Analysis of the FNB allows us to exemplify the potential of openly available bibliographic data resources in terms of data enrichment and reuse.
By releasing the source code of our algorithms, we aim to contribute to the growing body of algorithms that are specifically tailored for use in this field. Moreover, we hope that the open availability of analysis methods is gradually paving the way towards the opening of valuable bibliographic data resources, following related successes in other fields, such as the human genome sequencing project and subsequent research programs, which critically rely on openly licensed and centrally maintained data resources, and thousands of algorithmic tools that have been independently built by various members of the research community and increase the value of these data collections.
Scalable data harmonization, enrichment and validation
Data access,
parsing,
cleaning,
harmonization,
enrichment,
Obtaining valid conclusions depends on efficient and reliable harmonization and augmentation of the raw entries.
Furthermore, we show how external sources of metadata, for instance, on authors, publishers, or geographical places, can be used to enrich and verify bibliographic information. This type of ecosystem has potential for wider implementation in related studies and other bibliographies.
Research use is part of validation
Discussion: potential ML/AI
sellainen mistä olisi paljon hyötyä olisi selitykset miten erilaiset arviot on tehty. Nämä kannattais tehdä melkein erillisenä ekaks että niitä vois sitten käyttää myös muualla. Tämän jälkeen yhdistää tekstiin ja ehkä lyhentää jne. Tarkoitan siis esim. kuvausta siitä miten formaattitietoja puuttuvat on täydennetty jne. Eikö nämä pidä jotenkin olla mukana?

Towards a unified view: catalogue integration

This paper demonstrates how such challenges can be overcome by specifically tailored data analytical ecosystems that provide scalable tools for data processing and analysis.
Recognition of duplicates
Furthermore, we show how external sources of metadata, for instance, on authors, publishers, or geographical places, can be used to enrich and verify bibliographic information. This type of ecosystem has potential for wider implementation in related studies and other bibliographies. 

Open bibliographic data science

data organization, data and code sharing, interfaces, software modules, analytical ecosystems