Research use is part of validation. Automation of the workflow would in principle allow also the analysis of the robustness of this approach to varying technical choices in the data harmonization, although such analysis falls beyond the scope in this manuscript. Future development could take increasing advantage of machine learning, and borrow further methods from ecology and related fields that have well established methods for spatio-temporal data analysis. Machine learning and articial intelligence (AI) could help to significantly improve the scalability and accuracy of data harmonization and verification. For instance, the raw page count fields have systematic structure, and instead of a lengthy algorithm construction process, adaptive machine learning algorithms could be trained with a limited set of well chosen training examples, and the accuracy of the conversions into page counts could be easily monitored and exactly quantified until a satisfactory accuracy and coverage is reached.
This provides a starting point and guidelines for more extensive integration of national catalogues. National bibliographies are essentially about mapping the national canon of publishing, but integrating data across borders should be managed in a way that takes into account specific local circumstances while also helping to overcome the national view in analyzing the past. Such integration can help scholarship to reach a more precise view of print culture beyond the confines of national bibliographies. Open availability of the raw data as well as the analysis methods is central for efficient, collaborative, and transparent research use of bibliographic collections in modern society. Whereas traditional data management policies do not support open sharing of these digital resources, the time for change is ripe. Open availability of bibliographic data collections and supporting data sources can foster innovative and nontraditional research use of the catalogs, as demonstrated in this article. In this rapidly changing field, the development toward more collaborative development of research methods can advance the transition from data management towards collaborative quality control and research. This demonstrates how comprehensive data harmonization is essential for accurate and useful data retrieval tasks and relevant for the overall usability of the catalogue information, and how the available classification and subject analyses, geographical information, and other data can be utilized, augmented, enriched and validated based on auxiliary information sources.information sources, such as digital maps for instance. Integration of national bibliographies, special collections, and archives is relevant for international aspects of digital cataloging. As such the work highlights specific bottlenecks and shortcomings in the available cataloging and classification information, and can therefore provide relevant information for education, training, and management of cataloguing. Finally, we demonstrate how bibliographic catalog records can be used as a digital research resource, rather than a mere information retrieval tool.
Mitä merkitystä tällä työllä ja näillä julkaisuilla suhteessa jo julkaistuihin on -- myös projektio koskien muita vastaavia katalogeja. Tämä arvokas osuus paperissa itsessään - Joo tätä pitäs avata vielä lisää / LL