Importantly, our key observations on vernacularization and the rise of the octavo are supported by similar trends across multiple independently maintained library catalogues.
Whereas documentation and polishing continue, we have done all source code openly available, so that details of the data processing can be independently investigated and verified. This paper demonstrates how such challenges can be overcome by specifically tailored data analytical ecosystems that provide scalable tools for data analysis. 
We have investigated four different types of bibliographies (FNB, SNB, ESTC, and HPBD). Each catalogue is associated with a similar open source harmonization workflow, which provides a detailed and transparent account of the data processing steps from raw data harmonization to the final statistical analysis, summaries, and visualization. Furthermore, we have shown how external sources of metadata, for instance, on authors, publishers, or places, can be used to enrich and verify the information. Future developments could take increasing advantage of machine learning, ecology and related fields that have well established methods for spatio-temporal data analysis. Adaptive machine learning could help to significantly improve the scalability of data harmonization, as they could be trained with a limited set of well chosen training examples, and the accuracy of the conversions could be easily monitored until a satisfactory accuracy and coverage is reached. This type of data analytical ecosystem has potential for wider implementation in related studies and other bibliographies as many of the encountered data analytical problems are commonly encountered in digital humanities.