Obtaining valid conclusions depends on reliable harmonization and augmentation of the raw entries. However, the power of a large-scale approach is that broad patterns in knowledge production are often overwhelmingly clear, despite occasional inaccuracies and collection biases in individual data sets. Already the HPBD, with its uneven coverage, can be used to assess some general trends in publishing history although it does not compete in reliability and level of detail in the other used bibliographic metadata collections. This is exemplified by our key observations on vernacularization and the rise of the octavo, which are supported by similar trends across multiple independently maintained bibliographic metadata collections. For a a more detailed comparison across European cities, further harmonization and augmentation of the collections are needed.