In this work, we focus on a few selected fields, namely publication year and place, language, and physical dimensions. We have carried out removal of spelling errors, term disambiguation and standardization, missing value augmentation and validation, and developed custom algorithms, such as conversions from the raw MARC notation to numerical page count estimates [REFS], which we have implemented in the bibliographica R package. We have also added derivative fields, such as print area, which quantifies the overall number of sheets in distinct documents in a given period, and thus the overall breadth of printing activity. The print area reflects the overall breadth of print products, and complements the mere title count, or overall paper consumption including the print run estimates. An overview of the harmonized data sets and full algorithmic details of our analysis are available via Helsinki Computational History Group website [LINK: https://comhis.github.io/2019_CCQ **].