Patent indicator data extraction process
Using the technology classification categories, and where applicable the keywords specified in Table \ref{table:search_terms}, the results of these search queries were exported in batches of up to 10,000 records at a time in a tabulated HTML format. Exported records were based on only the representative family member for a given FamPat grouping in order to avoid duplication of records across multiple jurisdictions. Additionally, each exported record included the key patent information along with full details of both cited patent and non-patent literature references made in the current record. As some searches could generate very large numbers of records (i.e. hundreds of thousands), the use of batch processing enabled large quantities of records to be handled in manageable formats, but required that the batches were subsequently imported into a tool capable of processing the volumes of data considered. For this purpose, MATLAB was used, and a script (provided in Appendix B) was developed to convert each HTML batch file into a corresponding .MAT file (based on a pre-existing conversion script), ready for data cleaning processes.
Patent indicator data cleaning process
Whilst the consistency of the Questel-Orbit patent data is of a high standard, several steps are still required to be able to extract patent indicator metrics from this data. This is done to ensure that the datasets are translated into a tabulated format suitable for the automated analysis processes to follow, and to correct any easily rectifiable data entry errors that may be present in the extracted data (such as the omission of application or priority dates from the relevant columns when these dates are available elsewhere). In doing so, this allows a more accurate chronology of patent events to be established. This process is not discussed in detail here, but is available in Appendix C for more information.
Technology Life Cycle stage matching process
With bibliometric profiles extracted for each of the technologies considered in this study, the first stage of analysis consists of identifying the transition points between different stages of the Technology Life Cycle in order to establish time series segments for use in subsequent comparative analysis. For the technologies considered in this study, evidence was identified from literature to suggest when these transitions had occurred, such as in the innovation timeline assessments prepared for a range of technologies by Hanna \cite{hanna2015innovation}. Full details of the transition points used in this study are provided in Table \ref{table:TLC_transition_points}.