The fact that chemistry represents such a big fraction of the data sharing articles is also due to the fact that the
Cambridge Crystallographic Data Centre is the major contributor of links between data and articles in Scholix. We consider this to be more of a feature of the culture in crystallography than a bias: the CCDC repository is one of the biggest scientific data repositories there is, and is the product of a deep data sharing culture in this discipline.
Next steps
We are planning to expand the above analysis by looking also at the countries where data sharing is more common, as well as looking at journals and authors that are practicing data sharing.
Following the landscape analysis, we will start our work on measuring impact in the coming months. Using Scopus and Plum metrics, we will assess the impact of associated datasets on not just article citation, but also on downloads and several alternative metrics.
In addition, we will analyse how often datasets are cited in their own right. Two different methods will be employed:
- using Scholix, we will analyze how often 'cited by' links are contributed by publishers and data repositories.
- Using text mining methods, we will mine the Scopus database using DOI prefixes to see how often different data repositories and datasets appear in the reference lists of articles.
Employing both methods will also provide us with insight into firstly, the completeness of Scholix and secondly, how often links between articles and datasets are made outside of reference lists.
These results can be used as input for different projects currently underway to measure data metrics, such as for example the Make Data Count project\cite{Kratz2015}.