The largest number of domains identified in Figure \ref{692056} come from Life Sciences journals. Among the most highly referenced resolvable domains, Dryad (datadryad.org), the collection of repositories at the NIH (nih.gov), GitHub (github.com), and the general use data repository Figshare (figshare.com) represent the top four highly-used data sharing and storage locations. The top-referenced data sharing domains and resources were referenced largely by researchers from the Life Sciences. It is worth noting that GitHub, Research Gate, and Google appear in our results. These organisations do not currently offer the same commitments to data preservation, access, and citation that specialised repository services offer for research data. It is also noted that the high number of referenced domains from the Life Sciences journals, particularly in Ecology (see Figure \ref{762092} above) may also reflect that a large portion of Wiley's Life Sciences portfolio has required data archiving since as far back as 2011.
In Figure \ref{692056}, the Mathematics & Statistics category is the only WOL Level 1 category where doi.org (unresolved) is not the highest. Here most responses are in github.com. Similar patterns are also seen in the WOL Level 2 categories discussed below, Statistics and Oncology & Radiotherapy. As well as the most DASs, Life sciences also has the largest range of repositories used (22). Physical Sciences & Engineering uses the least, at only 2 repositories.It is clear from Figure \ref{692056} that github.com is not only used by computer science researchers but by researchers publishing in a diverse range of disciplines. We speculate that this reflects the increasingly important role across the research spectrum of computation and data science to handle the analysis of big data \cite{frontier}, and the increasingly interdisciplinary nature of research. Figure \ref{692056} also shows that in Social & Behavioral Sciences data sharing is dominated by osf.io, the Open Science Framework from the Center for Open Science. Interestingly, osf.io does not describe itself as specifically for social and behavioral scientists; it seems the Center for Open Science's roots in psychology run deep. Other results for top WOL Level 1 categories (Life Sciences; Social & Behavioral Sciences; Earth & Environmental Sciences; Medicine; and Business, Economics, Finance & Accounting) show the success of repositories or domains that exclusively (or predominantly) serve researchers from a particular discipline or subject category; some of these are discussed below.
Figure \ref{762092} shows the dominance of datadryad.org in Ecology, as is the case across the Life Sciences: DASs that link to datadryad.org are almost exclusively in the Life Sciences (Figure \ref{692056}). This speaks to datadryad.org's origins in evolutionary biology and ecology, and to its future across Life Sciences and perhaps beyond: datadryad.org only relatively recently recast itself as a general-purpose data repository.
Figure \ref{802504} shows specialist domains serving data sharing in Earth Sciences, namely usgs.gov, noaa.gov, and pangaea.de. Generalist repositories zenodo.org and figshare.com also feature highly, and github.com features when we broaden out from Earth Sciences WOL Level 2 category to consider the Earth & Environmental Sciences WOL Level 1 category in general (Figure \ref{692056}).
Figure \ref{170316} shows that github.com dominates the links found in DASs in Statistics. In fact, github.com scores higher than unresolvable doi.org links for Statistics DASs. This tells us something about precision and familiarity with data sharing in Statistics, where more doi.org links resolve than do not (unlike in Ecology, Earth Sciences, or Economics, Figures \ref{762092}, Figure \ref{802504}, or Figure \ref{792929}).
Figure \ref{792929} shows that in Economics worldbank.org is the most common domain. For the first time in this analysis researchgate.net, most commonly referred to as a professional network for science rather than a data sharing service, makes an appearance. As mentioned above, Research Gate does not currently offer the same commitments to data preservation, access, and citation that specialised repository services offer for research data.
Figure \ref{144977} reports data for Oncology & Radiotherapy, where top the domains are nih.gov, cancer.gov, iarc.fr, and clinicalstudydatarequest.com. The first three of these domains score higher than unresolvable doi.org links in Oncology & Radiotherapy DASs. Again, as with Statistics above, this tells us something about precision and familiarity with data sharing in Oncology & Radiotherapy where more doi.org links resolve than do not. Domains like nih.gov host multiple services and repositories to support data sharing; our methods did not deliver information about these.
To restate, the purpose of our study is to highlight repositories used most by researchers across disciplines, rather than to make robust comparisons about the frequency of data sharing between disciplines. Figures \ref{140748} and \ref{372269} show that our data includes DASs from many more submitted articles in Life Sciences than in any other WOL Level 1 category, and with more links to domains and repositories (28,327 DASs; 7% with links). Following Life Sciences is Medicine (4905 DASs; 11% with links), then Business, Economics, Finance & Accounting (3706; 18%), Earth & Environmental Sciences (2271; 17%) and Social & Behavioural Sciences (2146; 16%). Across all WOL Level 1 categories a mean of 17% of DASs contain links to repositories or data sharing services. The smallest number of DASs included in this study (Figure \ref{140748}) is from Wiley Physical Sciences journals. The number of domains retrieved and analysed in Physical Sciences is also very small, and includes links to just two domains: nih.gov and usgs.gov. The main reason for these small numbers is the source of our data. Our source data comes from Wiley journals that use one type of editorial office software; many Wiley Physical Sciences journals use a different piece of editorial office software. The Wiley journal portfolio in the Physical Sciences is notably strong and deep, and extraction of suitable data for analysis and further investigation would be valuable.
Conclusion
Our aim with this study was to use data from DASs as a useful resource for researchers who want or need to share new data, by showing them where other researchers choose to share similar data. Similarly, we aimed to support journal editors who want to recommend repositories and data sharing services to authors based on evidence.
We have provided detailed information about frequently used domains and repositories across research disciplines and subject areas, in the Table, Figures, and in the associated data files. These can be used as directional advice and inspiration by research authors and journal editors looking for repositories to use and recommend.
What we have provided should be used in combination with other resources, for example: FAIRsharing.org \cite{gdozw} and the project led by FAIRsharing.org to identify criteria that matter for repository selection \cite{matter}; CoreTrust Seal \cite{httpswwwcoretrustsealorgwhy-certificationcertified-repositories}; the ELIXIR Core Data Resources \cite{elixir}. These resources together will help facilitate a deeper understanding of best practices around data sharing, reporting standards, and data repositories.
Data availability statement
Processed source data are shared associated with Table \ref{133180}. Beyond that, data from originally submitted DASs are not shared, for the same reasons described in our previous study \cite{Graf}.
Disclosure of conflicts of interest
All authors are employed by Wiley and benefit from the company's success.