Abstract
Most research intensive institutions provide some form of data management support. However, the form in which these services are offered and how extensive these are differ and are often difficult to compare. Objective comparison of the different types of services is needed to evaluate the effectiveness of the diverse approaches and to make informed decisions about their usefulness. In this practice paper, we discuss a collaborative effort between Delft University of Technology (TU Delft), École Polytechnique Fédérale de Lausanne (EPFL), University of Cambridge and University of Illinois, which resulted in the development of a short survey to assist institutions in increasing the effectiveness of their data management support services and their evolution.

Different approaches to a common goal

Informal discussions between the research data service teams of TU Delft, EPFL, University of Cambridge and University of Illinois revealed that each institution had undertaken a different approach in designing their data support services. TU Delft has a central research data support team at the Library1, which is also part of a consortium of four Dutch technical universities (4TU)2. In addition, TU Delft is embarking on a Data Stewardship project, which will provide disciplinary support for data management embedded at faculties (Teperek et al., “Data Stewardship – addressing disciplinary data management needs”, abstract submitted to IDCC18). EPFL has a central data management support team, which provides generic, as well as on-demand, tailored training and data consultations to the research community3. This team is also assisted by liaison librarians, who know the data management needs of their faculties and help the central support to shape their service to meet disciplinary requirements. EPFL is also an active player in the national Digital Lifecycle Management project4. The University of Cambridge, in addition to small central team supporting researchers in data management, also has a dedicated programme of Data Champions - researchers volunteering their time to advocate for good data management in their local communities5. The University of Illinois has a central data management support team and is also part of a national network of subject-specific data curators6. Despite these differences, the goal of the four service providers is the same: to improve data management practice within their research communities. How can we therefore compare how good our approaches are towards achieving our common goal?

Evaluation of existing measures

Members of the four institutions first reviewed existing tools to assess data management support services. We first looked at the Research Infrastructure Self Evaluation Framework (RISE) framework survey created by the Digital Curation Centre7. However, we thought that this framework was more suitable for assessing the maturity of the data services offered. We then looked at the Data Asset Framework (DAF) used by several UK institutions8. The DAF survey is a comprehensive tool that allows institutions to assess researchers’ data management practice and identify gaps in service provisions; thus in principle, it should meet our requirements. However, the DAF survey consists of over sixty questions, which was not compatible with the repeated assessment we plan to do. We therefore decided to follow its general principle, but do something simpler and less resource-intensive.

Short survey on data management practice

Based on the DAF survey, we came up with a list of ten multiple choice questions that we found essential to reflect on researchers’ data management practice. By limiting the number of questions to ten and by ensuring these were multiple choice, we thought that first we were respectful of researchers’ time, and secondly, the approach would allow for results standardisation and comparison. In addition to a commonly agreed set of questions, each institution was able to add their own specific questions to obtain more granular information about the different research units and to get feedback about specific services provided to their research communities.

Anticipated outcomes

TU Delft and EPFL will launch the survey in October 2017, and will be followed by the University of Cambridge and the University of Illinois. We anticipate that the first comparative results will be available at the beginning of 2018. We expect that the results of the survey will provide a useful initial assessment of current data management practices across research communities, which will highlight to institutions where biggest gaps are and where more work is needed. The results will help understand the different disciplinary needs and the maturity of subject-specific data management practice, thus, allowing a more targeted approach. In addition, comparing the results between the institutions will hopefully highlight strengths and weaknesses of the different approaches they took in developing their data management support and will hopefully lead to best practice exchange.

Limitations

As with any other methodology based on surveys, there are limitations to our approach, which will affect the type of conclusions that can be drawn. First, the respondents will be self-selected, and therefore may not be representative of the research communities we are trying to sample. Secondly, institutions need to be cautious interpreting potentially different results for diverse groups of respondents as these might not be directly related to the quality or availability of data support services and might be affected by external factors, such as community norms, specific funders’ policies, influence of local authorities etc. Finally, the limited number of questions used in the survey limits the depth of possible conclusions about data management practices.
Nonetheless, we believe that the benefits of our lightweight data management practice assessment, make the approach worth testing.

Next steps

The initial results, expected in early 2018, will allow us to evaluate whether the survey allows for comparative assessment of data management practice. If the survey proves to be suitable for such measurements, we will continue to use it to regularly evaluate the maturity of researchers’ data management practice at our respective institutions. Additionally, we plan to share the survey under a CC BY licence to enable others to use the tool for their assessments and to allow comparisons and collaborations with other institutions.

References

1.         Research Data Management. TU Delft Available at: https://www.tudelft.nl/library/themaportalen/research-data-management/. (Accessed: 19th October 2017)
2.         4TU.ResearchData: Home. Available at: http://researchdata.4tu.nl/home/. (Accessed: 19th October 2017)
3.         ResearchData | EPFL. Available at: https://researchdata.epfl.ch/. (Accessed: 19th October 2017)
4.         Home :: DLCM. Available at: https://www.dlcm.ch/. (Accessed: 19th October 2017)
5.         Higman, R., Teperek, M. & Kingsley, D. Creating a Community of Data Champions. bioRxiv 104661 (2017). doi:10.1101/104661
6.         Johnston, L. R. et al. Data Curation Network: A Cross-Institutional Staffing Model for Curating Research Data. (2017).
7.         RISE, a self-start tool for research data management service review | Digital Curation Centre. Available at: http://www.dcc.ac.uk/news/rise-self-start-tool-research-data-management-service-review. (Accessed: 15th October 2017)
8.         Rob Johnson, Tom Parsons & Andrea Chiarelli. Jisc Data Asset Framework Toolkit 2016. (Zenodo, 2016). doi:10.5281/zenodo.177876