loading page

Applying Community Data Reporting Formats to Open-Source Water Quality Data
  • +6
  • Dylan O'Ryan,
  • Robert Crystal-Ornelas,
  • Deb Agarwal,
  • Kristin Boye,
  • Shreyas Cholia,
  • Joan Damerow,
  • Wenming Dong,
  • Kenneth Williams,
  • Charuleka Varadharajan
Dylan O'Ryan
Lawrence Berkeley National Laboratory

Corresponding Author:[email protected]

Author Profile
Robert Crystal-Ornelas
Lawrence Berkeley National Laboratory
Author Profile
Deb Agarwal
Lawrence Berkeley National Laboratory
Author Profile
Kristin Boye
SLAC National Acceleratory Laboratory
Author Profile
Shreyas Cholia
Lawrence Berkeley National Laboratory
Author Profile
Joan Damerow
Lawrence Berkeley National Laboratory
Author Profile
Wenming Dong
Earth and Environment Sciences Area, Lawrence Berkeley National Laboratory
Author Profile
Kenneth Williams
Earth and Environment Sciences Area
Author Profile
Charuleka Varadharajan
Lawrence Berkeley National Laboratory
Author Profile

Abstract

Data standardization can enable data reuse by streamlining the way data are collected, providing descriptive metadata, and enabling machine readability. Standardized open-source data can be more readily reused in interdisciplinary research that requires large amounts of data, such as climate modeling. Despite the importance given to both FAIR (Findable, Accessible, Interoperable, Reusable) data practices and the need for open-source data, a remaining question is how community data standards and open-source data can be adopted by research data providers and ultimately achieve FAIR data practices. In an attempt to answer this question, we used newly created water quality community data reporting formats and applied them to open-source water quality data. The development of this water quality data format was curated with several other related formats (e.g., CSV, Sample metadata reporting formats), aimed at targeting the research community that have historically published water quality data in a variety of formats. The water quality community data format aims to standardize how these types of data are stored in the data repository, ESS-DIVE (Environmental Systems Science Data Infrastructure for a Virtual Ecosystem). Adoption of these formats will also follow FAIR practices, increase machine readability, and increase the reuse of this data. We applied this community format to open-source water quality data produced by the Watershed Function Scientific Focus Area (WFSFA), a large watershed study in the East River Colorado, which involves many national laboratories, institutions, scientists, and disciplines. In this presentation, we provide a demonstration of a relatively efficient process for converting open-source water quality data into a format that adheres to a community data standard. We created examples of water quality data translated to the reporting formats that demonstrated the functionality of these data standards; descriptive metadata and sample names, streamlined data entries, and increased machine readability were products of this translation. As the community data standards are integrated within the WFSFA data collection processes, and ultimately all data providers of ESS-DIVE, these steps may enable interdisciplinary data discovery, increase reuse, and follow FAIR data practices.