ealare not shttps://www.aridhia.com/platform/analytixagility/ inaugurated in March 2017
natural cold conditions without energy
signed by 43 countries in 2018
what can be stored ? Piqlfilms are the storage medium and they can handle every format (digital)
binary data encoding on a 2D barcode > printed on 35mm film but it is digital, not analog.
Guaranteed information recovery in 500 years (ISO standards). I guess we can call this "long term preservation".
Data is visible
open platform for reading, only light source and sensor are necesary. Metadata physically embedded on the film. m
OAIS model compliant
Offline storage, migratoin free, few people manages data, trustworthy materials, automatic archival system
Conclusion de FAR: Mais, ce n'est pas OPEN!
Réponse d'Eliane: Bein, soit open, soit long-term preserved. Faut faire un choix!
Data Deposit Recommendations, DANS, P. Doorn
https://halshs.archives-ouvertes.fr/halshs-01531337
Built on re3data.org
"user-friendly service guiding humanities researchers to deposit data"
Tailyoring re3data, which is a well known service. To improve it, re3data needs to be improved.
Voilà, le site: https://ddrs-dev.dariah.eu/ddrs/
Helps you find an adequate repository regarding your field of research and your country. Attention : this only concerns HUMANITIES RESEARCH.
As this tool is built upon re3data, the quality of mData will depend on the quality of mData included on re3data.

DMPRoadmap : new features in DMPonline and DMPTool (finally we had time to see this one too)

Sarah Jones DCC

https://dmponline.dcc.ac.uk/ = ancienne version
https://dmponline-test.dcc.ac.uk/ = version beta de l'update
There's been an update on the DMPTool
"ask for feedback" functionnality
"invite collaborator" functionnality

What about the beat (bit) in the middle 

A lot of talk about the context of how research data is currently made. but ...
What about active data management? because, they told us at the beginning of the presentation that they were definitly talk about active data management (actually, the term is in the subtitle of the talk).
10 minutes later, we got a demo from a product, "AnalytixAgality" (we repeated together 1h in order to be able to pronounce the name correctly)
https://www.aridhia.com/platform/analytixagility/
Different kinds of - quiet powerful - visualisation possible.
Possible to code in R inside the software. LateX also enabled in the platform.
Genomic data
------------------------------------------------------

Parallel sessions II

----------------------------------------------

More than Data (Fantin)

Incorporating software curation into research data management services: lessons learned

(F. Rios, Arizona) 

Can we treat software different from other generic data ? Yes but "we are more interested in what the software does than what it is".
We may loose knowledge about : its execution / its relationships (workflows), other aspects (code, rep. int. prop.), attribution (visibility
Software in RDM
- consulting (best practices / funder & publishers reqs. / IP / repository selection / mData / how does it fit in DMP)
- archiving (instituional data repo / dataset curation)
- education (workshops / training)
knowledge-builidnt & planing is linked with the 3 bullet points above.
Planning outcomes :
- expand archiving workflow (treat sofware separately from data / scope : only sofware produced as part of the research => DON'T TRY TO CAPTURE EVERYTHING, DO FULL RE-EXCUTION)
- expand consulting expertise (eleveate team knowledge on research software in RDM)
- training
Sharing sofware in an Archive :
- how archive worflow ?
- how can software be made more visible in the archive ?
- how should id be curated ? => packaged in one entity / break down in multiple module ? => more of the second option (give enough info to replicate / give credit / reuse)
mData is added while uploading THEN controlled by the researcher who created it once uploaded.
Best practices (advices) : version controle / code organization
The team learned a lot about software data while creating educational materials.
Future goals: chose a better mData schema : CodeMeta could be a good solution https://github.com/codemeta/codemeta

Curating scientific workflows for biomolecular nuclear magnetic resonance spectroscopy

(M. Gryk, Illinois, UCONN, Health, USA)

Workflow : transform raw data (1)  into frewuency (2) / indentify signals (3) / transform it into biophysical claim (4). The process take between months and years.
(2) Shell script => not really reproducible (although NMR specialist say so). They created their own workflow design / execution system (CONNJUR Workflow system). Exports are in XML. But xml is not really human-intellegible. They exchanged a problem with another.
Implementation of PREMIS in those XML => finnally developped a CONNJUR XML sdchema  (not found on github)
Goal of this : workflow should be more understandable to a broader audience.
CWB sits inside NMRbox (VM downloadable)

Emedded Metadata patterns across web sharing environments

(S. Thomspon, Houston) 

Images mData => it's quiet chaotic since everybody is allowed to make photographs and upload them (smartphone, cheapers cameras)
Typical metadata :
- exif
- icc profile
-GFIF (not sure)
- IPTC
- too fast
First, they choosed a sample from various location. Then, they developped data collection, tested metadata stability and finally recorded results (shared on which plateform / embedded mdata)
Result : soft. like photoshop / windows media viewer CHANGEd the embedded mData. No use of such softs for now.
Goal : see if embedded mData travel with the image while beeing uploaded on a social media platform. They do, but they're changed. Some of them don't. It's easy to change some fields are editable in standard media viewer.
Some mData schemes are not readable by some social media platforms.
It arises more questions than answers. But it permits to have a clearer overiew on reusability of suche media files.

Designing and building interactive curation pieplines for natural hazards engineering dta

(M. Esteva, Texas)

Natural hazards engineering research : large scale experiments / simulation / hybrid simulation / etc.
DesignSafe_CI : large national initiative. Cyber infrastrucure platform : data management / data analysis / data curation / data publication > RD END-TO-END PLATFORM.
https://www.designsafe-ci.org/
Data challenges : changes accross research steps / final results achieved through serveral iterations / large numbers of files, in different formats / multi-relational datasets / involved documentation / user roles : data creators VS re-users > quite conflictual roles
at first : only curation/publishing module. They had to pass through the whole R&D process (interviews/meetings/modelling research workflows + translate those workflows into front-end UI and back-end mechanisms
user-centered design : choice of vocabularies made with researchers. Every type of experiments are made possible in the platform.
For the researches the boundaries between the steps of DLC are quite burry. They need their data AT ANYTIME. They don't think step by step : the sofware architecture needs to follow this mindset.
Interactive mockups were set up and tested by researchers. Their requirements, for instance : all models should have sensor informations
Transition from Active Management to publication (they should have the choice of "when" they're able to publish, and what files, etc.)
mData : editable after active management => mdata transformed in DublinCore Pro in back-end
Browzing  interface (categories / relations between data and process / relations btw data and documentation) => They got a DOI at the end through EZID
Evaluation : curration process seems still a bit of a painful process for researchers. Further automations should be implemented
-----------------------

Repository Services (Eliane)

Two libraries using one texas data repository / A. Dabrowoski, Texas
- ILSSI : provide workflow for dataset submission
- UTCT : visibility of data
Challenges:
- navigating relationships with stakeholders
- integrating with existing practices
- dealing with large quantities of data
- ensuring maintenance and continuation
Lessons learnt:
- labs face complicated social, policy and technical challenges
- furhter software and policy development
- consortial model is an advantage for sharing solutions
- scaling local solutions involves more engagement from liaison librarians
Future work:
- working with liaison librarians
- costmodels for preservation
- .... trop vite!
From passive to active, from generic to focussed: how can a ninstitutional data archive remain relevant in a rapidly evlovling landscape, M. J. Cruz, Delft University)  [AUTHOREA ME REND FOU]
- 4TU.Dataservice
- 10 years old https://data.4tu.nl/repository/
- DSA Certified
- 15 years long-term preservation ensured
- metadata is checked and improved
How can we remain relevant in the next years?
- repositories need to have a subject or format focus to remain relevant
--> netCDF 90% of data https://www.unidata.ucar.edu/software/netcdf/ : standard in atmospheric science, climate sciences and geology 
The presentation is online: https://zenodo.org/record/1175238#.WoxCJefjI2w
Some services which could be provided:
- visualisation serfices, data processing and data mining
- scalable data citation like recommended by : https://rd-alliance.org/group/data-citation-wg/outcomes/data-citation-recommendation.html
- training, advice and guidance
- outreach and international collaboration
if researchers want to share datasets in netCDF, TU Delft would be willing to collaborate
Building open-source digital curation services and repositories at scale, R. Marciano, Maryland, USA
Digital curation service
Project: Digital Repostiory at scale : layering different frameowkrs together
http://dcic.umd.edu/10032016-introducing-open-source-platform-dras-tic/
Example: Developping a cloud-bsed digital curation service : NSF Brown Dog Project
Another example: creation of a testbed of justice, human rights, and cultural heritage collections
-----
-----
-----
DAY 2
-----
-----
-----

Remediation Data Management Plans : a Tool for Recovering Research Data from Mesy, Messy Projects

Clar Llebot Lorente (OSU Libraries and Press, Oregon State University, USA)

Environmental project : "experimental forest"
WRC => they had no RDM at all. At the phase of syntesis/analysis of their data, they started to think "hmmm.. maybe we should have implemented some data management planning " they contacted the OSU Libraries
The library created a "remedial" rDMP
Differennce between classical DMP and rDMP :
1. Audience : researchers for DMP / researchers OR administrators & manager for rDMP
2. Data Inventory was the most difficult because there were so many datafiles : http://watershedsresearch.org/
- Relational databases, trask tabular dtaa, fish database, other digital data, physical samples
- First of all: group the data which are together (like climate data and nutrional data)
- a looooot of data: 50 0000 excel files, 520000 images, 334 databases, 9000 documents, 17000 pdfs
- second: collecto metadata about the data : subject, location, responsibilities, manager, versions, formats, documentation, sensitivity, sharing status
- third: implementation is different, focus on priorities and not on guidelines (i.e. priority 1 : clean, document, preserve in controlled datasets, priority 2 : clean, document and preserve data and associated to past publications, priority 3, triage data in shared drive folders
problems: motivation of researchers, what are the priorities, what should be selected
conclusion: DMPs can be used for remediation, but is this true?
------------------

A landscape survey of #ActiveDMPs

Sarah Jones for the team which is international 

transform static documents in active, machine-actionable DMPs 
During IDCC 2017, discussions about: 
- interoperability with research systems
- institutional perspectie
- repositry use cases
- evaluation & monitoring
- utilising PIDs
A white book with results of this workshop was published https://riojournal.com/article/13086/
common standards
leveraging PIDs
capacity planning
increasing data discvoery & reuse
..... to fast
GITHUB user stories were created : https://github.com/RDA-DMP-Common/activeDMPs
A study was made in the framework of OpenAire : "what h2020 dmp users prioritised ?" : https://zenodo.org/record/1120245#.Wo0_1-fjI2w
There are still WGs working on that matter :
- during the next RDA face 2 face meeting (March 2018 in Berlin)
- Force11 : Force2018 Montreal https://www.force11.org/meetings/force2018
- Australian DMP IG is another group
- Data Management Record in Queensland Australia
- DMPRoadmap https://github.com/DMPRoadmap/roadmap/wiki
- UQRDM https://guides.library.uq.edu.au/for-researchers/research-data-management
- ReDBox DLC
- Auckland DMP Tool
- Data Stewardship Wizard Elixir
- ezDMP IEDA
- Data planning toolUNINETT Sigma 2
- DMP Service OpenAIRE, EUDAT
To be read: https://www.slideshare.net/sjDCC/10-simple-rules-for-machineactionable-dmps
Websitesoon open :https://activedmps.org/
Slides will be on: http://doi.org/10.5821/zenodo.1174283
Mandate to integrate DMP in the program for students (idée pour l'EPFL?)
--------------------------------------------------
PARALLEL Sessions
-------------------------------------------------

Data Policy and guidance (Eliane)

Advancing policy and changes for graduate data management, Virginia Tech, Zhiwu Xie

- policy wordings: encourage DMP, ought to release, as widely as possible, must be provided
- it is necessary to use specific wording
- once data management is routine, it is not necesary to educate on it anymore
- librarians  
-          Librarians are not really convincing
-          Read the book « Nudge »
-          Universtiy owns the data, but you can, should, ought to share it, however, if something goes wrong, you are responsible for it  
The impact on authors and editors of introducing data availability statements of nature journals, R. Grant, Springer Nature
Springer Nature has four policy types
-          Sharing data and data citation is encouraged but not required
-          Data sharing and evidence of data sharing encouraged
-          Data sharing encouraged and statements of data availability required
-          Data sharing not encouraged
 
Data availability statement (DAS)
-          Templates are available
https://www.nature.com/news/announcement-where-are-the-data-1.20541
Editors needed to write down the time they need to write down a DAS
Processing times go up when making data policies mandatory