Menu
Catégorie de la page

Digital data, services and infrastructure

Outstanding result / EaSy Data collects ‘long-tail’ data on the Earth and the environment

Introduction d'entête
Set up in 2023, EaSy Data is the national thematic warehouse for so-called ‘orphan’ or ‘long-tail’ data on the environment and the Earth system. Forming part of the Data Terra research infrastructure and supported by BRGM, EaSy Data seeks to capitalise on a whole range of public research data in the field of Earth and environmental sciences.
Body
Image
Légende

The aim of the "EaSy Data" national data warehouse is to centralise, organise and share the large quantity of so-called orphan data on the environment and the Earth system. © BRGM

Unlike observational data, which is structured by nature, a significant proportion of the data generated by public research into the Earth system and the environment is not organised and/or shared. They are not systematically archived in warehouses, and not sufficiently documented. These data, referred to as ‘orphan’ or ‘long tail’ data, are nevertheless of strategic importance, with the goal being to build on the results of the research behind them. 

This is a relatively broad issue, covering all areas of scientific research. However, while the Ministry of Higher Education and Research is keen to encourage the opening up and sharing of data, publications and source codes, through its national plan for open science, the issue has now been resolved in Earth sciences.

Wide and compulsory dissemination of public research data 

On 6 November 2023, the Ministry of Higher Education and Research inaugurated EaSy Data, France's national thematic warehouse for orphan or long-tail data on the environment and the Earth system.

Supported by the Data Terra national research infrastructure, this repository is operated by BRGM. More broadly, it is part of the national plan for open science initiated in 2018, whose purpose is to structure initiatives to promote the opening up and sharing of data, publications and source codes from publicly-funded projects.

A data warehouse developed by a virtual project team 

Warehouses of this type allow researchers to store and reference the data from their work. A national platform, Recherche Data Gouv (RDG), has been created for this purpose, to bring data together. EaSy Data is one of the first components. 

EaSy Data has been made available to the scientific community of the Earth system to help them address major environmental issues such as climate change, water resources, natural hazards, sustainable energy and so on. It will allow data to be compared, reused, shared and rediscovered.

This data warehouse applies the ISO 19115 standard for metadata, the international standard for geospatial data. The cataloguing tool used is GeoNetwork, an open source project. Data are stored in the BRGM data centre. The project involved a team of almost 20 people whose expertise was key to the developments carried out with the support of the infrastructure teams. 

A 'virtual' steering team held meetings over a period of two years to set up the project: Véronique Bertrand, CNRS - Epos-France; Hélène Bressan, BRGM; Christelle Pierkot, CNRS - DataTerra; and Marine Vernet, IFREMER - DataTerra. An application overlay has been developed to facilitate data entry, tailored to the needs of researchers. A moderation team made up of data centre scientists and volunteers is also on hand to ensure compatibility with the scope defined.

Encouraging feedback from data depositors

The data warehouse is a clear success for BRGM, as well as a great human story about four women who worked on EaSy Data over a period of two years as part of an exclusively virtual approach. 

Researchers immediately began to deposit data. Feedback has been encouraging, and the warehouse has already allocated over 20 persistent digital object identifiers (DOIs), showing that it is meeting a real need among different communities. While depositing data is not compulsory, the practice is widely encouraged, as it is clearly in each researcher's interest to share their work, improve data citation and contribute to more reproducible research. 

New prospects are already emerging, with efforts to improve access and work on vocabulary and semantics to make data filing easier, for example. The implementation of the EaSy Data warehouse reflects the actions carried out by BRGM as part of its open science policy

Image
Légende

The data entry interface for the dataset description allows users to enter the necessary information easily, in French and/or English. Depositors identify themselves using their ORCID or Renater ID, or by creating an account. © BRGM

Portrait de l'auteur
Hélène Bressan Vocabulary and data quality project manager
Prénom de l'auteur
Hélène
Nom de l'auteur
Bressan
Métier de l'auteur
Vocabulary and data quality project manager
EasyData is part of the National Plan for Open Science, which seeks to open up and share data, publications and source codes.