How Would Publishers Benefit From FAIR Data Principles?
Academic Publishing

How Would Publishers Benefit From FAIR Data Principles?

Arunav Moitra
Arunav Moitra

The present scenario has necessitated industries and businesses to step up and accelerate their digital transformation process to avoid ongoing disruptions. Keeping up with the challenging times, publishers should also go beyond the conventional publishing executions. They need to cultivate interoperable metadata and data sharing standards. These initiatives would support the swift and extensive publication of journal articles. Alongside, it would facilitate the reuse and republication of essential research. So, the first and foremost step includes improving metadata quality, data sharing practices, & standards. It could be done by implementing FAIR data principles in their academic publishing workflows. The FAIR data principle means making the research metadata and data; "Findable," "Accessible," "Interoperable," and "Reusable."

In this pandemic era, the need for machine-readable FAIR data has never been more evident. It brings immense benefits to the publisher’s table. The usage of quality metadata, open data, and machine-readable text-and-data mining-supported journal articles is already showing results. It is playing a remarkable role in coronavirus research linking and discovery tools. In like manner, FAIR data principles have been crucial in establishing new publishing initiatives and artificial intelligence tools to facilitate rapid research publications.

However, even with diverse significant advances made in research aggregation and data analysis, the pandemic has revealed just how unFAIR the majority of metadata outputs and datasets remain. Among primary reasons for this are - inconsistencies in publishers' article-level metadata and lack of data sharing policies, the need for greater standardization of metadata and dataset collection among repositories, and interoperability between them.

Metadata creators, curators, custodians, and consumers play an important role in fulfilling existing metadata guidelines and developing new ones. There is a necessity to raise awareness regarding it. New initiatives are being taken up, which continues to pick up momentum. Now, it appears that the publishers and stakeholders are initiating to put more resources towards data sharing and metadata standardization.

FAIR Data

The FAIR data principles benefit many stakeholders, including;

  • Researchers wanting to share and reuse experimental data,
  • Scientific publishers and funding agencies for long-term data stewardship,
  • Software providers for data management, analysis, and processing, and
  • The data science community uses new and existing data to advance discovery.

So, this was a brief intro about FAIR data, now let’s understand what FAIR data is in a broader spectrum.

What Is FAIR Data?

There has been a need for a set of principles to govern the discovery, management and reuse of scientific data. To address these needs and requirements, prominent scientists came up with the FAIR data concept of describing the principles. It would make data valuable to researchers and scientists.

These fundamental principles state that all research objects should be;

“Findable, Accessible, Interoperable and Reusable (FAIR) for both machines and people”.

The emphasis is on making data understandable to machines or converting it into machine-actionable data. The FAIR data principles strengthen the ability of machines to find and use the data automatically. In addition, it also supports the reuse of the research data and further publications. Thus, it will help various stakeholders' data management, data sharing, and data reuse.

The FAIR data principles’ demand data and metadata be easily found, accessed, understood, exchanged and reused. And, there are a total of four basic components to it.

They are;

  1. Findable: The first aspect is that data must be "Findable.” The term findable means the data and metadata must be assigned to a globally unique and persistent identifier so that computers can easily find it. To put it simply, metadata and data are registered or indexed as searchable resources.
  2. Accessible: The second aspect is that data must be "Accessible.” It means the identifier can retrieve data via a standardized protocol that is open, free and universally implementable.
  3. Interoperable: The third aspect is that data must be "Interoperable.” It means the data's nature must be formal, accessible, shared and in a broadly applicable language for knowledge representation. This aspect allows for data integration with other data sources without ambiguity.
  4. Reusable: The last aspect is that data must be "Reusable.” It means data can be further used and repurposed by machines. Data needs to have a detailed provenance, and rich description of metadata attributes to achieve this goal.
FAIR Data

Now, let’s dig deeper into what these FAIR data principles are;

What Are The FAIR Data Principles

The FAIR data principles outline diverse deliberations for the modern data publishing world. It does this with respect to supporting both manual and automated deposition, exploration, sharing, and reuse. FAIR elucidates robust data principles in a nutshell without pertaining to specific domain knowledge. They apply to a wide range of scholarly outputs and research content. The elements of the FAIR principles are related but independent and separable. The principles explain critical attributes that present-day data resources, infrastructures, vocabularies, tools, and systems must demonstrate to facilitate visibility and reusability by various stakeholders.

So, here are the pointers explaining the FAIR data principles.

To make data Findable, data objects;

  • Must be refindable
  • Should be persistent and highlight their metadata
  • Must contain basic machine-actionable metadata that enables distinction
  • Must have identifiers that are unique and persistent

To make data Accessible, data objects must be obtained;

  • By appropriate authorization
  • Through a well-defined protocol
  • Judging the actual accessibility of each data object

Data objects can be Interoperable when;

  • Metadata is machine-actionable or machine-readable
  • Metadata formats utilize shared vocabularies and tools
  • Metadata within the data object is syntactically parseable and semantically machine-accessible

For data objects to be Reusable;

  • It should be compliant with rest of the principles
  • Metadata should be sufficiently well-described and rich that it can be automatically integrated.
  • Published Data Objects should refer to their sources with rich metadata to enable citation.

These pointers elucidate how data objects must be modified to make them FAIR and benefit the publishers.

FAIR Data


Let’s explore the benefits section of the FAIR data!

Why Is FAIR Data Important?

The FAIR data principles enhance the overall value of the data. By using unique identifiers, finding data or sets of data becomes effortless and straightforward. Further, it helps in combining and integrating the data. This kind of data is less challenging to reuse, repurpose, and share as machines can understand everything about the data. Moreover, FAIR data expedites research, encourages cooperation and assists in reusing scientific research. Thus, it helps in achieving maximum impact from research.

We can understand the importance of FAIR data from the fact that - in 2016, the G20 (Group of Twenty) leaders voiced their support for open science and the FAIR principles-based research. Likewise, the European Union (EU) is also embracing these principles. It constituted an expert group to report transforming FAIR data principles into present-day reality. Similarly, the Office of Science at the Department of Energy in the United States announced a US $8.5 million for new research and development aimed at advancing the FAIR data principles in Artificial Intelligence (AI).

FAIRness Is A Prerequisite For Proper Data Management And Data Stewardship

The FAIR data principles reflect, combine, build upon machine-actionability and harmonization of data structures and semantics. It focuses on citable primary scholarly data, its discoverability and availability for reuse. Further, it emphasizes the capability of supporting more rigorous scholarship.

The FAIR helps attain more rigorous management and stewardship of the entire academic community's valuable digital resources and benefits. It provides a set of mileposts for data producers and publishers. They guide the implementation of good Data Management and Stewardship practice levels. Thus, it helps researchers adhere to their funding agencies' expectations and requirements. It becomes crucial for data producers and publishers to examine and implement these principles. By working together towards shared, common goals, the valuable data produced by our community will gradually achieve the target of FAIRness. Thus, proper data management and supervision is not the ultimate achievable goal; rather its a basic precondition that enables and supports innovation, future-proof research and knowledge discovery.

But, how do you identify whether the data is FAIR?

So, here is the checklist that helps you figure out the FAIR data;

Are Your Data FAIR? — Checklist For Identifying FAIR Data

Follow the below fundamental tenets to identify whether the data is FAIR (Findable, Accessible, Interoperable, Reusable):

Findable

1

[ ]

It must be possible for others to discover your data.

2

[ ]

Rich metadata should be available online in a searchable resource such as a catalogue or data repository.

3

[ ]

Data should be assigned to a persistent identifier.

Accessible

1

[ ]

It should be possible for humans and machines to gain access to your data.

2

[ ]

The protocol by which can retrieve data follows recognized standards.

3

[ ]

The access procedure includes authentication and authorization steps, if necessary.

4

[ ]

Metadata are accessible wherever possible, even if the data aren't.

Interoperable

1

[ ]

Data and metadata are provided in commonly understood open formats and standards to be combined and exchanged.

2

[ ]

The metadata provided follows relevant standards.

3

[ ]

Controlled vocabularies, keywords, thesauri or ontologies are used where possible.

4

[ ]

Must have qualified links and references

Reusable

1

[ ]

Data must have relevant attributes.

2

[ ]

Data are accurate and clearly outlined.

3

[ ]

Have a comprehensible and accessible data usage license.

4

[ ]

Complete information on data’s history and lineage

5

[ ]

The data and metadata must fulfill the appropriate domain standards.

Conclusion

As the scholarly publishing industry is looking for a FAIR future, creating a unified framework for data and metadata would be challenging. Our knowledge is constantly changing as we dive deep into complex technology. However, life science research and healthcare stand to benefit significantly from the opportunities afforded by this integration through FAIR data principles.

To advance the sciences and the research, embracing the FAIR principles is the need of the hour for the publishers. As a result, the research and academic community would immensely benefit from data and metadata's potential. At the same time, we must remember that FAIR is a set of principles, not a standard. Secondly, data could meet the FAIR principles, private or only shared under certain restrictions. So, there is no open access to the data. Lastly, following the FAIR principles doesn't necessarily mean that everyone has to share the data openly. But making data FAIR does add a significant value to the modern publishers’ lifecycle.