How JATS XML Can Positively Impact Publishers
Academic Publishing

Arunav Moitra
Perhaps, the publishers are missing out on valuable indexing opportunities. As they continue to publish journal articles only in human-readable formats such as PDF, it is difficult to utilize their true potential and reach. Since JATS XML format is increasingly being adopted as the industry’s standard indexing format, publishing articles in XML format has now become the prerequisite for any publisher.

Whenever publishers want to switch to JATS XML, they tend to take it as an uphill task. This can be the biggest reason for slow adoption. It is indeed a tough decision to make, but it has its own benefits. The most significant one would be content standardization. When a publisher decides to integrate JATS XML into its workflow, it has to further decide on the model to use, like Archiving, Publishing, or Authoring. Further, they need to ponder the type of table model, a citation model, and style.

So, in scholarly publishing, a publisher must look at all platforms and services based on this global model to choose what works best for them. They need to do this before moving from a proprietary content model to JATS to maximize the impact on their business and finances.

Let us move on to understand the basics of XML and JATS.

JATS XML In A Nutshell

There are two components in JATS XML. Understanding both of them is extremely important. Let us start with XML first.

XML stands for Extensible Markup Language. It is a markup language that encodes content in human-readable & machine-readable and layout-independent ways. So, it is more flexible and reusable for a variety of formats like PDF, HTML, and more. The XML markup enhances the research content’s usability and visibility. XML moulds the documents into a highly discoverable and easily accessible version that is perfectly suitable for storage. Further, it allows text mining and opens for content enrichment through multimedia and semantic tagging.

Now, coming to the JATS, it stands for Journal Article Tag Suite. JATS is an international standard XML tag set for journal articles. JATS is an XML vocabulary designed to model current journal articles. JATS is a named collection of XML elements and attributes that can mark the structure and semantics of a single journal article. It was first developed by the National Information Standards Organization (NISO). Later, it got approval from the American National Standards Institute. In the first place, Scientific, Technical, Engineering and Medical journal articles used the JATS standard. However, journals in the humanities, sociology, economics and the soft sciences also use the JATS XML markup.

JATS XML Prerequisite For Publishers

Since its inception, publishers have found JATS useful for the production and quality assurance testing of articles and preprints. Currently, JATS helps in building large journal repositories. At the same time, publishers produce their new content in JATS for organization-to-organization interchange. The big publishing houses still use their own custom tag sets and are a bit restrictive about accepting the global changes. However, medium and small-sized publishers are encoding their new journals in JATS and converting their backfiles from PDF or a proprietary tag set into JATS. Public archives such as libraries and scholarly portals prefer to receive article data from different sources in a standard format. Private and commercial archives may require JATS for ingesting. Numerous web-hosting and service vendors also support JATS. The various scholarly communities have consent for a single data format for journal content publication.

Why Should Publishers Use JATS?

  • Integrating JATS/XML in OJS (Open Journal Systems) can help them automate the creation of full-text publishing in their journals.
  • When Journal publishers use JATS, it makes it easy for google scholar and other indexing and improves SEO for the journal site.
  • JATS is required for indexing services and third-party online repositories such as PubMed.

Advantages of JATS for Journal Articles And Publishers (A Broader View)

  • Declarative: JATS markup is structural and declarative. It is not presentational or behavioral. So, this makes the journal articles easier to process and ensures the longevity of data.
  • Designed for Articles: JATS is an XML model that fits the way journals and preprints are published today.
  • Documented Tag Set: Extensive Tag Libraries with explanations and examples for element and attribute usage is available online, as are many best practice recommendations.
  • Low Cost: Tag sets, Tag library documentation, tagged examples, and some tools for QA and output production are available for free.
  • Not a Static Standard: JATS changes as publishers and other users request new features or modifications.

Why XML Metadata Matters More Than Ever

The primary objective of a journal article is to have its contents disseminated as widely as possible. This helps in reaching as many people in as many ways as possible. Ultimately, it serves the larger goal of advancing science and humanities research further.

A journal article has to move through different places to accomplish these objectives. It includes passing several times through the peer review systems, from the publisher’s website, from databases, aggregators, search engines, different discovery platforms to partner publishers, repositories, archives, and the list continues. Considering the huge volume of journal articles being produced daily, it becomes a herculean task to achieve greater content dissemination. Here, metadata comes to the rescue. Article metadata provides the gatekeepers of key article destinations with assertions about the provenance, authorship, ownership, access, funding, and relevance of an article. All these things are extremely useful for smooth dissemination across greater lengths.

Nevertheless, in scholarly publishing today, all the articles are read by a machine. The research content goes to humans only after a machine has processed that content’s metadata. Moreover, there is a growing need for standardizing the content for improved interoperability. In turn, it increases the need for persistent identifiers and other new forms of metadata. The objective is to capture all these within an article’s XML. is an integrated platform for writing, collaborating, and publishing research papers. It has evolved over the years as a go-to destination for academic content writing tools alongside aiding researchers for content formatting, as a conversion tool for publishers, and as a dissemination tool for institutional research. It enables researchers to author research papers faster by providing an intuitive online collaborative writing platform.

Why Use Typeset’s Services For Conversion To JATS XML?

Typeset software is an automated typesetting platform that generates automatic reference, correction, and conversion of human-readable formats to machine-readable formats. Typeset’s proprietary technology uses machine learning to capture and analyze data sets. It helps in making publication workflows more efficient. Publishers can use Typeset to automatically convert author submissions to any publication format, including JATS XML.

Check the following converters, if you want to explore more:

Typeset varies from the other conversion methods as it is far more convenient, quick, and automated. Therefore, it is error-proof and does not call for the endless back-and-forth with third-party vendors and dodge burning a hole in your pocket.

Learn more on Typeset and Typeset for Publishers.

Final Thoughts

From a technology perspective, publishers can develop their presentation, delivery, and distribution of the content objects. The journal article's metadata, body text, figures, tables, and references are reliably based on the same XML content model as the research content. This process is called content standardization across platforms. In its simplest form, JATS does not try to lead or coerce publishers. Rather, it strives to capture common journal practices. JATS format preserves the current text order or the reading sequence to a greater extent.

The idea behind JATS is to be straightforward and assist in encoding anyone's journal articles into the JATS format. To simplify even more, JATS is just an age-old document model, with metadata concerning the article  at the beginning and the body and back matter of the article following the metadata. This model fits current journal production. Publishers who are still not using the JATS XML must work in this regard. It has all the capability to transform the publishing industry for the better completely.

