Showing papers in "Biodiversity Informatics in 2013"

PDF

Open Access

Journal Article•DOI•

Content assessment of the primary biodiversity data published through gbif network: status, challenges and potentials

[...]

Gaiji Samy, Vishwas Chavan, Arturo H. Ariño¹, Javier Otegui¹, Donald Hobern, Rajesh Sood, Estrella Robles¹ - Show less +3 more•Institutions (1)

University of Navarra¹

09 Jul 2013-Biodiversity Informatics

TL;DR: In this paper, the authors present a comprehensive assessment of the coverage of the content mobilised so far through GBIF, as well as a mean to identify the existing gaps and reflect on fitness-for-use requirements.

...read moreread less

Abstract: With the establishment of the Global Biodiversity Information Facility (GBIF) in 2001 as an inter-governmental co-ordinating body, concerted efforts were made during the past decade to establish a global research infrastructure to facilitate the sharing, discovery and access to primary biodiversity data. As on date the participants in GBIF have enabled the discovery and access to over 267+ million such data records. While this remarkable achievement in terms of volume of data must be acknowledged, concerns about the quality and ‘fitness-for-use’ of the data should also be carefully considered in future developments. This contribution is therefore a direct response to the calls for comprehensive content assessment of the GBIF mobilised data. It is the first comprehensive assessment of the coverage of the content mobilised so far through GBIF, as well as a mean to identify the existing gaps and reflect on fitness-for-use requirements. This paper describes the complementary methodologies adopted by the GBIF Secretariat and University of Navarra for the development of a comprehensive content assessment. Outcomes of these research initiatives are summarised in four categories, namely, (a) data quality assessment, (b) trends/patterns assessment, (c) fitness-for-use assessment, and (d) ecosystem specific data diversity assessment. In conclusion we make specific suggestions to the GBIF community on the adoption of common indicators to assess progress towards future targets as well as recommendations to populate such exercise at various levels within the GBIF Network from national level to thematic levels.

...read moreread less

68 citations

Journal Article•DOI•

Bridging the biodiversity data gaps: Recommendations to meet users’ data needs

[...]

Daniel P. Faith, Ben Collen, Arturo H. Ariño, Patricia Koleff Patricia Koleff, John M. Guinotte, Jeremy Kerr, Vishwas Chavan - Show less +3 more

09 Jul 2013-Biodiversity Informatics

TL;DR: A broad range of recommendations are found from a content need assessment survey conducted by the Global Biodiversity Information Facility (GBIF), principally concerning issues such as data quality, bias, and coverage, and extending ease of access.

...read moreread less

Abstract: A strong case has been made for freely available, high quality data on species occurrence, in order to track changes in biodiversity. However, one of the main issues surrounding the provision of such data is that sources vary in quality, scope, and accuracy. Therefore publishers of such data must face the challenge of maximizing quality, utility and breadth of data coverage, in order to make such data useful to users. Here, we report a number of recommendations that stem from a content need assessment survey conducted by the Global Biodiversity Information Facility (GBIF). Through this survey, we aimed to distil the main user needs regarding biodiversity data. We find a broad range of recommendations from the survey respondents, principally concerning issues such as data quality, bias, and coverage, and extending ease of access. We recommend a candidate set of actions for the GBIF that fall into three classes: 1) addressing data gaps, data volume, and data quality, 2) aggregating new kinds of data for new applications, and 3) promoting ease-of-use and providing incentives for wider use. Addressing the challenge of providing high quality primary biodiversity data can potentially serve the needs of many international biodiversity initiatives, including the new 2020 biodiversity targets of the Convention on Biological Diversity, the emerging global biodiversity observation network (GEO BON), and the new Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services (IPBES).

...read moreread less

34 citations

Journal Article•DOI•

Discovery and publishing of primary biodiversity data associated with multimedia resources: The Audubon Core strategies and approaches

[...]

Robert A. Morris¹, Vijay Barve, Mihail Carausu, Vishwas Chavan², José Cuadra², Chris Freeland³, Gregor Hagedorn⁴, Patrick Leary, Dimitry Mozzherin, Annette Olson⁵, G. Riccardi⁶, Ivan Teage⁷, Greg Whitbread - Show less +9 more•Institutions (7)

University of Massachusetts Boston¹, Global Biodiversity Information Facility², Missouri Botanical Garden³, Museum für Naturkunde⁴, United States Geological Survey⁵, Florida State University⁶, American Museum of Natural History⁷

09 Jul 2013-Biodiversity Informatics

TL;DR: The Audubon Core Multimedia Resource Metadata Schema is a representation-free vocabulary for the description of biodiversity multimedia resources and collections that seeks to lighten the burden for providing or using multimedia useful for biodiversity science.

...read moreread less

Abstract: The Audubon Core Multimedia Resource Metadata Schema is a representation-free vocabulary for the description of biodiversity multimedia resources and collections, now in the final stages as a proposed Biodiversity Informatics Standards (TDWG) standard. By defining only six terms as mandatory, it seeks to lighten the burden for providing or using multimedia useful for biodiversity science. At the same time it offers rich optional metadata terms that can help curators of multimedia collections provide authoritative media that document species occurrence, ecosystems, identification tools, ontologies, and many other kinds of biodiversity documents or data. About half of the vocabulary is re-used from other relevant controlled vocabularies that are often already in use for multimedia metadata, thereby reducing the mapping burden on existing repositories. A central design goal is to allow consuming applications to have a high likelihood of discovering suitable resources, reducing the human examination effort that might be required to decide if the resource is fit for the purpose of the application.

...read moreread less

21 citations

Journal Article•DOI•

Assessment of user needs of primary biodiversity data: analysis, concerns, and challenges

[...]

Arturo H. Ariño¹, Vishwas Chavan², Daniel P. Faith³•Institutions (3)

University of Navarra¹, Global Biodiversity Information Facility², Australian Museum³

09 Jul 2013-Biodiversity Informatics

TL;DR: Analysis of the responses showed some lack of awareness about the availability of accessible primary data, and pointed out some types of data in high demand for linking to distribution and taxonomical data now derived from the GBIF cache.

...read moreread less

Abstract: A Content Needs Assessment (CNA) survey has been conducted in order to determine what GBIF-mediated data users may be using, what they would be using if available, and what they need in terms of primary biodiversity data records. The survey was launched in 2009 in six languages, and collected more than 700 individual responses. Analysis of the responses showed some lack of awareness about the availability of accessible primary data, and pointed out some types of data in high demand for linking to distribution and taxonomical data now derived from the GBIF cache. A notable example was linkages to molecular data. Also, the CNA survey uncovered some biases in the design of user needs surveys, by showing demographic and linguistic effects that may have influenced the distribution of responses received in analogous surveys conducted at the global scale.

...read moreread less

19 citations

Journal Article•DOI•

On the dates of GBIF mobilised primary biodiversity records

[...]

Javier Otegui¹, Arturo H. Ariño¹, Vishwas Chavan², Samy Gaiji²•Institutions (2)

University of Navarra¹, Global Biodiversity Information Facility²

09 Jul 2013-Biodiversity Informatics

TL;DR: The types of errors and misprocessing in dates through the sources and the published records are analysed; their impact on the overall data quality of the published index is assessed, and corrective measures are suggested.

...read moreread less

Abstract: There are more than 267 million primary biodiversity data records published by hundreds of data publishers through the GBIF network. Thus, GBIF network is the single most comprehensive index for this kind of data. Ensuring or, at least, assessing data quality is of capital importance for the reliability and usability of this data. While conducting a time data gap analysis on this mass of data, we have detected some issues with the way date information is processed and shared. Dates can be obscured or altered under certain circumstances, when a specific combination of publisher’s error or date handling features, and faulty or inadequate date parsing and processing routines gets chained together. The extent of the date unreliability (either at the source or through GBIF portal) is not high, and further it is concentrated in a few data publishers. We analyse the types of errors and misprocessing in dates through the sources and the published records; assess their impact on the overall data quality of the published index, and suggest corrective measures.

...read moreread less

14 citations