scispace - formally typeset
Search or ask a question

Showing papers in "Biodiversity Informatics in 2013"


Journal ArticleDOI
TL;DR: In this paper, the authors present a comprehensive assessment of the coverage of the content mobilised so far through GBIF, as well as a mean to identify the existing gaps and reflect on fitness-for-use requirements.
Abstract: With the establishment of the Global Biodiversity Information Facility (GBIF) in 2001 as an inter-governmental co-ordinating body, concerted efforts were made during the past decade to establish a global research infrastructure to facilitate the sharing, discovery and access to primary biodiversity data. As on date the participants in GBIF have enabled the discovery and access to over 267+ million such data records. While this remarkable achievement in terms of volume of data must be acknowledged, concerns about the quality and ‘fitness-for-use’ of the data should also be carefully considered in future developments. This contribution is therefore a direct response to the calls for comprehensive content assessment of the GBIF mobilised data. It is the first comprehensive assessment of the coverage of the content mobilised so far through GBIF, as well as a mean to identify the existing gaps and reflect on fitness-for-use requirements. This paper describes the complementary methodologies adopted by the GBIF Secretariat and University of Navarra for the development of a comprehensive content assessment. Outcomes of these research initiatives are summarised in four categories, namely, (a) data quality assessment, (b) trends/patterns assessment, (c) fitness-for-use assessment, and (d) ecosystem specific data diversity assessment. In conclusion we make specific suggestions to the GBIF community on the adoption of common indicators to assess progress towards future targets as well as recommendations to populate such exercise at various levels within the GBIF Network from national level to thematic levels.

68 citations


Journal ArticleDOI
TL;DR: A broad range of recommendations are found from a content need assessment survey conducted by the Global Biodiversity Information Facility (GBIF), principally concerning issues such as data quality, bias, and coverage, and extending ease of access.
Abstract: A strong case has been made for freely available, high quality data on species occurrence, in order to track changes in biodiversity. However, one of the main issues surrounding the provision of such data is that sources vary in quality, scope, and accuracy. Therefore publishers of such data must face the challenge of maximizing quality, utility and breadth of data coverage, in order to make such data useful to users. Here, we report a number of recommendations that stem from a content need assessment survey conducted by the Global Biodiversity Information Facility (GBIF). Through this survey, we aimed to distil the main user needs regarding biodiversity data. We find a broad range of recommendations from the survey respondents, principally concerning issues such as data quality, bias, and coverage, and extending ease of access. We recommend a candidate set of actions for the GBIF that fall into three classes: 1) addressing data gaps, data volume, and data quality, 2) aggregating new kinds of data for new applications, and 3) promoting ease-of-use and providing incentives for wider use. Addressing the challenge of providing high quality primary biodiversity data can potentially serve the needs of many international biodiversity initiatives, including the new 2020 biodiversity targets of the Convention on Biological Diversity, the emerging global biodiversity observation network (GEO BON), and the new Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services (IPBES).

34 citations


Journal ArticleDOI
TL;DR: The Audubon Core Multimedia Resource Metadata Schema is a representation-free vocabulary for the description of biodiversity multimedia resources and collections that seeks to lighten the burden for providing or using multimedia useful for biodiversity science.
Abstract: The Audubon Core Multimedia Resource Metadata Schema is a representation-free vocabulary for the description of biodiversity multimedia resources and collections, now in the final stages as a proposed Biodiversity Informatics Standards (TDWG) standard. By defining only six terms as mandatory, it seeks to lighten the burden for providing or using multimedia useful for biodiversity science. At the same time it offers rich optional metadata terms that can help curators of multimedia collections provide authoritative media that document species occurrence, ecosystems, identification tools, ontologies, and many other kinds of biodiversity documents or data. About half of the vocabulary is re-used from other relevant controlled vocabularies that are often already in use for multimedia metadata, thereby reducing the mapping burden on existing repositories. A central design goal is to allow consuming applications to have a high likelihood of discovering suitable resources, reducing the human examination effort that might be required to decide if the resource is fit for the purpose of the application.

21 citations


Journal ArticleDOI
TL;DR: Analysis of the responses showed some lack of awareness about the availability of accessible primary data, and pointed out some types of data in high demand for linking to distribution and taxonomical data now derived from the GBIF cache.
Abstract: A Content Needs Assessment (CNA) survey has been conducted in order to determine what GBIF-mediated data users may be using, what they would be using if available, and what they need in terms of primary biodiversity data records. The survey was launched in 2009 in six languages, and collected more than 700 individual responses. Analysis of the responses showed some lack of awareness about the availability of accessible primary data, and pointed out some types of data in high demand for linking to distribution and taxonomical data now derived from the GBIF cache. A notable example was linkages to molecular data. Also, the CNA survey uncovered some biases in the design of user needs surveys, by showing demographic and linguistic effects that may have influenced the distribution of responses received in analogous surveys conducted at the global scale.

19 citations


Journal ArticleDOI
TL;DR: The types of errors and misprocessing in dates through the sources and the published records are analysed; their impact on the overall data quality of the published index is assessed, and corrective measures are suggested.
Abstract: There are more than 267 million primary biodiversity data records published by hundreds of data publishers through the GBIF network. Thus, GBIF network is the single most comprehensive index for this kind of data. Ensuring or, at least, assessing data quality is of capital importance for the reliability and usability of this data. While conducting a time data gap analysis on this mass of data, we have detected some issues with the way date information is processed and shared. Dates can be obscured or altered under certain circumstances, when a specific combination of publisher’s error or date handling features, and faulty or inadequate date parsing and processing routines gets chained together. The extent of the date unreliability (either at the source or through GBIF portal) is not high, and further it is concentrated in a few data publishers. We analyse the types of errors and misprocessing in dates through the sources and the published records; assess their impact on the overall data quality of the published index, and suggest corrective measures.

14 citations