scispace - formally typeset
Search or ask a question
Journal Article

Supporting better treatments for meeting health consumers' needs: extracting semantics in social data for representing a consumer health ontology.

01 Dec 2016-Information Research: An International Electronic Journal (Thomas D. Wilson. 9 Broomfield Road, Broomhill, Sheffield, S10 2SE, UK. Web site: http://informationr.net/ir)-Vol. 21, Iss: 4
TL;DR: The process of building an ontology using social tags shows how using this consumer health ontology could improve user access and retrieval and demonstrates how terms extracted from tags are related to each other with similarity and relationships within hierarches in the ontology.
Abstract: Introduction. The purpose of this paper is to provide a framework for building a consumer health ontology using social tags. This would assist health users when they are accessing health information and increase the number of documents relevant to their needs. Methods. In order to extract concepts from social tags, this study conducted an empirical study on terms collected from a social networking site. The semantics of tags were analyzed and a concept list was developed by using the middle-out strategy. Analysis. This study analysed the semantic values of tags by employing Latent Semantic Analysis (LSA). This is a method for extracting and representing the contextual-usage meaning of words by analyzing relationships between documents and the terms they contain and word semantics. Results. The process of building an ontology using social tags shows how using this consumer health ontology could improve user access and retrieval. It demonstrates how terms extracted from tags are related to each other with similarity and relationships within hierarches in the ontology. Conclusion. The study has implications for better design of ontology applications that support the search for healthrelated resources. This will enhance the communication between health consumers and professionals.

Content maybe subject to copyright    Report

VOL. 21 NO. 4, DECEMBER, 2016
Contents | Author index | Subject index | Search
| Home
Supporting better treatments for meeting health
consumers' needs: extracting semantics in social data for
representing a consumer health ontology
Yunseon Choi
Abstract
Introduction. The purpose of this paper is to provide a
framework for building a consumer health ontology using
social tags. This would assist health users when they are
accessing health information and increase the number of
documents relevant to their needs.
Methods. In order to extract concepts from social tags, this
study conducted an empirical study on terms collected from a
social networking site. The semantics of tags were analyzed
and a concept list was developed by using the middle-out
strategy.
Analysis. This study analysed the semantic values of tags by
employing Latent Semantic Analysis (LSA). This is a method
for extracting and representing the contextual-usage
meaning of words by analyzing relationships between
documents and the terms they contain and word semantics.
Results. The process of building an ontology using social
tags shows how using this consumer health ontology could
improve user access and retrieval. It demonstrates how terms
extracted from tags are related to each other with similarity
and relationships within hierarches in the ontology.
Conclusion. The study has implications for better design of
ontology applications that support the search for health-
related resources. This will enhance the communication
between health consumers and professionals.
change font

Introduction
As a large number of online health resources have become
available, there has been a great increase of the number of
health consumers replying on online health resources
available on the World Wide Web (Andreassen,
Bujnowska-Fedak, Chronaki, Dumitru, and Pudule, 2007;
Fox, 2011; Rice, 2006; MacLean and Heer, 2013). It has
been reported that health consumers should be able to
have effective access and utilise relevant health
information to meet their needs (Nutbeam, 2008; World
Health Organisation, 2011). A Pew Research Center survey
indicates that 72% of U.S. adult Internet users have looked
for health information online (Fox and Duggan, 2013).
Studies also show that most consumers lack the skills to
access and use effectively online health resources (Friel,
Bond, and Lahoz, 2015; Gray, 2005; Jain and Bickham,
2014; Ratzan and Parker, 2000; Rowlands et al., 2013).
There have been efforts to provide access to reliable health
information on the World Wide Web, and MedlinePlus and
InformedHealthOnline are such examples. MedlinePlus is
maintained by the National Library of Medicine and it is a
Web-based consumer health information service (Miller,
Lacroix, and Joyce, 2000). InformedHealthOnline is
published by the German Institut für Qualität und
Wirtschaftlichkeit im Gesundheitswesen (or IQWiG) and is
the English-language version of the German website which
provides health information to the public and patients.
Information in health or medical domains is critical and
should be provided to health consumers without difficulty.
However, the growing amount of health information on
the web has increased concern about effective access to
quality health information because terminology, currently
used for organising health or medical information, is
generated by professionals and may not be familiar to
users. The terminology gap between users' and
professionals' vocabulary in describing medical-related
web documents was also uncovered by a study on indexing
consistency of social tagging in comparison with
professional indexing (Choi, 2014). Health consumers and
healthcare professionals tend to use different terms to
describe health-related concepts, for example, dry mouth
vs. xerostomia and flu vs. influenza (Vydiswaran, Vinod,

Hanauer, and Zheng, 2014). This terminology gap in the
health domain prevents health consumers from accessing
health information relevant to their information needs.
For example, when a health consumer tries to find
information related to nosebleed symptoms, she/he may
not find the resources including only the term epistaxis in
the meta tags, title and text (Zielstorff, 2003). In large
medical health consumer websites, it has been reported
that when a consumer's terms are different from
physician-defined terms, the search returned no results,
for example, heart attack vs. myocardial infarction (Zeng,
Kogan, Ash, and Greenes, 2001) and shakes vs. tremor
(Zielstorff, 2003).
On the other hand, as networked information resources on
the web continue to grow rapidly, digital information
environments have led librarians and information
professionals to manage digital resources on the web.
Thus, this trend has required new tools for organizing and
providing more effective access to the web. Subject
directories or Web directories are such tools for internet
resource discovery since subject directories organise Web
documents by subject areas. Yet, studies have shown that
subject directories based on traditional organisation
schemes are not sufficient for the web (Golub, 2006;
Nowick and Mering, 2003; Macgregor and McCulloch,
2006). This is because they were developed using
traditional library schemes which have been developed
with a focus on physical library collection. Web
documents, however, were originally organized and
indexed by professionally-generated keywords. This means
they do not reflect intuitively and instantaneously
expressed users' current needs (Macgregor and McCulloch,
2006).
Although there have been efforts to involve users in
developing information organization systems, they are not
necessarily based on users' real languages. Accordingly,
social tagging has received significant attention as a
promising way to solve this challenge since users' tags
reflect their interests and their languages. Social tags are
good sources for identifying users' terms. Several
researchers have discussed the impact of tagging on
retrieval performance on the web (Bao, 2007; Choi, 2009;

Choy and Lui, 2006; Golder and Huberman, 2006;
Heymann, Koutrika, and Garcia-Molina, 2008; Sen et al.,
2013; Yanbe, Jatowt, Nakamura, and Tanaka, 2006).
Although social tags have been discussed regarding its
usefulness as additional access points for classification and
retrieval (Trant, 2009; Choi, 2014), there has been little
research conducted on the use of social tags to improve
practices in information organization. Since social tags
provide additional access points as user-generated terms,
using them would improve information access and
promote effective reasoning for retrieval.
In terms of information organization, ontologies have been
used for information organization and information
integration. Ontology is a shared understanding of a
domain that can be communicated between people and
computers (Ding, 2001). Especially, in the medical and
health services, information systems should be able to
communicate difficult and complex concepts. However,
analysing the structure and concepts of medical
terminologies cannot be easily achieved.
There have been very few studies conducted on building
health or medical ontologies which features concepts and
vocabularies familiar to health consumers. Mayo
consumer vocabulary, a taxonomy of consumer health
terms and concepts, was developed and maintained by
Mayo Clinic (Seedorff et al., 2013). The Consumer Health
Vocabulary Initiative resulted in the creation of the Open
access collaborative consumer health vocabulary, which
was designed to complement the existing framework of the
Unified medical language system and to aid the needs of
consumer health applications (US. National Library of
Medicine, 2012). However, this vocabulary is not
implemented using a knowledge representation language
such as Web Ontology Language which supports semantic
search and knowledge reasoning.
The aforementioned important components of effective
health information organization are applied in this study:
Due to the unfamiliarity of health consumers to
current terminology used for organizing health or
medical information, medical information systems
need to include user-friendly vocabulary.

Considering the characteristics and quality of social
tags in representing users' views, social tags should
be utilised to improve practices in information
organization.
To establish a closer link between health consumers'
information needs and professionals' responses, a
powerful semantic-based ontology needs to be built.
This paper is part of a larger research project which aims
to answer questions about how we can assist users when
they are accessing health information in order to increase
the number of documents they find relevant to their needs.
The ultimate goal of the project is to build a consumer
health ontology by utilising social tags assigned to health-
related documents. The main objective of this paper is,
therefore, to provide the framework for a consumer health
ontology by discussing the process of building an ontology
featuring social tags. This paper intends to show how
social tags can be utilised for developing class hierarchies
in the ontology in order to identify unambiguously implicit
relations among social tags.
Ontologies for information organization and
information integration
Definitions of ontologies
The term ontology has been used in several disciplines,
from philosophy to computer science. As a branch of
philosophy, ontology studies the structures of the objects,
properties and relations of reality (Smith, 1997). In
computer science, into which the term came from artificial
intelligence, the ontology is a model of the representation
of objects in the world with properties and relationships
(Garshol, 2004). An ontology is defined as a formal,
explicit specification of a conceptualisation (Gruber, 1993;
Studer, Benjamins, & Fensel, 1998):
Conceptualisation refers to 'an
abstract, simplified view of the world
that we wish to represent for some
purpose' (Gruber, 1993
, p. 1).
Explicit refers to the 'type of concepts
used, and the constraints on their use
are explicitly defined' (Studer,
et al.,
1998, p. 25).

Citations
More filters
10 Dec 2010
TL;DR: Nasjonalt kunnskapssenter for helsetjenesten, seksjon Helsebiblioteket, har startet arbeidet med a fixmeoversette Medical Subject Headings (MeSH) til norsk as mentioned in this paper.
Abstract: Nasjonalt kunnskapssenter for helsetjenesten, seksjon Helsebiblioteket, har startet arbeidet med a oversette Medical Subject Headings (MeSH) til norsk.

7 citations

01 Jan 2006
TL;DR: This thesis is to determine to what degree controlled vocabularies that have been traditionally used in libraries could be utilised in automated classification of textual Web pages, in the context of browsing.
Abstract: Automated subject classification has been a challenging research issue for several decades now. The purpose of this thesis is to determine to what degree controlled vocabularies that have been traditionally used in libraries could be utilised in automated classification of textual Web pages, in the context of browsing. Usefulness of different characteristics of controlled vocabularies for automated classification is explored, such as captions of classes from classification systems and terms from thesauri and/or subject heading systems. The classification algorithm would be developed based on a research article collection, and tested on Web pages.

4 citations

Journal ArticleDOI
05 Jun 2007
TL;DR: In this article, a narrated video is coupled with text checklists to walk the student through the proper components and technique of examining a pediatric patient, and a text checklist is used to guide the student in examining the patient.
Abstract: This program is designed to teach medical students the proper components and technique of examining pediatric patients. A narrated video is coupled with text checklists to walk the student...

1 citations

References
More filters
Journal ArticleDOI
TL;DR: The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing.
Abstract: Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biological process, molecular function and cellular component.

35,225 citations


"Supporting better treatments for me..." refers background in this paper

  • ...(Ashburne et al., 2000) SNOMED CT (Systematised nomenclature of medicine—clinical terms) is a comprehensive clinical terminology, originally created by the College of American Pathologists (CAP)....

    [...]

Journal ArticleDOI
TL;DR: This paper describes a mechanism for defining ontologies that are portable over representation systems, basing Ontolingua itself on an ontology of domain-independent, representational idioms.
Abstract: To support the sharing and reuse of formally represented knowledge among AI systems, it is useful to define the common vocabulary in which shared knowledge is represented. A specification of a representational vocabulary for a shared domain of discourse—definitions of classes, relations, functions, and other objects—is called an ontology. This paper describes a mechanism for defining ontologies that are portable over representation systems. Definitions written in a standard format for predicate calculus are translated by a system called Ontolingua into specialized representations, including frame-based systems as well as relational languages. This allows researchers to share and reuse ontologies, while retaining the computational benefits of specialized implementations. We discuss how the translation approach to portability addresses several technical problems. One problem is how to accommodate the stylistic and organizational differences among representations while preserving declarative content. Another is how to translate from a very expressive language into restricted languages, remaining system-independent while preserving the computational efficiency of implemented systems. We describe how these problems are addressed by basing Ontolingua itself on an ontology of domain-independent, representational idioms.

12,962 citations


"Supporting better treatments for me..." refers background in this paper

  • ...An ontology is defined as a formal, explicit specification of a conceptualisation (Gruber, 1993; Studer, Benjamins, & Fensel, 1998):...

    [...]

  • ...An ontology is defined as a formal, explicit specification of a conceptualisation (Gruber, 1993; Studer, Benjamins, & Fensel, 1998): Conceptualisation refers to 'an abstract, simplified view of the world that we wish to represent for some purpose' (Gruber, 1993, p. 1)....

    [...]

Journal ArticleDOI
TL;DR: A new method for automatic indexing and retrieval to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries.
Abstract: A new method for automatic indexing and retrieval is described. The approach is to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries. The particular technique used is singular-value decomposition, in which a large term by document matrix is decomposed into a set of ca. 100 orthogonal factors from which the original matrix can be approximated by linear combination. Documents are represented by ca. 100 item vectors of factor weights. Queries are represented as pseudo-document vectors formed from weighted combinations of terms, and documents with supra-threshold cosine values are returned. initial tests find this completely automatic method for retrieval to be promising.

12,443 citations

Journal ArticleDOI
TL;DR: The Unified Medical Language System is a repository of biomedical vocabularies developed by the US National Library of Medicine and includes tools for customizing the Metathesaurus (MetamorphoSys), for generating lexical variants of concept names (lvg) and for extracting UMLS concepts from text (MetaMap).
Abstract: The Unified Medical Language System (http:// umlsks.nlm.nih.gov) is a repository of biomedical vocabularies developed by the US National Library of Medicine. The UMLS integrates over 2 million names for some 900 000 concepts from more than 60 families of biomedical vocabularies, as well as 12 million relations among these concepts. Vocabularies integrated in the UMLS Metathesaurus include the NCBI taxonomy, Gene Ontology, the Medical Subject Headings (MeSH), OMIM and the Digital Anatomist Symbolic Knowledge Base. UMLS concepts are not only inter-related, but may also be linked to external resources such as GenBank. In addition to data, the UMLS includes tools for customizing the Metathesaurus (MetamorphoSys), for generating lexical variants of concept names (lvg) and for extracting UMLS concepts from text (MetaMap). The UMLS knowledge sources are updated quarterly. All vocabularies are available at no fee for research purposes within an institution, but UMLS users are required to sign a license agreement. The UMLS knowledge sources are distributed on CD-ROM and by FTP.

3,707 citations


"Supporting better treatments for me..." refers background in this paper

  • ...(Bodenreider, 2004) Additionally, there have been several research efforts focusing on developing frameworks to help health consumers search for information (Puustjarvi and Puustjarvi, 2011; Dong and Hussain, 2011)....

    [...]

Journal ArticleDOI
TL;DR: This paper outlines a methodology for developing and evaluating ontologies, first discussing informal techniques, concerning such issues as scoping, handling ambiguity, reaching agreement and producing definitions, and considers, a more formal approach.
Abstract: This paper is intended to serve as a comprehensive introduction to the emerging field concerned with the design and use of ontologies. We observe that disparate backgrounds, languages, tools and techniques are a major barrier to effective communication among people, organisations and/or software understanding (i.e. an “ontology”) in a given subject area, can improve such communication, which in turn, can give rise to greater reuse and sharing, inter-operability, and more reliable software. After motivating their need, we clarify just what ontologies are and what purpose they serve. We outline a methodology for developing and evaluating ontologies, first discussing informal techniques, concerning such issues as scoping, handling ambiguity, reaching agreement and producing definitions. We then consider the benefits and describe, a more formal approach. We re-visit the scoping phase, and discuss the role of formal languages and techniques in the specification, implementation and evalution of ontologies. Finally, we review the state of the art and practice in this emerging field, considering various case studies, software tools for ontology development, key research issues and future prospects.

3,568 citations