Showing papers on "Conceptual schema published in 2009"

PDF

Open Access

Journal Article•DOI•

Building a Conceptual Framework: Philosophy, Definitions, and Procedure:

[...]

01 Dec 2009-The International Journal of Qualitative Methods

TL;DR: A new qualitative method for building conceptual frameworks for phenomena that are linked to multidisciplinary bodies of knowledge based on grounded theory method and redefines the key terms of concept, conceptual framework, and conceptual framework analysis.

...read moreread less

Abstract: In this paper the author proposes a new qualitative method for building conceptual frameworks for phenomena that are linked to multidisciplinary bodies of knowledge. First, he redefines the key terms of concept, conceptual framework, and conceptual framework analysis. Concept has some components that define it. A conceptual framework is defined as a network or a “plane” of linked concepts. Conceptual framework analysis offers a procedure of theorization for building conceptual frameworks based on grounded theory method. The advantages of conceptual framework analysis are its flexibility, its capacity for modification, and its emphasis on understanding instead of prediction.

...read moreread less

970 citations

Journal Article•DOI•

On Modelling Route Choice Sets in Transportation Networks: A Synthesis

[...]

Piet H. L. Bovy¹•Institutions (1)

Delft University of Technology¹

01 Jan 2009-Transport Reviews

TL;DR: This article reviews a number of topics related to the modelling and generation of route choice sets, specifically for applications in large networks, and shows that it is advantageous to distinguish the processes of choice set formation and choice per se.

...read moreread less

162 citations

Patent•

Methods and systems to train models to extract and integrate information from data sources

[...]

Justin Boyan, Glenn McDonald, Margaret Benthall, Ray Molnar

15 May 2009

TL;DR: In this article, the authors present methods and systems to model and acquire data from a variety of data and information sources, to integrate the data into a structured database, and to manage the continuing reintegration of updated data from those sources over time.

...read moreread less

Abstract: Methods and systems to model and acquire data from a variety of data and information sources, to integrate the data into a structured database, and to manage the continuing reintegration of updated data from those sources over time. For any given domain, a variety of individual information and data sources that contain information relevant to the schema can be identified. Data elements associated with a schema may be identified in a training source, such as by user tagging. A formal grammar may be induced appropriate to the schema and layout of the training source. A Hidden Markov Model (HMM) corresponding to the grammar may learn where in the sources the elements can be found. The system can automatically mutate its schema into a grammar matching the structure of the source documents. By following an inverse transformation sequence, data that is parsed by the mutated grammar can be fit back into the original grammar structure, matching the original data schema defined through domain modeling. Features disclosed herein may be implemented with respect to web-scraping and data acquisition, and to represent data in support of data-editing and data-merging tasks. A schema may be defined with respect to a graph-based domain model.

...read moreread less

138 citations

Proceedings Article•DOI•

Structural characterizations of schema-mapping languages

[...]

Balder ten Cate¹, Phokion G. Kolaitis²•Institutions (2)

University of Amsterdam¹, University of California, Santa Cruz²

23 Mar 2009

TL;DR: In this paper, the structural properties of schema mappings specified by source-to-target (s-t) dependencies have been analyzed in terms of universal solutions or closure under target homomorphisms.

...read moreread less

Abstract: Schema mappings are declarative specifications that describe the relationship between two database schemas. In recent years, there has been an extensive study of schema mappings and of their applications to several different data inter-operability tasks, including applications to data exchange and data integration. Schema mappings are expressed in some logical formalism that is typically a fragment of first-order logic or a fragment of second-order logic. These fragments are chosen because they possess certain desirable structural properties, such as existence of universal solutions or closure under target homomorphisms. In this paper, we turn the tables and focus on the following question: can we characterize the various schema-mapping languages in terms of structural properties possessed by the schema mappings specified in these languages? We obtain a number of characterizations of schema mappings specified by source-to-target (s-t) dependencies, including characterizations of schema mappings specified by LAV (local-as-view) s-t tgds, schema mappings specified by full s-t tgds, and schema mappings specified by arbitrary s-t tgds. These results shed light on schema-mapping languages from a new perspective and, more importantly, demarcate the properties of schema mappings that can be used to reason about them in data inter-operability applications.

...read moreread less

73 citations

Book Chapter•DOI•

A metric conceptual space algebra

[...]

Benjamin Adams¹, Martin Raubal¹•Institutions (1)

University of California, Santa Barbara¹

21 Sep 2009

TL;DR: This paper presents a metric conceptual space algebra that is designed to facilitate the creation of conceptual space knowledge bases and inferencing systems and demonstrates the applicability of the algebra to spatial information systems with a proof-of-concept application.

...read moreread less

Abstract: The modeling of concepts from a cognitive perspective is important for designing spatial information systems that interoperate with human users. Concept representations that are built using geometric and topological conceptual space structures are well suited for semantic similarity and concept combination operations. In addition, concepts that are more closely grounded in the physical world, such as many spatial concepts, have a natural fit with the geometric structure of conceptual spaces. Despite these apparent advantages, conceptual spaces are underutilized because existing formalizations of conceptual space theory have focused on individual aspects of the theory rather than the creation of a comprehensive algebra. In this paper we present a metric conceptual space algebra that is designed to facilitate the creation of conceptual space knowledge bases and inferencing systems. Conceptual regions are represented as convex polytopes and context is built in as a fundamental element. We demonstrate the applicability of the algebra to spatial information systems with a proof-of-concept application.

...read moreread less

68 citations

Journal Article•DOI•

Improving XML schema matching performance using Prüfer sequences

[...]

Alsayed Algergawy¹, Eike Schallehn¹, Gunter Saake¹•Institutions (1)

Otto-von-Guericke University Magdeburg¹

01 Aug 2009

TL;DR: This paper develops and implements the XPruM system, which consists mainly of two parts-schema preparation and schema matching, and introduces the concept of compatible nodes to identify semantic correspondences across complex elements first, then the matching process is refined to identify correspondences among simple elements inside each pair of compatible node.

...read moreread less

Abstract: Schema matching is a critical step for discovering semantic correspondences among elements in many data-shared applications. Most of existing schema matching algorithms produce scores between schema elements resulting in discovering only simple matches. Such results partially solve the problem. Identifying and discovering complex matches is considered one of the biggest obstacle towards completely solving the schema matching problem. Another obstacle is the scalability of matching algorithms on large number and large-scale schemas. To tackle these challenges, in this paper, we propose a new XML schema matching framework based on the use of Prufer encoding. In particular, we develop and implement the XPruM system, which consists mainly of two parts-schema preparation and schema matching. First, we parse XML schemas and represent them internally as schema trees. Prufer sequences are constructed for each schema tree and employed to construct a sequence representation of schemas. We capture schema tree semantic information in Label Prufer Sequences (LPS) and schema tree structural information in Number Prufer Sequences (NPS). Then, we develop a new structural matching algorithm exploiting both LPS and NPS. To cope with complex matching discovery, we introduce the concept of compatible nodes to identify semantic correspondences across complex elements first, then the matching process is refined to identify correspondences among simple elements inside each pair of compatible nodes. Our experimental results demonstrate the performance benefits of the XPruM system.

...read moreread less

54 citations

Journal Article•DOI•

Laconic schema mappings: computing the core with SQL queries

[...]

Balder ten Cate¹, Laura Chiticariu², Phokion G. Kolaitis³, Wang-Chiew Tan³•Institutions (3)

École normale supérieure de Cachan¹, IBM², University of California, Santa Cruz³

01 Aug 2009

TL;DR: A method for directly computing the core by SQL queries, when schema mappings are specified by source-to-target tuple-generating dependencies (s-t tgds), which has the property that a "direct translation" of the source instance according to the laconic schema mapping produces the core.

...read moreread less

Abstract: A schema mapping is a declarative specification of the relationship between instances of a source schema and a target schema. The data exchange (or data translation) problem asks: given an instance over the source schema, materialize an instance (or solution) over the target schema that satisfies the schema mapping. In general, a given source instance may have numerous different solutions. Among all the solutions, universal solutions and core universal solutions have been singled out and extensively studied. A universal solution is a most general one and also represents the entire space of solutions, while a core universal solution is the smallest universal solution and is unique up to isomorphism (hence, we can talk about the core).The problem of designing efficient algorithms for computing the core has attracted considerable attention in recent years. In this paper, we present a method for directly computing the core by SQL queries, when schema mappings are specified by source-to-target tuple-generating dependencies (s-t tgds). Unlike prior methods that, given a source instance, first compute a target instance and then recursively minimize that instance to the core, our method avoids the construction of such intermediate instances. This is done by rewriting the schema mapping into a laconic schema mapping that is specified by first-order s-t tgds with a linear order in the active domain of the source instances. A laconic schema mapping has the property that a "direct translation" of the source instance according to the laconic schema mapping produces the core. Furthermore, a laconic schema mapping can be easily translated into SQL, hence it can be optimized and executed by a database system to produce the core. We also show that our results are optimal: the use of the linear order is inevitable and, in general, schema mappings with constraints over the target schema cannot be rewritten to a laconic schema mapping.

...read moreread less

48 citations

Proceedings Article•DOI•

Discovering functional dependencies for multidimensional design

[...]

Oscar Romero¹, Diego Calvanese², Alberto Abelló¹, Mariano Rodriguez-Muro²•Institutions (2)

Polytechnic University of Catalonia¹, Free University of Bozen-Bolzano²

06 Nov 2009

TL;DR: This work proposes an algorithm to discover functional dependencies from the domain ontology that exploits the inference capabilities of DL-Lite, thus fully taking into account the semantics of the domain.

...read moreread less

Abstract: Nowadays, it is widely accepted that the data warehouse design task should be largely automated. Furthermore, the data warehouse conceptual schema must be structured according to the multidimensional model and as a consequence, the most common way to automatically look for subjects and dimensions of analysis is by discovering functional dependencies (as dimensions functionally depend on the fact) over the data sources. Most advanced methods for automating the design of the data warehouse carry out this process from relational OLTP systems, assuming that a RDBMS is the most common kind of data source we may find, and taking as starting point a relational schema. In contrast, in our approach we propose to rely instead on a conceptual representation of the domain of interest formalized through a domain ontology expressed in the DL-Lite Description Logic. We propose an algorithm to discover functional dependencies from the domain ontology that exploits the inference capabilities of DL-Lite, thus fully taking into account the semantics of the domain. We also provide an evaluation of our approach in a real-world scenario.

...read moreread less

46 citations

Journal Article•DOI•

A model of process documentation to determine provenance in mash-ups

[...]

Paul Groth¹, Simon Miles², Luc Moreau³•Institutions (3)

University of Southern California¹, King's College London², University of Southampton³

23 Feb 2009-ACM Transactions on Internet Technology

TL;DR: A generic conceptual data model is defined that supports the autonomous creation of attributable, factual process documentation for dynamic multi-institutional applications and is evaluated with respect to questions about the provenance of results generated by a complex bioinformatics mash-up.

...read moreread less

Abstract: Through technologies such as RSS (Really Simple Syndication), Web Services, and AJAX (Asynchronous JavaScript and XML), the Internet has facilitated the emergence of applications that are composed from a variety of services and data sources. Through tools such as Yahoo Pipes, these “mash-ups” can be composed in a dynamic, just-in-time manner from components provided by multiple institutions (i.e., Google, Amazon, your neighbor). However, when using these applications, it is not apparent where data comes from or how it is processed. Thus, to inspire trust and confidence in mash-ups, it is critical to be able to analyze their processes after the fact. These trailing analyses, in particular the determination of the provenance of a result (i.e., the process that led to it), are enabled by process documentation, which is documentation of an application's past process created by the components of that application at execution time. In this article, we define a generic conceptual data model that supports the autonomous creation of attributable, factual process documentation for dynamic multi-institutional applications. The data model is instantiated using two Internet formats, OWL and XML, and is evaluated with respect to questions about the provenance of results generated by a complex bioinformatics mash-up.

...read moreread less

44 citations

Journal Article•DOI•

HAMSTER: using search clicklogs for schema and taxonomy matching

[...]

Arnab Nandi¹, Philip A. Bernstein²•Institutions (2)

University of Michigan¹, Microsoft²

01 Aug 2009

TL;DR: This work addresses the problem of unsupervised matching of schema information from a large number of data sources into the schema of a data warehouse by proposing a new technique based on the search engine's clicklogs.

...read moreread less

Abstract: We address the problem of unsupervised matching of schema information from a large number of data sources into the schema of a data warehouse. The matching process is the first step of a framework to integrate data feeds from third-party data providers into a structured-search engine's data warehouse. Our experiments show that traditional schema-based and instance-based schema matching methods fall short. We propose a new technique based on the search engine's clicklogs. Two schema elements are matched if the distribution of keyword queries that cause click-throughs on their instances are similar. We present experiments on large commercial datasets that show the new technique has much better accuracy than traditional techniques.

...read moreread less

43 citations

Book Chapter•DOI•

Conceptual Modeling for Data Integration

[...]

Diego Calvanese¹, Giuseppe De Giacomo², Domenico Lembo², Maurizio Lenzerini², Riccardo Rosati² - Show less +1 more•Institutions (2)

Free University of Bozen-Bolzano¹, Sapienza University of Rome²

04 Jul 2009

TL;DR: The main goal of this paper is to analyze how to express the conceptual model representing the global schema, and proposes a particular Description Logic, called $\textit{DL-Lite}_{\mathcal A,id}$.

...read moreread less

Abstract: The goal of data integration is to provide a uniform access to a set of heterogeneous data sources, freeing the user from the knowledge about where the data are, how they are stored, and how they can be accessed. One of the outcomes of the research work carried out on data integration in the last years is a clear architecture, comprising a global schema, the source schema, and the mapping between the source and the global schema. Although in many research works and commercial tools the global schema is simply a data structure integrating the data at the sources, we argue that the global schema should represent, instead, the conceptual model of the domain. However, to fully pursue such an approach, several challenging issues are to be addressed. The main goal of this paper is to analyze one of them, namely, how to express the conceptual model representing the global schema. We start our analysis with the case where such a schema is expressed in terms of a UML class diagram, and we end up with a proposal of a particular Description Logic, called $\textit{DL-Lite}_{\mathcal A,id}$. We show that the data integration framework based on such a logic has several interesting properties, including the fact that both reasoning at design time, and answering queries at run time can be done efficiently.

...read moreread less

Journal Article•DOI•

Towards a unified conceptual model representation: a case study in healthcare

[...]

Bhakti Stephan Onggo¹•Institutions (1)

Lancaster University¹

05 Mar 2009-Journal of Simulation

TL;DR: A number of diagrams are proposed to represent each of the conceptual model components and a case study in healthcare is used to show how the proposed unified conceptual modelling representation can be applied in practice.

...read moreread less

Abstract: One of the critical success factors in a simulation project is good communication between different stakeholders in the project, especially in the early stages. Good documentation or representation is essential for communicating conceptual models between stakeholders effectively. Despite the lack of a single accepted definition for a conceptual model, most definitions agree that a conceptual model contains a set of components, each of which specifies different aspects of a conceptual model. This paper advocates the use of a standard multi-faceted representation of conceptual models. A number of diagrams are proposed to represent each of the conceptual model components. Our intention is to initiate discussion and the development of a standard multi-faceted conceptual model representation that will benefit stakeholders involved in a simulation project. A case study in healthcare is used to show how the proposed unified conceptual modelling representation can be applied in practice.

...read moreread less

Proceedings Article•DOI•

The PRISM Workwench: Database Schema Evolution without Tears

[...]

Carlo Curino, Hyun Jin Moon¹, Myungwon Ham¹, Carlo Zaniolo¹•Institutions (1)

University of California, Los Angeles¹

29 Mar 2009

TL;DR: The PRISM system demonstrates a major new advance toward automating schema evolution (including query mapping and database conversion), by improving predictability, logical independence, and auditability of the process.

...read moreread less

Abstract: Information Systems are subject to a perpetual evolution, which is particularly pressing in Web Information Systems, due to their distributed and often collaborative nature. Such continuous adaptation process, comes with a very high cost, because of the intrinsic complexity of the task and the serious rami?cations of such changes upon database-centric Information System softwares. Therefore, there is a need to automate and simplify the schema evolution process and to ensure predictability and logical independence upon schema changes. Current relational technology makes it easy to change the database content or to revise the underlaying storage and indexes but does little to support logical schema evolution which nowadays remains poorly supported by commercial tools. The PRISM system demonstrates a major new advance toward automating schema evolution (including query mapping and database conversion), by improving predictability, logical independence, and auditability of the process. In fact, PRISM exploits recent theoretical results on mapping composition, invertibility and query rewriting to provide DB Administrators with an intuitive, operational workbench usable in their everyday activities—thus enabling graceful schema evolution. In this demonstration, we will show (i) the functionality of PRISM and its supportive AJAX interface, (ii) its architecture built upon a simple SQL–inspired language of Schema Modi?cation Operators, and (iii) we will allow conference participants to directly interact with the system to test its capabilities. Finally, some of the most interesting evolution steps of popular Web Information Systems, such as Wikipedia, will be reviewed in a brief "Saga of Famous Schema Evolutions".

...read moreread less

Journal Article•DOI•

Formal description of conceptual relationships with a view to implementing them in the ontology editor Protégé

[...]

Nava Maroto García, Amparo Alcina

01 Jan 2009-Terminology

TL;DR: A catalogue of conceptual relationships in which each relationship is defined formally in terms of its properties and the nature of the conceptual classes involved, by making explicit the conceptual relationships of the catalogue using the standard ontology editor Protege.

...read moreread less

Abstract: In this article we present a catalogue of conceptual relationships in which each relationship is defined formally in terms of its properties and the nature of the conceptual classes involved. By making explicit the conceptual relationships of the catalogue using the standard ontology editor Protege we should be able to retrieve conceptual knowledge in an onomasiological way using the Queries function of the editor. In the final part of the article we present a sample query taken from the analysis of the terminology of finished ceramic products in order to show how information about relationships can be retrieved.

...read moreread less

Proceedings Article•DOI•

Top-k generation of integrated schemas based on directed and weighted correspondences

[...]

Ahmed Radwan¹, Lucian Popa², Ioana Stanoi², Akmal A. Younis¹•Institutions (2)

University of Miami¹, IBM²

29 Jun 2009

TL;DR: This paper proposes a more automatic approach to schema integration that is based on the use of directed and weighted correspondences between the concepts that appear in the source schemas and shows that the algorithm runs in polynomial time and has good performance in practice.

...read moreread less

Abstract: Schema integration is the problem of creating a unified target schema based on a set of existing source schemas and based on a set of correspondences that are the result of matching the source schemas. Previous methods for schema integration rely on the exploration, implicit or explicit, of the multiple design choices that are possible for the integrated schema. Such exploration relies heavily on user interaction; thus, it is time consuming and labor intensive. Furthermore, previous methods have ignored the additional information that typically results from the schema matching process, that is, the weights and in some cases the directions that are associated with the correspondences. In this paper, we propose a more automatic approach to schema integration that is based on the use of directed and weighted correspondences between the concepts that appear in the source schemas. A key component of our approach is a novel top-k ranking algorithm for the automatic generation of the best candidate schemas. The algorithm gives more weight to schemas that combine the concepts with higher similarity or coverage. Thus, the algorithm makes certain decisions that otherwise would likely be taken by a human expert. We show that the algorithm runs in polynomial time and moreover has good performance in practice.

...read moreread less

Journal Article•

Natural Language Processing for Conceptual Modeling.

[...]

Lilac Al-Safadi¹•Institutions (1)

King Saud University¹

01 Jan 2009-International Journal of Digital Content Technology and Its Applications

TL;DR: A semi-automated approach for the design of databases in enhanced-ERD notation is presented, which focuses on the very early stage of the database development which is the stage of user requirement analysis.

...read moreread less

Abstract: A semi-automated approach for the design of databases in enhanced-ERD notation is presented. It focuses on the very early stage of the database development which is the stage of user requirement analysis. It is supposed to be used between the requirements determination stage and analysis. The approach provides the opportunity of using natural language text documents as a source of knowledge for semi-automated generation of a conceptual data model. The system performs information extraction by parsing the syntax of the sentences and semantically analyzing their content.

...read moreread less

Journal Article•DOI•

Normalization and optimization of schema mappings

[...]

Georg Gottlob¹, Reinhard Pichler², Vadim Savenkov²•Institutions (2)

University of Oxford¹, Vienna University of Technology²

01 Aug 2009

TL;DR: The normalization of schema mappings allows us to eliminate the effect of the concrete syntactic representation of the st-tgds from the semantics of query answering, and how these results can be fruitfully applied to aggregate queries.

...read moreread less

Abstract: Schema mappings are high-level specifications that describe the relationship between two database schemas. They are an important tool in several areas of database research, notably in data integration and data exchange. However, a concrete theory of schema mapping optimization including the formulation of optimality criteria and the construction of algorithms for computing optimal schema mappings is completely lacking to date. The goal of this work is to fill this gap. We start by presenting a system of rewrite rules to minimize sets of source-to-target tuple-generating dependencies (st-tgds, for short). Moreover, we show that the result of this minimization is unique up to variable renaming. Hence, our optimization also yields a schema mapping normalization. By appropriately extending our rewrite rule system, we also provide a normalization of schema mappings containing equality-generating target-dependencies (egds). An important application of such a normalization is in the area of defining the semantics of query answering in data exchange, since several definitions in this area depend on the concrete syntactic representation of the st-tgds. This is, in particular, the case for queries with negated atoms and for aggregate queries. The normalization of schema mappings allows us to eliminate the effect of the concrete syntactic representation of the st-tgds from the semantics of query answering. We discuss in detail how our results can be fruitfully applied to aggregate queries.

...read moreread less

Book Chapter•DOI•

Structure–Property Relations Between Macro and Micro Representations: Relevant Meso-levels in Authentic Tasks

[...]

Marijn R. Meijer¹, Astrid M. W. Bulte, Albert Pilot•Institutions (1)

Utrecht University¹

01 Jan 2009

TL;DR: This chapter redefined this domain in terms of a coherent set of philosophical, substantive and pedagogical substructures, and presents a first exploration of students’ learning of authentic tasks, focusing on their conceptual development.

...read moreread less

Abstract: In chemistry education, micro–macro thinking using structure–property relations is considered as a key conceptual area for students. However, it is difficult but challenging for students and teachers. In this chapter, we have redefined this domain in terms of a coherent set of philosophical, substantive and pedagogical substructures. Starting from the philosophy that chemistry should be considered as a human activity, scientific and technological developments are interrelated with issues in society and part of our cultures. In many communities of practice in society, knowledge is regarded as a tool necessary for performing the activities of those practices. Learning chemistry can be seen as participation in relevant social practices. Within this vision, we have selected tasks belonging to authentic chemical practices in which structure–property relations were explored in different sub-domains (biochemistry, inorganic material science and organic polymeric material science). Within the substantive substructure, meso-structures are essential to iterate between the macro- and the sub-microscopic level. Interrelating structure–property relations connect student learning of these chemical concepts to the contexts of their everyday lives and to contemporary science and technological issues. Using this way of macro–micro thinking, two units for teaching structure–property relations were designed. These units focus on macro–micro thinking with steps in between: what we have termed ‘meso-levels’. The results of the conceptual analysis of structure~– property relations and how these relations are used in macro–micro thinking are discussed. We also present a first exploration of students’ learning of authentic tasks, focusing on their conceptual development.

...read moreread less

Proceedings Article•DOI•

"All You Can Eat" Ontology-Building: Feeding Wikipedia to Cyc

[...]

Samuel Sarjant¹, Catherine Legg¹, Michael Robinson¹, Olena Medelyan¹•Institutions (1)

University of Waikato¹

15 Sep 2009

TL;DR: This work describes methods used to add 35K new concepts mined from Wikipedia to collections in ResearchCyc entirely automatically and shows how Cyc itself can be leveraged for ontological quality control by ‘feeding’ it assertions one by one, enabling it to reject those that contradict its other knowledge.

...read moreread less

Abstract: In order to achieve genuine web intelligence, building some kind of large general machine-readable conceptual scheme (i.e. ontology) seems inescapable. Yet the past 20 years have shown that manual ontology-building is not practicable. The recent explosion of free user-supplied knowledge on the Web has led to great strides in automatic ontology-building, but quality-control is still a major issue. Ideally one should automatically build onto an already intelligent base. We suggest that the long-running Cyc project is able to assist here. We describe methods used to add 35K new concepts mined from Wikipedia to collections in ResearchCyc entirely automatically. Evaluation with 22 human subjects shows high precision both for the new concepts’ categorization, and their assignment as individuals or collections. Most importantly we show how Cyc itself can be leveraged for ontological quality control by ‘feeding’ it assertions one by one, enabling it to reject those that contradict its other knowledge.

...read moreread less

Book Chapter•DOI•

Schema Normalization for Improving Schema Matching

[...]

Serena Sorrentino¹, Sonia Bergamaschi¹, Maciej Gawinecki¹, Laura Po¹•Institutions (1)

University of Modena and Reggio Emilia¹

10 Nov 2009

TL;DR: This work proposes a method to perform schema labels normalization which increases the number of comparable labels, and empirically proves that the normalization method helps in the identification of similarities among schema elements of different data sources, thus improving schema matching accuracy.

...read moreread less

Abstract: Schema matching is the problem of finding relationships among concepts across heterogeneous data sources (heterogeneous in format and in structure). Starting from the "hidden meaning" associated to schema labels (i.e. class/attribute names) it is possible to discover relationships among the elements of different schemata. Lexical annotation (i.e. annotation w.r.t. a thesaurus/lexical resource) helps in associating a "meaning" to schema labels. However, accuracy of semi-automatic lexical annotation methods on real-world schemata suffers from the abundance of non-dictionary words such as compound nouns and word abbreviations. In this work, we address this problem by proposing a method to perform schema labels normalization which increases the number of comparable labels. Unlike other solutions, the method semi-automatically expands abbreviations and annotates compound terms, without a minimal manual effort. We empirically prove that our normalization method helps in the identification of similarities among schema elements of different data sources, thus improving schema matching accuracy.

...read moreread less

Journal Article•DOI•

Terminological concept modelling and conceptual data modelling

[...]

Bodil Nistrup Madsen¹, Hanne Erdman Thomsen¹•Institutions (1)

Copenhagen Business School¹

01 Nov 2009-International Journal of Metadata, Semantics and Ontologies

TL;DR: It is argued that the development of an ontology is crucial for setting up a conceptual data model, and therefore it should always be added as an initial stage to data modelling.

...read moreread less

Abstract: Ontologies are useful for many purposes. The use of an ontology is, for example, crucial for writing consistent definitions of concepts within a specific domain. In this paper, we will argue that the principles of rigorous terminology work are useful for building consistent ontologies. In many cases, developers of IT systems encounter severe problems, because they neglect the necessity of developing a proper ontology (concept model) before they develop a conceptual data model as a basis for an IT system. In this paper, we will argue that the development of an ontology is crucial for setting up a conceptual data model, and therefore it should always be added as an initial stage to data modelling. Also we will give some examples of the mapping between ontologies and conceptual data models. Future research will reveal to what extent it will be possible to set up rules for automatic mapping of concepts of an ontology into classes and attributes of a conceptual data model.

...read moreread less

Book Chapter•DOI•

Instance-Based OWL Schema Matching

[...]

Luiz André P. Paes Leme¹, Marco A. Casanova¹, Karin Breitman¹, Antonio L. Furtado¹•Institutions (1)

Pontifical Catholic University of Rio de Janeiro¹

06 May 2009

TL;DR: This paper describes an instance-based schema matching technique for an OWL dialect that is based on similarity functions and is backed up by experimental results with real data downloaded from data sources found on the Web.

...read moreread less

Abstract: Schema matching is a fundamental issue in many database applications, such as query mediation and data warehousing. It becomes a challenge when different vocabularies are used to refer to the same real-world concepts. In this context, a convenient approach, sometimes called extensional, instance-based or semantic, is to detect how the same real world objects are represented in different databases and to use the information thus obtained to match the schemas. This paper describes an instance-based schema matching technique for an OWL dialect. The technique is based on similarity functions and is backed up by experimental results with real data downloaded from data sources found on the Web.

...read moreread less

Book Chapter•DOI•

Tractable Query Answering over Conceptual Schemata

[...]

Andrea Calì¹, Georg Gottlob¹, Andreas Pieris¹•Institutions (1)

University of Oxford¹

10 Nov 2009

TL;DR: In this article, the problem of answering conjunctive queries over extended entity-relationship schemata, which are called EER (Extended Entity-Relationship Schemata), with is-a among entities and relationships, and cardinality constraints is addressed.

...read moreread less

Abstract: We address the problem of answering conjunctive queries over extended Entity-Relationship schemata, which we call EER (Extended ER) schemata, with is-a among entities and relationships, and cardinality constraints. This is a common setting in conceptual data modelling, where reasoning over incomplete data with respect to a knowledge base is required. We adopt a semantics for EER schemata based on their relational representation. We identify a wide class of EER schemata for which query answering is tractable in data complexity; the crucial condition for tractability is the separability between maximum-cardinality constraints (represented as key constraints in relational form) and the other constraints. We provide, by means of a graph-based representation, a syntactic condition for separability: we show that our conditions is not only sufficient, but also necessary, thus precisely identifying the class of separable schemata. We present an algorithm, based on query rewriting, that is capable of dealing with such EER schemata, while achieving tractability. We show that further negative constraints can be added to the EER formalism, while still keeping query answering tractable. We show that our formalism is general enough to properly generalise the most widely adopted knowledge representation languages.

...read moreread less

Journal Article•DOI•

An intelligent fuzzy object-oriented database framework for video database applications

[...]

Nezihe Burcu Ozgur¹, Murat Koyuncu², Adnan Yazici¹•Institutions (2)

Middle East Technical University¹, Atılım University²

01 Aug 2009-Fuzzy Sets and Systems

TL;DR: The introduced fuzzy conceptual model is used in this framework, which provides modeling of complex and rich semantic content and knowledge of video data including uncertainty, and it supports various flexible queries including (fuzzy) semantic, temporal and spatial queries, based on the video data model.

...read moreread less

Posted Content•

Pork, by Any Other Name...Building a Conceptual Scheme of Distributive Politics

[...]

Susan C. Stokes¹•Institutions (1)

Yale University¹

01 Jan 2009-Social Science Research Network

TL;DR: This paper propose a conceptual scheme to distinguish terms and concepts such as clientelism, patronage, vote buying and pork-barrel and distributive politics, and others, guided by the scope of strategies that one observes in the world and by their normative implications.

...read moreread less

Abstract: When parties or candidates get voters to support them not by promising desirable public policies, advertising their past record in office, making ideological appeals, invoking shared identities, or relying on partisan attachments, but instead by offering them material inducement, scholars are not sure what to call this strategy. The problem is more than terminological: without clear conceptual distinctions, our attempts at causal explanation can be frustrated. Much research in distributive politics is implicitly motivated by a sense that these strategies depart from what is normatively desirable. Yet just as conceptual confusion obscures causal explanation, so it obscures normative considerations. This paper proposes a conceptual scheme to distinguish terms and concepts such as clientelism, patronage, vote buying and pork-barrel and distributive politics, and others. The distinctions are guided by the scope of strategies that one observes in the world and by their normative implications.

...read moreread less

Proceedings Article•DOI•

Automating database schema evolution in information system upgrades

[...]

Carlo Curino¹, Hyun Jin Moon¹, Carlo Zaniolo¹•Institutions (1)

University of California, Los Angeles¹

25 Oct 2009

TL;DR: The PRISM project seeks to develop the methods and tools that turn this error-prone and time-consuming process into one that is controllable, predictable and avoids down-time, and develops a language of Schema Modification Operators (SMO) to express concisely these histories.

...read moreread less

Abstract: The complexity, cost, and down-time currently created by the database schema evolution process is the source of incessant problems in the life of information systems and a major stumbling block that prevent graceful upgrades. Furthermore, our studies shows that the serious problems encountered by traditional information systems are now further exacerbated in web information systems and cooperative scientific databases where the frequency of schema changes has increased while tolerance for downtimes has nearly disappeared. The PRISM project seeks to develop the methods and tools that turn this error-prone and time-consuming process into one that is controllable, predictable and avoids down-time. Toward this goal, we have assembled a large testbed of schema evolution histories, and developed a language of Schema Modification Operators (SMO) to express concisely these histories. Using this language, the database administrator can specify new schema changes, and then rely on PRISM to (i) predict the effect of these changes on current applications, (ii) translate old queries and updates to work on the new schema version, (iii) perform data migration, and (iv) generate full documentation of intervened changes. Furthermore, PRISM achieves good usability and scalability by incorporating recent advances on mapping composition and invertibility in the implementation of (ii). The progress in automating schema evolution so achieved provides the enabling technology for other advances, such as light-weight database design methodologies that embrace changes as the regular state of software. While these topics remain largely unexplored, and thus provide rich opportunities for future research, an important area which we have been investigated is that of archival information systems, where PRISM query mapping techniques were used to support flashback and historical queries for database archives under schema evolution.

...read moreread less

Journal Article•DOI•

Conceptual and ontological modeling in information systems

[...]

Mikhail R. Kogalovsky¹, Leonid A. Kalinichenko¹•Institutions (1)

Russian Academy of Sciences¹

20 Sep 2009-Programming and Computer Software

TL;DR: Conceptual modeling of a subject domain, which produces its conceptual model, is an important stage in designing information systems and much attention in the development of such systems has been given to reusing information resources and to providing access to them at the semantic level.

...read moreread less

Abstract: Conceptual modeling of a subject domain, which produces its conceptual model, is an important stage in designing information systems. In recent years, much attention in the development of such systems has been given to reusing information resources and to providing access to them at the semantic level. Methods and technologies of ontological modeling have lately been under intensive development. In this paper, problems and preconditions of conceptual modeling of the subject domain in database technologies and information systems are discussed. Various approaches to conceptual modeling, conceptual modeling languages, and the respective tools are considered, various interpretations of the role of the conceptual model of the subject domain are discussed, and the current state of conceptual modeling tools produced by software industry is assessed. The relationships between the conceptual schemas of the subject domain and ontologies are analyzed and their similarities and differences are described. Terminological issues and the directions of research in the field of conceptual and ontological modeling are considered. An extensive list of references is given.

...read moreread less

Towards a Catalogue of Patterns for defining Metrics over i* Models.

[...]

Xavier Franch¹, Gemma Grau¹•Institutions (1)

Polytechnic University of Catalonia¹

01 Jan 2009

TL;DR: In this article, the authors propose the use of patterns to design these metrics, with emphasis in their definition over i* models, organized in the form of a catalogue structured along several dimensions, and expressed using a template.

...read moreread less

Abstract: Metrics applied at the early stages of the Information Systems development process are useful for assessing further decisions. Agent-oriented models provide descriptions of processes as a network of relationships among actors and their analysis allows discerning whether a model fulfils some required properties, or comparing models according to some criteria. In this paper, we adopt metrics to drive this analysis and we propose the use of patterns to design these metrics, with emphasis in their definition over i* models. Patterns are organized in the form of a catalogue structured along several dimensions, and expressed using a template. The patterns and the metrics are written using OCL expressions defined over a UML conceptual data model for i*. As a result, we promote reusability improving the metrics definition process in terms of accuracy and efficiency of the process.

...read moreread less

Patent•

Relational Database Page-Level Schema Transformations

[...]

Donna M. Di Carlo¹, Thomas G. Price¹, Stanely J. Dee¹•Institutions (1)

BMC Software¹

07 May 2009

TL;DR: In this article, the conversion of database objects from one schema version to another schema version (e.g., an earlier version) without requiring the objects be unloaded and reloaded are described.

...read moreread less

Abstract: Methods, devices and systems which facilitate the conversion of database objects from one schema version (e.g., an earlier version) to another schema version (e.g., a newer version) without requiring the objects be unloaded and reloaded are described. In general, data object conversion applies to both table space objects and index space objects. The described transformation techniques may be used to convert any object whose schema changes occur at the page-level.

...read moreread less

Proceedings Article•

Schema Matching Based on Attribute Values and Background Ontology

[...]

Abadie Nathalie¹•Institutions (1)

Institut géographique national¹

02 Jun 2009

TL;DR: This paper proposes to use external domain knowledge, namely background ontology, to improve the schema matching strategy, which implies local ontology matching, and presents and discus the results of two schema matching tests based on this strategy.

...read moreread less

Abstract: This paper focuses on the specific problem of geographic database schema matching, as a first step in an integration application. We propose, a schema matching approach based on attributes values and background ontology. We follow the intuition that comparing only schema classes is not sufficient and that there are specific class attributes in geographic database classes whose role consists in specifying the exact nature of each class instance. Their enumerated values refer to geographic concepts. We assume that it is possible to take advantage of this additional knowledge for upgrading the level of granularity of schema classifications by making it explicit in local ontologies created from each database schema that we have to match. Moreover, we propose to use external domain knowledge, namely background ontology, to improve our schema matching strategy, which implies local ontology matching. We lastly present and discus the results of two schema matching tests based on this schema matching strategy.

...read moreread less