scispace - formally typeset
Search or ask a question

Showing papers on "XML published in 2008"


Journal ArticleDOI
TL;DR: This survey describes and classify top-k processing techniques in relational databases including query models, data access methods, implementation levels, data and query certainty, and supported scoring functions, and shows the implications of each dimension on the design of the underlying techniques.
Abstract: Efficient processing of top-k queries is a crucial requirement in many interactive environments that involve massive amounts of data. In particular, efficient top-k processing in domains such as the Web, multimedia search, and distributed systems has shown a great impact on performance. In this survey, we describe and classify top-k processing techniques in relational databases. We discuss different design dimensions in the current techniques including query models, data access methods, implementation levels, data and query certainty, and supported scoring functions. We show the implications of each dimension on the design of the underlying techniques. We also discuss top-k queries in XML domain, and show their connections to relational approaches.

893 citations


Book
07 Mar 2008
TL;DR: This chapter discusses modeling with a general-purpose language and with a domain-specific language, and defines the DSM solution as a continuous process in the real world.
Abstract: Foreword. Preface . PART 1: BACKGROUND AND MOTIVATION. 1. Introduction. 1.1 Seeking the better level of abstraction. 1.2 Code-driven and model-driven development. 1.3 An example: modeling with a general-purpose language and with a domain-specific language. 1.4 What is DSM? 1.5 When to use DSM? 1.6 Summary. 2. Business value. 2.1 Productivity. 2.2 Quality. 2.3 Leverage expertise. 2.4 The economics of DSM. 2.5 Summary. PART 2: FUNDAMENTALS. 3. DSM defined. 3.1 DSM characteristics. 3.2 Implications of DSM for users. 3.3 Difference to other modeling approaches. 3.4 Tooling for DSM. 3.5 Summary. 4. Architecture of DSM. 4.1 Introduction. 4.2 Language. 4.3 Models. 4.4 Code generator. 4.5 Domain framework and target environment. 4.6 DSM organization and process. 4.7 Summary. PART 3: DSM EXAMPLES. 5. IP telephony and call processing. 5.1 Introduction and objectives. 5.2 Development process. 5.3 Language for modeling call processing services. 5.4 Modeling IP telephony service. 5.5 Generator for XML. 5.6 Framework support. 5.7 Main results. 5.8 Summary. 6. Insurance products. 6.1 Introduction and objectives. 6.2 Development process. 6.3 Language for modeling insurances. 6.4 Modeling insurance products. 6.5 Generator for Java. 6.6 Framework support. 6.7 Main results. 6.8 Summary. 7. Home Automation. 7.1 Introduction and objectives. 7.2 Development process. 7.3 Home automation modeling language. 7.4 Home automation modeling language in use. 7.5 Generator. 7.6 Main results. 7.7 Summary. 8. Mobile phone applications using Python framework. 8.1 Introduction and objectives. 8.2 Development process. 8.3 Language for application modeling. 8.4 Modeling phone applications. 8.5 Generator for Python. 8.6 Framework support. 8.7 Main results. 8.8 Extending the solution to native S60 C++. 8.9 Summary. 9. Digital Wristwatch. 9.1 Introduction and Objectives. 9.2 Development Process. 9.3 Modeling Language. 9.4 Models. 9.5 Code Generation for Watch Models. 9.6 The Domain Framework. 9.7 Main Results. 9.8 Summary. PART 4: CREATING DSM SOLUTIONS. 10 DSM language definition. 10.1 Introduction and objectives. 10.2 Identifying and defining modeling concepts. 10.3 Formalizing languages with metamodeling. 10.4 Defining language rules. 10.5 Integrating multiple languages. 10.6 Notation for the language. 10.7 Testing the languages. 10.8 Maintaining the languages. 10.9 Summary. 11. Generator definition. 11.1 "Here's one I made earlier". 11.2 Types of generator facilities. 11.3 Generator output patterns. 11.4 Generator structure. 11.5 Process. 11.6 Summary. 12. Domain Framework. 12.1 Removing duplication from generated code. 12.2 Hiding platform details. 12.3 Providing an interface for the generator. 12.4 Summary. 13. DSM definition process. 13.1 Choosing among possible candidate domains. 13.2 Organizing for DSM. 13.3 Proof of concept. 13.4 Defining the DSM solution. 13.5 Pilot project. 13.6 DSM deployment. 13.7 DSM as a continuous process in the real world. 13.8 Summary. 14. Tools for DSM. 14.1 Different approaches to building tool support. 14.2 A Brief History of Tools. 14.3 What is needed in a DSM environment. 14.4 Current tools. 14.5 Summary. 15. DSM in use. 15.1 Model reuse. 15.2 Model sharing and splitting. 15.3 Model versioning. 15.4 Summary. 16. Conclusion. 16.1 No sweat shops--But no Fritz Lang's Metropolis either. 16.2 The onward march of DSM. Appendix A: Metamodeling Language. References. Index.

825 citations


Patent
29 Sep 2008

340 citations


Journal ArticleDOI
TL;DR: Investigations into the creation of sound and complete relevance assessments for the evaluation of content-oriented XML retrieval as carried out at INEX, the evaluation campaign for XML retrieval are described.
Abstract: In information retrieval research, comparing retrieval approaches requires test collections consisting of documents, user requests and relevance assessments. Obtaining relevance assessments that are as sound and complete as possible is crucial for the comparison of retrieval approaches. In XML retrieval, the problem of obtaining sound and complete relevance assessments is further complicated by the structural relationships between retrieval results.A major difference between XML retrieval and flat document retrieval is that the relevance of elements (the retrievable units) is not independent of that of related elements. This has major consequences for the gathering of relevance assessments. This article describes investigations into the creation of sound and complete relevance assessments for the evaluation of content-oriented XML retrieval as carried out at INEX, the evaluation campaign for XML retrieval. The campaign, now in its seventh year, has had three substantially different approaches to gather assessments and has finally settled on a highlighting method for marking relevant passages within documents—even though the objective is to collect assessments at element level. The different methods of gathering assessments at INEX are discussed and contrasted. The highlighting method is shown to be the most reliable of the methods.

256 citations


Patent
James Michael Ferris1
26 Nov 2008
TL;DR: In this paper, the authors describe a system and methods for embedding a cloud-based resource request in a specification language wrapper, such as an XML object, which can be transmitted to a marketplace to seek the response of available clouds which can support the application or appliance according to the specifications contained in the specification language wrappers.
Abstract: Embodiments relate to systems and methods for embedding a cloud-based resource request in a specification language wrapper In embodiments, a set of applications and/or a set of appliances can be registered to be instantiated in a cloud-based network Each application or appliance can have an associated set of specified resources with which the user wishes to instantiate those objects For example, a user may specify a maximum latency for input/output of the application or appliance, a geographic location of the supporting cloud resources, a processor throughput, or other resource specification to instantiate the desired object According to embodiments, the set of requested resources can be embedded in a specification language wrapper, such as an XML object The specification language wrapper can be transmitted to a marketplace to seek the response of available clouds which can support the application or appliance according to the specifications contained in the specification language wrapper

203 citations


Journal ArticleDOI
TL;DR: The paper addresses the application of information retrieval technology in a DW to exploit text-rich documents collections and introduces the problem of dealing with semi-structured data in aDW.
Abstract: This paper surveys the most relevant research on combining Data Warehouse (DW) and Web data. It studies the XML technologies that are currently being used to integrate, store, query and retrieve web data, and their application to DWs. The paper reviews different DW distributed architectures and the use of XML languages as an integration tool in these systems. It also introduces the problem of dealing with semi-structured data in a DW. It studies Web data repositories, the design of multidimensional databases for XML data sources and the XML extensions of On-Line Analytical Processing techniques. The paper addresses the application of information retrieval technology in a DW to exploit text-rich documents collections. The authors hope that the paper will help to discover the main limitations and opportunities that offer the combination of the DW and the Web fields, as well as, to identify open research lines.

160 citations


Journal ArticleDOI
01 Aug 2008
TL;DR: The AXML model and language are described and motivate, the research results obtained in the course of the project are overviewed, and how all the pieces come together in the implementation are shown.
Abstract: This paper provides an overview of the Active XML project developed at INRIA over the past five years. Active XML (AXML, for short), is a declarative framework that harnesses Web services for distributed data management, and is put to work in a peer-to-peer architecture. The model is based on AXML documents, which are XML documents that may contain embedded calls to Web services, and on AXML services, which are Web services capable of exchanging AXML documents. An AXML peer is a repository of AXML documents that acts both as a client by invoking the embedded service calls, and as a server by providing AXML services, which are generally defined as queries or updates over the persistent AXML documents. The approach gracefully combines stored information with data defined in an intensional manner as well as dynamic information. This simple, rather classical idea leads to a number of technically challenging problems, both theoretical and practical. In this paper, we describe and motivate the AXML model and language, overview the research results obtained in the course of the project, and show how all the pieces come together in our implementation.

156 citations


Journal ArticleDOI
01 Aug 2008
TL;DR: This work investigates an axiomatic framework that includes two intuitive and non-trivial properties that an XML keyword search technique should ideally satisfy: monotonicity and consistency, with respect to data and query and proposes a novel semantics for identifying relevant matches that satisfies both properties.
Abstract: Keyword search is a user-friendly mechanism for retrieving XML data in web and scientific applications. An intuitively compelling but vaguely defined goal is to identify matches to query keywords that are relevant to the user. However, it is hard to directly evaluate the relevance of query results due to the inherent ambiguity of search semantics. In this work, we investigate an axiomatic framework that includes two intuitive and non-trivial properties that an XML keyword search technique should ideally satisfy: monotonicity and consistency, with respect to data and query. This is the first work that reasons about keyword search strategies from a formal perspective.Then we propose a novel semantics for identifying relevant matches, which, to the best of our knowledge, is the only existing algorithm that satisfies both properties. An efficient algorithm is designed for realizing this semantics. Extensive experimental studies have verified the intuition of the properties and shown the effectiveness of the proposed algorithm.

152 citations


Patent
04 Jun 2008
TL;DR: In this article, an event server running an event driven application implementing an event processing network can be specified by XML that is an extension of SPRING framework XML, and the event server can include at least one processor to implement a rule on at least a single input stream.
Abstract: An event server running an event driven application implementing an event processing network. The event processing network can include at least one processor to implement a rule on at least one input stream. The event driven application can be specified by XML that is an extension of SPRING framework XML.

147 citations


Patent
22 Oct 2008

136 citations


Journal ArticleDOI
01 Mar 2008
TL;DR: The biomedical informatics research network (BIRN) has developed a federated and distributed infrastructure for the storage, retrieval, analysis, and documentation of biomedical imaging data.
Abstract: The aggregation of imaging, clinical, and behavioral data from multiple independent institutions and researchers presents both a great opportunity for biomedical research as well as a formidable challenge. Many research groups have well-established data collection and analysis procedures, as well as data and metadata format requirements that are particular to that group. Moreover, the types of data and metadata collected are quite diverse, including image, physiological, and behavioral data, as well as descriptions of experimental design, and preprocessing and analysis methods. Each of these types of data utilizes a variety of software tools for collection, storage, and processing. Furthermore sites are reluctant to release control over the distribution and access to the data and the tools. To address these needs, the biomedical informatics research network (BIRN) has developed a federated and distributed infrastructure for the storage, retrieval, analysis, and documentation of biomedical imaging data. The infrastructure consists of distributed data collections hosted on dedicated storage and computational resources located at each participating site, a federated data management system and data integration environment, an extensible markup language (XML) schema for data exchange, and analysis pipelines, designed to leverage both the distributed data management environment and the available grid computing resources.

Journal ArticleDOI
TL;DR: A mapping from Workflow Nets (WF-nets) to BPEL is provided, which builds on the rich theory of Petri nets and can also be used to map other languages onto BPEL.
Abstract: The Business Process Execution Language for Web Services (BPEL) has emerged as the de facto standard for implementing processes. Although intended as a language for connecting web services, its application is not limited to cross-organizational processes. It is expected that in the near future a wide variety of process-aware information systems will be realized using BPEL. While being a powerful language, BPEL is difficult to use. Its XML representation is very verbose and only readable for the trained eye. It offers many constructs and typically things can be implemented in many ways, e.g., using links and the flow construct or using sequences and switches. As a result only experienced users are able to select the right construct. Several vendors offer a graphical interface that generates BPEL code. However, the graphical representations are a direct reflection of the BPEL code and not easy to use by end-users. Therefore, we provide a mapping from Workflow Nets (WF-nets) to BPEL. This mapping builds on the rich theory of Petri nets and can also be used to map other languages (e.g., UML, EPC, BPMN, etc.) onto BPEL. In addition to this we have implemented the algorithm in a tool called WorkflowNet2BPEL4WS.

Journal ArticleDOI
TL;DR: This paper explores how organizational and technological factors explain the adoption of e-business functions in 4570 European companies and the migration from EDI-based to XML-based e- business frameworks in 329 European companies.

BookDOI
TL;DR: In the context of the INEX 2007 Ad Hoc Track (BookSearch'07) as discussed by the authors, the authors of this paper presented an XML document classification using Extended VSM (VSM) approach.
Abstract: Ad Hoc Track.- Overview of the INEX 2007 Ad Hoc Track.- INEX 2007 Evaluation Measures.- XML Retrieval by Improving Structural Relevance Measures Obtained from Summary Models.- TopX @ INEX 2007.- The Garnata Information Retrieval System at INEX'07.- Dynamic Element Retrieval in the Wikipedia Collection.- The Simplest XML Retrieval Baseline That Could Possibly Work.- Using Language Models and Topic Models for XML Retrieval.- UJM at INEX 2007: Document Model Integrating XML Tags.- Phrase Detection in the Wikipedia.- Indian Statistical Institute at INEX 2007 Adhoc Track: VSM Approach.- A Fast Retrieval Algorithm for Large-Scale XML Data.- LIG at INEX 2007 Ad Hoc Track: Using Collectionlinks as Context.- Book Search Track.- Overview of the INEX 2007 Book Search Track (BookSearch'07).- Logistic Regression and EVIs for XML Books and the Heterogeneous Track.- CMIC at INEX 2007: Book Search Track.- XML-Mining Track.- Clustering XML Documents Using Closed Frequent Subtrees: A Structural Similarity Approach.- Probabilistic Methods for Structured Document Classification at INEX'07.- Efficient Clustering of Structured Documents Using Graph Self-Organizing Maps.- Document Clustering Using Incremental and Pairwise Approaches.- XML Document Classification Using Extended VSM.- Entity Ranking Track.- Overview of the INEX 2007 Entity Ranking Track.- L3S at INEX 2007: Query Expansion for Entity Ranking Using a Highly Accurate Ontology.- Entity Ranking Based on Category Expansion.- Entity Ranking from Annotated Text Collections Using Multitype Topic Models.- An n-Gram and Initial Description Based Approach for Entity Ranking Track.- Structured Document Retrieval, Multimedia Retrieval, and Entity Ranking Using PF/Tijah.- Using Wikipedia Categories and Links in Entity Ranking.- Integrating Document Features for Entity Ranking.- Interactive Track.- A Comparison of Interactive and Ad-Hoc Relevance Assessments.- Task Effects on Interactive Search: The Query Factor.- Link-the-Wiki Track.- Overview of INEX 2007 Link the Wiki Track.- Using and Detecting Links in Wikipedia.- GPX: Ad-Hoc Queries and Automated Link Discovery in the Wikipedia.- University of Waterloo at INEX2007: Adhoc and Link-the-Wiki Tracks.- Wikipedia Ad Hoc Passage Retrieval and Wikipedia Document Linking.- Multimedia Track.- The INEX 2007 Multimedia Track.

Book
01 Jan 2008
TL;DR: This book constitutes the refereed proceedings of the 7th International XML Database Symposium, XSym 2010, held in Singapore, in September 2010, and is organized in topical sections on XML query processing, XML update and applications, and XML modeling.
Abstract: This book constitutes the refereed proceedings of the 7th International XML Database Symposium, XSym 2010, held in Singapore, in September 2010.The 11 papers were carefully reviewed and selected from 20 submissions. The papers are organized in topical sections on XML query processing, XML update and applications, and XML modeling.

Proceedings ArticleDOI
09 Jun 2008
TL;DR: This paper identifies that a good XML result snippet should be a self-contained meaningful information unit of a small size that effectively summarizes this query result and differentiates it from others, according to which users can quickly assess the relevance of the query result.
Abstract: Snippets are used by almost every text search engine to complement ranking scheme in order to effectively handle user searches, which are inherently ambiguous and whose relevance semantics are difficult to assess. Despite the fact that XML is a standard representation format of web data, research on generating result snippets for XML search remains untouched.In this paper we present a system, eXtract, which addresses this important yet open problem. We identify that a good XML result snippet should be a self-contained meaningful information unit of a small size that effectively summarizes this query result and differentiates it from others, according to which users can quickly assess the relevance of the query result. We have designed and implemented a novel algorithm to satisfy these requirements and verified its efficiency and effectiveness through experiments.

Journal ArticleDOI
TL;DR: A technique is presented that allows to represent the tree structure of an XML document in an efficient way by compressing their tree structure, and the functionality of basic tree operations, like traversal along edges, is preserved under this compressed representation.

Book ChapterDOI
01 May 2008
TL;DR: This track overview introduces the track setup, and discusses the implications of the new relevance notion for entity ranking in comparison to ad hoc retrieval.
Abstract: Many realistic user tasks involve the retrieval of specific entities instead of just any type of documents. Examples of information needs include `Countries where one can pay with the euro' or `Impressionist art museums in The Netherlands'. The Initiative for Evaluation of XML Retrieval (INEX) started the XML Entity Ranking track (INEX-XER) to create a test collection for entity retrieval in Wikipedia. Entities are assumed to correspond to Wikipedia entries. The goal of the track is to evaluate how well systems can rank entities in response to a query; the set of entities to be ranked is assumed to be loosely defined either by a generic category (entity ranking) or by some example entities (list completion). This track overview introduces the track setup, and discusses the implications of the new relevance notion for entity ranking in comparison to ad hoc retrieval.

Journal ArticleDOI
01 Jan 2008
TL;DR: The main contributions of this paper unfold into four main points: fully implemented models and algorithms for ranked XML retrieval with XPath Full-Text functionality, efficient and effective top-k query processing for semistructured data, support for integrating thesauri and ontologies with statistically quantified relationships among concepts, and a comprehensive description of the TopX system.
Abstract: Recent IR extensions to XML query languages such as Xpath 1.0 Full-Text or the NEXI query language of the INEX benchmark series reflect the emerging interest in IR-style ranked retrieval over semistructured data. TopX is a top-k retrieval engine for text and semistructured data. It terminates query execution as soon as it can safely determine the k top-ranked result elements according to a monotonic score aggregation function with respect to a multidimensional query. It efficiently supports vague search on both content- and structure-oriented query conditions for dynamic query relaxation with controllable influence on the result ranking. The main contributions of this paper unfold into four main points: (1) fully implemented models and algorithms for ranked XML retrieval with XPath Full-Text functionality, (2) efficient and effective top-k query processing for semistructured data, (3) support for integrating thesauri and ontologies with statistically quantified relationships among concepts, leveraged for word-sense disambiguation and query expansion, and (4) a comprehensive description of the TopX system, with performance experiments on large-scale corpora like TREC Terabyte and INEX Wikipedia.

Journal ArticleDOI
TL;DR: BP-QL as discussed by the authors is a query language for querying business processes based on an intuitive model of business processes, an abstraction of the emerging BPEL (business process execution language) standard.

Proceedings ArticleDOI
07 Apr 2008
TL;DR: Muse is described, a mapping design wizard that uses data examples to assist designers in understanding and refining a schema mapping towards the desired specification.
Abstract: A fundamental problem in information integration is that of designing the relationships, called schema mappings, between two schemas. The specification of a semantically correct schema mapping is typically a complex task. Automated tools can suggest potential mappings, but few tools are available for helping a designer understand mappings and design alternative mappings. We describe Muse, a mapping design wizard that uses data examples to assist designers in understanding and refining a schema mapping towards the desired specification. We present novel algorithms behind Muse and show how Muse systematically guides the designer on two important components of a mapping design: the specification of the desired grouping semantics for sets of data and the choice among alternative interpretations for semantically ambiguous mappings. In every component, Muse infers the desired semantics based on the designer's actions on a short sequence of small examples. Whenever possible, Muse draws examples from a familiar database, thus facilitating the design process even further. We report our experience with Muse on some publicly available schemas.

Book
24 Sep 2008
TL;DR: The Ultimate Guide for Designing and Governing Web Service Contracts For Web services to succeed as part of SOA, they require balanced, effective technical contracts that enable services to be evolved and repeatedly reused for years to come.
Abstract: The Ultimate Guide for Designing and Governing Web Service Contracts For Web services to succeed as part of SOA, they require balanced, effective technical contracts that enable services to be evolved and repeatedly reused for years to come. Now, a team of industry experts presents the first end-to-end guide to designing and governing Web service contracts. Writing for developers, architects, governance specialists, and other IT professionals, the authors cover the following areas: Understanding Web Service Contract Technologies Initial chapters and ongoing supplementary content help even the most inexperienced professional get up to speed on how all of the different technologies and design considerations relate to the creation of Web service contracts. For example, a visual anatomy of a Web service contract documented from logical and physical perspectives is provided, along with a chapter dedicated to describing namespaces in plain English. The book is further equipped with numerous case study examples and many illustrations. Fundamental and Advanced WSDL Tutorial coverage of WSDL 1.1 and 2.0 and detailed descriptions of their differences is followed by numerous advanced WSDL topics and design techniques, including extreme loose coupling, modularization options, use of extensibility elements, asynchrony, message dispatch, service instance identification, non-SOAP HTTP binding, and WS-BPEL extensions. Also explained is how WSDL definitions are shaped by key SOA design patterns. Fundamental and Advanced XML Schema XML Schema basics are covered within the context of Web services and SOA, after which advanced XML Schema chapters delve into a variety of specialized message design considerations and techniques, including the use of wildcards, reusability of schemas and schema fragments, type inheritance and composition, CRUD-style message design, and combining industry and custom schemas. Fundamental and Advanced WS-Policy Topics, such as Policy Expression Structure, Composite Policies, Operator Composition Rules, and Policy Attachment establish a foundation upon which more advanced topics, such as policy reusability and centralization, nested, parameterized, and ignorable assertions are covered, along with an exploration of creating concurrent policy-enabled contracts and designing custom policy assertions and vocabularies. Fundamental Message Design with SOAPA broad range of message design-related topics are covered, including SOAP message structures, SOAP nodes and roles, SOAP faults, designing custom SOAP headers and working with industry-standard SOAP headers. Advanced Message Design with WS-Addressing The art of message design is taken to a new level with in-depth descriptions of WS-Addressing endpoint references (EPRs) and MAP headers and an exploration of how they are applied via SOA design patterns. Also covered are WSDL binding considerations, related MEP rules, WS-Addressing policy assertions, and detailed coverage of how WS-Addressing relates to SOAP Action values. Advanced Message Design with MTOM, and SwA Developing SOAP messages capable of transporting large documents or binary content is explored with a documentation of the MTOM packaging and serialization framework (including MTOM-related policy assertions), together with the SOAP with Attachments (SwA) standard and the related WS-I Attachments Profile. Versioning Techniques and Strategies Fundamental versioning theory starts off a series of chapters that dive into a variety of versioning techniques based on proven SOA design patterns including backward and forward compatibility, version identification strategies, service termination, policy versioning, validation by projection, concurrency control, partial understanding, and versioning with and without wildcards. Web Service Contracts and SOA The constant focus of this book is on the design and versioning of Web service contracts in support of SOA and service-orientation. Relevant SOA design principles and design patterns are periodically discussed to demonstrate how specific Web service technologies can be applied and further optimized. Furthermore, several of the advanced chapters provide expert techniques for designing Web service contracts while taking SOA governance considerations into account. About the Web Sites www.soabooks.com supplements this book with a variety of resources, including a diagram symbol legend, glossary, supplementary articles, and source code available for download. www.soaspecs.com provides further support by establishing a descriptive portal to XML and Web services specifications referenced in all of Erls Service-Oriented Architecture books. Foreword Preface Chapter 1: Introduction Chapter 2: Case Study Background Part I: Fundamental Service Contract Design Chapter 3: SOA Fundamentals and Web Service Contracts Chapter 4: Anatomy of a Web Service Contract Chapter 5: A Plain English Guide to Namespaces Chapter 6: Fundamental XML Schema: Types and Message Structure Basics Chapter 7: Fundamental WSDL Part I: Abstract Description Design Chapter 8: Fundamental WSDL Part II: Concrete Description Design Chapter 9: Fundamental WSDL 2.0: New Features, and Design Options Chapter 10: Fundamental WS-Policy: Expression, Assertion, and Attachment Chapter 11: Fundamental Message Design: SOAP Envelope Structure, and Header Block Processing Part II: Advanced Service Contract Design Chapter 12: Advanced XML Schema Part I: Message Flexibility, and Type Inheritance and Composition Chapter 13: Advanced XML Schema Part II: Reusability, Derived Types, and Relational Design Chapter 14: Advanced WSDL Part I: Modularization, Extensibility, MEPs, and Asynchrony Chapter 15: Advanced WSDL Part II: Message Dispatch, Service Instance Identification, and Non-SOAP HTTP Binding Chapter 16: Advanced WS-Policy Part I: Policy Centralization and Nested, Parameterized, and Ignorable Assertions Chapter 17: Advanced WS-Policy Part II: Custom Policy Assertion Design, Runtime Representation, and Compatibility Chapter 18: Advanced Message Design Part I: WS-Addressing Vocabularies Chapter 19: Advanced Message Design Part II: WS-Addressing Rules and Design Techniques Part III: Service Contract Versioning Chapter 20: Versioning Fundamentals Chapter 21: Versioning WSDL Definitions Chapter 22: Versioning Message Schemas Chapter 23: Advanced Versioning Part IV: Appendices Appendix A: Case Study Conclusion Appendix B: A Comparison of Web Services and REST Services Appendix C: How Technology Standards are Developed Appendix D: Alphabetical Pseudo Schema Reference Appendix E: SOA Design Patterns Related to This Book

Journal ArticleDOI
01 Aug 2008
TL;DR: This paper proposes a data model for tracking historical information in an XML document and for recovering the state of the document as of any given time, and introduces a new class of summaries, denoted TSummary, that adds the time dimension to the well-known path summarization schemes.
Abstract: In this paper we address the problem of modeling and implementing temporal data in XML. We propose a data model for tracking historical information in an XML document and for recovering the state of the document as of any given time. We study the temporal constraints imposed by the data model, and present algorithms for validating a temporal XML document against these constraints, along with methods for fixing inconsistent documents. In addition, we discuss different ways of mapping the abstract representation into a temporal XML document, and introduce TXPath, a temporal XML query language that extends XPath 2.0. In the second part of the paper, we present our approach for summarizing and indexing temporal XML documents. In particular we show that by indexing continuous paths, i.e., paths that are valid continuously during a certain interval in a temporal XML graph, we can dramatically increase query performance. To achieve this, we introduce a new class of summaries, denoted TSummary, that adds the time dimension to the well-known path summarization schemes. Within this framework, we present two new summaries: LCP and Interval summaries. The indexing scheme, denoted TempIndex, integrates these summaries with additional data structures. We give a query processing strategy based on TempIndex and a type of ancestor-descendant encoding, denoted temporal interval encoding. We present a persistent implementation of TempIndex, and a comparison against a system based on a non-temporal path index, and one based on DOM. Finally, we sketch a language for updates, and show that the cost of updating the index is compatible with real-world requirements.

Journal ArticleDOI
01 Nov 2008
TL;DR: A proposal for the implementation of the model management operator ModelGen, which translates schemas from one model to another, for example from object-oriented to SQL or from SQL to XML schema descriptions, is discussed.
Abstract: We discuss a proposal for the implementation of the model management operator ModelGen, which translates schemas from one model to another, for example from object-oriented to SQL or from SQL to XML schema descriptions. The operator can be used to generate database wrappers (e.g., object-oriented or XML to relational), default user interfaces (e.g., relational to forms), or default database schemas from other representations. The approach translates schemas from a model to another, within a predefined, but large and extensible, set of models: given a source schema S expressed in a source model, and a target model TM, it generates a schema S? expressed in TM that is "equivalent" to S. A wide family of models is handled by using a metamodel in which models can be succinctly and precisely described. The approach expresses the translation as Datalog rules and exposes the source and target of the translation in a generic relational dictionary. This makes the translation transparent, easy to customize and model-independent. The proposal includes automatic generation of translations as composition of basic steps.

Proceedings ArticleDOI
18 Aug 2008
TL;DR: This paper presents recent advances in an established treebank annotation framework comprising of an abstract XML-based data format, fully customizable editor of tree-based annotations, a toolkit for all kinds of automated data processing with support for cluster computing, and a work-in-progress database-driven search engine with a graphical user interface built into the tree editor.
Abstract: This paper presents recent advances in an established treebank annotation framework comprising of an abstract XML-based data format, fully customizable editor of tree-based annotations, a toolkit for all kinds of automated data processing with support for cluster computing, and a work-in-progress database-driven search engine with a graphical user interface built into the tree editor.

01 Sep 2008
TL;DR: The BioCreative MetaServer (BCMS) as discussed by the authors is a meta-service for information extraction in molecular biology, which provides automatically generated annotations for PubMed/Medline abstracts.
Abstract: We introduce the first meta-service for information extraction in molecular biology, the BioCreative MetaServer (BCMS; http://bcms.bioinfo.cnio.es/). This prototype platform is a joint effort of 13 research groups and provides automatically generated annotations for PubMed/Medline abstracts. Annotation types cover gene names, gene IDs, species, and protein-protein interactions. The annotations are distributed by the meta-server in both human and machine readable formats (HTML/XML). This service is intended to be used by biomedical researchers and database annotators, and in biomedical language processing. The platform allows direct comparison, unified access, and result aggregation of the annotations.

Journal ArticleDOI
TL;DR: The first meta-service for information extraction in molecular biology, the BioCreative MetaServer (BCMS), provides automatically generated annotations for PubMed/Medline abstracts and is intended to be used by biomedical researchers and database annotators, and in biomedical language processing.
Abstract: We introduce the first meta-service for information extraction in molecular biology, the BioCreative MetaServer (BCMS; http://bcms.bioinfo.cnio.es/). This prototype platform is a joint effort of 13 research groups and provides automatically generated annotations for PubMed/Medline abstracts. Annotation types cover gene names, gene IDs, species, and protein-protein interactions. The annotations are distributed by the meta-server in both human and machine readable formats (HTML/XML). This service is intended to be used by biomedical researchers and database annotators, and in biomedical language processing. The platform allows direct comparison, unified access, and result aggregation of the annotations.

Patent
12 Nov 2008
TL;DR: In this paper, a computer system and method for integrating legacy insurance policy underwriting is described, which integrates an older legacy policy generating system to on-line rating systems where users access the system through browsers.
Abstract: The invention relates generally to a computer system and method for integrating insurance policy underwriting. In one aspect, it integrates an older legacy insurance policy generating system to on-line rating systems where users access the system through browsers. The computer system to perform the process of dynamically rating includes generating an input XML file of risk information that is sent to a web service and calculated in a calculation engine. The processed data is retrieved by the web service and transmitted as an XML file to a user interface that parses the rating information and displays the data.

Proceedings ArticleDOI
10 May 2008
TL;DR: This paper develops an algorithm to construct XRGs and a novel family of data flow testing criteria to test WS-BPEL applications and proposes a data structure called XPath rewriting graph (XRG), which not only models how an XPath is conceptually rewritten but also tracks individual rewritings progressively.
Abstract: WS-BPEL applications are a kind of service-oriented application. They use XPath extensively to integrate loosely-coupled workflow steps. However, XPath may extract wrong data from the XML messages received, resulting in erroneous results in the integrated process. Surprisingly, although XPath plays a key role in workflow integration, inadequate researches have been conducted to address the important issues in software testing. This paper tackles the problem. It also demonstrates a novel transformation strategy to construct artifacts. We use the mathematical definitions of XPath constructs as rewriting rules, and propose a data structure called XPath rewriting graph (XRG), which not only models how an XPath is conceptually rewritten but also tracks individual rewritings progressively. We treat the mathematical variables in the applied rewriting rules as if they were program variables, and use them to analyze how information may be rewritten in an XPath conceptually. We thus develop an algorithm to construct XRGs and a novel family of data flow testing criteria to test WS-BPEL applications. Experiment results show that our testing approach is promising.