Showing papers on "Simple API for XML published in 2018"

PDF

Open Access

Journal Article•DOI•

Machine learning techniques for XML (co-)clustering by structure-constrained phrases

[...]

Gianni Costa¹, Riccardo Ortale¹•Institutions (1)

Indian Council of Agricultural Research¹

01 Feb 2018

TL;DR: Experiments over real-world benchmark XML corpora show that the effectiveness of the three approaches improves with contextualized n-grams of suitable length, which confirms the validity of the devised method from multiple clustering perspectives.

...read moreread less

Abstract: A new method is proposed for clustering XML documents by structure-constrained phrases. It is implemented by three machine-learning approaches previously unexplored in the XML domain, namely non-negative matrix (tri-)factorization, co-clustering and automatic transactional clustering. A novel class of XML features approximately captures structure-constrained phrases as n-grams contextualized by root-to-leaf paths. Experiments over real-world benchmark XML corpora show that the effectiveness of the three approaches improves with contextualized n-grams of suitable length. This confirms the validity of the devised method from multiple clustering perspectives. Two approaches overcome in effectiveness several state-of-the-art competitors. The scalability of the three approaches is investigated, too.

...read moreread less

8 citations

Proceedings Article•

UNIFICATION OF XML DOCUMENT STRUCTURES FOR DOCUMENT WAREHOUSE (DocW)

[...]

Ines Ben Messaoud¹, Jamel Feki¹, Kaïs Khrouf¹, Gilles Zurfluh²•Institutions (2)

University of Sfax¹, University of Toulouse²

09 Aug 2018

TL;DR: This method consists of two steps: i) unification of XML document structures in order to set a global and generic perception/view of the distributed document warehouse, and ii) multidimensional modeling of unified documents for decisional purposes.

...read moreread less

Abstract: Data warehouses and OLAP (On Line Analytical Processing) technologies analyse huge amounts of structured data that companies store as conventional databases. Recent works underline the importance of textual data for the decision making process and, therefore, lead to build document warehouses. In fact, documents help decision makers to better understand the evolution of their business activities. In general, these documents exist in XML format, are geographically distributed and described by multiple and different structures. This paper deals with a method to build a distributed document warehouse. This method consists of two steps: i) unification of XML document structures in order to set a global and generic perception/view of the distributed document warehouse, and ii) multidimensional modeling of unified documents for decisional purposes. More specifically, this paper focuses on the unification step.

...read moreread less

6 citations

Journal Article•DOI•

Evaluating Queries and Updates on Big XML Documents

[...]

Nicole Bidoit¹, Dario Colazzo², Noor Malla, Carlo Sartiani•Institutions (2)

University of Paris-Sud¹, Paris Dauphine University²

01 Feb 2018-Information Systems Frontiers

TL;DR: Andromeda, a system for processing queries and updates on large XML documents based on the idea of statically and dynamically partitioning the input document, so as to distribute the computing load among the machines of a MapReduce cluster.

...read moreread less

Abstract: In this paper we present Andromeda, a system for processing queries and updates on large XML documents. The system is based on the idea of statically and dynamically partitioning the input document, so as to distribute the computing load among the machines of a MapReduce cluster.

...read moreread less

5 citations

Patent•

Method for quickly positioning and processing XML tag

[...]

Wang Changsheng, Li Xindong

21 Aug 2018

TL;DR: In this article, a method for quickly positioning and processing an XML tag is presented, which can save a lot of parsing time and maintenance and expansion of a program are very easy, and the convenient and flexible capability of treating the XML node is provided.

...read moreread less

Abstract: The invention discloses a method for quickly positioning and processing an XML tag. A quick positioning operation of an XML node includes creation and arrangement of a document context processor, anda quick processing operation of the XML node includes high efficiency match of an XML element node, high efficiency acquisition of an XML element attribute and a post-processing immediate exit mechanism. Through the adoption of a method for quickly positioning and processing an XML node, a to-be-processed XML node can be positioned quickly, processing is finished, and whole XML parsing can be quickly completed. Compared with a traditional XML SAX PARSER, a lot of parsing time is saved (the parsing time refers to an operation time spent before a node is located and an idle time spent after thenode is processed), the convenient and flexible capability of treating the XML node is provided, and maintenance and expansion of a program are very easy.

...read moreread less

1 citations

Book Chapter•DOI•

OntoGen Based Ontology Concepts Generation from Graph

[...]

Abid Saeed¹, Muhammad Kamran¹, Shoaib Saleem Khan¹, Rao Muhammad Kamran¹•Institutions (1)

Islamia University¹

23 Oct 2018

TL;DR: This paper describes a unique approach to generate the graph based ontology that consists of four different phases and its working is improved and its produce more accurate result in term of accuracy.

...read moreread less

Abstract: This paper describes a unique approach to generate the graph based ontology. Ontology is created from text graph. Many other tools are available to create the ontology but each tool has its own method and complex structure to generate ontology. Ontology is very popular in many fields today and also became the necessary part of www. In this paper, the proposed tool that is used to generate the ontology consists of four different phases. Each phase has its own purpose. First, the text graph is input in notepad and implementing in java (eclipse). Second, the output of first step is converts into XML file. Third, the XML file is parsed with the help of DOM or SAX parser. In the last step, XML file is converted into RDF file which is validating by the help of online RDF parser. In the future, the RDF file is converted into RDFS and in the last ontology is created. After this, the working of proposed tool is improved and its produce more accurate result in term of accuracy.

...read moreread less

1 citations