scispace - formally typeset
Search or ask a question
Author

Ángeles Saavedra Places

Bio: Ángeles Saavedra Places is an academic researcher from University of A Coruña. The author has contributed to research in topics: Web application & Geographic information system. The author has an hindex of 10, co-authored 62 publications receiving 368 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: This article introduces a different kind of index that replaces the text using essentially the same space required by the compressed text alone (compression ratio around 35%).
Abstract: The inverted index supports efficient full-text searches on natural language text collections. It requires some extra space over the compressed text that can be traded for search speed. It is usually fast for single-word searches, yet phrase searches require more expensive intersections. In this article we introduce a different kind of index. It replaces the text using essentially the same space required by the compressed text alone (compression ratio around 35p). Within this space it supports not only decompression of arbitrary passages, but efficient word and phrase searches. Searches are orders of magnitude faster than those over inverted indexes when looking for phrases, and still faster on single-word searches when little space is available. Our new indexes are particularly fast at counting the occurrences of words or phrases. This is useful for computing relevance of words or phrases.We adapt self-indexes that succeeded in indexing arbitrary strings within compressed space to deal with large alphabets. Natural language texts are then regarded as sequences of words, not characters, to achieve word-based self-indexes. We design an architecture that separates the searchable sequence from its presentation aspects. This permits applying case folding, stemming, removing stopwords, etc. as is usual on inverted indexes.

105 citations

Journal ArticleDOI
TL;DR: The results show that the sustainability–quality model integrated with the architectural decision maps can be used to identify sustainability-quality requirements as design concerns because most of its quality attributes have been either addressed in the software project or acknowledged as relevant.
Abstract: In the last years, software engineering researchers have defined sustainability as a quality requirement of software, but not enough effort has been devoted to develop new methods/techniques to support the analysis and assessment of software sustainability. In this study, we present the Sustainability Assessment Framework (SAF) that consists of two instruments: the software sustainability–quality model, and the architectural decision map. Then, we use participatory and technical action research in close collaboration with the software industry to validate the SAF regarding its applicability in specific cases. The unit of analysis of our study is a family of software products (Geographic Information System- and Mobile-based Workforce Management Systems) that aim to address sustainability goals (e.g., efficient collection of dead animals to mitigate social and environmental sustainability risks). The results show that the sustainability–quality model integrated with the architectural decision maps can be used to identify sustainability–quality requirements as design concerns because most of its quality attributes (QAs) have been either addressed in the software project or acknowledged as relevant (i.e., creating awareness on the relevance of the multidimensional sustainability nature of certain QAs). Moreover, the action–research method has been helpful to enrich the sustainability–quality model, by identifying missing QAs (e.g., regulation compliance, data privacy). Finally, the architectural decision maps have been found as useful to guide software architects/designers in their decision-making process.

21 citations

Journal ArticleDOI
TL;DR: This paper proposes a new index structure that combines an inverted index and a spatial index based on an ontology of geographic space and describes the architecture of a system for geographic information retrieval that defines a workflow for the extraction of the geographic references in documents.
Abstract: Both Geographic Information Systems and Information Retrieval have been very active research fields in the last decades. Lately, a new research field called Geographic Information Retrieval has appeared from the intersection of these two fields. The main goal of this field is to define index structures and techniques to efficiently store and retrieve documents using both the text and the geographic references contained within the text. We present in this paper two contributions to this research field. First, we propose a new index structure that combines an inverted index and a spatial index based on an ontology of geographic space. This structure improves the query capabilities of other proposals. Then, we describe the architecture of a system for geographic information retrieval that defines a workflow for the extraction of the geographic references in documents. The architecture also uses the index structure that we propose to solve pure spatial and textual queries as well as hybrid queries that combine both a textual and a spatial component. Furthermore, query expansion can be performed on geographic references because the index structure is based in an ontology.

21 citations

Book ChapterDOI
10 Nov 2008
TL;DR: This paper shows that the self-index requires space very close to that of the best word-based compressors, and that it obtains better search time than inverted indexes when searching for phrases.
Abstract: Self-indexing is a concept developed for indexing arbitrary strings. It has been enormously successful to reduce the size of the large indexes typically used on strings, namely suffix trees and arrays. Self-indexes represent a string in a space close to its compressed size and provide indexed searching on it. On natural language, a compressed inverted index over the compressed text already provides a reasonable alternative, in space and time, for indexed searching of words and phrases. In this paper we explore the possibility of regarding natural language text as a string of words and applying a self-index to it. There are several challenges involved, such as dealing with a very large alphabet and detaching searchable content from non-searchable presentation aspects in the text. As a result, we show that the self-index requires space very close to that of the best word-based compressors, and that it obtains better search time than inverted indexes (using the same overall space) when searching for phrases.

20 citations

Journal ArticleDOI
TL;DR: A model‐driven procedure for recovering business processes from legacy information systems is proposed, which proposes a set of models at different abstraction levels, along with the model transformations between them and a supporting tool, which facilitates its adoption.
Abstract: Business processes have become one of the key assets of organization, since these processes allow them to discover and control what occurs in their environments, with information systems automating most of an organization's processes. Unfortunately, and as a result of uncontrolled maintenance, information systems age over time until it is necessary to replace them with new and modernized systems. However, while systems are aging, meaningful business knowledge that is not present in any of the organization's other assets gradually becomes embedded in them. The preservation of this knowledge through the recovery of the underlying business processes is, therefore, a critical problem. This paper provides, as a solution to the aforementioned problem, a model-driven procedure for recovering business processes from legacy information systems. The procedure proposes a set of models at different abstraction levels, along with the model transformations between them. The paper also provides a supporting tool, which facilitates its adoption. Moreover, a real-life case study concerning an e-government system applies the proposed recovery procedure to validate its effectiveness and efficiency. The case study was carried out by following a formal protocol to improve its rigor and replicability. Copyright © 2011 John Wiley & Sons, Ltd.

18 citations


Cited by
More filters
01 Jan 2006
TL;DR: After you change your VT Google password, you will be unable to log on to VT Google Apps services including Mail, Drive, Groups, etc.
Abstract: IT Status "Password doesn't match" error. 4Help is aware that after you change your VT Google password, you will be unable to log on to VT Google Apps services including Mail, Drive, Groups, etc. 4Help is notifying the appropriate people. 12:00 Noon: Engineers have found a backlog on Google password replication. Once the backlog clears you should be able to log on with your changed password that you set earlier. You may be able to log on with your old VT Google password until the system catches up and syncs the new password. Service Degraded Service Degraded [Resolved] Created: Thu, 04/14/2016 11:20am Resolved: Fri, 04/15/2016 1:16pm Duration: 1 day 1 hour 56 min 1734 Views Source URL: https://computing.vt.edu/content/google-0

312 citations

Journal Article
TL;DR: This paper is a comparative study of the requirements handling in Web methodologies showing trends in the use of techniques for capturing, specifying and validating Web requirements.
Abstract: The requirements engineering discipline has become more and more important in the last years. Tasks such as the requirements elicitation, the specification of requirements or the requirements validation are essential to assure the quality of the resulting software. The development of Web systems usually involves more heterogeneous stakeholders than the construction of traditional software. In addition, Web systems have additional requirements for the navigational and multimedia aspects as well as for the usability as no training is possible. Therefore a thoroughly requirements analysis is even more relevant. In contrast, most of the methodologies that have been proposed for the development of Web applications focus on the design paying less attention to the requirements engineering. This paper is a comparative study of the requirements handling in Web methodologies showing trends in the use of techniques for capturing, specifying and validating Web requirements.

183 citations

Journal ArticleDOI
TL;DR: The Pizza&Chili site is introduced, which offers tuned implementations and a standardized API for the most successful compressed full-text self-indexes, together with effective test-beds and scripts for their automatic validation and test.
Abstract: A compressed full-text self-index represents a text in a compressed form and still answers queries efficiently. This represents a significant advancement over the (full-)text indexing techniques of the previous decade, whose indexes required several times the size of the text. Although it is relatively new, this algorithmic technology has matured up to a point where theoretical research is giving way to practical developments. Nonetheless this requires significant programming skills, a deep engineering effort, and a strong algorithmic background to dig into the research results. To date only isolated implementations and focused comparisons of compressed indexes have been reported, and they missed a common API, which prevented their re-use or deployment within other applications.The goal of this article is to fill this gap. First, we present the existing implementations of compressed indexes from a practitioner's point of view. Second, we introduce the Pizza&Chili site, which offers tuned implementations and a standardized API for the most successful compressed full-text self-indexes, together with effective test-beds and scripts for their automatic validation and test. Third, we show the results of our extensive experiments on these codes with the aim of demonstrating the practical relevance of this novel algorithmic technology.

175 citations

Journal ArticleDOI
TL;DR: A detailed analytical mapping of OMSA research work is presented and the progress of discipline on various useful parameters are charted.
Abstract: The new transformed read-write Web has resulted in a rapid growth of user generated content on the Web resulting into a huge volume of unstructured data. A substantial part of this data is unstructured text such as reviews and blogs. Opinion mining and sentiment analysis (OMSA) as a research discipline has emerged during last 15 years and provides a methodology to computationally process the unstructured data mainly to extract opinions and identify their sentiments. The relatively new but fast growing research discipline has changed a lot during these years. This paper presents a scientometric analysis of research work done on OMSA during 2000–2016. For the scientometric mapping, research publications indexed in Web of Science (WoS) database are used as input data. The publication data is analyzed computationally to identify year-wise publication pattern, rate of growth of publications, types of authorship of papers on OMSA, collaboration patterns in publications on OMSA, most productive countries, institutions, journals and authors, citation patterns and an year-wise citation reference network, and theme density plots and keyword bursts in OMSA publications during the period. A somewhat detailed manual analysis of the data is also performed to identify popular approaches (machine learning and lexicon-based) used in these publications, levels (document, sentence or aspect-level) of sentiment analysis work done and major application areas of OMSA. The paper presents a detailed analytical mapping of OMSA research work and charts the progress of discipline on various useful parameters.

157 citations

Journal Article
TL;DR: In this paper, a data structure based on the Ziv-Lempel trie that takes 4n log 2 n(1+o(1)) bits of space and reports the R occurrences of a pattern of length m in worst case time O(m 2 log(mσ)+(m+R) log n).
Abstract: Let a text of u characters over an alphabet of size σ be compressible to n symbols by the LZ78 or LZW algorithm. We show that it is possible to build a data structure based on the Ziv-Lempel trie that takes 4n log 2 n(1+o(1)) bits of space and reports the R occurrences of a pattern of length m in worst case time O(m 2 log(mσ)+(m+R) log n).

132 citations