Institution

INESC-ID

Nonprofit•Lisbon, Portugal•

About: INESC-ID is a nonprofit organization based out in Lisbon, Portugal. It is known for research contribution in the topics: Computer science & Context (language use). The organization has 932 authors who have published 2618 publications receiving 37658 citations.

...read moreread less

Topics: Computer science, Context (language use), Field-programmable gate array, Control theory, Adaptive control ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Corpus-Based Speech Enhancement With Uncertainty Modeling and Cepstral Smoothing

[...]

Robert M. Nickel¹, Ramón Fernandez Astudillo², Dorothea Kolossa³, Rainer Martin³•Institutions (3)

Bucknell University¹, INESC-ID², Ruhr University Bochum³

01 May 2013-IEEE Transactions on Audio, Speech, and Language Processing

TL;DR: A new approach for corpus-based speech enhancement that significantly improves over a method published by Xiao and Nickel in 2010 and employs a Gaussian mixture model instead of a vector quantizer in the phoneme recognition front-end is presented.

...read moreread less

Abstract: We present a new approach for corpus-based speech enhancement that significantly improves over a method published by Xiao and Nickel in 2010 Corpus-based enhancement systems do not merely filter an incoming noisy signal, but resynthesize its speech content via an inventory of pre-recorded clean signals The goal of the procedure is to perceptually improve the sound of speech signals in background noise The proposed new method modifies Xiao's method in four significant ways Firstly, it employs a Gaussian mixture model (GMM) instead of a vector quantizer in the phoneme recognition front-end Secondly, the state decoding of the recognition stage is supported with an uncertainty modeling technique With the GMM and the uncertainty modeling it is possible to eliminate the need for noise dependent system training Thirdly, the post-processing of the original method via sinusoidal modeling is replaced with a powerful cepstral smoothing operation And lastly, due to the improvements of these modifications, it is possible to extend the operational bandwidth of the procedure from 4 kHz to 8 kHz The performance of the proposed method was evaluated across different noise types and different signal-to-noise ratios The new method was able to significantly outperform traditional methods, including the one by Xiao and Nickel, in terms of PESQ scores and other objective quality measures Results of subjective CMOS tests over a smaller set of test samples support our claims

...read moreread less

18 citations

Book Chapter•DOI•

An Overview of XML Duplicate Detection Algorithms

[...]

Pável Calado¹, Melanie Herschel², Luís Leitão¹•Institutions (2)

INESC-ID¹, University of Tübingen²

01 Jan 2010

TL;DR: This chapter studies different approaches that have been proposed for XML fuzzy duplicate detection, and shows that the DogmatiX system is the most effective overall, as it yields the highest recall and precision values for various kinds of differences between duplicates.

...read moreread less

Abstract: Fuzzy duplicate detection aims at identifying multiple representations of real-world objects in a data source, and is a task of critical relevance in data cleaning, data mining, and data integration tasks. It has a long history for relational data, stored in a single table or in multiple tables with an equal schema. However, algorithms for fuzzy duplicate detection in more complex structures, such as hierarchies of a data warehouse, XML data, or graph data have only recently emerged. These algorithms use similarity measures that consider the duplicate status of their direct neighbors to improve duplicate detection effectiveness. In this chapter, we study different approaches that have been proposed for XML fuzzy duplicate detection. Our study includes a description and analysis of the different approaches, as well as a comparative experimental evaluation performed on both artificial and real-world data. The two main dimensions used for comparison are the methods effectiveness and efficiency. Our comparison shows that the DogmatiX system [44] is the most effective overall, as it yields the highest recall and precision values for various kinds of differences between duplicates. Another system, called XMLDup [27] has a similar performance, being most effective especially at low recall values. Finally, the SXNM system [36] is the most efficient, as it avoids executing too many pairwise comparisons, but its effectiveness is greatly affected by errors in the data.

...read moreread less

18 citations

Proceedings Article•DOI•

Pando: Personal Volunteer Computing in Browsers

[...]

Erick Lavoie¹, Laurie Hendren¹, Frédéric Desprez², Miguel Correia³•Institutions (3)

McGill University¹, French Institute for Research in Computer Science and Automation², INESC-ID³

09 Dec 2019

TL;DR: It is shown that Pando can provide throughput improvements compared to a single personal device, on a variety of compute-bound applications including animation rendering and image processing, and the flexibility of the approach is shown by deploying Pando on personal devices connected over a local network.

...read moreread less

Abstract: The large penetration and continued growth in ownership of personal electronic devices represents a freely available and largely untapped source of computing power. To leverage those, we present Pando, a new volunteer computing tool based on a declarative concurrent programming model and implemented using JavaScript, WebRTC, and WebSockets. This tool enables a dynamically varying number of failure-prone personal devices contributed by volunteers to parallelize the application of a function on a stream of values, by using the devices' browsers. We show that Pando can provide throughput improvements compared to a single personal device, on a variety of compute-bound applications including animation rendering and image processing. We also show the flexibility of our approach by deploying Pando on personal devices connected over a local network, on Grid5000, a French-wide computing grid in a virtual private network, and seven PlanetLab nodes distributed in a wide area network over Europe.

...read moreread less

18 citations

Proceedings Article•

Revisiting centrality-as-relevance: support sets and similarity as geometric proximity: extended abstract

[...]

Ricardo Ribeiro¹, David Martins de Matos²•Institutions (2)

INESC-ID¹, Technical University of Lisbon²

03 Aug 2013

TL;DR: Thorough automatic evaluation shows that the new centrality-based relevance model for automatic summarization achieves state-of-the-art performance, both in written text, and automatically transcribed speech summarization, even when compared to considerably more complex approaches.

...read moreread less

Abstract: In automatic summarization, centrality-as-relevance means that the most important content of an information source, or of a collection of information sources, corresponds to the most central passages, considering a representation where such notion makes sense (graph, spatial, etc.). We assess the main paradigms and introduce a new centrality-based relevance model for automatic summarization that relies on the use of support sets to better estimate the relevant content. Geometric proximity is used to compute semantic relatedness. Centrality (relevance) is determined by considering the whole input source (and not only local information), and by taking into account the existence of minor topics or lateral subjects in the information sources to be summarized. The method consists in creating, for each passage of the input source, a support set consisting only of the most semantically related passages. Then, the determination of the most relevant content is achieved by selecting the passages that occur in the largest number of support sets. This model produces extractive summaries that are generic, and language- and domain-independent. Thorough automatic evaluation shows that the method achieves state-of-the-art performance, both in written text, and automatically transcribed speech summarization, even when compared to considerably more complex approaches.

...read moreread less

18 citations

Journal Article•DOI•

CitySDK Tourism API - building value around open data

[...]

Ricardo Lopes Pereira¹, Ricardo Lopes Pereira², Pedro Cruz Sousa¹, Ricardo Barata¹, André Oliveira, Geert Monsieur³ - Show less +2 more•Institutions (3)

Instituto Superior Técnico¹, INESC-ID², Tilburg University³

17 Nov 2015-Journal of Internet Services and Applications

TL;DR: An overview of the design, deployment and utilization of the CitySDK Tourism API, which aims to provide access to information about Points of Interest, Events and Itineraries, was provided.

...read moreread less

Abstract: Tourism is a major social and cultural activity with relevant economic impact. In an effort to promote their attractions with tourists, some cities have adopted the open-data model, publishing touristic data for programmers to use in their own applications. Unfortunately, each city publishes touristic information in its own way.

...read moreread less

18 citations

Collapse

Authors

Showing all 967 results

Name	H-index	Papers	Citations
João Carvalho	126	1278	77017
Jaime G. Carbonell	72	496	31267
Chris Dyer	71	240	32739
Joao P. S. Catalao	68	1039	19348
Muhammad Bilal	63	720	14720
Alan W. Black	61	413	19215
João Paulo Teixeira	60	636	19663
Bhiksha Raj	51	359	13064
Joao Marques-Silva	48	289	9374
Paulo Flores	48	321	7617
Ana Paiva	47	472	9626
Miadreza Shafie-khah	47	450	8086
Susana Cardoso	44	400	7068
Mark J. Bentum	42	226	8347
Joaquim Jorge	41	290	6366