scispace - formally typeset
Search or ask a question
Institution

INESC-ID

NonprofitLisbon, Portugal
About: INESC-ID is a nonprofit organization based out in Lisbon, Portugal. It is known for research contribution in the topics: Computer science & Context (language use). The organization has 932 authors who have published 2618 publications receiving 37658 citations.


Papers
More filters
Proceedings ArticleDOI
25 Aug 2013
TL;DR: This work addresses the privacy problem in the context of a speaker verification system using a factor analysis based front-end extractor, the so-called i-vectors, using a hashing scheme known as Secure Binary Embeddings.
Abstract: Remote speaker verification services typically rely on the system to have access to the users recordings, or features derived from them, and also a model of the users voice. This conventional scheme raises several privacy concerns. In this work, we address this privacy problem in the context of a speaker verification system using a factor analysis based front-end extractor, the so-called i-vectors. Speaker verification without exposing speaker data is achieved by transforming speaker i-vectors to bit strings in a way that allows the computation of approximate distances, instead of exact ones. The key to the transformation uses a hashing scheme known as Secure Binary Embeddings. Then, a modified SVM kernel permits operating on the i-vector hashes. Experiments on sub-sets of NIST SRE 2008 showed that the secure system yielded similar results as its non-private counterpart.

12 citations

Book ChapterDOI
Rui Amaral1, Isabel Trancoso
26 Jun 2003
TL;DR: A topic segmentation and indexation system for TV broadcast news programs spoken in European Portuguese to enhance the retrieval of specific spoken documents that have been automatically transcribed, using speech recognition.
Abstract: This paper describes a topic segmentation and indexation system for TV broadcast news programs spoken in European Portuguese. The system is integrated in an alert system for selective dissemination of multimedia information developed in the scope of an European Project. The goal of this work is to enhance the retrieval of specific spoken documents that have been automatically transcribed, using speech recognition. Our segmentation algorithm is based on simple heuristics related with anchor detection. The indexation is based on hierarchical concept trees (thesaurus), containing 22 main thematic domains, for which Hidden Markov models and topic language models were created. On-going experiments related to multiple topic indexing are also described, where a confidence measure based on the likelihood ratio test is used as the hypothesis test.

12 citations

Book ChapterDOI
11 Oct 2006
TL;DR: A new compressed self-index able to locate the occurrences of P in O((m+occ)logn) time, where occ is the number of occurrences and σ the size of the alphabet of T, and is very competitive in practice by comparing it against the LZ-Index, the FM-index and a compressed suffix array.
Abstract: A compressed full-text self-index for a text T, of size u, is a data structure used to search patterns P, of size m, in T that requires reduced space, ie that depends on the empirical entropy (Hk, H0) of T, and is, furthermore, able to reproduce any substring of T In this paper we present a new compressed self-index able to locate the occurrences of P in O((m+occ)logn) time, where occ is the number of occurrences and σ the size of the alphabet of T The fundamental improvement over previous LZ78 based indexes is the reduction of the search time dependency on m from O(m2) to O(m) To achieve this result we point out the main obstacle to linear time algorithms based on LZ78 data compression and expose and explore the nature of a recurrent structure in LZ-indexes, the $\mathcal{T}_{78}$ suffix tree We show that our method is very competitive in practice by comparing it against the LZ-Index, the FM-index and a compressed suffix array

11 citations

Proceedings ArticleDOI
08 Mar 2010
TL;DR: This work proposes a new FPGA structure based on a low-power quaternary voltage-mode device, with the most important characteristics of the proposed architecture are the reduced fanout, low number of wires and switches, and the small wire length.
Abstract: FPGA structures are widely used due to early time-to-market and reduced non-recurring engineering costs in comparison to ASIC designs. Interconnections play a crucial role in modern FPGAs, because they dominate delay, power and area. Multiple-valued logic allows the reduction of the number of signals in the circuit, hence can serve as a mean to effectively curtail the impact of interconnections. In this work we propose a new FPGA structure based on a low-power quaternary voltage-mode device. The most important characteristics of the proposed architecture are the reduced fanout, low number of wires and switches, and the small wire length. We use a set of FIR filters as a demonstrator of the benefits of the quaternary representation in FPGAs. Results show a significant reduction on power consumption with small timing penalties.

11 citations

Proceedings ArticleDOI
08 Oct 2012
TL;DR: This work presents a MapReduce runtime that tolerates arbitrary faults and runs in a set of clouds at a reasonable cost in terms of computation and execution time.
Abstract: MapReduce is a framework for processing large data sets largely used in cloud computing. MapReduce implementations like Hadoop can tolerate crashes and file corruptions, but there is evidence that general arbitrary faults do occur and can affect the correctness of job executions. Furthermore, many individual cloud outages have been reported, raising concerns about depending on a single cloud. We present a MapReduce runtime that tolerates arbitrary faults and runs in a set of clouds at a reasonable cost in terms of computation and execution time. The main challenge is to avoid sending through the internet the huge amount of data that would normally be exchanged between map and reduce tasks.

11 citations


Authors

Showing all 967 results

NameH-indexPapersCitations
João Carvalho126127877017
Jaime G. Carbonell7249631267
Chris Dyer7124032739
Joao P. S. Catalao68103919348
Muhammad Bilal6372014720
Alan W. Black6141319215
João Paulo Teixeira6063619663
Bhiksha Raj5135913064
Joao Marques-Silva482899374
Paulo Flores483217617
Ana Paiva474729626
Miadreza Shafie-khah474508086
Susana Cardoso444007068
Mark J. Bentum422268347
Joaquim Jorge412906366
Network Information
Related Institutions (5)
Carnegie Mellon University
104.3K papers, 5.9M citations

88% related

Eindhoven University of Technology
52.9K papers, 1.5M citations

88% related

Microsoft
86.9K papers, 4.1M citations

88% related

Vienna University of Technology
49.3K papers, 1.3M citations

86% related

Performance
Metrics
No. of papers from the Institution in previous years
YearPapers
202311
202252
202196
2020131
2019133
2018126