scispace - formally typeset
Search or ask a question
Institution

INESC-ID

NonprofitLisbon, Portugal
About: INESC-ID is a nonprofit organization based out in Lisbon, Portugal. It is known for research contribution in the topics: Computer science & Context (language use). The organization has 932 authors who have published 2618 publications receiving 37658 citations.


Papers
More filters
Journal ArticleDOI
01 Apr 2008
TL;DR: A regular expression pattern matching approach for reconfigurable hardware that is 10–20× more efficient than previous Field Programmable Gate Array approaches, and the generated designs have comparable area requirements to current application-specific integrated circuit solutions.
Abstract: In this paper we describe a regular expression pattern matching approach for reconfigurable hardware. Following a Non-deterministic Finite Automata direction, we introduce three new basic building blocks to support constraint repetitions syntaxes more efficiently than previous works. In addition, a number of optimization techniques are employed to reduce the area cost of the designs and maximize performance. Our design methodology is supported by a tool that automatically generates the circuitry for the given regular expressions and outputs Hardware Description Language representations ready for logic synthesis. The proposed approach is evaluated on network Intrusion Detection Systems (IDS). Recent IDS use regular expressions to represent hazardous packet payload contents. They require high-speed packet processing providing a challenging case study for pattern matching using regular expressions. We use a number of IDS rulesets to show that our approach scales well as the number of regular expressions increases, and present a step-by-step optimization to survey the benefits of our techniques. The synthesis tool described in this study is used to generate hardware engines to match 300 to 1,500 IDS regular expressions using only 10---45 K logic cells and achieving throughput of 1.6---2.2 and 2.4---3.2 Gbps on Virtex2 and Virtex4 devices, respectively. Concerning the throughput per area required per matching non-Meta character, our hardware engines are 10---20× more efficient than previous Field Programmable Gate Array approaches. Furthermore, the generated designs have comparable area requirements to current application-specific integrated circuit solutions.

86 citations

Journal ArticleDOI
TL;DR: MIAPPE 1.1 marks a major step towards enabling plant phenotyping data reusability, thanks to its extended coverage, and especially the formalisation of its data model, which facilitates its implementation in different formats.
Abstract: Enabling data reuse and knowledge discovery is increasingly critical in modern science, and requires an effort towards standardising data publication practices. This is particularly challenging in the plant phenotyping domain, due to its complexity and heterogeneity. We have produced the MIAPPE 1.1 release, which enhances the existing MIAPPE standard in coverage, to support perennial plants, in structure, through an explicit data model, and in clarity, through definitions and examples. We evaluated MIAPPE 1.1 by using it to express several heterogeneous phenotyping experiments in a range of different formats, to demonstrate its applicability and the interoperability between the various implementations. Furthermore, the extended coverage is demonstrated by the fact that one of the datasets could not have been described under MIAPPE 1.0. MIAPPE 1.1 marks a major step towards enabling plant phenotyping data reusability, thanks to its extended coverage, and especially the formalisation of its data model, which facilitates its implementation in different formats. Community feedback has been critical to this development, and will be a key part of ensuring adoption of the standard.

85 citations

Journal ArticleDOI
TL;DR: The results showed that there were significant differences in the performance of the algorithms being evaluated, and the new proposed measure for jitter, LocJitt, performed in general is equal to or better than the commonly used tools of MDVP and Praat.
Abstract: This work is focused on the evaluation of different methods to estimate the amount of jitter present in speech signals. The jitter value is a measure of the irregularity of a quasiperiodic signal and is a good indicator of the presence of pathologies in the larynx such as vocal fold nodules or a vocal fold polyp. Given the irregular nature of the speech signal, each jitter estimation algorithm relies on its own model making a direct comparison of the results very difficult. For this reason, the evaluation of the different jitter estimation methods was target on their ability to detect pathological voices. Two databases were used for this evaluation: a subset of the MEEI database and a smaller database acquired in the scope of this work. The results showed that there were significant differences in the performance of the algorithms being evaluated. Surprisingly, in the largest database the best results were not achieved with the commonly used relative jitter, measured as a percentage of the glottal cycle, but with absolute jitter values measured in microseconds. Also, the new proposed measure for jitter, LocJitt, performed in general is equal to or better than the commonly used tools of MDVP and Praat.

84 citations

Proceedings ArticleDOI
19 Apr 2009
TL;DR: This paper describes experiments with SVM and HMM-based classifiers, using a 290-hour corpus of sound effects, and reports promising results, despite the difficulties posed by the mixtures of audio events that characterize real sounds.
Abstract: Audio event detection is one of the tasks of the European project VIDIVIDEO. This paper focuses on the detection of non-speech events, and as such only searches for events in audio segments that have been previously classified as non-speech. Preliminary experiments with a small corpus of sound effects have shown the potential of this type of corpus for training purposes. This paper describes our experiments with SVM and HMM-based classifiers, using a 290-hour corpus of sound effects. Although we have only built detectors for 15 semantic concepts so far, the method seems easily portable to other concepts. The paper reports experiments with multiple features, different kernels and several analysis windows. Preliminary experiments on documentaries and films yielded promising results, despite the difficulties posed by the mixtures of audio events that characterize real sounds.

84 citations

Book ChapterDOI
17 Apr 2012
TL;DR: A methodology for automatically enlarging a Portuguese sentiment lexicon for mining social judgments from text, i.e., detecting opinions on human entities, produces results at least as good as the best that have been reported for this task.
Abstract: We present a methodology for automatically enlarging a Portuguese sentiment lexicon for mining social judgments from text, i.e., detecting opinions on human entities. Starting from publicly-availabe language resources, the identification of human adjectives is performed through the combination of a linguistic-based strategy, for extracting human adjective candidates from corpora, and machine learning for filtering the human adjectives from the candidate list. We then create a graph of the synonymic relations among the human adjectives, which is built from multiple open thesauri. The graph provides distance features for training a model for polarity assignment. Our initial evaluation shows that this method produces results at least as good as the best that have been reported for this task.

81 citations


Authors

Showing all 967 results

NameH-indexPapersCitations
João Carvalho126127877017
Jaime G. Carbonell7249631267
Chris Dyer7124032739
Joao P. S. Catalao68103919348
Muhammad Bilal6372014720
Alan W. Black6141319215
João Paulo Teixeira6063619663
Bhiksha Raj5135913064
Joao Marques-Silva482899374
Paulo Flores483217617
Ana Paiva474729626
Miadreza Shafie-khah474508086
Susana Cardoso444007068
Mark J. Bentum422268347
Joaquim Jorge412906366
Network Information
Related Institutions (5)
Carnegie Mellon University
104.3K papers, 5.9M citations

88% related

Eindhoven University of Technology
52.9K papers, 1.5M citations

88% related

Microsoft
86.9K papers, 4.1M citations

88% related

Vienna University of Technology
49.3K papers, 1.3M citations

86% related

Performance
Metrics
No. of papers from the Institution in previous years
YearPapers
202311
202252
202196
2020131
2019133
2018126