Institution
Idiap Research Institute
Facility•Martigny-Combe, Switzerland•
About: Idiap Research Institute is a facility organization based out in Martigny-Combe, Switzerland. It is known for research contribution in the topics: Hidden Markov model & Speaker recognition. The organization has 434 authors who have published 1915 publications receiving 58976 citations. The organization is also known as: Institut d'intelligence artificielle perceptive.
Topics: Hidden Markov model, Speaker recognition, Speaker diarisation, Speech processing, Word error rate
Papers published on a yearly basis
Papers
More filters
•
01 Jan 2011
TL;DR: The design of Kaldi is described, a free, open-source toolkit for speech recognition research that provides a speech recognition system based on finite-state automata together with detailed documentation and a comprehensive set of scripts for building complete recognition systems.
Abstract: We describe the design of Kaldi, a free, open-source toolkit for speech recognition research. Kaldi provides a speech recognition system based on finite-state automata (using the freely available OpenFst), together with detailed documentation and a comprehensive set of scripts for building complete recognition systems. Kaldi is written is C++, and the core library supports modeling of arbitrary phonetic-context sizes, acoustic modeling with subspace Gaussian mixture models (SGMM) as well as standard Gaussian mixture models, together with all commonly used linear and affine transforms. Kaldi is released under the Apache License v2.0, which is highly nonrestrictive, making it suitable for a wide community of users.
5,857 citations
••
TL;DR: This paper shows that reformulating that step as a constrained flow optimization results in a convex problem and takes advantage of its particular structure to solve it using the k-shortest paths algorithm, which is very fast.
Abstract: Multi-object tracking can be achieved by detecting objects in individual frames and then linking detections across frames. Such an approach can be made very robust to the occasional detection failure: If an object is not detected in a frame but is in previous and following ones, a correct trajectory will nevertheless be produced. By contrast, a false-positive detection in a few frames will be ignored. However, when dealing with a multiple target problem, the linking step results in a difficult optimization problem in the space of all possible families of trajectories. This is usually dealt with by sampling or greedy search based on variants of Dynamic Programming which can easily miss the global optimum. In this paper, we show that reformulating that step as a constrained flow optimization results in a convex problem. We take advantage of its particular structure to solve it using the k-shortest paths algorithm, which is very fast. This new approach is far simpler formally and algorithmically than existing techniques and lets us demonstrate excellent performance in two very different contexts.
1,076 citations
••
19 Jun 2006
TL;DR: The third BCI Competition to address several of the most difficult and important analysis problems in BCI research is organized and the paper describes the data sets that were provided to the competitors and gives an overview of the results.
Abstract: A brain-computer interface (BCI) is a system that allows its users to control external devices with brain activity. Although the proof-of-concept was given decades ago, the reliable translation of user intent into device control commands is still a major challenge. Success requires the effective interaction of two adaptive controllers: the user's brain, which produces brain activity that encodes intent, and the BCI system, which translates that activity into device control commands. In order to facilitate this interaction, many laboratories are exploring a variety of signal analysis techniques to improve the adaptation of the BCI system to the user. In the literature, many machine learning and pattern classification algorithms have been reported to give impressive results when applied to BCI data in offline analyses. However, it is more difficult to evaluate their relative value for actual online use. BCI data competitions have been organized to provide objective formal evaluations of alternative methods. Prompted by the great interest in the first two BCI Competitions, we organized the third BCI Competition to address several of the most difficult and important analysis problems in BCI research. The paper describes the data sets that were provided to the competitors and gives an overview of the results.
814 citations
•
21 Jun 2014TL;DR: This work proposes an approach that consists of a recurrent convolutional neural network which allows us to consider a large input context while limiting the capacity of the model, and yields state-of-the-art performance on both the Stanford Background Dataset and the SIFT FlowDataset while remaining very fast at test time.
Abstract: The goal of the scene labeling task is to assign a class label to each pixel in an image. To ensure a good visual coherence and a high class accuracy, it is essential for a model to capture long range (pixel) label dependencies in images. In a feed-forward architecture, this can be achieved simply by considering a sufficiently large input context patch, around each pixel to be labeled. We propose an approach that consists of a recurrent convolutional neural network which allows us to consider a large input context while limiting the capacity of the model. Contrary to most standard approaches, our method does not rely on any segmentation technique nor any task-specific features. The system is trained in an end-to-end manner over raw pixels, and models complex spatial dependencies with low inference cost. As the context size increases with the built-in recurrence, the system identifies and corrects its own errors. Our approach yields state-of-the-art performance on both the Stanford Background Dataset and the SIFT Flow Dataset, while remaining very fast at test time.
747 citations
•
27 Sep 2012TL;DR: This paper inspects the potential of texture features based on Local Binary Patterns (LBP) and their variations on three types of attacks: printed photographs, and photos and videos displayed on electronic screens of different sizes and concludes that LBP show moderate discriminability when confronted with a wide set of attack types.
Abstract: Spoofing attacks are one of the security traits that biometric recognition systems are proven to be vulnerable to. When spoofed, a biometric recognition system is bypassed by presenting a copy of the biometric evidence of a valid user. Among all biometric modalities, spoofing a face recognition system is particularly easy to perform: all that is needed is a simple photograph of the user. In this paper, we address the problem of detecting face spoofing attacks. In particular, we inspect the potential of texture features based on Local Binary Patterns (LBP) and their variations on three types of attacks: printed photographs, and photos and videos displayed on electronic screens of different sizes. For this purpose, we introduce REPLAY-ATTACK, a novel publicly available face spoofing database which contains all the mentioned types of attacks. We conclude that LBP, with ∼15% Half Total Error Rate, show moderate discriminability when confronted with a wide set of attack types.
707 citations
Authors
Showing all 440 results
Name | H-index | Papers | Citations |
---|---|---|---|
Samy Bengio | 95 | 390 | 56904 |
François Fleuret | 91 | 936 | 42585 |
Di Wu | 87 | 965 | 48697 |
Tinne Tuytelaars | 71 | 374 | 46089 |
Daniel P. W. Ellis | 67 | 355 | 20791 |
Arun Ross | 66 | 323 | 28023 |
Amir H. Mohammadi | 62 | 698 | 16044 |
Junichi Yamagishi | 60 | 500 | 14665 |
Daniel Gatica-Perez | 59 | 318 | 13488 |
Anindya Roy | 59 | 301 | 14306 |
José del R. Millán | 56 | 332 | 11839 |
Steve Renals | 56 | 368 | 10827 |
Sébastien Marcel | 55 | 252 | 10480 |
Barbara Caputo | 53 | 257 | 11628 |
Hervé Bourlard | 51 | 451 | 13929 |