scispace - formally typeset
Search or ask a question
Author

Emmanuel Dupoux

Bio: Emmanuel Dupoux is an academic researcher from Facebook. The author has contributed to research in topics: Computer science & Vowel. The author has an hindex of 63, co-authored 267 publications receiving 14315 citations. Previous affiliations of Emmanuel Dupoux include Centre national de la recherche scientifique & PSL Research University.


Papers
More filters
Journal ArticleDOI
TL;DR: It is concluded that LSTMs can capture a non-trivial amount of grammatical structure given targeted supervision, but stronger architectures may be required to further reduce errors; furthermore, the language modeling signal is insufficient for capturing syntax-sensitive dependencies, and should be supplemented with more direct supervision if such dependencies need to be captured.
Abstract: The success of long short-term memory (LSTM) neural networks in language processing is typically attributed to their ability to capture long-distance statistical regularities. Linguistic regularities are often sensitive to syntactic structure; can such dependencies be captured by LSTMs, which do not have explicit structural representations? We begin addressing this question using number agreement in English subject-verb dependencies. We probe the architecture's grammatical competence both using training objectives with an explicit grammatical target (number prediction, grammaticality judgments) and using language models. In the strongly supervised settings, the LSTM achieved very high overall accuracy (less than 1% errors), but errors increased when sequential and structural information conflicted. The frequency of such errors rose sharply in the language-modeling setting. We conclude that LSTMs can capture a non-trivial amount of grammatical structure given targeted supervision, but stronger architectures may be required to further reduce errors; furthermore, the language modeling signal is insufficient for capturing syntax-sensitive dependencies, and should be supplemented with more direct supervision if such dependencies need to be captured.

691 citations

Journal ArticleDOI
01 Oct 1998-Brain
TL;DR: Findings suggest that, at least for pairs of L1 and L2 languages that are fairly close, attained proficiency is more important than age of acquisition as a determinant of the cortical representation of L2.
Abstract: Functional imaging methods show differences in the pattern of cerebral activation associated with the subject's native language (L1) compared with a second language (L2). In a recent PET investigation on bilingualism we showed that auditory processing of stories in L1 (Italian) engages the temporal lobes and temporoparietal cortex more extensively than L2 (English). However, in that study the Italian subjects learned L2 late and attained a fair, but not an excellent command of this language (low proficiency, late acquisition bilinguals). Thus, the different patterns of activation could be ascribed either to age of acquisition or to proficiency level. In the current study we use a similar paradigm to evaluate the effect of early and late acquisition of L2 in highly proficient bilinguals. We studied a group of Italian-English bilinguals who acquired L2 after the age of 10 years (high proficiency, late acquisition bilinguals) and a group of Spanish-Catalan bilinguals who acquired L2 before the age of 4 years (high proficiency, early acquisition bilinguals). The differing cortical responses we had observed when low proficiency volunteers listened to stories in L1 and L2 were not found in either of the high proficiency groups in this study. Several brain areas, similar to those observed for L1 in low proficiency bilinguals, were activated by L2. These findings suggest that, at least for pairs of L1 and L2 languages that are fairly close, attained proficiency is more important than age of acquisition as a determinant of the cortical representation of L2.

679 citations

Journal ArticleDOI
TL;DR: Variations in accent are sufficient to evoke social preferences observed in infants before they produce or comprehend speech and are exhibited by children even when they comprehend the foreign-accented speech.
Abstract: The Gileadites captured the fords of the Jordan leading to Ephraim, and whenever a survivour of Ephraim said, ''Let me go over,'' the men of Gilead asked him, ''Are you an Ephraimite?'' If he replied, ''No,'' they said, ''All right, say 'Shibboleth'.'' If he said, ''Sibboleth,'' because he could not pronounce the word correctly, they seized him and killed him at the fords of the Jordan. Forty-two thousand Ephraimites were killed at that time. Judges 12:5-6.

673 citations

Journal ArticleDOI
TL;DR: In this article, Hinrichs, Yurko, and Hu (1981) were extended with French Ss to compare multidigit numbers digit by digit (symbolic model) or do they compute the whole magnitude of the numbers before comparing them (holistic model)?
Abstract: Do Ss compare multidigit numbers digit by digit (symbolic model) or do they compute the whole magnitude of the numbers before comparing them (holistic model)? In 4 experiments of timed 2-digit number comparisons with a fixed standard, the findings of Hinrichs, Yurko, and Hu (1981) were extended with French Ss. Reaction times (RTs) decreased with target-standard distance, with discontinuities at the boundaries of the standard's decade appearing only with standards 55 and 66 but not with 65. The data are compatible with the holistic model. A symbolic interference model that posits the simultaneous comparison of decades and units can also account for the results. To separate the 2 models, the decades and units digits of target numbers were presented asynchronously in Experiment 4. Contrary to the prediction of the interference model, presenting the units before the decades did not change the influence of units on RTs. Pros and cons of the holistic model are discussed.

653 citations

Journal ArticleDOI
TL;DR: The hypothesis that first language acquisition relies on a dedicated left-hemispheric cerebral network, while late second language acquisition is not necessarily associated with a reproducible biological substrate is supported.
Abstract: Functional magnetic resonance imaging was used to assess inter-subject variability in the cortical representation of language comprehension processes. Moderately fluent French-English bilinguals were scanned while they listened to stories in their first language (L1 = French) or in a second language (L2 = English) acquired at school after the age of seven. In all subjects, listening to L1 always activated a similar set of areas in the left temporal lobe, clustered along the left superior temporal sulcus. Listening to L2, however, activated a highly variable network of left and right temporal and frontal areas, sometimes restricted only to right-hemispheric regions. These results support the hypothesis that first language acquisition relies on a dedicated left-hemispheric cerebral network, while late second language acquisition is not necessarily associated with a reproducible biological substrate. The postulated contribution of the right hemisphere to L2 comprehension is found to hold only on average, individual subjects varying from complete right lateralization to standard left lateralization for L2.

522 citations


Cited by
More filters
Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

Journal ArticleDOI
13 Dec 1996-Science
TL;DR: The present study shows that a fundamental task of language acquisition, segmentation of words from fluent speech, can be accomplished by 8-month-old infants based solely on the statistical relationships between neighboring speech sounds.
Abstract: Learners rely on a combination of experience-independent and experience-dependent mechanisms to extract information from the environment. Language acquisition involves both types of mechanisms, but most theorists emphasize the relative importance of experience-independent mechanisms. The present study shows that a fundamental task of language acquisition, segmentation of words from fluent speech, can be accomplished by 8-month-old infants based solely on the statistical relationships between neighboring speech sounds. Moreover, this word segmentation was based on statistical learning from only 2 minutes of exposure, suggesting that infants have access to a powerful mechanism for the computation of statistical properties of the language input.

4,352 citations

Journal ArticleDOI
TL;DR: The model can handle some of the main observations in the domain of speech errors (the major empirical domain for most other theories of lexical access), and the theory opens new ways of approaching the cerebral organization of speech production by way of high-temporal-resolution imaging.
Abstract: Preparing words in speech production is normally a fast and accurate process. We generate them two or three per second in fluent conversation; and overtly naming a clear picture of an object can easily be initiated within 600 msec after picture onset. The underlying process, however, is exceedingly complex. The theory reviewed in this target article analyzes this process as staged and feed-forward. After a first stage of conceptual preparation, word generation proceeds through lexical selection, morphological and phonological encoding, phonetic encoding, and articulation itself. In addition, the speaker exerts some degree of output control, by monitoring of self-produced internal and overt speech. The core of the theory, ranging from lexical selection to the initiation of phonetic encoding, is captured in a computational model, called WEAVER++. Both the theory and the computational model have been developed in interaction with reaction time experiments, particularly in picture naming or related word production paradigms, with the aim of accounting for the real-time processing in normal word production. A comprehensive review of theory, model, and experiments is presented. The model can handle some of the main observations in the domain of speech errors (the major empirical domain for most other theories of lexical access), and the theory opens new ways of approaching the cerebral organization of speech production by way of high-temporal-resolution imaging.

3,958 citations