Showing papers in &quot;Computer Speech &amp; Language in 2019&quot;

Preserving privacy in speaker and speech characterisation

TL;DR: An automatic speech recognition based procedure for the extraction of a special set of acoustic features and a linguistic feature set that is extracted from the transcripts of the same speech signals to tell apart Alzheimer’s patients from those with mild cognitive impairment.

...read moreread less

120 citations

Journal Article•DOI•

[...]

Andreas Nautsch¹, Andreas Nautsch², Abelino Jiménez³, Amos Treiber⁴, Jascha Kolberg¹, Catherine Jasserand, Els Kindt⁵, Héctor Delgado², Massimiliano Todisco², Mohamed Amine Hmani⁶, Aymen Mtibaa, Mohammed Ahmed Abdelraheem, Alberto Abad, Francisco Teixeira, Driss Matrouf⁷, Marta Gomez-Barrero¹, Dijana Petrovska-Delacrétaz, Gérard Chollet⁶, Nicholas Evans², Thomas Schneider⁴, Jean-François Bonastre⁷, Bhiksha Raj³, Isabel Trancoso, Christoph Busch¹ - Show less +20 more•Institutions (7)

Darmstadt University of Applied Sciences¹, Institut Eurécom², Carnegie Mellon University³, Technische Universität Darmstadt⁴, Catholic University of Leuven⁵, Université Paris-Saclay⁶, University of Avignon⁷

Dementia detection using automatic analysis of conversations

TL;DR: The requirements for effective privacy preservation are established, generic cryptography-based solutions are reviewed, followed by specific techniques that are applicable to speaker characterisation and speech characterisation (biometrics and non-biometric applications), and common, empirical evaluation metrics for the assessment of privacy-preserving technologies for speech data are outlined.

...read moreread less

91 citations

Journal Article•DOI•

[...]

Bahman Mirheidari¹, Daniel Blackburn¹, Traci Walker¹, Markus Reuber², Heidi Christensen¹ - Show less +1 more•Institutions (2)

University of Sheffield¹, Royal Hallamshire Hospital²

Unsupervised sentence representations as word information series: Revisiting TF–IDF

TL;DR: An automatic classification system using an intelligent virtual agent (IVA) is presented and it is shown that using acoustic, lexical and CA-inspired features enable ND/FMD classification rates of 90.0% for the neurologist-patient conversations, and 90.9%" for the IVA- patient conversations.

...read moreread less

60 citations

Journal Article•DOI•

[...]

Ignacio Arroyo-Fernández¹, Carlos-Francisco Méndez-Cruz¹, Gerardo Sierra¹, Juan-Manuel Torres-Moreno², Grigori Sidorov - Show less +1 more•Institutions (2)

National Autonomous University of Mexico¹, University of Avignon²

Tracking depression severity from audio and video based on speech articulatory coordination

TL;DR: This paper propose an unsupervised method that models a sentence as a weighted series of word embeddings, which are fitted by using Shannon's Mutual Information (MI) among words, sentences and the corpus.

...read moreread less

59 citations

Journal Article•DOI•

[...]

James R. Williamson¹, Diana Young², Andrew A. Nierenberg³, James B. Niemi², Brian S. Helfer¹, Thomas F. Quatieri¹ - Show less +2 more•Institutions (3)

Massachusetts Institute of Technology¹, Wyss Institute for Biologically Inspired Engineering², Harvard University³

Character convolutions for Arabic Named Entity Recognition with Long Short-Term Memory Networks

TL;DR: An algorithm is proposed that estimates the articulatory coordination of speech from audio and video signals, and uses these coordination features to learn a prediction model to track depression severity with treatment, allowing rapid assessment of treatment efficacy as well as improved long term care of individuals at high risk for depression.

...read moreread less

52 citations

Journal Article•DOI•

[...]

Muhammad Khalifa¹, Khaled Shaalan²•Institutions (2)

Freelancer.com¹, British University in Dubai²

Affective and Behavioural Computing: Lessons Learnt from the First Computational Paralinguistics Challenge

TL;DR: This paper proposes to augment the existing LSTM neural tagging model for Arabic NER with a Convolutional Neural Network for the extraction of relevant character- level features and shows that character CNN is able to outperform the previously used character-level Bi-directional Long Short-Term Memory Networks (BiLSTM) in many settings.

...read moreread less

45 citations

Journal Article•DOI•

[...]

Björn Schuller¹, Björn Schuller², Björn Schuller³, Felix Weninger⁴, Yue Zhang¹, Yue Zhang⁴, Fabien Ringeval⁵, Anton Batliner⁶, Anton Batliner³, Stefan Steidl⁶, Florian Eyben, Erik Marchi⁴, Alessandro Vinciarelli⁷, Klaus R. Scherer², Mohamed Chetouani⁸, Marcello Mortillaro² - Show less +12 more•Institutions (8)

Imperial College London¹, University of Geneva², University of Augsburg³, Technische Universität München⁴, University of Grenoble⁵, University of Erlangen-Nuremberg⁶, University of Glasgow⁷, Pierre-and-Marie-Curie University⁸

Multilingual word embeddings for the assessment of narrative speech in mild cognitive impairment

TL;DR: This article presents the four Sub-Challenges of ComParE 2013 and provides details of the Challenge databases and a meta-analysis by conducting experiments of logistic regression on single features and evaluating the performances achieved by the participants.

...read moreread less

43 citations

Journal Article•DOI•

[...]

Kathleen C. Fraser¹, Kristina Lundholm Fors¹, Dimitrios Kokkinakis¹•Institutions (1)

University of Gothenburg¹

Online learning for effort reduction in interactive neural machine translation

TL;DR: The hypothesis that subtle differences in language can be detected in narrative speech, even at the very early stages of cognitive decline, is supported when scores on screening tools such as the Mini-Mental State Exam are still in the “normal” range.

...read moreread less

40 citations

Journal Article•DOI•

[...]

Álvaro Peris¹, Francisco Casacuberta¹•Institutions (1)

Polytechnic University of Valencia¹

Overview of the sixth dialog system technology challenge: DSTC6

TL;DR: In this article, the authors explore the incremental update of neural machine translation systems during the post-editing or interactive translation processes and show that online learning effectively achieves the objective of reducing the human effort required for obtaining high-quality translations.

...read moreread less

39 citations

Journal Article•DOI•

[...]

Chiori Hori¹, Julien Perez, Ryuichiro Higashinaka², Takaaki Hori¹, Y-Lan Boureau³, Michimasa Inaba⁴, Yuiko Tsunomori⁵, Tetsuro Takahashi⁶, Koichiro Yoshino⁷, Seokhwan Kim⁸ - Show less +6 more•Institutions (8)

Mitsubishi Electric Research Laboratories¹, Nippon Telegraph and Telephone², Facebook³, Hiroshima City University⁴, NTT DoCoMo⁵, Fujitsu⁶, Nara Institute of Science and Technology⁷, Adobe Systems⁸

An automated assessment framework for atypical prosody and stereotyped idiosyncratic phrases related to autism spectrum disorder

TL;DR: The blending end-to-end trainable models associated to meaningful prior knowledge performs the best for the restaurant retrieval for Track 1 and Hybrid Code Network and Memory Network have been the best models for this task.

...read moreread less

38 citations

Journal Article•DOI•

[...]

Ming Li¹, Dengke Tang², Junlin Zeng², Tianyan Zhou², Huilin Zhu², Biyuan Chen², Xiaobing Zou² - Show less +3 more•Institutions (2)

Duke University¹, Sun Yat-sen University²

Semantic vector learning for natural language understanding

TL;DR: This paper presents an automated assessment framework in quantifying atypical prosody and stereotyped idiosyncratic phrases related to ASD and proposes both the hand-crafted feature based method as well as the end-to-end deep learning framework for detecting atypicals prosody from speech.

...read moreread less

Journal Article•DOI•

[...]

Sangkeun Jung¹•Institutions (1)

Chungnam National University¹

Analysis of DNN Speech Signal Enhancement for Robust Speaker Recognition

TL;DR: This work proposes a framework that learns to embed semantic correspondence between text and its extracted semantic knowledge, called semantic frame, and demonstrates three key areas where the embedding model can be effective: visualization, distance based semantic search, similarity-based intent classification and re-ranking.

...read moreread less

Journal Article•DOI•

[...]

Ondřej Novotný¹, Oldřich Plchot¹, Ondřej Glembek¹, Jan Cernocký¹, Lukas Burget¹ - Show less +1 more•Institutions (1)

Brno University of Technology¹

Exploiting social and local contexts propagation for inducing Chinese microblog-specific sentiment lexicons

TL;DR: In this article, a DNN-based autoencoder for speech enhancement, deverberation and denoising is presented, which can be used to build a robust speaker verification system for various target domains.

...read moreread less

Journal Article•DOI•

[...]

Chuanjun Zhao¹, Suge Wang¹, Suge Wang², Deyu Li¹, Deyu Li² - Show less +1 more•Institutions (2)

Shanxi University¹, Chinese Ministry of Education²

An investigation of linguistic stress and articulatory vowel characteristics for automatic depression classification

TL;DR: Experiments on two real-world microblog data sets demonstrate that the proposed novel sentiment unit context propagation framework can generate microblog-specific sentiment lexicons effectively and significantly outperform state-of-the-art baselines.

...read moreread less

Journal Article•DOI•

[...]

Brian Stasak¹, Brian Stasak², Julien Epps¹, Julien Epps², Roland Goecke³ - Show less +1 more•Institutions (3)

University of New South Wales¹, Commonwealth Scientific and Industrial Research Organisation², University of Canberra³

Vocal effort compensation for MFCC feature extraction in a shouted versus normal speaker recognition task

TL;DR: By examining vowel articulatory parameters, statistically significant differences in articulatory characteristics are found at a paraphonetic level and linguistic stress feature results indicate that specific vowel set analysis provides better discrimination of clinically depressed and non-depressed speakers.

...read moreread less

Journal Article•DOI•

[...]

Emma Jokinen¹, Rahim Saeidi¹, Tomi Kinnunen², Paavo Alku¹•Institutions (2)

Aalto University¹, University of Eastern Finland²

Improving LSTM CRFs using character-based compositions for Korean named entity recognition

TL;DR: Two compensation methods are proposed to tackle the mismatch in a shouted versus normal speaker recognition task by modifying the spectral envelopes of shouts to be closer to those in normal speech.

...read moreread less

Journal Article•DOI•

[...]

Seung-Hoon Na¹, Hyun Kim², Jinwoo Min¹, Kangil Kim³•Institutions (3)

Chonbuk National University¹, Pohang University of Science and Technology², Konkuk University³

01 Mar 2019-Computer Speech & Language

TL;DR: This paper addresses Korean NER tasks and proposes an extension of a bidirectional LSTM CRF by investigating character-based representation and deploys a hybrid representation using ConvNet and L STM for the sequential modeling of characters, namely a character- based LSTm-ConvNet hybrid representation.

...read moreread less

Journal Article•DOI•

Single document keyword extraction via quantifying higher-order structural features of word co-occurrence graph

[...]

Yan Chen¹, Jie Wang¹, Ping Li¹, Guo Peilun¹•Institutions (1)

Southwest Petroleum University¹

01 Sep 2019-Computer Speech & Language

TL;DR: This work has proposed a new graph-based measure for keyword extraction, by leveraging higher-order structural features of word co-occurrence graph, and shows superior performance of the proposed method, compared to TF-IDF and PageRank based methods.

...read moreread less

Journal Article•DOI•

Residual convolutional neural network with attentive feature pooling for end-to-end language identification from short-duration speech

[...]

Joao Monteiro¹, Jahangir Alam¹, Tiago H. Falk¹•Institutions (1)

Institut national de la recherche scientifique¹

A Bi-LSTM memory network for end-to-end goal-oriented dialog learning

TL;DR: Residual convolutional neural networks are employed to this end, aiming at exploiting the ability of such architectures to take into account large contextual segments of input data, and learnable attention mechanisms are introduced on top of the convolutionAL stack for data-driven feature pooling across time.

...read moreread less

Journal Article•DOI•

[...]

Byoungjae Kim¹, KyungTae Chung, Jeongpil Lee¹, Jungyun Seo¹, Myoung-Wan Koo¹ - Show less +1 more•Institutions (1)

Sogang University¹

Automatic evaluation of end-to-end dialog systems with adequacy-fluency metrics

TL;DR: A model to satisfy the requirements of Dialog System Technology Challenge 6 (DSTC6) Track 1: building an end-to-end dialog systems for goal-oriented applications is developed and achieves state-of-the-art performance among the memory networks, and is comparable to hybrid code networks and hierarchical LSTM model.

...read moreread less

Journal Article•DOI•

[...]

Luis Fernando D'Haro¹, Rafael E. Banchs², Chiori Hori³, Haizhou Li⁴•Institutions (4)

Technical University of Madrid¹, Nanyang Technological University², Mitsubishi Electric Research Laboratories³, National University of Singapore⁴

Polysemy and brevity versus frequency in language

TL;DR: A two-dimensional evaluation metric that is designed to operate at sentence level, which considers the syntactic and semantic information carried along the answers generated by an end-to-end dialog system with respect to a set of references is evaluated.

...read moreread less

Journal Article•DOI•

[...]

Bernardino Casas¹, Antoni Hernández-Fernández¹, Neus Català¹, Ramon Ferrer-i-Cancho¹, Jaume Baixeries¹ - Show less +1 more•Institutions (1)

Polytechnic University of Catalonia¹

Articulatory and bottleneck features for speaker-independent ASR of dysarthric speech

TL;DR: In this article, the authors focus on two laws that have been studied less intensively: the meaning-frequency law, i.e. the tendency of more frequent words to be more polysemous, and the law of abbreviation, which refers to the fact that more frequently words tend to be shorter.

...read moreread less

Journal Article•DOI•

[...]

Emre Yilmaz¹, Emre Yilmaz², Vikramjit Mitra³, Ganesh Sivaraman, Horacio Franco⁴ - Show less +1 more•Institutions (4)

National University of Singapore¹, Radboud University Nijmegen², University of Maryland, College Park³, SRI International⁴

Neural sentence fusion for diversity driven abstractive multi-document summarization

TL;DR: This work compares the ASR performance of speaker-independent bottleneck and articulatory features on dysarthric speech used in conjunction with dedicated neural network-based acoustic models that have been shown to be robust against spectrotemporal deviations.

...read moreread less

Journal Article•DOI•

[...]

Tanvir Ahmed Fuad¹, Mir Tafseer Nayeem¹, Asif Mahmud¹, Yllias Chali¹•Institutions (1)

University of Lethbridge¹

Enhancing generative conversational service agents with dialog history and external knowledge

TL;DR: This work designs complementary models for two different tasks such as sentence clustering and neural sentence fusion and applies them to implement a full abstractive multi-document summarization system which simultaneously considers importance, coverage, and diversity under a desired length limit.

...read moreread less

Journal Article•DOI•

[...]

Zongsheng Wang, Zhuoran Wang¹, Yinong Long², Jianan Wang³, Zhen Xu⁴, Baoxun Wang - Show less +2 more•Institutions (4)

Tsinghua University¹, Central South University², Shanghai Jiao Tong University³, Harbin Institute of Technology⁴

01 Mar 2019-Computer Speech & Language

TL;DR: This paper describes the attempt at generating natural and informative responses for customer service oriented dialog systems, by incorporating dialog history related information and external knowledge in two improved sequence-to-sequence frameworks.

...read moreread less

Journal Article•DOI•

Unsupervised-learning-based keyphrase extraction from a single document by the effective combination of the graph-based model and the modified C-value method

[...]

Hongseon Yeom¹, Youngjoong Ko², Jungyun Seo¹•Institutions (2)

Sogang University¹, Sungkyunkwan University²

Predicting emotional reactions to news articles in social networks

TL;DR: An effective combination method of a statistical model, C-value method, and a graph-based model to overcome the drawbacks of each model is proposed and its results outperformed the state-of-the-art model among unsupervised models and the existing graph- based ranking models.

...read moreread less

Journal Article•DOI•

[...]

Omar Juárez Gambino¹, Hiram Calvo¹•Institutions (1)

Instituto Politécnico Nacional¹

A unified framework and models for integrating translation memory into phrase-based statistical machine translation

TL;DR: This work proposes a method to predict the emotional reactions that Twitter users would have after reading a news article by using a multi-target classification strategy and obtains an emotional reactions similarity of 89%.

...read moreread less

Journal Article•DOI•

[...]

Yang Liu¹, Kun Wang¹, Chengqing Zong¹, Keh-Yih Su²•Institutions (2)

Chinese Academy of Sciences¹, Academia Sinica²

01 Mar 2019-Computer Speech & Language

TL;DR: Under this unified framework, several integrated models are proposed to incorporate different types of information extracted from TM to guide the SMT decoding and let SMT implicitly and indirectly utilize global context with a local dependency model.

...read moreread less

Journal Article•DOI•

Toward differential diagnosis of autism spectrum disorder using multimodal behavior descriptors and executive functions

[...]

Chin-Po Chen¹, Chin-Po Chen², Susan Shur-Fen Gau³, Chi-Chun Lee², Chi-Chun Lee¹ - Show less +1 more•Institutions (3)

AmeriCorps VISTA¹, National Tsing Hua University², National Taiwan University³

Automatic classification of speech overlaps: Feature representation and algorithms

TL;DR: This work proposes to compute signal-derived multimodal behavior descriptors of ASD subjects during dyadic interactions of Autism Diagnostic Observation Schedule (ADOS), and further examines these behavior features’ discriminatory power in differentiating between the three groups in ASD.

...read moreread less

Journal Article•DOI•

[...]

Shammur Absar Chowdhury¹, Evgeny A. Stepanov¹, Morena Danieli¹, Giuseppe Riccardi¹•Institutions (1)

University of Trento¹