Showing papers in "Computer Speech & Language in 2022"

PDF

Open Access

Journal Article•DOI•

Bayesian HMM clustering of x-vector sequences (VBx) in speaker diarization: Theory, implementation and analysis on standard tasks

[...]

Federico Landini¹, Ruas, Maria Aparecida Soares², Ján Profant, Mireia Diez¹, Lukas Burget¹ - Show less +1 more•Institutions (2)

Brno University of Technology¹, Osservatorio Astrofisico di Torino²

01 Jan 2022-Computer Speech & Language

TL;DR: This work shows that VBx achieves superior performance on three of the most popular datasets for evaluating diarization: CALLHOME, AMI and DIHARDII datasets and presents for the first time the derivation and update formulae for the VBX model.

...read moreread less

110 citations

Journal Article•DOI•

A review of speaker diarization: Recent advances with deep learning

[...]

Tae Jin Park¹, Naoyuki Kanda², Dimitrios Dimitriadis², Kyu Jeong Han, Shinji Watanabe³, Shrikanth S. Narayanan¹ - Show less +2 more•Institutions (3)

University of Southern California¹, Microsoft², Johns Hopkins University³

01 Mar 2022-Computer Speech & Language

TL;DR: Speaker diarization is a task to label audio or video recordings with classes that correspond to speaker identity as mentioned in this paper, or in short, identifying "who spoke when" in audio and video recordings.

...read moreread less

55 citations

Journal Article•DOI•

Bayesian HMM clustering of x-vector sequences (VBx) in speaker diarization: Theory, implementation and analysis on standard tasks

[...]

Ruas, Maria Aparecida Soares¹•Institutions (1)

Osservatorio Astrofisico di Torino¹

01 Jan 2022-Computer Speech & Language

TL;DR: The VBx model as discussed by the authors uses a Bayesian hidden Markov model to find speaker clusters in a sequence of x-vectors and achieves superior performance on three popular datasets for evaluating diarization: CALLHOME, AMI and DIHARD II.

...read moreread less

41 citations

Journal Article•DOI•

Hate speech detection on Twitter using transfer learning

[...]

Research Assistant Raza Ali, Umair Arshad, Waseem Shahzad, Mirza Omer Beg

01 Feb 2022-Computer Speech & Language

TL;DR: In this paper , the authors developed an Urdu language hate lexicon, on the basis of which they formulated annotated dataset of 10,526 Urdu tweets and used various machine learning techniques for hate speech detection.

...read moreread less

41 citations

Journal Article•DOI•

Spoken language interaction with robots: Recommendations for future research

[...]

Matthew Marge¹, Carol Y. Espy-Wilson², Nigel Ward³, Abeer Alwan⁴, Yoav Artzi⁵, Mohit Bansal⁶, G.L. Blankenship², Joyce Y. Chai⁷, Hal Daumé², Debadeepta Dey⁸, Mary P. Harper¹, Thomas M. Howard⁹, Casey Kennington¹⁰, Ivana Kruijff-Korbayová, Dinesh Manocha², Cynthia Matuszek¹¹, Ross Mead, Raymond J. Mooney¹², Roger K. Moore¹³, Mari Ostendorf¹⁴, Heather Pon-Barry¹⁵, Alexander I. Rudnicky¹⁶, Matthias Scheutz¹⁷, Robert St. Amant¹, Tong Sun¹⁸, Stefanie Tellex¹⁹, David Traum²⁰, Zhou Yu²¹ - Show less +24 more•Institutions (21)

01 Jan 2022-Computer Speech & Language

TL;DR: This article identifies key scientific and engineering advances needed to enable effective spoken language interaction with robotics, and makes 25 recommendations, involving eight general themes: putting human needs first, better modeling the social and interactive aspects of language, improving robustness, creating new methods for rapid adaptation, and improving research infrastructure and resources.

...read moreread less

38 citations

Journal Article•DOI•

Combining context-relevant features with multi-stage attention network for short text classification

[...]

Yingying Liu¹, Peipei Li¹, Xuegang Hu¹•Institutions (1)

Hefei University of Technology¹

01 Jan 2022-Computer Speech & Language

TL;DR: This work proposes a novel short text classification approach combining Context-Relevant Features with multi-stage Attention model based on Temporal Convolutional Network (TCN) and CNN, called CRFA, which uses Probase as external knowledge to enrich the semantic representation for the solution to the data sparsity and ambiguity of short texts.

...read moreread less

32 citations

Journal Article•

Deep reinforcement and transfer learning for abstractive text summarization: A review

[...]

Ayham Alomari, Norisma Idris, Aznul Qalid Md Sabri, Izzat Alsmadi

Computer Speech & Language

32 citations

Journal Article•DOI•

Combining context-relevant features with multi-stage attention network for short text classification

[...]

01 Jan 2022-Computer Speech & Language

TL;DR: This paper proposed a novel short text classification approach combining Context-Relevant Features with multi-stage Attention model based on Temporal Convolutional Network (TCN) and CNN, called CRFA.

...read moreread less

25 citations

Journal Article•DOI•

BERT syntactic transfer: A computational experiment on Italian, French and English languages

[...]

Raffaele Guarasci¹, Stefano Silvestri², Stefano Silvestri¹, Giuseppe De Pietro¹, Hamido Fujita, Massimo Esposito¹ - Show less +2 more•Institutions (2)

Indian Council of Agricultural Research¹, University of Bologna²

01 Jan 2022-Computer Speech & Language

TL;DR: The results of the experimental assessment have shown a transfer of syntactic knowledge of the mBERT model among languages belonging to different branches of the Indo-European languages, namely English, Italian and French, which present very different syntactic constructions.

...read moreread less

24 citations

Journal Article•DOI•

Deep reinforcement and transfer learning for abstractive text summarization: A review

[...]

01 Jan 2022-Computer Speech & Language

TL;DR: Automatic Text Summarization (ATS) is an important area in NLP as mentioned in this paper with the goal of shortening a long text into a more compact version by conveying the most important points in a readable form.

...read moreread less

23 citations

Journal Article•DOI•

An automatic Alzheimer’s disease classifier based on spontaneous spoken English

[...]

Flavio Bertini¹, Davide Allevi², Gianluca Lutero², Laura Calzà², Danilo Montesi² - Show less +1 more•Institutions (2)

University of Parma¹, University of Bologna²

01 Mar 2022-Computer Speech & Language

TL;DR: A full automated method able to classify the spontaneous spoken production of the subjects using the spectrogram of the audio signal, which is the visual representation of the speech of the subject, and a specific data augmentation approach that avoids distorting the original samples is proposed.

...read moreread less

Journal Article•DOI•

Deep reinforcement and transfer learning for abstractive text summarization: A review

[...]

Ayham Alomari¹, Norisma Idris¹, Aznul Qalid Md Sabri¹, Izzat Alsmadi²•Institutions (2)

University of Malaya¹, Texas A&M University²

01 Jan 2022-Computer Speech & Language

TL;DR: Automatic Text Summarization (ATS) is an important area in Natural Language Processing (NLP) with the goal of shortening a long text into a more compact version by conveying the most important points in a readable form as mentioned in this paper.

...read moreread less

Journal Article•DOI•

Hate speech and offensive language detection in Dravidian languages using deep ensemble framework

[...]

Pradeep Kumar Roy, Snehaan Bhawal, C. N. Subalalitha

01 Apr 2022-Computer Speech & Language

TL;DR: In this paper , a weighted ensemble framework for hate and offensive code-mixed posts identification on social platforms has been proposed to detect hate speech and offensive language on social networking platforms.

...read moreread less

Journal Article•DOI•

Generative adversarial networks for speech processing: A review

[...]

Aamir Wali¹, Zareen Alamgir¹, Saira Karim¹, Ather Fawaz¹, Mubariz Barkat Ali¹, Muhammad Adan¹, Malik Mujtaba¹ - Show less +3 more•Institutions (1)

National University of Computer and Emerging Sciences¹

01 Mar 2022-Computer Speech & Language

TL;DR: A comprehensive review of the novel and emerging GAN-based speech frameworks and algorithms that have revolutionized speech processing and categorized speech GANs based on application areas: speech synthesis, speech enhancement & conversion, and data augmentation in automatic speech recognition and emotion speech recognition systems.

...read moreread less

Journal Article•DOI•

The VoicePrivacy 2020 Challenge: Results and findings

[...]

01 Jul 2022-Computer Speech & Language

TL;DR: The first VoicePrivacy 2020 Challenge as mentioned in this paper focused on developing anonymization solutions for speech technology and evaluated the results and analyses stemming from the challenge, including the objective and subjective evaluation metrics and attack models.

...read moreread less

Journal Article•DOI•

BERT syntactic transfer: A computational experiment on Italian, French and English languages

[...]

Stefano Silvestri¹•Institutions (1)

University of Bologna¹

01 Jan 2022-Computer Speech & Language

TL;DR: This paper investigated the ability of multilingual BERT (mBERT) language model to transfer syntactic knowledge cross-lingually, verifying if and to which extent syntactic dependency relationships learnt in a language are maintained in other languages.

...read moreread less

Journal Article•DOI•

Investigations on speech recognition systems for low-resource dialectal Arabic–English code-switching speech

[...]

01 Mar 2022-Computer Speech & Language

TL;DR: In this article , the authors presented their work on code-switched Egyptian Arabic-English ASR using DNN-based hybrid and Transformer-based end-to-end models.

...read moreread less

Journal Article•DOI•

Investigations on speech recognition systems for low-resource dialectal Arabic–English code-switching speech

[...]

Injy Hamed¹, Pavel Denisov¹, Chia-Yu Li¹, Mohamed Elmahdy, Slim Abdennadher², Ngoc Thang Vu¹ - Show less +2 more•Institutions (2)

University of Stuttgart¹, German University in Cairo²

01 Mar 2022-Computer Speech & Language

TL;DR: In this paper, the authors presented their work on code-switched Egyptian Arabic-English ASR using DNN-based hybrid and Transformer-based end-to-end models.

...read moreread less

Journal Article•DOI•

An automatic Alzheimer’s disease classifier based on spontaneous spoken English

[...]

01 Mar 2022-Computer Speech & Language

TL;DR: In this paper , the authors proposed a full automated method able to classify the spontaneous spoken production of the subjects, in particular, trained an artificial neural network using the spectrogram of the audio signal which is the visual representation of the speech of the subject.

...read moreread less

Journal Article•DOI•

Generative adversarial networks for speech processing: A review

[...]

01 Mar 2022-Computer Speech & Language

TL;DR: A comprehensive review of the novel and emerging GAN-based speech frameworks and algorithms that have revolutionized speech processing can be found in this article , where the authors categorized speech GANs based on application areas: speech synthesis, speech enhancement and conversion, and data augmentation in automatic speech recognition and emotion speech recognition systems.

...read moreread less

Journal Article•DOI•

Improving the potential of Enhanced Teager Energy Cepstral Coefficients (ETECC) for replay attack detection

[...]

Ankur T. Patil¹, Rajul Acharya¹, Hemant A. Patil¹, Rodrigo Capobianco Guido²•Institutions (2)

Dhirubhai Ambani Institute of Information and Communication Technology¹, Sao Paulo State University²

01 Mar 2022-Computer Speech & Language

TL;DR: Comprehensive evaluations which include a detailed mathematical analysis, a simulation on amplitude and frequency modulated (AM-FM) signals, and a spectrographic inspection involving different filterbank structures, along with their experimental results are provided in this paper.

...read moreread less

Journal Article•DOI•

X-vector anonymization using autoencoders and adversarial training for preserving speech privacy

[...]

Juan M. Perero-Codosero, Fernando Espinoza-Cuadros, Luis A. Hernández Gómez

01 Jan 2022-Computer Speech & Language

TL;DR: In this paper , the authors proposed a speech anonymization method based on autoencoders and adversarial training. But the method is limited to the English utterance and cannot handle other languages, such as French, German, and Dutch.

...read moreread less

Journal Article•DOI•

Joint emotion label space modeling for affect lexica

[...]

Luna De Bruyne¹, Pepa Atanasova², Isabelle Augenstein²•Institutions (2)

Ghent University¹, University of Copenhagen²

01 Jan 2022-Computer Speech & Language

TL;DR: The overall findings are that emotion lexica can offer complementary information to even extremely large pre-trained models such as BERT, and the performance of the models is comparable to state-of-the art models that are specifically engineered for certain datasets, and even outperform the state of the art on four datasets.

...read moreread less

Journal Article•DOI•

Language-independent extractive automatic text summarization based on automatic keyword extraction

[...]

Ángel Hernández-Castañeda, René Arnulfo García-Hernández, Yulia Ledeneva, Christian Eduardo Millán-Hernández

01 Jan 2022-Computer Speech & Language

TL;DR: This study proposes a language and domain independent approach for automatic extractive text summarization (EATS) tasks, which is based on a clustering scheme supported by a genetic algorithm (GA), to find an optimal grouping of sentences.

...read moreread less

Journal Article•DOI•

Towards sound based testing of COVID-19—Summary of the first Diagnostics of COVID-19 using Acoustics (DiCOVA) Challenge

[...]

01 May 2022-Computer Speech & Language

TL;DR: In this paper , the authors presented the results from the top four teams, which achieved an area-under-the-receiver operating curve (AUC-ROC) of 95.1% on the blind test data.

...read moreread less

Journal Article•DOI•

Evaluating voice-assistant commands for dementia detection

[...]

Xiaohui Liang¹, John A. Batsis², Youxiang Zhu¹, Tiffany M. Driesse², Robert M. Roth³, David Kotz³, Brian MacWhinney⁴ - Show less +3 more•Institutions (4)

University of Massachusetts Boston¹, University of North Carolina at Chapel Hill², Dartmouth College³, Carnegie Mellon University⁴

01 Mar 2022-Computer Speech & Language

TL;DR: This paper explores the voice commands using a Voice-Assistant System (VAS), i.e., Amazon Alexa, from 40 older adults who were either Healthy Control (HC) participants or Mild Cognitive Impairment (MCI) participants, age 65 or older, to demonstrate the promise of future home-based cognitive assessments using Voice- Assistant Systems.

...read moreread less

Journal Article•DOI•

Perceptions and reactions to conversational privacy initiated by a conversational user interface

[...]

Birgit Brüggemeier¹, Philip Lalone¹•Institutions (1)

Fraunhofer Society¹

01 Jan 2022-Computer Speech & Language

TL;DR: In this article, the authors used a bespoke data collection interface to generate speaking chatbots and made them available as tasks on the crowd sourcing platform Mechanical Turk to simulate how privacy can be communicated in a dialogue between user and machine.

...read moreread less

Journal Article•DOI•

Named entity recognition using neural language model and CRF for Hindi language

[...]

Richa Sharma, Sudha Morwal, Basant Agarwal

01 Jan 2022-Computer Speech & Language

TL;DR: In this paper , a state-of-the-art Hindi NER system based on MuRIL language model and CRF is proposed. But, the model is not suitable for the Hindi named entity recognition task.

...read moreread less

Journal Article•DOI•

Arabic speech recognition by end-to-end, modular systems and human

[...]

01 Jan 2022-Computer Speech & Language

TL;DR: The authors performed a comprehensive benchmarking for end-to-end transformer ASR, modular HMM-DNN ASR and human speech recognition (HSR) on the Arabic language and its dialects.

...read moreread less

Journal Article•DOI•

Evaluating voice-assistant commands for dementia detection

[...]

01 Mar 2022-Computer Speech & Language

TL;DR: In this article , the authors explored the voice commands using a Voice-Assistant System (VAS), i.e., Amazon Alexa, from 40 older adults who were either Healthy Control (HC) participants or Mild Cognitive Impairment (MCI) participants, age 65 or older.

...read moreread less

Collapse