Open AccessProceedings Article
Framewise phoneme classification with bidirectional LSTM and other neural network architectures
Alex Graves,Jürgen Schmidhuber +1 more
- Vol. 18, pp 602-610
Reads0
Chats0
TLDR
In this article, a modified, full gradient version of the LSTM learning algorithm was used for framewise phoneme classification, using the TIMIT database, and the results support the view that contextual information is crucial to speech processing, and suggest that bidirectional networks outperform unidirectional ones.Abstract:
In this paper, we present bidirectional Long Short Term Memory (LSTM) networks, and a modified, full gradient version of the LSTM learning algorithm. We evaluate Bidirectional LSTM (BLSTM) and several other network architectures on the benchmark task of framewise phoneme classification, using the TIMIT database. Our main findings are that bidirectional networks outperform unidirectional ones, and Long Short Term Memory (LSTM) is much faster and also more accurate than both standard Recurrent Neural Nets (RNNs) and time-windowed Multilayer Perceptrons (MLPs). Our results support the view that contextual information is crucial to speech processing, and suggest that BLSTM is an effective architecture with which to exploit it'.read more
Citations
More filters
Book ChapterDOI
PlaNet - Photo Geolocation with Convolutional Neural Networks
TL;DR: This work subdividing the surface of the earth into thousands of multi-scale geographic cells, and train a deep network using millions of geotagged images, and shows that the resulting model, called PlaNet, outperforms previous approaches and even attains superhuman accuracy in some cases.
Proceedings Article
UDPipe: Trainable Pipeline for Processing CoNLL-U Files Performing Tokenization, Morphological Analysis, POS Tagging and Parsing
TL;DR: UDPipe, a pipeline processing CoNLL-U-formatted files, performs tokenization, morphological analysis, part-of-speech tagging, lemmatization and dependency parsing for nearly all treebanks of Universal Dependencies 1.2.
Journal ArticleDOI
Evaluation and accurate diagnoses of pediatric diseases using artificial intelligence
Huiying Liang,Brian Tsui,Hao Ni,Carolina C. S. Valentim,Sally L. Baxter,Guangjian Liu,Wenjia Cai,Daniel S. Kermany,Daniel S. Kermany,Xin Sun,Jiancong Chen,Liya He,Jie Zhu,Pin Tian,Hua Shao,Lianghong Zheng,Rui Hou,Sierra Hewett,Sierra Hewett,Gen Li,Gen Li,Ping Liang,Xuan Zang,Zhiqi Zhang,Liyan Pan,Huimin Cai,Rujuan Ling,Shuhua Li,Yongwang Cui,Shusheng Tang,Hong Ye,Xiaoyan Huang,Waner He,Wenqing Liang,Qing Zhang,Jianmin Jiang,Wei Yu,Jianqun Gao,Wanxing Ou,Yingmin Deng,Qiaozhen Hou,Bei Wang,Cuichan Yao,Yan Liang,Shu Zhang,Yaou Duan,Runze Zhang,Sarah Gibson,Charlotte Zhang,Oulan Li,Edward Zhang,Gabriel Karin,Nathan Nguyen,Xiaokang Wu,Xiaokang Wu,Cindy Wen,Jie Xu,Wenqin Xu,Bochu Wang,Winston Wang,Jing Li,Jing Li,Bianca Pizzato,Caroline Bao,Daoman Xiang,Wanting He,Wanting He,Suiqin He,Yugui Zhou,Yugui Zhou,Weldon W Haw,Weldon W Haw,Michael H. Goldbaum,Adriana H. Tremoulet,Chun-Nan Hsu,Hannah Carter,Long Zhu,Kang Zhang,Kang Zhang,Kang Zhang,Huimin Xia +80 more
TL;DR: This study shows that MLCs can query EHRs in a manner similar to the hypothetico-deductive reasoning used by physicians and unearth associations that previous statistical methods have not found, and provides a proof of concept for implementing an AI-based system to aid physicians in tackling large amounts of data, augmenting diagnostic evaluations, and to provide clinical decision support in cases of diagnostic uncertainty or complexity.
Posted Content
Interpretable 3D Human Action Analysis with Temporal Convolutional Networks
Tae Soo Kim,Austin Reiter +1 more
TL;DR: In this paper, a new class of models known as Temporal Convolutional Neural Networks (TCN) is proposed to explicitly learn readily interpretable spatio-temporal representations for 3D human action recognition.
Journal ArticleDOI
Semi-Supervised Deep Learning Using Pseudo Labels for Hyperspectral Image Classification
Hao Wu,Saurabh Prasad +1 more
TL;DR: This paper uses deep convolutional recurrent neural networks for hyperspectral image classification by treating each hyperspectrals pixel as a spectral sequence and proposes a constrained Dirichlet process mixture model (C-DPMM) for semi-supervised clustering which includes pairwise must-link and cannot-link constraints, resulting in improved initialization of the deep neural network.
References
More filters
Journal ArticleDOI
Long short-term memory
TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
Book
Neural networks for pattern recognition
TL;DR: This is the first comprehensive treatment of feed-forward neural networks from the perspective of statistical pattern recognition, and is designed as a text, with over 100 exercises, to benefit anyone involved in the fields of neural computation and pattern recognition.
Journal ArticleDOI
Bidirectional recurrent neural networks
Mike Schuster,Kuldip K. Paliwal +1 more
TL;DR: It is shown how the proposed bidirectional structure can be easily modified to allow efficient estimation of the conditional posterior probability of complete symbol sequences without making any explicit assumption about the shape of the distribution.
Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST
John S. Garofolo,Lori Lamel,W M. Fisher,Jonathan G. Fiscus,David S. Pallett,Nancy L. Dahlgren +5 more
Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies
Sepp Hochreiter,Yoshua Bengio +1 more
TL;DR: D3EGF(FIH)J KMLONPEGQSRPETN UCV.