Open AccessJournal Article
Natural Language Processing (Almost) from Scratch
Reads0
Chats0
TLDR
A unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including part-of-speech tagging, chunking, named entity recognition, and semantic role labeling is proposed.Abstract:
We propose a unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including part-of-speech tagging, chunking, named entity recognition, and semantic role labeling. This versatility is achieved by trying to avoid task-specific engineering and therefore disregarding a lot of prior knowledge. Instead of exploiting man-made input features carefully optimized for each task, our system learns internal representations on the basis of vast amounts of mostly unlabeled training data. This work is then used as a basis for building a freely available tagging system with good performance and minimal computational requirements.read more
Citations
More filters
Proceedings ArticleDOI
Intrinsic Evaluation of Word Vectors Fails to Predict Extrinsic Performance
TL;DR: It is demonstrated that most intrinsic evaluations are poor predictors of downstream performance, and this issue can be traced in part to a failure to distinguish specific similarity from relatedness in intrinsic evaluation datasets.
Proceedings ArticleDOI
Context- and Content-aware Embeddings for Query Rewriting in Sponsored Search
TL;DR: This work proposes rewriting method based on a novel query embedding algorithm, which jointly models query content as well as its context within a search session, and shows the proposed approach significantly outperformed existing state-of-the-art, strongly indicating its benefits and the monetization potential.
Proceedings ArticleDOI
GrandSLAm: Guaranteeing SLAs for Jobs in Microservices Execution Frameworks
TL;DR: This study presents GrandSLAm, a microservice execution framework that improves utilization of datacenters hosting microservices, and significantly increases throughput by up to 3x compared to the baseline, without violating SLAs for a wide range of real-world AI and ML applications.
Journal ArticleDOI
Towards Making the Most of BERT in Neural Machine Translation
TL;DR: A concerted training framework (CTnmt) that is the key to integrate the pre-trained LMs to neural machine translation (NMT) and consists of three techniques: asymptotic distillation to ensure that the NMT model can retain the previous pre- trained knowledge.
Proceedings ArticleDOI
Neural Word Segmentation with Rich Pretraining
Jie Yang,Yue Zhang,Fei Dong +2 more
TL;DR: This work investigates the effectiveness of a range of external training sources for neural word segmentation by building a modular segmentation model, pretraining the most important submodule using rich external sources and shows that such pretraining significantly improves the model, leading to accuracies competitive to the best methods on six benchmarks.
References
More filters
Journal ArticleDOI
Gradient-based learning applied to document recognition
Yann LeCun,Léon Bottou,Léon Bottou,Yoshua Bengio,Yoshua Bengio,Yoshua Bengio,Patrick Haffner +6 more
TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.
Journal ArticleDOI
A tutorial on hidden Markov models and selected applications in speech recognition
TL;DR: In this paper, the authors provide an overview of the basic theory of hidden Markov models (HMMs) as originated by L.E. Baum and T. Petrie (1966) and give practical details on methods of implementation of the theory along with a description of selected applications of HMMs to distinct problems in speech recognition.
Book
Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference
TL;DR: Probabilistic Reasoning in Intelligent Systems as mentioned in this paper is a complete and accessible account of the theoretical foundations and computational methods that underlie plausible reasoning under uncertainty, and provides a coherent explication of probability as a language for reasoning with partial belief.
Journal ArticleDOI
A fast learning algorithm for deep belief nets
TL;DR: A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory.
Journal ArticleDOI
Machine learning
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.