scispace - formally typeset
Search or ask a question

Showing papers on "Character (mathematics) published in 2017"


Journal ArticleDOI
TL;DR: This paper proposed a new approach based on skip-gram model, where each word is represented as a bag of character n-grams, words being represented as the sum of these representations, allowing to train models on large corpora quickly and allowing to compute word representations for words that did not appear in the training data.
Abstract: Continuous word representations, trained on large unlabeled corpora are useful for many natural language processing tasks. Popular models to learn such representations ignore the morphology of words, by assigning a distinct vector to each word. This is a limitation, especially for languages with large vocabularies and many rare words. In this paper, we propose a new approach based on the skipgram model, where each word is represented as a bag of character n-grams. A vector representation is associated to each character n-gram, words being represented as the sum of these representations. Our method is fast, allowing to train models on large corpora quickly and allows to compute word representations for words that did not appear in the training data. We evaluate our word representations on nine different languages, both on word similarity and analogy tasks. By comparing to recently proposed morphological word representations, we show that our vectors achieve state-of-the-art performance on these tasks.

7,537 citations


Journal ArticleDOI
20 Jul 2017
TL;DR: A real-time character control mechanism using a novel neural network architecture called a Phase-Functioned Neural Network that takes as input user controls, the previous state of the character, the geometry of the scene, and automatically produces high quality motions that achieve the desired user control.
Abstract: We present a real-time character control mechanism using a novel neural network architecture called a Phase-Functioned Neural Network. In this network structure, the weights are computed via a cyclic function which uses the phase as an input. Along with the phase, our system takes as input user controls, the previous state of the character, the geometry of the scene, and automatically produces high quality motions that achieve the desired user control. The entire network is trained in an end-to-end fashion on a large dataset composed of locomotion such as walking, running, jumping, and climbing movements fitted into virtual environments. Our system can therefore automatically produce motions where the character adapts to different geometric environments such as walking and running over rough terrain, climbing over large rocks, jumping over obstacles, and crouching under low ceilings. Our network architecture produces higher quality results than time-series autoregressive models such as LSTMs as it deals explicitly with the latent variable of motion relating to the phase. Once trained, our system is also extremely fast and compact, requiring only milliseconds of execution time and a few megabytes of memory, even when trained on gigabytes of motion data. Our work is most appropriate for controlling characters in interactive scenes such as computer games and virtual reality systems.

440 citations


Proceedings ArticleDOI
01 Sep 2017
TL;DR: Character n-gram F-score (CHRF) is shown to correlate very well with human relative rankings of different machine translation outputs, especially for morphologically rich target languages, however, its relation with direct human assessments is not yet clear.
Abstract: Character n-gram F-score (CHRF) is shown to correlate very well with human relative rankings of different machine translation outputs, especially for morphologically rich target languages. However, its relation with direct human assessments is not yet clear. In this work, Pearson’s correlation coefficients for direct assessments are investigated for two currently available target languages, English and Russian. First, different β parameters (in range from 1 to 3) are re-investigated with direct assessment, and it is confirmed that β = 2 is the optimal option. Then separate character and word n-grams are investigated, and the main finding is that, apart from character n-grams, word 1-grams and 2-grams also correlate rather well with direct assessments. Further experiments show that adding word unigrams and bigrams to the standard CHRF score improves the correlations with direct assessments, though it is still not clear which option is better, unigrams only (CHRF+) or unigrams and bigrams (CHRF++). This should be investigated in future work on more target languages.

203 citations


Proceedings ArticleDOI
10 Nov 2017
TL;DR: The authors used character n-grams, word ngrams and word skip-gram features to detect hate speech in social media, while distinguishing this from general profanity and hate speech from each other.
Abstract: In this paper we examine methods to detect hate speech in social media, while distinguishing this from general profanity. We aim to establish lexical baselines for this task by applying supervised classification methods using a recently released dataset annotated for this purpose. As features, our system uses character n-grams, word n-grams and word skip-grams. We obtain results of 78% accuracy in identifying posts across three classes. Results demonstrate that the main challenge lies in discriminating profanity and hate speech from each other. A number of directions for future work are discussed.

189 citations


Proceedings ArticleDOI
22 Aug 2017
TL;DR: A weakly supervised framework that can utilize word annotations, either in tight quadrangles or the more loose bounding boxes, for character detector training is proposed, able to train a robust character detector by exploiting word annotations in the rich large-scale real scene text datasets, e.g. ICDAR15 and COCO-text.
Abstract: Imagery texts are usually organized as a hierarchy of several visual elements, i.e. characters, words, text lines and text blocks. Among these elements, character is the most basic one for various languages such as Western, Chinese, Japanese, mathematical expression and etc. It is natural and convenient to construct a common text detection engine based on character detectors. However, training character detectors requires a vast of location annotated characters, which are expensive to obtain. Actually, the existing real text datasets are mostly annotated in word or line level. To remedy this dilemma, we propose a weakly supervised framework that can utilize word annotations, either in tight quadrangles or the more loose bounding boxes, for character detector training. When applied in scene text detection, we are thus able to train a robust character detector by exploiting word annotations in the rich large-scale real scene text datasets, e.g. ICDAR15 [19] and COCO-text [39]. The character detector acts as a key role in the pipeline of our text detection engine. It achieves the state-of-the-art performance on several challenging scene text detection benchmarks. We also demonstrate the flexibility of our pipeline by various scenarios, including deformed text detection and math expression recognition.

164 citations


Book
01 Jun 2017
TL;DR: The VIA Institute on Character Index as mentioned in this paper defined seven core concepts of the science of character, including behavioral traps, misconceptions, and strategies for applying character strengths in practice.
Abstract: Dedication Foreword Preface Acknowledgements Chapter 1 Foundations of Strengths-Based Practice: Seven Core Concepts of the Science of Character Chapter 2 Signature Strengths: Research and Practice Chapter 3 Practice Essentials: Six Integration Strategies for a Strengths-Based Practice Chapter 4 Behavioral Traps, Misconceptions, and Strategies Chapter 5 Advanced Issues in Applying Character Strengths Chapter 6 Character Strength Spotlights: 24 Practitioner-Friendly Handouts Chapter 7 How to Apply Character Strengths Interventions Chapter 8 Research-Based Interventions for Character Strengths Chapter 9 Afterword References Appendix A Background on the VIA Classification of Character Strengths and the VIA Survey Appendix B Checklist for Strengths-Based Practitioners Appendix C A Sampling of Strengths-Based Models Appendix D Frequently Asked Questions About Character Strengths Appendix E Comparison of VIA Survey with StrengthsFinder (Gallup) and Myers-Briggs Type Indicator (MBTI) Appendix F Flagship Papers on Character Strengths Appendix G 10 Character Strengths Concepts and Applications in Specific Movies Appendix H About the VIA Institute on Character Index

142 citations


Posted Content
TL;DR: Zhang et al. as discussed by the authors proposed a weakly supervised framework that can utilize word annotations, either in tight quadrangles or the more loose bounding boxes, for character detector training.
Abstract: Imagery texts are usually organized as a hierarchy of several visual elements, i.e. characters, words, text lines and text blocks. Among these elements, character is the most basic one for various languages such as Western, Chinese, Japanese, mathematical expression and etc. It is natural and convenient to construct a common text detection engine based on character detectors. However, training character detectors requires a vast of location annotated characters, which are expensive to obtain. Actually, the existing real text datasets are mostly annotated in word or line level. To remedy this dilemma, we propose a weakly supervised framework that can utilize word annotations, either in tight quadrangles or the more loose bounding boxes, for character detector training. When applied in scene text detection, we are thus able to train a robust character detector by exploiting word annotations in the rich large-scale real scene text datasets, e.g. ICDAR15 and COCO-text. The character detector acts as a key role in the pipeline of our text detection engine. It achieves the state-of-the-art performance on several challenging scene text detection benchmarks. We also demonstrate the flexibility of our pipeline by various scenarios, including deformed text detection and math expression recognition.

129 citations


Posted Content
TL;DR: This paper aims to establish lexical baselines for this task by applying supervised classification methods using a recently released dataset annotated for this purpose, and obtains results of 78% accuracy in identifying posts across three classes.
Abstract: In this paper we examine methods to detect hate speech in social media, while distinguishing this from general profanity. We aim to establish lexical baselines for this task by applying supervised classification methods using a recently released dataset annotated for this purpose. As features, our system uses character n-grams, word n-grams and word skip-grams. We obtain results of 78% accuracy in identifying posts across three classes. Results demonstrate that the main challenge lies in discriminating profanity and hate speech from each other. A number of directions for future work are discussed.

94 citations


Journal ArticleDOI
TL;DR: In this article, the authors summarize the recent results about complete solvability of Hermitian and rectangular complex matrix models, and show that the integrability and Virasoro constraints are simple corollaries, but no vice versa.

94 citations


Patent
19 Apr 2017
TL;DR: In this paper, a text named entity recognition method based on Bi-LSTM, CNN and CRF was proposed, which is an end-to-end model without the need of data preprocessing in the unmarked corpus with the exception of the pre-trained word vector.
Abstract: The invention discloses a text named entity recognition method based on Bi-LSTM, CNN and CRF. The method includes the following steps: (1) using a convolutional nerve network to encode and convert information on text word character level to a character vector; (2) combining the character vector and word vector into a combination which, as an input, is transmitted to a bidirectional LSTM neural network to build a model for contextual information of every word; and (3) in the output end of the LSTM neural network, utilizing continuous conditional random fields to carry out label decoding to a whole sentence, and mark the entities in the sentence. The invention is an end-to-end model without the need of data pre-processing in the un-marked corpus with the exception of the pre-trained word vector, therefore the invention can be widely applied for statement marking of different languages and fields.

93 citations



Journal ArticleDOI
TL;DR: In this paper, a framework of leader character is proposed, which provides rigor through a three-phase, multi-method approach involving 1,817 leaders, and relevance by using an engaged scholarship epistemology to validate the framework with practicing leaders.
Abstract: While the construct of character is well grounded in philosophy, ethics, and more recently psychology, it lags in acceptance and legitimacy within management research and mainstream practice. Our research seeks to remedy this through four contributions. First, we offer a framework of leader character that provides rigor through a three-phase, multi-method approach involving 1,817 leaders, and relevance by using an engaged scholarship epistemology to validate the framework with practicing leaders. This framework highlights the theoretical underpinnings of the leader character model and articulates the character dimensions and elements that operate in concert to promote effective leadership. Second, we bring leader character into mainstream management research, extending the traditional competency and interpersonal focus on leadership to embrace the foundational component of leader character. In doing this, we articulate how leader character complements and strengthens several existing theories of leadership. Third, we extend the virtues-based approach to ethical decision making to the broader domain of judgment and decision making in support of pursuing individual and organization effectiveness. Finally, we offer promising directions for future research on leader character that will also serve the larger domain of leadership research. This article is protected by copyright. All rights reserved.

Journal ArticleDOI
TL;DR: It is claimed that the 2 m -fold Gaussian correlators of rank r tensors are given by r -linear combinations of dimensions with the Young diagrams of size m, which emphasizes a close similarity between technical methods in matrix and tensor models and supports a hope to understand the emerging structures in very similar terms.

Journal ArticleDOI
TL;DR: The findings support a tripartite taxonomy of character in the school context, and positive peer relations were most consistently predicted by interpersonal character, class participation by intellectual character, and report card grades by intrapersonal character.

Journal ArticleDOI
TL;DR: The fractional occupation number weighted density analysis is explored as a general theoretical diagnostic for complicated electronic structures and opens a full quantum-mechanical, unbiased route to the automatic detection of errors in experimental protein X-ray structures, such as false protonation states or misplaced atoms.
Abstract: The fractional occupation number weighted density (FOD) analysis is explored as a general theoretical diagnostic for complicated electronic structures. Its main feature is to provide robustly and quickly the information on the localization of "hot" (strongly correlated and chemically active) electrons in a molecule. We demonstrate its usage in four different prototypical applications: 1) As a new and fast measure of the biradical character of polycyclic aromatic hydrocarbons, 2) for the selection of active orbital spaces in multiconfigurational or complete active space self consistent field (MCSCF/CASSCF) treatments, 3) as a possibility to describe molecular-energy landscapes consistently in regions with varying biradical character, as exemplified by partial double-bond torsions, and 4) as a powerful visualization method for static electron correlation effects in large biomolecules in connection with an efficient semi-empirical tight-binding molecular orbital scheme. The last application opens a full quantum-mechanical, unbiased route to the automatic detection of errors in experimental protein X-ray structures, such as false protonation states or misplaced atoms. In the first example, the complete (unfragmented) quantum-chemical calculation of the FOD for an entire metalloprotein with more than 7500 atoms is described.

Posted Content
TL;DR: This article found that character representations are effective across typologies, and that a previously unstudied combination of character trigram representations composed with bi-LSTMs outperforms most others.
Abstract: Words can be represented by composing the representations of subword units such as word segments, characters, and/or character n-grams. While such representations are effective and may capture the morphological regularities of words, they have not been systematically compared, and it is not understood how they interact with different morphological typologies. On a language modeling task, we present experiments that systematically vary (1) the basic unit of representation, (2) the composition of these representations, and (3) the morphological typology of the language modeled. Our results extend previous findings that character representations are effective across typologies, and we find that a previously unstudied combination of character trigram representations composed with bi-LSTMs outperforms most others. But we also find room for improvement: none of the character-level models match the predictive accuracy of a model with access to true morphological analyses, even when learned from an order of magnitude more data.

Journal ArticleDOI
TL;DR: In this article, the Kontsevich-Soibelman construction of the cohomological Hall algebra (CoHA) of BPS states and Lusztig's construction of canonical bases for quantum enveloping algebras were studied.
Abstract: Pursuing the similarity between the Kontsevich–Soibelman construction of the cohomological Hall algebra (CoHA) of BPS states and Lusztig's construction of canonical bases for quantum enveloping algebras, and the similarity between the integrality conjecture for motivic Donaldson–Thomas invariants and the PBW theorem for quantum enveloping algebras, we build a coproduct on the CoHA associated to a quiver with potential. We also prove a cohomological dimensional reduction theorem, further linking a special class of CoHAs with Yangians, and explaining how to connect the study of character varieties with the study of CoHAs.

01 Jan 2017
TL;DR: The Clothing as Culture project as mentioned in this paper investigates the emergence and didactic functions of costume prints produced between 1600 and 1650, reframing them as artifacts of an era when clothing was considered the primary visual indicator of cultural difference.
Abstract: At the turn of the seventeenth century, European printmakers began issuing single-sheet series portraying how people dressed in different parts of the world. These works are only briefly acknowledged in artists’ monographs—if such studies exist—or treated summarily in studies of fashion illustration, where their aims are insufficiently differentiated from those of fashion plates. This dissertation investigates the emergence and didactic functions of costume prints produced between 1600 and 1650, reframing them as artifacts of an era when clothing was considered the primary visual indicator of cultural difference. Collected in the albums of connoisseurs, affixed to the walls of alehouses, or incorporated into household objects, costume prints that pair national types with descriptions of customs and behaviors instructed viewers to read clothing as an index of civility, morality, and status. The project addresses the interplay between images and inscriptions, parallels with period texts, and the varied modes of reception. To acknowledge the fluid boundaries of early modern print culture, it encompasses a range of artists, audiences, and regions, chiefly the Low Countries, England, and France. Arranged chronologically and according to the geographic scope of each costume series, the dissertation traces how Europeans’ increasing knowledge of global sartorial diversity precipitated an intensified preoccupation with the role of dress in their own societies. In three chapters, the project considers how Pieter de Jode’s series of European costumes draw from the representational strategies of illustrated voyage accounts and from the principles of antiquarianism, cosmography, and geography; explores how allegories of the Twelve Months and the Four Continents rely on the premise of an inextricable bond between appearance and character to rank the peoples of the world; and examines the divergent attitudes toward luxury in English society by contrasting the demonization of French fashion in popular satires with Wenceslaus Hollar’s sensuous depictions of women’s attire. Through these studies, Clothing as Culture situates costume prints in the ongoing process of self-awareness about the capacity of clothing to constitute individual and collective identities in early modern Europe. Degree Type Dissertation Degree Name Doctor of Philosophy (PhD) Graduate Group History of Art First Advisor Larry Silver

Posted Content
TL;DR: The proposed scene text recognition method with character models on convolutional feature map bases on character models trained free of lexicon, and can recognize unknown words has a number of appealing properties.
Abstract: Scene text recognition has attracted great interests from the computer vision and pattern recognition community in recent years. State-of-the-art methods use concolutional neural networks (CNNs), recurrent neural networks with long short-term memory (RNN-LSTM) or the combination of them. In this paper, we investigate the intrinsic characteristics of text recognition, and inspired by human cognition mechanisms in reading texts, we propose a scene text recognition method with character models on convolutional feature map. The method simultaneously detects and recognizes characters by sliding the text line image with character models, which are learned end-to-end on text line images labeled with text transcripts. The character classifier outputs on the sliding windows are normalized and decoded with Connectionist Temporal Classification (CTC) based algorithm. Compared to previous methods, our method has a number of appealing properties: (1) It avoids the difficulty of character segmentation which hinders the performance of segmentation-based recognition methods; (2) The model can be trained simply and efficiently because it avoids gradient vanishing/exploding in training RNN-LSTM based models; (3) It bases on character models trained free of lexicon, and can recognize unknown words. (4) The recognition process is highly parallel and enables fast recognition. Our experiments on several challenging English and Chinese benchmarks, including the IIIT-5K, SVT, ICDAR03/13 and TRW15 datasets, demonstrate that the proposed method yields superior or comparable performance to state-of-the-art methods while the model size is relatively small.

Journal ArticleDOI
TL;DR: Out of the 24 character strengths, the happiness-related strengths were more likely to correlate with PWB and SWB than any other character strength and “Persistence” showed the highest correlation with the PWB aspect mastery.
Abstract: Research has shown that character strengths are positively linked with well-being in general. However, there has not been a fine-grained analysis up to date. This study examines the individual relational aspects between the 24 character strengths, subjective well-being (SWB), and different aspects of psychological well-being (PWB) at two times of measurement (N=117). Results showed that overall the “good character” was significantly stronger related with PWB than with SWB. The character strength “hope” was at least moderately correlated with the PWB aspects meaning, optimism and autonomy, and “zest” with the PWB aspects relationships and engagement. “Persistence” showed the highest correlation with the PWB aspect mastery. Out of the 24 character strengths, the happiness-related strengths (hope, zest, gratitude, curiosity, and love) were more likely to correlate with PWB and SWB than any other character strength. This study offers a more fine-grained and thorough understanding of specific relational aspects between the 24 character strengths and a broad range of well-being aspects. Future studies should take up a detailed strategy when exploring relationships between character strengths and well-being.

Journal ArticleDOI
TL;DR: It is argued that “intellectual character education,” which emphasizes the development of intellectual virtues like curiosity, open-mindedness, and intellectual courage, is an underexplored but especially promising approach in this context.
Abstract: The moral and civic dimensions of personal character have been widely recognized and explored. Recent work by philosophers, psychologists, and education theorists has drawn attention to two additional dimensions of character: intellectual character and "performance" character. This article sketches a "four-dimensional" conceptual model of personal character and some of the character strengths or "virtues" proper to each dimension. In addition to exploring how the dimensions of character are related to each other, the article also examines the implications of this account for character education undertaken in a youth or adolescent context. It is argued that "intellectual character education," which emphasizes the development of intellectual virtues like curiosity, open-mindedness, and intellectual courage, is an underexplored but especially promising approach in this context. The relationship between intellectual character education and traditional character education, which emphasizes the development of moral and civic virtues like kindness, generosity, and tolerance, is also explored.

Proceedings ArticleDOI
01 Sep 2017
TL;DR: Proposing a neural model for predicting a tag for each character using word and character information and demonstrating that this model outperforms the state-of-the-art neural English NER model in Japanese.
Abstract: Recently, neural models have shown superior performance over conventional models in NER tasks. These models use CNN to extract sub-word information along with RNN to predict a tag for each word. However, these models have been tested almost entirely on English texts. It remains unclear whether they perform similarly in other languages. We worked on Japanese NER using neural models and discovered two obstacles of the state-of-the-art model. First, CNN is unsuitable for extracting Japanese sub-word information. Secondly, a model predicting a tag for each word cannot extract an entity when a part of a word composes an entity. The contributions of this work are (1) verifying the effectiveness of the state-of-the-art NER model for Japanese, (2) proposing a neural model for predicting a tag for each character using word and character information. Experimentally obtained results demonstrate that our model outperforms the state-of-the-art neural English NER model in Japanese.

Proceedings ArticleDOI
17 Apr 2017
TL;DR: This article used a convolutional neural network (CNN) to produce a visual character embedding for Chinese, Japanese, and Korean text classification task and showed that the model learns to focus on the parts of characters that carry topical content, resulting in embeddings that are coherent in visual space.
Abstract: Previous work has modeled the compositionality of words by creating character-level models of meaning, reducing problems of sparsity for rare words. However, in many writing systems compositionality has an effect even on the character-level: the meaning of a character is derived by the sum of its parts. In this paper, we model this effect by creating embeddings for characters based on their visual characteristics, creating an image for the character and running it through a convolutional neural network to produce a visual character embedding. Experiments on a text classification task demonstrate that such model allows for better processing of instances with rare characters in languages such as Chinese, Japanese, and Korean. Additionally, qualitative analyses demonstrate that our proposed model learns to focus on the parts of characters that carry topical content which resulting in embeddings that are coherent in visual space.

Proceedings ArticleDOI
01 Dec 2017
TL;DR: A convolutional deep model to recognize Bengali handwritten characters is proposed that first learnt a useful set of features by using kernels and local receptive fields, and then it has employed densely connected layers for the discrimination task.
Abstract: Handwritten character recognition is a nontrivial task as it seeks to recognize the correct class for user independent handwritten characters. This problem becomes even more challenging for a highly stylized, morphologically complex, and potentially juxtapositional characters comprising language like Bengali. As a result, the improvements over the years in Bengali character recognition are significantly less as compared to the other languages. In this paper, we propose a convolutional deep model to recognize Bengali handwritten characters. We first learnt a useful set of features by using kernels and local receptive fields, and then we have employed densely connected layers for the discrimination task. Our system has been tested on BanglaLekha-Isolated dataset. It achieves 98.66% accuracy on numerals (10 character classes), 94.99% accuracy on vowels (11 character classes), 91.60% accuracy on compound letters (20 character classes), 91.23% accuracy on alphabets (50 character classes), and 89.93% accuracy on almost all Bengali characters (80 character classes). Most of the errors incurred by our model in recognition task are due to extreme proximity in shapes among characters. A significant number of errors was caused by the mislabeled, irrecoverably distorted, and illegal data examples.

Proceedings ArticleDOI
01 Sep 2017
TL;DR: This work investigates training character embeddings on a word-based context in a similar way, showing that the simple method improves state-of-the-art neural word segmentation models significantly, beating tri-training baselines for leveraging auto-segmented data.
Abstract: Neural parsers have benefited from automatically labeled data via dependency-context word embeddings We investigate training character embeddings on a word-based context in a similar way, showing that the simple method improves state-of-the-art neural word segmentation models significantly, beating tri-training baselines for leveraging auto-segmented data

Proceedings ArticleDOI
01 Apr 2017
TL;DR: This paper investigates neural character-based morphological tagging for languages with complex morphology and large tag sets and shows that the CNN based approach performs slightly worse and less consistently than the RNN based approach.
Abstract: This paper investigates neural character-based morphological tagging for languages with complex morphology and large tag sets. Character-based approaches are attractive as they can handle rarely- and unseen words gracefully. We evaluate on 14 languages and observe consistent gains over a state-of-the-art morphological tagger across all languages except for English and French, where we match the state-of-the-art. We compare two architectures for computing character-based word vectors using recurrent (RNN) and convolutional (CNN) nets. We show that the CNN based approach performs slightly worse and less consistently than the RNN based approach. Small but systematic gains are observed when combining the two architectures by ensembling.

Proceedings ArticleDOI
01 Dec 2017
TL;DR: A novel method for end-to-end ASR decoding with LMs at both the character and word level, which achieved 5.6 % WER for the Eval'92 test set using only the SI284 training set and WSJ text data, which is the best score reported on this benchmark.
Abstract: We propose a combination of character-based and word-based language models in an end-to-end automatic speech recognition (ASR) architecture. In our prior work, we combined a character-based LSTM RNN-LM with a hybrid attention/connectionist temporal classification (CTC) architecture. The character LMs improved recognition accuracy to rival state-of-the-art DNN/HMM systems in Japanese and Mandarin Chinese tasks. Although a character-based architecture can provide for open vocabulary recognition, the character-based LMs generally under-perform relative to word LMs for languages such as English with a small alphabet, because of the difficulty of modeling Linguistic constraints across long sequences of characters. This paper presents a novel method for end-to-end ASR decoding with LMs at both the character and word level. Hypotheses are first scored with the character-based LM until a word boundary is encountered. Known words are then re-scored using the word-based LM, while the character-based LM provides for out-of-vocabulary scores. In a standard Wall Street Journal (WSJ) task, we achieved 5.6 % WER for the Eval'92 test set using only the SI284 training set and WSJ text data, which is the best score reported for end-to-end ASR systems on this benchmark.

Journal ArticleDOI
TL;DR: In this article, a categorization of the Chern character is proposed, which refines earlier work of Toen and Vezzosi and of Ganter and Kapranov, and shows that the secondary Chern character factors through secondary K-theory.

Journal ArticleDOI
TL;DR: In this article, a new orbital-entanglement-based multi-configurational diagnostic termed Zs(1) was proposed, which can be evaluated from a partially converged, but qualitatively correct, and therefore inexpensive density matrix renormalization group wave function.
Abstract: One of the most critical tasks at the very beginning of a quantum chemical investigation is the choice of either a multi- or single-configurational method. Naturally, many proposals exist to define a suitable diagnostic of the multi-configurational character for various types of wave functions in order to assist this crucial decision. Here, we present a new orbital-entanglement-based multi-configurational diagnostic termed Zs(1). The correspondence of orbital entanglement and static (or non-dynamic) electron correlation permits the definition of such a diagnostic. We chose our diagnostic to meet important requirements such as well-defined limits for pure single-configurational and multi-configurational wave functions. The Zs(1) diagnostic can be evaluated from a partially converged, but qualitatively correct, and therefore inexpensive density matrix renormalisation group wave function as in our recently presented automated active orbital selection protocol. Its robustness and the fact that it can ...

01 Jan 2017
TL;DR: Programmatically deriving sentiment has been the topic of many a thesis: it’s application in analyzing 140 character sentences, to that of 400-word Hemingway sentences; the methods ranging from nai ...
Abstract: Programmatically deriving sentiment has been the topic of many a thesis: it’s application in analyzing 140 character sentences, to that of 400-word Hemingway sentences; the methods ranging from nai ...