scispace - formally typeset
Search or ask a question

Showing papers on "Character (mathematics) published in 2015"


Proceedings ArticleDOI
01 Sep 2015
TL;DR: The proposed use of character n-gram F-score for automatic evaluation of machine translation output shows very promising results, especially for the CHRF3 score – for translation from English, this variant showed the highest segment-level correlations outperforming even the best metrics on the WMT14 shared evaluation task.
Abstract: We propose the use of character n-gram F-score for automatic evaluation of machine translation output. Character ngrams have already been used as a part of more complex metrics, but their individual potential has not been investigated yet. We report system-level correlations with human rankings for 6-gram F1-score (CHRF) on the WMT12, WMT13 and WMT14 data as well as segment-level correlation for 6gram F1 (CHRF) and F3-scores (CHRF3) on WMT14 data for all available target languages. The results are very promising, especially for the CHRF3 score – for translation from English, this variant showed the highest segment-level correlations outperforming even the best metrics on the WMT14 shared evaluation task.

743 citations


Proceedings ArticleDOI
09 Aug 2015
TL;DR: A model for constructing vector representations of words by composing characters using bidirectional LSTMs that requires only a single vector per character type and a fixed set of parameters for the compositional model, which yields state- of-the-art results in language modeling and part-of-speech tagging.
Abstract: We introduce a model for constructing vector representations of words by composing characters using bidirectional LSTMs. Relative to traditional word representation models that have independent vectors for each word type, our model requires only a single vector per character type and a fixed set of parameters for the compositional model. Despite the compactness of this model and, more importantly, the arbitrary nature of the form‐function relationship in language, our “composed” word representations yield state-of-the-art results in language modeling and part-of-speech tagging. Benefits over traditional baselines are particularly pronounced in morphologically rich languages (e.g., Turkish).

538 citations


Posted Content
Abstract: We introduce a model for constructing vector representations of words by composing characters using bidirectional LSTMs. Relative to traditional word representation models that have independent vectors for each word type, our model requires only a single vector per character type and a fixed set of parameters for the compositional model. Despite the compactness of this model and, more importantly, the arbitrary nature of the form-function relationship in language, our "composed" word representations yield state-of-the-art results in language modeling and part-of-speech tagging. Benefits over traditional baselines are particularly pronounced in morphologically rich languages (e.g., Turkish).

519 citations



Proceedings Article
Xinxiong Chen1, Lei Xu1, Zhiyuan Liu1, Maosong Sun1, Huanbo Luan1 
25 Jul 2015
TL;DR: A character-enhanced word embedding model (CWE) is presented to address the issues of character ambiguity and non-compositional words, and the effectiveness of CWE on word relatedness computation and analogical reasoning is evaluated.
Abstract: Most word embedding methods take a word as a basic unit and learn embeddings according to words' external contexts, ignoring the internal structures of words. However, in some languages such as Chinese, a word is usually composed of several characters and contains rich internal information. The semantic meaning of a word is also related to the meanings of its composing characters. Hence, we take Chinese for example, and present a character-enhanced word embedding model (CWE). In order to address the issues of character ambiguity and non-compositional words, we propose multiple prototype character embeddings and an effective word selection method. We evaluate the effectiveness of CWE on word relatedness computation and analogical reasoning. The results show that CWE outperforms other baseline methods which ignore internal character information. The codes and data can be accessed from https://github.com/Leonard-Xu/CWE.

265 citations


Posted Content
TL;DR: A neural machine translation model that views the input and output sentences as sequences of characters rather than words, which alleviates much of the challenges associated with preprocessing/tokenization of the source and target languages.
Abstract: We introduce a neural machine translation model that views the input and output sentences as sequences of characters rather than words. Since word-level information provides a crucial source of bias, our input model composes representations of character sequences into representations of words (as determined by whitespace boundaries), and then these are translated using a joint attention/translation model. In the target language, the translation is modeled as a sequence of word vectors, but each word is generated one character at a time, conditional on the previous character generations in each word. As the representation and generation of words is performed at the character level, our model is capable of interpreting and generating unseen word forms. A secondary benefit of this approach is that it alleviates much of the challenges associated with preprocessing/tokenization of the source and target languages. We show that our model can achieve translation results that are on par with conventional word-based models.

255 citations


Book
17 Apr 2015
TL;DR: The authors provides a reconstruction of the Aristotelian character education, shedding new light on what moral character really is, and how it can be highlighted, measured, nurtured and taught in current schooling.
Abstract: This book provides a reconstruction of Aristotelian character education, shedding new light on what moral character really is, and how it can be highlighted, measured, nurtured and taught in current schooling. Arguing that many recent approaches to character education understand character in exclusively amoral, instrumentalist terms, Kristjansson proposes a coherent, plausible and up-to-date concept, retaining the overall structure of Aristotelian character education.After discussing and debunking popular myths about Aristotelian character education, subsequent chapters focus on the practical ramifications and methodologies of character education. These include measuring virtue and morality, asking whether Aristotelian character education can salvage the effects of bad upbringing, and considering implications for teacher training and classroom practice. The book rejuvenates time-honoured principles of the development of virtues in young people, at a time when ‘character’ features prominently in educational agendas and parental concerns over school education systems.Offering an interdisciplinary perspective which draws from the disciplines of education, psychology, philosophy and sociology, this book will appeal to researchers, academics and students wanting a greater insight into character education.

169 citations


Proceedings ArticleDOI
01 Jan 2015
TL;DR: It is demonstrated that characterngrams that capture information about affixes and punctuation account for almost all of the power of character n-grams as features.
Abstract: Character n-grams have been identified as the most successful feature in both singledomain and cross-domain Authorship Attribution (AA), but the reasons for their discriminative value were not fully understood. We identify subgroups of charactern-grams that correspond to linguistic aspects commonly claimed to be covered by these features: morphosyntax, thematic content and style. We evaluate the predictiveness of each of these groups in two AA settings: a single domain setting and a cross-domain setting where multiple topics are present. We demonstrate that characterngrams that capture information about affixes and punctuation account for almost all of the power of character n-grams as features. Our study contributes new insights into the use of n-grams for future AA work and other classification tasks.

165 citations


Journal ArticleDOI
TL;DR: The proposed method normalizes the written character images and then employ CNN to classify individual characters, which is shown satisfactory recognition accuracy and outperformed some other prominent exiting methods.
Abstract: Handwritten character recognition complexity varies among different languages due to distinct shapes, strokes and number of characters. Numerous works in handwritten character recognition are available for English with respect to other major languages such as Bangla. Existing methods use distinct feature extraction techniques and various classification tools in their recognition schemes. Recently, Convolutional Neural Network (CNN) is found efficient for English handwritten character recognition. In this paper, a CNN based Bangla handwritten character recognition is investigated. The proposed method normalizes the written character images and then employ CNN to classify individual characters. It does not employ any feature extraction method like other related works. 20000 handwritten characters with different shapes and variations are used in this study. The proposed method is shown satisfactory recognition accuracy and outperformed some other prominent exiting methods.

105 citations


Proceedings ArticleDOI
23 Aug 2015
TL;DR: An unconstrained end-to-end text localization and recognition method that detects initial text hypothesis in a single pass by an efficient region-based method and refines the text hypothesis using a more robust local text model, which deviates from the common assumption of region- based methods that all characters are detected as connected components.
Abstract: An unconstrained end-to-end text localization and recognition method is presented. The method detects initial text hypothesis in a single pass by an efficient region-based method and subsequently refines the text hypothesis using a more robust local text model, which deviates from the common assumption of region-based methods that all characters are detected as connected components.

86 citations


Posted Content
TL;DR: This work innovatively develops two component-enhanced Chinese character embedding models and their bigram extensions that explore the compositions of Chinese characters, which often serve as semantic indictors inherently.
Abstract: Distributed word representations are very useful for capturing semantic information and have been successfully applied in a variety of NLP tasks, especially on English. In this work, we innovatively develop two component-enhanced Chinese character embedding models and their bigram extensions. Distinguished from English word embeddings, our models explore the compositions of Chinese characters, which often serve as semantic indictors inherently. The evaluations on both word similarity and text classification demonstrate the effectiveness of our models.

Journal ArticleDOI
TL;DR: In this paper, the link existing between sustainable behavior (SB) and the character strengths that constitute universal virtues is discussed. But the authors do not discuss the relationship between SB and environmental sustainability.
Abstract: This article addresses the link existing between sustainable behavior (SB) and the character strengths that constitute universal virtues. Research was conducted to confirm the idea that SB comprise...

Journal ArticleDOI
TL;DR: It is argued that morphological character trees generated by phylogenetic analysis of transcriptomes provide a useful tool for identifying causal gene expression differences underlying the development and evolution of morphological characters and enable rigorous testing of different models of Morphological character evolution and origination.
Abstract: We elaborate a framework for investigating the evolutionary history of morphological characters. We argue that morphological character trees generated by phylogenetic analysis of transcriptomes provide a useful tool for identifying causal gene expression differences underlying the development and evolution of morphological characters. They also enable rigorous testing of different models of morphological character evolution and origination, including the hypothesis that characters originate via divergence of repeated ancestral characters. Finally, morphological character trees provide evidence that character transcriptomes undergo concerted evolution. We argue that concerted evolution of transcriptomes can explain the so-called "species signal" found in several recent comparative transcriptome studies. The species signal is the phenomenon that transcriptomes cluster by species rather than character type, even though the characters are older than the respective species. We suggest the species signal is a natural consequence of concerted gene expression evolution resulting from mutations that alter gene regulatory network interactions shared by the characters under comparison. Thus, character trees generated from transcriptomes allow us to investigate the variational independence, or individuation, of morphological characters at the level of genetic programs.

Journal ArticleDOI
TL;DR: In this paper, the authors provide an explicit construction of the local Langlands correspondence for general tamely-ramified reductive p-adic groups and a class of wildly ramified Langlands parameters.
Abstract: We provide an explicit construction of the local Langlands correspondence for general tamely-ramified reductive p-adic groups and a class of wildly ramified Langlands parameters. Furthermore, we verify that our construction satisfies many expected properties of such a correspondence. More precisely, we show that each $$L$$ -packet we construct admits a parameterization in terms of the Langlands dual group, contains a unique generic element for a fixed Whittaker datum, satisfies the formal degree conjecture, is compatible with central and cocentral characters, provides a stable virtual character, and satisfies the expected endoscopic character identities. Moreover, we show that in the case of $$\mathrm{{GL}}_n$$ , our construction coincides with the established local Langlands correspondence. Our techniques provide a general approach to the construction of the local Langlands correspondence for tamely-ramified groups and regular supercuspidal parameters.

Proceedings ArticleDOI
01 Jan 2015
TL;DR: The authors developed two component-enhanced Chinese character embedding models and their bigram extensions, which explore the compositions of Chinese characters, which often serve as semantic indictors inherently and demonstrate the effectiveness of their models.
Abstract: Distributed word representations are very useful for capturing semantic information and have been successfully applied in a variety of NLP tasks, especially on English. In this work, we innovatively develop two component-enhanced Chinese character embedding models and their bigram extensions. Distinguished from English word embeddings, our models explore the compositions of Chinese characters, which often serve as semantic indictors inherently. The evaluations on both word similarity and text classification demonstrate the effectiveness of our models.

Patent
18 Mar 2015
TL;DR: In this article, a character input device, including a touch panel which integrally includes a display unit and an input unit, is presented. And a character display region on the display unit is associated with a keyboard including a plurality of characters with the touch panel and displays a character in the keyboard corresponding to a position where a touch input is performed via the input unit.
Abstract: A character input device, including: a touch panel which integrally includes a display unit and an input unit; a first control unit which displays a character input screen having a character display region on the display unit, associates a keyboard including a plurality of characters with the touch panel and displays a character in the keyboard corresponding to a position where a touch input is performed via the input unit in the character display region as an input target character; an evaluation unit which obtains an evaluation value for the input target character on basis of an input manner; a determination unit which determines whether the input target character is a correction target character on basis of the evaluation value; and a second control unit which displays the correction target character on the display unit so as to be distinguishable

Book ChapterDOI
14 Apr 2015
TL;DR: In this article, the authors proposed using character n-grams as features since they have shown to capture lexical content as well as stylistic information, and evaluated the effectiveness of character ngrams decreasing the training set size in order to simulate real training conditions.
Abstract: In this paper we consider the detection of opinion spam as a stylistic classification task because, given a particular domain, the deceptive and truthful opinions are similar in content but differ in the way opinions are written (style). Particularly, we propose using character n-grams as features since they have shown to capture lexical content as well as stylistic information. We evaluated our approach on a standard corpus composed of 1600 hotel reviews, considering positive and negative reviews. We compared the results obtained with character n-grams against the ones with word n-grams. Moreover, we evaluated the effectiveness of character n-grams decreasing the training set size in order to simulate real training conditions. The results obtained show that character n-grams are good features for the detection of opinion spam; they seem to be able to capture better than word n-grams the content of deceptive opinions and the writing style of the deceiver. In particular, results show an improvement of 2.3% and 2.1% over the word-based representations in the detection of positive and negative deceptive opinions respectively. Furthermore, character n-grams allow to obtain a good performance also with a very small training corpus. Using only 25% of the training set, a Naive Bayes classifier showed F 1 values up to 0.80 for both opinion polarities.

Journal ArticleDOI
27 Jul 2015
TL;DR: This work presents a space-time abstraction for the sketch-based design of character animation that allows animators to draft a full coordinated motion using a single stroke called the space- time curve (STC).
Abstract: We present a space-time abstraction for the sketch-based design of character animation. It allows animators to draft a full coordinated motion using a single stroke called the space-time curve (STC). From the STC we compute a dynamic line of action (DLOA) that drives the motion of a 3D character through projective constraints. Our dynamic models for the line's motion are entirely geometric, require no pre-existing data, and allow full artistic control. The resulting DLOA can be refined by over-sketching strokes along the space-time curve, or by composing another DLOA on top leading to control over complex motions with few strokes. Additionally, the resulting dynamic line of action can be applied to arbitrary body parts or characters. To match a 3D character to the 2D line over time, we introduce a robust matching algorithm based on closed-form solutions, yielding a tight match while allowing squash and stretch of the character's skeleton. Our experiments show that space-time sketching has the potential of bringing animation design within the reach of beginners while saving time for skilled artists.

Proceedings ArticleDOI
01 Sep 2015
TL;DR: A novel technique for character detection is proposed, achieving significant improvements over state of the art on multiple datasets, and heavily reliant on NER to identify characters.
Abstract: Characters are fundamental to literary analysis. Current approaches are heavily reliant on NER to identify characters, causing many to be overlooked. We propose a novel technique for character detection, achieving significant improvements over state of the art on multiple datasets.

Posted Content
TL;DR: This work proposes two alternative structural modifications to the classical RNN model, one of which consists on conditioning the character level representation on the previous word representation, and the other uses the character history to condition the output probability.
Abstract: Recurrent neural networks are convenient and efficient models for language modeling. However, when applied on the level of characters instead of words, they suffer from several problems. In order to successfully model long-term dependencies, the hidden representation needs to be large. This in turn implies higher computational costs, which can become prohibitive in practice. We propose two alternative structural modifications to the classical RNN model. The first one consists on conditioning the character level representation on the previous word representation. The other one uses the character history to condition the output probability. We evaluate the performance of the two proposed modifications on challenging, multi-lingual real world data.

Journal ArticleDOI
TL;DR: In this paper, the authors present an approach in which non-additive systems can be described within a purely thermodynamics formalism, where a large ensemble of replicas of the system where the standard formulation of thermodynamics can be naturally applied and the properties of a single system can be inferred.
Abstract: The usual formulation of thermodynamics is based on the additivity of macroscopic systems. However, there are numerous examples of macroscopic systems that are not additive, due to the long-range character of the interaction among the constituents. We present here an approach in which nonadditive systems can be described within a purely thermodynamics formalism. The basic concept is to consider a large ensemble of replicas of the system where the standard formulation of thermodynamics can be naturally applied and the properties of a single system can be consequently inferred. After presenting the approach, we show its implementation in systems where the interaction decays as 1/r(α) in the interparticle distance r, with α smaller than the embedding dimension d, and in the Thirring model for gravitational systems.

Journal ArticleDOI
TL;DR: In this paper, a list of moral competencies that can be implemented in competency-based human resource management is presented. And the concept of character traits, or virtues, and a unified operational version of it for incorporation into management is discussed.
Abstract: In recent years, character traits in general and virtue-related concepts in particular have been of considerable interest to philosophers, psychological researchers, and practitioners in the business ethics field. Three approaches to character traits can be used to incorporate ethics into organizations: virtues (philosophical approach), character strengths (psychological approach), and competencies (management approach). The aim of this article is to clarify the concept of character traits, or virtues, and provide a unified operational version of it for incorporation into management. To this end, we first discuss the analogy among virtues, character strengths, and competencies. Then, we propose a list of moral competencies that can be implemented in competency-based human resource management.

Posted Content
TL;DR: In this paper, twisted versions of the wild character varieties are constructed for the first time, using a twisted version of the Wild Character Dictionary. But they do not describe the characters themselves.
Abstract: We will construct twisted versions of the wild character varieties.

Journal ArticleDOI
TL;DR: It is suggested here that the ultimate accuracy of DFT methods arises from the type of hybridization scheme followed, and a large number of exchange-correlation functionals against the AE6, G2/148, and S22 reference data sets are assessed.
Abstract: It is suggested here that the ultimate accuracy of DFT methods arises from the type of hybridization scheme followed. This idea can be cast into a mathematical formulation utilizing an integrand connecting the noninteracting and the interacting particle system. We consider two previously developed models for it, dubbed as HYB0 and QIDH, and assess a large number of exchange-correlation functionals against the AE6, G2/148, and S22 reference data sets. An interesting consequence of these hybridization schemes is that the error bars, including the standard deviation, are found to markedly decrease with respect to the density-based (nonhybrid) case. This improvement is substantially better than variations due to the underlying density functional used. We thus finally hypothesize about the universal character of the HYB0 and QIDH models.

Patent
19 Aug 2015
TL;DR: In this article, the authors describe a process of matching media characters, the process including: obtaining a plurality of character records, each character record including a trait vector specifying traits of the respective character; receiving a request from a user device to match characters in the character records.
Abstract: Provided is a process of matching media characters, the process including: obtaining a plurality of character records, each character record including a trait vector specifying traits of the respective character; receiving a request from a user device to match characters in the character records, the request identifying at least one reference character record; calculating, with one or more processors, matching scores indicative of similarity between the trait vector of the reference character record and trait vectors of other character records among the plurality of character records; selecting a responsive character record from among the plurality of character records based on the matching scores; and sending instructions to the user device to display information about a character of the responsive character record.

Journal ArticleDOI
TL;DR: Despite recognizing that institutionalized cooperation is central to both business and politics in many advanced, industrialized economies, scholars remain divided over the origins, character, and origins of such cooperation as mentioned in this paper.
Abstract: Despite recognizing that institutionalized cooperation is central to both business and politics in many advanced, industrialized economies, scholars remain divided over the origins, character, and ...

Journal ArticleDOI
TL;DR: A method for isolated handwritten or hand-printed character recognition using dynamic programming for matching the non-linear multi-projection profiles that are produced from the Radon transform using dynamic time warping (DTW).
Abstract: In this paper, we study a method for isolated handwritten or hand-printed character recognition using dynamic programming for matching the non-linear multi-projection profiles that are produced from the Radon transform. The idea is to use dynamic time warping (DTW) algorithm to match corresponding pairs of the Radon features for all possible projections. By using DTW, we can avoid compressing feature matrix into a single vector which may miss information. It can handle character images in different shapes and sizes that are usually happened in natural handwriting in addition to difficulties such as multi-class similarities, deformations and possible defects. Besides, a comprehensive study is made by taking a major set of state-of-the-art shape descriptors over several character and numeral datasets from different scripts such as Roman, Devanagari, Oriya, Bangla and Japanese-Katakana including symbol. For all scripts, the method shows a generic behaviour by providing optimal recognition rates but, with high computational cost.

Journal ArticleDOI
TL;DR: In this article, it was shown that the hypercohomology of most character twists of perverse sheaves on a complex abelian variety vanishes in all non-zero degrees.
Abstract: We show that the hypercohomology of most character twists of perverse sheaves on a complex abelian variety vanishes in all non-zero degrees. As a consequence we obtain a vanishing theorem for constructible sheaves and a relative vanishing theorem for a homomorphism between abelian varieties. Our proof relies on a Tannakian description for convolution products of perverse sheaves, and with future applications in mind we discuss the basic properties of the arising Tannaka groups.

Journal ArticleDOI
01 Sep 2015
TL;DR: Muller as mentioned in this paper argues that populism is not just anti-elitist, but also necessarily anti-pluralist, and in this exclusive claim to representation lies its profoundly undemocratic character.
Abstract: Donald Trump is but Bernie Sanders isn't; Syriza is, sometimes. In analysing the state of contemporary populism, Jan-Werner Muller argues that it is not just anti-elitist, but also necessarily anti-pluralist, and in this exclusive claim to representation lies its profoundly undemocratic character.

Journal ArticleDOI
TL;DR: The results indicate that although the content of the verbal responses of both virtual characters was the same, participants showed different subjective and behavioral responses to the two different personalities.
Abstract: We introduce a novel technique for the study of human–virtual character interaction in immersive virtual reality. The human participants verbally administered a standard questionnaire about social anxiety to a virtual female character, which responded to each question through speech and body movements. The purpose was to study the extent to which participants responded differently to characters that exhibited different personalities, even though the verbal content of their answers was always the same. A separate online study provided evidence that our intention to create two different personality types had been successful. In the main between-groups experiment that utilized a Cave system there were 24 male participants, where 12 interacted with a female virtual character portrayed to exhibit shyness and the remaining 12 with an identical but more confident virtual character. Our results indicate that although the content of the verbal responses of both virtual characters was the same, participants showed different subjective and behavioral responses to the two different personalities. In particular participants evaluated the shy character more positively, for example, expressing willingness to spend more time with her. Participants evaluated the confident character more negatively and waited for a significantly longer time to call her back after she had left the scene in order to answer a telephone call. The method whereby participants interviewed the virtual character allowed naturalistic conversation while avoiding the necessity of speech processing and generation, and natural language understanding. It is therefore a useful method for the study of the impact of virtual character personality on participant responses.