scispace - formally typeset
Open AccessJournal ArticleDOI

Hindi to English Machine Transliteration of Named Entities using Conditional Random Fields

Reads0
Chats0
TLDR
This paper focuses on Hindi to English machine transliteration of Indian named entities such as proper nouns, place names and organization names using conditional random fields (CRF).
Abstract
Machine transliteration has received significant research attention in recent years. In most cases, the source language has been English and the target language is an Asian language. This paper focuses on Hindi to English machine transliteration of Indian named entities such as proper nouns, place names and organization names using conditional random fields (CRF). Hindi is the national language of the India and spoken by more than 500 millions Indian. Hindi is the world‟s fourth most commonly used language after Chinese, English and Spanish. This system takes Indian place name as an input in Hindi language using Devanagari script and transliterates it into English. The input to the system is provided in the form of syllabification in order to apply the n-gram techniques. As more than 50% named entities are formed as a combination of two and three syllabic units, the ngram approach with unigrams, bigrams and trigrams of Hindi are used to train the corpus. The system provides the satisfactory performance for trigrams as compared to unigrams and bigrams. General Terms Machine Transliteration

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Rule Based Transliteration Scheme for English to Punjabi

TL;DR: This paper is doing machine transliteration for English-Punjabi language pair using rule based approach and has constructed some rules for syllabification.
Journal ArticleDOI

Rule Based Transliteration Scheme for English to Punjabi

TL;DR: In this paper, a rule-based approach was used for English-Punjabi language pair using rule based approach and the probabilities for name entities (Proper names and location) were calculated by using relative frequency through a statistical machine translation toolkit known as MOSES.
Journal ArticleDOI

A Sequence-to-Sequence based Approach For the double Transliteration of Tunisian Dialect

TL;DR: A deep learning based Sequence-to-Sequence approach is proposed to perform a word-level transliteration of the user generated Tunisian dialect on the social web, in both Latin to Arabic and Arabic to Latin senses.
Journal ArticleDOI

Romanized Tunisian dialect transliteration using sequence labelling techniques

TL;DR: This work addresses the issue of the automatic Latin to Arabic transliteration of TD language productions on the social web and proposes an approach that models the transliterations as a sequence labeling task.
References
More filters
Proceedings Article

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

TL;DR: This work presents iterative parameter estimation algorithms for conditional random fields and compares the performance of the resulting models to HMMs and MEMMs on synthetic and natural-language data.

An Introduction to Conditional Random Fields for Relational Learning

TL;DR: A solution to this problem is to directly model the conditional distribution p(y|x), which is sufficient for classification, and this is the approach taken by conditional random fields.
Journal Article

Machine transliteration

TL;DR: This paper used a generative model to perform backwards transliteration from Japanese back to English, and evaluated a method for performing backward transliterations by machine, incorporating several distinct stages in the translation process.
Related Papers (5)