Y
Yujie Zhang
Researcher at National Institute of Information and Communications Technology
Publications - 24
Citations - 395
Yujie Zhang is an academic researcher from National Institute of Information and Communications Technology. The author has contributed to research in topics: Machine translation & Parsing. The author has an hindex of 9, co-authored 24 publications receiving 383 citations. Previous affiliations of Yujie Zhang include Beijing Jiaotong University.
Papers
More filters
Proceedings Article
Improving Chinese Word Segmentation and POS Tagging with Semi-supervised Methods Using Large Auto-Analyzed Data
TL;DR: A simple yet effective semi-supervised method to improve Chinese word segmentation and POS tagging by introducing novel features derived from large auto-analyzed data to enhance a simple pipelined system.
Proceedings Article
Chinese Named Entity Recognition with Conditional Random Fields
TL;DR: A Chinese Named Entity Recognition (NER) system submitted to the close track of Sighan Bakeoff2006 is presented, which incorporates basic features and additional features based on Conditional Random Fields (CRFs) in order to correct inconsistently results.
Proceedings ArticleDOI
An Empirical Study of Chinese Chunking
TL;DR: An empirical study of Chinese chunking on a corpus, which is extracted from UPENN Chinese Treebank-4 (CTB4), and two novel voting methods based on the characteristics of chunking task are described.
Proceedings Article
Dependency Parsing with Short Dependency Relations in Unlabeled Data.
TL;DR: This paper presents an effective dependency parsing approach of incorporating short dependency information from unlabeled data by training another parser which uses the information on short dependency relations extracted from the output of the first parser.
Proceedings ArticleDOI
Multilingual aligned parallel treebank corpus reflecting contextual information and its applications
TL;DR: This paper shows that the framework for parallel translations whose source language sentence is similar to a given sentence can be semi-automatically generated and can be achieved by using the aligned parallel treebank corpus.