scispace - formally typeset
Y

Yujie Zhang

Researcher at National Institute of Information and Communications Technology

Publications -  24
Citations -  395

Yujie Zhang is an academic researcher from National Institute of Information and Communications Technology. The author has contributed to research in topics: Machine translation & Parsing. The author has an hindex of 9, co-authored 24 publications receiving 383 citations. Previous affiliations of Yujie Zhang include Beijing Jiaotong University.

Papers
More filters
Proceedings Article

Improving Chinese Word Segmentation and POS Tagging with Semi-supervised Methods Using Large Auto-Analyzed Data

TL;DR: A simple yet effective semi-supervised method to improve Chinese word segmentation and POS tagging by introducing novel features derived from large auto-analyzed data to enhance a simple pipelined system.
Proceedings Article

Chinese Named Entity Recognition with Conditional Random Fields

TL;DR: A Chinese Named Entity Recognition (NER) system submitted to the close track of Sighan Bakeoff2006 is presented, which incorporates basic features and additional features based on Conditional Random Fields (CRFs) in order to correct inconsistently results.
Proceedings ArticleDOI

An Empirical Study of Chinese Chunking

TL;DR: An empirical study of Chinese chunking on a corpus, which is extracted from UPENN Chinese Treebank-4 (CTB4), and two novel voting methods based on the characteristics of chunking task are described.
Proceedings Article

Dependency Parsing with Short Dependency Relations in Unlabeled Data.

TL;DR: This paper presents an effective dependency parsing approach of incorporating short dependency information from unlabeled data by training another parser which uses the information on short dependency relations extracted from the output of the first parser.
Proceedings ArticleDOI

Multilingual aligned parallel treebank corpus reflecting contextual information and its applications

TL;DR: This paper shows that the framework for parallel translations whose source language sentence is similar to a given sentence can be semi-automatically generated and can be achieved by using the aligned parallel treebank corpus.