scispace - formally typeset
Search or ask a question
Topic

Shallow parsing

About: Shallow parsing is a research topic. Over the lifetime, 397 publications have been published within this topic receiving 10211 citations.


Papers
More filters
Proceedings Article
06 Nov 2007
TL;DR: The purpose of this paper is to characterize a chunk boundary parsing algorithm, using a statistical method combining adjustment rules, which serves as a supplement to traditional statistics-based parsing methods.
Abstract: Natural language processing (NLP) is a very hot research domain. One important branch of it is sentence analysis, including Chinese sentence analysis. However, currently, no mature deep analysis theories and techniques are available. An alternative way is to perform shallow parsing on sentences which is very popular in the domain. The chunk identification is a fundamental task for shallow parsing. The purpose of this paper is to characterize a chunk boundary parsing algorithm, using a statistical method combining adjustment rules, which serves as a supplement to traditional statistics-based parsing methods. The experimental results show that the model works well on the small dataset. It will contribute to the sequent processes like chunk tagging and chunk collocation extraction under other topics etc.

2 citations

01 Jan 2001
TL;DR: A part-of-speech tagger for Czech is described that employs DIS shallow parser for Czech, manually-coded rules and inductive logic programming.
Abstract: A part-of-speech tagger for Czech is described that employs DIS shallow parser for Czech, manually-coded rules and inductive logic programming.

2 citations

01 May 2001
TL;DR: In this paper well-known state-of-the-art data-driven algorithms are applied topart- of-speech tagging and shallow parsing of Swedish texts.
Abstract: In this paper well-known state-of-the-art data-driven algorithms are applied topart-of-speech tagging and shallow parsing of Swedish texts.

2 citations

DOI
01 Jan 2002
TL;DR: An efficient FPGA-based coprocessor for natural language syntactic analysis that can deal with inputs in the form of word lattices is proposed and an interface between the hardware tool and a potential natural language software application, running on the desktop computer is offered.
Abstract: This thesis is at the crossroad between Natural Language Processing (NLP) and digital circuit design. It aims at delivering a custom hardware coprocessor for accelerating natural language parsing. The coprocessor has to parse real-life natural language and is targeted to be useful in several NLP applications that are time constrained or need to process large amounts of data. More precisely, the three goals of this thesis are: (1) to propose an efficient FPGA-based coprocessor for natural language syntactic analysis that can deal with inputs in the form of word lattices, (2) to implement the coprocessor in a hardware tool ready for integration within an ordinary desktop computer and (3) to offer an interface (i.e. software library) between the hardware tool and a potential natural language software application, running on the desktop computer. The Field Programmable Gate Array (FPGA) technology has been chosen as the core of the coprocessor implementation due to its ability to efficiently exploit all levels of parallelism available in the implemented algorithms in a cost-effective solution. In addition, the FPGA technology makes it possible to efficiently design and test such a hardware coprocessor. A final reason is that the future general-purpose processors are expected to contain reconfigurable resources. In such a context, an IP core implementing an efficient context-free parser ready to be configured within the reconfigurable resources of the general-purpose processor would be a support for any application relying on context-free parsing and running on that general-purpose processor. The context-free grammar parsing algorithms that have been implemented are the standard CYK algorithm and an enhanced version of the CYK algorithm developed at the EPFL Artificial Intelligence Laboratory. These algorithms were selected (1) due to their intrinsic properties of regular data flow and data processing that make them well suited for a hardware implementation, (2) for their property of producing partial parse trees which makes them adapted for further shallow parsing and (3) for being able to parse word lattices.

2 citations

Journal Article
TL;DR: In this article, a linguistics-based system for word-to-word alignment is presented. But this system assumes some hypotheses about the structure of texts which are often infirmed.
Abstract: This paper describes an algorithm which represents one of the few linguistics-based systems for word-to-word alignment. Most systems are purely statistic and assume some hypotheses about the structure of texts which are often infirmed. Our approach combines statistic methods with positional and linguistic ones in order to can be successfully applied to any kind of bitext as far as the internal stricture of the texts is concerned. The linguistic part uses shallow parsing by regular expressions and relies on very general linguistic principles. However a component of language-specific methods can be developed for improving results. Our word-alignment system was evaluated on a Romanian-English bitext.

2 citations


Network Information
Related Topics (5)
Machine translation
22.1K papers, 574.4K citations
81% related
Natural language
31.1K papers, 806.8K citations
79% related
Language model
17.5K papers, 545K citations
79% related
Parsing
21.5K papers, 545.4K citations
79% related
Query language
17.2K papers, 496.2K citations
74% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20217
202012
20196
20185
201711
201611