scispace - formally typeset
Search or ask a question
Topic

Shallow parsing

About: Shallow parsing is a research topic. Over the lifetime, 397 publications have been published within this topic receiving 10211 citations.


Papers
More filters
Book ChapterDOI
01 Jan 2013
TL;DR: This paper presents an integrated feature extraction framework for Natural Language Processing that removes wasteful redundancy and helps in rapid prototyping.
Abstract: Feature extraction from text corpora is an important step in Natural Language Processing (NLP), especially for Machine Learning (ML) techniques. Various NLP tasks have many common steps, e.g. low level act of reading a corpus and obtaining text windows from it. Some high-level processing steps might also be shared, e.g. testing for morpho-syntactic constraints between words. An integrated feature extraction framework removes wasteful redundancy and helps in rapid prototyping.

20 citations

Proceedings Article
01 Jan 2002
TL;DR: Among the novelties added to QUANTUM this year is a web module that finds exact answers using high-precision reformulation of the question to anticipate the expected context of.
Abstract: This year, we participated to the Question Answering task for the second time with the QUANTUM system. We entered 2 runs for the main task (one using the web, the other without) and 1 run for the list task (without the web). We essentially built on last year’s experience to enhance the system. The architecture of QUANTUM is mainly the same as last year: it uses patterns that rely on shallow parsing techniques and regular expressions to analyze the question and then select the most appropriate extraction function. This extraction function is then applied to one-paragraph long passages retrieved by Okapi to extract and score candidate answers. Among the novelties we added to QUANTUM this year is a web module that finds exact answers using high-precision reformulation of the question to anticipate the expected context of

20 citations

Book ChapterDOI
11 Sep 2001
TL;DR: This work presents a two-level stochastic model approach to the construction of the natural language understanding component of a dialog system in the domain of database queries, which answers queries about a railway timetable in Spanish.
Abstract: Over the last few years, stochastic models have been widely used in the natural language understanding modeling Almost all of these works are based on the definition of segments of words as basic semantic units for the stochastic semantic models In this work, we present a two-level stochastic model approach to the construction of the natural language understanding component of a dialog system in the domain of database queries This approach will treat this problem in a way similar to the stochastic approach for the detection of syntactic structures (Shallow Parsing or Chunking) in natural language sentences; however, in this case, stochastic semantic language models are based on the detection of some semantic units from the user turns of the dialog We give the results of the application of this approach to the construction of the understanding component of a dialog system, which answers queries about a railway timetable in Spanish

19 citations

Proceedings Article
01 Jan 2001
TL;DR: This work makes an extensive use of the Alembic named-entity tagger and the WordNet semantic network to extract candidate answers from one-paragraph-long passage retrieval and deals with the possibility of noanswer questions by looking for a significant score drop between the extracted candidate answers.
Abstract: We participated to the TREC-X QA main task and list task with a new system named QUANTUM, which analyzes questions with shallow parsing techniques and regular expressions. Instead of using a question classification based on entity types, we classify the questions according to generic mechanisms (which we call extraction fonctions) for the extraction of candidate answers. We take advantage of the Okapi information retrieval system for one-paragraph-long passage retrieval. We make an extensive use of the Alembic named-entity tagger and the WordNet semantic network to extract candidate answers from those passages. We deal with the possibility of noanswer questions (NIL) by looking for a significant score drop between the extracted candidate answers.

18 citations


Network Information
Related Topics (5)
Machine translation
22.1K papers, 574.4K citations
81% related
Natural language
31.1K papers, 806.8K citations
79% related
Language model
17.5K papers, 545K citations
79% related
Parsing
21.5K papers, 545.4K citations
79% related
Query language
17.2K papers, 496.2K citations
74% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20217
202012
20196
20185
201711
201611