Topic

Noisy text analytics

About: Noisy text analytics is a research topic. Over the lifetime, 700 publications have been published within this topic receiving 28759 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Patent•DOI•

Proofreading with text to speech feedback

[...]

Hsiao-Wuen Hon¹, Dong Li¹, Xuedong Huang¹, Yun-Chen Ju¹, Xianghui Sean Zhang¹ - Show less +1 more•Institutions (1)

Microsoft¹

17 Aug 1998-Journal of the Acoustical Society of America

TL;DR: A computer implemented system and method of proofreading text in a computer system includes receiving text from a user into a text editing module as discussed by the authors, at least a portion of the text is converted to an audio signal upon the detection of an indicator, the indicator defining a boundary in the text by either being embodied therein or comprising delays in receiving text.

...read moreread less

Abstract: A computer implemented system and method of proofreading text in a computer system includes receiving text from a user into a text editing module. At least a portion of the text is converted to an audio signal upon the detection of an indicator, the indicator defining a boundary in the text by either being embodied therein or comprising delays in receiving text. The audio signal is played through a speaker to the user to provide feedback.

...read moreread less

224 citations

Posted Content•

Scene Text Detection via Holistic, Multi-Channel Prediction

[...]

Cong Yao, Xiang Bai, Nong Sang, Xinyu Zhou, Shuchang Zhou, Zhimin Cao - Show less +2 more

29 Jun 2016-arXiv: Computer Vision and Pattern Recognition

TL;DR: This work proposes to localize text in a holistic manner, by casting scene text detection as a semantic segmentation problem, and demonstrates that the proposed algorithm substantially outperforms previous state-of-the-art approaches.

...read moreread less

Abstract: Recently, scene text detection has become an active research topic in computer vision and document analysis, because of its great importance and significant challenge. However, vast majority of the existing methods detect text within local regions, typically through extracting character, word or line level candidates followed by candidate aggregation and false positive elimination, which potentially exclude the effect of wide-scope and long-range contextual cues in the scene. To take full advantage of the rich information available in the whole natural image, we propose to localize text in a holistic manner, by casting scene text detection as a semantic segmentation problem. The proposed algorithm directly runs on full images and produces global, pixel-wise prediction maps, in which detections are subsequently formed. To better make use of the properties of text, three types of information regarding text region, individual characters and their relationship are estimated, with a single Fully Convolutional Network (FCN) model. With such predictions of text properties, the proposed algorithm can simultaneously handle horizontal, multi-oriented and curved text in real-world natural images. The experiments on standard benchmarks, including ICDAR 2013, ICDAR 2015 and MSRA-TD500, demonstrate that the proposed algorithm substantially outperforms previous state-of-the-art approaches. Moreover, we report the first baseline result on the recently-released, large-scale dataset COCO-Text.

...read moreread less

220 citations

Proceedings Article•DOI•

Automatic text decomposition using text segments and text themes

[...]

Gerard Salton¹, Amit Singhal¹, Chris Buckley¹, Mandar Mitra¹•Institutions (1)

Cornell University¹

01 Mar 1996

TL;DR: The interaction between text segments and text themes is used to characterize text structure, and to formulate specifications for information retrieval, text traversal, and text summarization.

...read moreread less

Abstract: With the widespread use of full-text information retrieval, passage-retrieval techniques are becoming increasingly popular. Larger texts can then be replaced by important text excerpts, thereby simplifying the retrieval task and improving retrieval effectiveness. Passage-level evidence about the use of words in local contexts is also useful for resolving language ambiguities and improving retrieval output. Two main text decomposition strategies are introduced in this study, including a chronological decomposition into {\em text segments}, and semantic decomposition into {\em text themes}. The interaction between text segments and text themes is then used to characterize text structure, and to formulate specifications for information retrieval, text traversal, and text summarization.

...read moreread less

213 citations

Patent•

Leveraging markup language data for semantically labeling text strings and data and for providing actions based on semantically labeled text strings and data

[...]

Brian M. Jones¹, Jeff Reynar¹, Ziyi Wang¹•Institutions (1)

Microsoft¹

27 Jun 2003

TL;DR: In this article, an action dynamically linked library (DLL) is used to obtain actions associated with markup language elements applied to the text or data, which are then passed to a recognizer DLL for recognition of certain data types.

...read moreread less

Abstract: Markup language data applied to text or data is leveraged for providing helpful actions on certain types of text or data such as names, addresses, etc. Selected portions of text or data entered into a document and any associated markup language data are passed to an action dynamically linked library (DLL) for obtaining actions associated with markup language elements applied to the text or data. The text or data may be passed to a recognizer DLL for recognition of certain data types. The recognizer DLL utilizes markup language data associated with the text or data to assist recognition and labeling of text or data. After all applicable text and/or data is recognized and labeled, an action DLL is called for actions associated with the labeled text or data.

...read moreread less

208 citations

Patent•

Robust Information Extraction from Utterances

[...]

Jun Huang, Yookyung Kim, Youssef Billawala, Farzad Ehsani, Demitrios Master - Show less +1 more

27 Dec 2007

TL;DR: In this paper, a novel predictive feature extraction method which combines linguistic and statistical information for representation of information embedded in a noisy source language is proposed. But the method is not suitable for speech recognition systems due to the large domain size, scarce training data and noisy environmental conditions.

...read moreread less

Abstract: The performance of traditional speech recognition systems (as applied to information extraction or translation) decreases significantly with, larger domain size, scarce training data as well as under noisy environmental conditions. This invention mitigates these problems through the introduction of a novel predictive feature extraction method which combines linguistic and statistical information for representation of information embedded in a noisy source language. The predictive features are combined with text classifiers to map the noisy text to one of the semantically or functionally similar groups. The features used by the classifier can be syntactic, semantic, and statistical.

...read moreread less

199 citations

Collapse

Network Information

Performance

Metrics

715

Papers

30,953

Citations

No. of papers in the topic in previous years
Year	Papers
2023	6
2022	8
2020	1
2019	1
2018	4
2017	23

Noisy text analytics

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics