scispace - formally typeset
Patent

System and method for identifying and labeling fields of text associated with scanned business documents

TLDR
In this article, a system for electronically distilling information from a business document uses a network scanner to electronically scan a platen area, having a businessdocument thereon, to create a bitmap.
Abstract
A system for electronically distilling information from a business document uses a network scanner to electronically scan a platen area, having a business document thereon, to create a bitmap. A network server carries out a segmentation process to segment the scan generated bitmap into a bitmap object, the bitmap object corresponding to the scanned business document; a bitmap to text conversion process to convert the bitmap object into a block of text; a semantic recognition process to generate a structured representation of semantic entities corresponding to the scanned business document; and a document generation process to convert the structured representation into a structure text file. The semantic recognition process includes the processes of generating, for each line of text having a keyword therein, a terminal symbol corresponding to the keyword therein; generating, for each line of text not having a keyword therein and absent of numeric characters, an alphabetic terminal symbol; generating, for each line of text not having a keyword therein and having a numeric character therein, an alphanumeric terminal symbol; generating a string of terminal symbols from the generated terminal symbols; determining a probable parsing of the generated string of terminal symbols; labeling each text line, according to a determined function, with non-terminal symbols; and parsing the business document information text into fields of business document information text based upon the non-terminal symbol of each text line and the determined probable parsing of the generated string of terminal symbols.

read more

Citations
More filters
Patent

Extracting data from semi-structured information utilizing a discriminative context free grammar

TL;DR: In this paper, a discriminative grammar framework utilizing a machine learning algorithm is employed to facilitate in learning scoring functions for parsing of unstructured information, which is trained based on features of an example input.
Patent

Semantic analysis of information

TL;DR: In this article, semantic information that describes data sets is inferred based upon a semantic analysis performed on data sets retained within a data repository, which can include a determination of formats associated with fields of the data sets and a comparison of values of the fields against reference data sets having predetermined semantic types.
Patent

Sending and receiving electronic business cards

TL;DR: In this article, a visual representation of electronic business cards was generated from associated contact file information, which can be used for adding to or updating information contained in a recipient's electronic contact files.
Patent

System and method for defining and generating document management applications for model-driven document management

TL;DR: In this article, a business process file generator defines and generates dynamic document management applications for use in a document management system, formalizing variable elements in document management application to reduce the requirement for custom application logic with each new application.
Patent

Exchanging electronic business cards over digital media

TL;DR: In this paper, a visual representation of electronic business cards was generated from associated contact file information, which can be used for adding to or updating information contained in a recipient's electronic contact files.
References
More filters
Patent

Credit management for electronic brokerage system

TL;DR: In this article, an anonymous trading system identifies the best bids and offers from those counterparties with which each party is currently eligible to deal, while maintaining the anonymity of the potential counterparty and the confidentiality of any specific credit limitations imposed by the anonymous potential counter party.
Patent

Image processing network

TL;DR: In this article, an image processing network for processing image files and coupled to an outside data communication network comprising a plurality of remote image capture units, the units scanning documents and creating a pluralityof image files of said documents, the image processing means is coupled to said plurality of Remote Image Capture Units for processing the image files; the accounting means are coupled to the image Processing Means for immediately updating accounts associated with said image processing Means.
Patent

Information processing methodology

TL;DR: In this article, the user can identify, on a display, portions of the hard copy documents containing information used in application programs or for storage, and the selected stored document information is then placed into the transmission format required by a particular application program in accordance with transmission format instructions.
Patent

Method and apparatus for implementing a corporate directory and service center

TL;DR: In this article, a method and apparatus for implementing a corporate directory and service center is described, which performs querying for common characteristics, displaying information in a varied manner of displays and switching between the manners of displaying, maintaining data integrity and changing data, defining types of data with forms of display or treatments for handling the data.
Patent

Use of semantic inference and context-free grammar with speech recognition system

TL;DR: In this paper, a method and apparatus to use semantic inference with speech recognition systems includes recognizing at least one spoken word, processing the spoken word using a context free grammar, deriving an output from the context-free grammar, and translating the output to a predetermined command.