scispace - formally typeset
Search or ask a question
Journal Article

Text and label reading using Raspberry Pi and Open CV

TL;DR: The aim is to obtain images from the real world and recognize them with little to no latency and convert them into audio files that can be played and with the help of this model people with blindness can be more independent with confidence.
Abstract: In this digital Age with the help of our cognitive abilities, we can see, hear and sense various new technologies and can communicate with them but this is not the case for visually impaired people In order for them to communicate with the world they need assistive technology and adaptive devices. The key idea of the project involves helping the sight challenged people. The Proposal of the project involves capturing an image with the help of a camera, recognizing it and extracting the content in the image using various data algorithmic techniques. With the help of this model people with blindness can be more independent with confidence. The major step in making this project a reality is to obtain images from the real world and recognize them with little to no latency and convert them into audio files that can be played. This can be executed out with the help of Open CV and Raspberry pi the main usage of the latter is its portability and compatibility which can be achieved with the use of battery backup functionality which can also be used for future endeavors. The size of the Raspberry pi soc permits the user to carry it anywhere and use it.

Content maybe subject to copyright    Report

Citations
More filters
Journal Article
TL;DR: The design and implementation of a system which involves conversion of text information present in the image to speech information and conversion of speech information given by user into text information is described.
Abstract: Nowadays realtime hardware implementation of Text to Speech and Speech to text conversion systems playing a crucial role in several real time applications such as reading aid for blind people and talking aids for vocally handicapped people and robotics etc. This paper describes the design and implementation of a system which involves conversion of text information present in the image to speech information and conversion of speech information given by user into text information. In this context raspberry pi has been chosen as a hardware platform to implement the proposed method. For the implementation proposed system Logitech C170 camera module and Bluetooth HC-05 module were interfaced to raspberry pi device. The concept used in this project are tesseract OCR(Optical Character Recognition),espeak TTS(Text to Speech) engine,AMR(android meets robots) voice to text application software. The code which is used in the proposed system is used in the python programming language. The proposed system which is implemented on raspberry pi is used for many real time applications.

6 citations

References
More filters
Journal ArticleDOI
TL;DR: The combination of CAMSHIFT and SVMs produces both robust and efficient text detection, as time-consuming texture analyses for less relevant pixels are restricted, leaving only a small part of the input image to be texture-analyzed.
Abstract: The current paper presents a novel texture-based method for detecting texts in images. A support vector machine (SVM) is used to analyze the textural properties of texts. No external texture feature extraction module is used, but rather the intensities of the raw pixels that make up the textural pattern are fed directly to the SVM, which works well even in high-dimensional spaces. Next, text regions are identified by applying a continuously adaptive mean shift algorithm (CAMSHIFT) to the results of the texture analysis. The combination of CAMSHIFT and SVMs produces both robust and efficient text detection, as time-consuming texture analyses for less relevant pixels are restricted, leaving only a small part of the input image to be texture-analyzed.

473 citations

01 Jan 1994
TL;DR: Some simple functions to compute the discrete cosine transform and how it is used for image compression are developed to illustrate the use of Mathematica in image processing and to provide the reader with the basic tools for further exploration of this subject.
Abstract: The discrete cosine transform (DCT) is a technique for converting a signal into elementary frequency components. It is widely used in image compression. Here we develop some simple functions to compute the DCT and to compress images. These functions illustrate the power of Mathematica in the prototyping of image processing algorithms. The rapid growth of digital imaging applications, including desktop publishing, multimedia, teleconferencing, and high-definition television (HDTV) has increased the need for effective and standardized image compression techniques. Among the emerging standards are JPEG, for compression of still images [Wallace 1991]; MPEG, for compression of motion video [Puri 1992]; and CCITT H.261 (also known as Px64), for compression of video telephony and teleconferencing. All three of these standards employ a basic technique known as the discrete cosine transform (DCT). Developed by Ahmed, Natarajan, and Rao [1974], the DCT is a close relative of the discrete Fourier transform (DFT). Its application to image compression was pioneered by Chen and Pratt [1984]. In this article, I will develop some simple functions to compute the DCT and show how it is used for image compression. We have used these functions in our laboratory to explore methods of optimizing image compression for the human viewer, using information about the human visual system [Watson 1993]. The goal of this paper is to illustrate the use of Mathematica in image processing and to provide the reader with the basic tools for further exploration of this subject.

364 citations

Journal ArticleDOI
TL;DR: The result of the proposed method is compared with the standard Mushaf al Madinah benchmark to find similarities that match with texts of the Holy Quran and the obtained accuracy was superior to the other tested K-nearest neighbor (knn) algorithm and published results in the literature.
Abstract: The detection and recognition and then conversion of the characters in an image into a text are called optical character recognition (OCR). A distinctive-type of OCR is used to process Arabic characters, namely, Arabic OCR. OCR is increasingly used in many applications, where this process is preferred to automatically perform a process without human association. The Quranic text contains two elements, namely, diacritics and characters. However, processing these elements may cause malfunction to the OCR system and reduce its level of accuracy. In this paper, a new method is proposed to check the similarity and originality of Quranic content. This method is based on a combination of Quranic diacritic and character recognition techniques. Diacritic detections are performed using a region-based algorithm. An optimization technique is applied to increase the recognition ratio. Moreover, character recognition is performed based on the projection method. An optimization technique is applied to increase the recognition ratio. The result of the proposed method is compared with the standard Mushaf al Madinah benchmark to find similarities that match with texts of the Holy Quran. The obtained accuracy was superior to the other tested K-nearest neighbor (knn) algorithm and published results in the literature. The accuracies were 96.4286% and 92.3077% better in the improved knn algorithm for diacritics and characters, respectively, than in the knn algorithm.

18 citations


"Text and label reading using Raspbe..." refers background in this paper

  • ...A greater piece of advancement worked for individuals with sight challenged and deprived of vision rely upon the two quintessential structure modules known as the Optical Character Recognition (OCR) programming and Text-to-Speech Engine (TTS) [1]....

    [...]

12 Jan 2015
TL;DR: The proposed method is a camera based assistive text reading to help blind person in reading the text present on the text labels, printed notes and products through Text Extraction from image and converting the Text to Speech converter.
Abstract: Human communication today is mainly via speech and text. To access information in a text, a person needs to have vision. However those who are deprived of vision can gather information using their hearing capability. The proposed method is a camera based assistive text reading to help blind person in reading the text present on the text labels, printed notes and products [1]. The proposed project involves Text Extraction from image and converting the Text to Speech converter, a process which makes blind persons to read the text. This is the first step in developing a prototype for blind people for recognizing the products in real world, where the text on product is extracted and converted into speech. This is carried out by using Raspberry pi, where portability is the main aim which is achieved by providing a battery backup and can be implemented as a future technology. The portability allows the user to carry the device anywhere and can use any time.

17 citations

Proceedings ArticleDOI
20 Aug 2006
TL;DR: The results of simulation experiments showed that recognition rates over 99% were obtained by the extracted cross ratio under heavy projective distortions.
Abstract: In order to realize accurate camera-based character recognition, machine-readable class information is embedded into each character image. Specifically, each character image is printed with a pattern which comprises five stripes and the cross ratio derived from the pattern represents class information. Since the cross ratio is a projective invariant, the class information is extracted correctly regardless of camera angle. The results of simulation experiments showed that recognition rates over 99% were obtained by the extracted cross ratio under heavy projective distortions.

13 citations