scispace - formally typeset
Search or ask a question

Showing papers in "International Journal of Image, Graphics and Signal Processing in 2016"


Journal ArticleDOI
TL;DR: This paper involves the extraction of one of the most dominant and most researched up on speech feature, Mel coefficients and its first and second order derivatives and compares the performance based on first Mel coefficient.
Abstract: Speech Recognition Technology can be embedded in various real time applications in order to increase the human-computer interaction. From robotics to health care and aerospace, from interactive voice response systems to mobile telephony and telematics, speech recognition technology have enhanced the human- machine interaction. Gender recognition is an important component for the application embedding speech recognition as it reduces the computational complexity for the further processing in these applications. The paper involves the extraction of one of the most dominant and most researched up on speech feature, Mel coefficients and its first and second order derivatives. We extracted 13 values for each of these from a data-set 46 speech samples containing the Hindi vowels (आ, इ, ई, उ, ऊ, ऋ, ए, ऎ, ऒ, ऑ) and trained them using a combined model of SVM and neural network classification to determine their gender using stacking. The results obtained showed the accuracy of 93.48% after taking into consideration the first Mel coefficient. The purpose of this study was to extract the correct features and to compare the performance based on first Mel coefficient. Index Terms—Gender recognition, Hindi, mel-frequency, delta, delta-delta, neural network.

36 citations


Journal ArticleDOI
TL;DR: Survey of applications, color spaces, methods and their performances, compensation techniques and benchmarking datasets on human skin detection topic, covering the related researches within more than last two decades is provided.
Abstract: Human Skin detection is one of the most widely used algorithms in vision literature which has been numerously exploited both directly and indirectly in multifarious applications. This scope has received a great deal of attention specifically in face analysis and human detection/tracking/recognition systems. As regards, there are several challenges mainly emanating from nonlinear illumination, camera characteristics, imaging conditions, and intra-personal features. During last twenty years, researchers have been struggling to overcome these challenges resulting in publishing hundreds of papers. The aim of this paper is to survey applications, color spaces, methods and their performances, compensation techniques and benchmarking datasets on human skin detection topic, covering the related researches within more than last two decades. In this paper, different difficulties and challenges involved in the task of finding skin pixels are discussed. Skin segmentation algorithms are mainly based on color information; an in-depth discussion on effectiveness of disparate color spaces is elucidated. In addition, using standard evaluation metrics and datasets make the comparison of methods both possible and reasonable. These databases and metrics are investigated and suggested for future studies. Reviewing most existing techniques not only will ease future studies, but it will also result in developing better methods. These methods are classified and illustrated in detail. Variety of applications in which skin detection has been either fully or partially used is also provided.

31 citations


Journal ArticleDOI
TL;DR: A non-format compliant JPEG encryption algorithm is proposed which is based on a modification of the RSA encryption system, and a variant of the algorithm is also described, which is faster than the original algorithm, but expands the bit stream slightly.
Abstract: A non-format compliant JPEG encryption algorithm is proposed which is based on a modification of the RSA encryption system. Firstly, an alternate form of entropy coding is described, which is more suited to the proposed algorithm, instead of the zigzag coding scheme used in JPEG. The algorithm for the encryption and decryption process is then elaborated. A variant of the algorithm, also based on the RSA algorithm is also described, which is faster than the original algorithm, but expands the bit stream slightly. Both the algorithms are shown to be scalable and resistant to ‘sketch’ attacks. Finally, the encrypted file sizes for both the algorithms are compared with the unencrypted JPEG compressed image file size. The encrypted image is found to be moderately expanded, but which is justified by the high security and most importantly, the scalability of the algorithm.

30 citations


Journal ArticleDOI
TL;DR: The proposed idea of this research work is to develop the robust image steganography using Least Significant Bit and Discrete Wavelet Transform techniques for digital image signal to improve the robustness & evaluate the performance of these algorithms.
Abstract: Steganography is the science that deals with conveying secret information by embedding into the cover object invisibly. In steganography, only the authorized party is aware of the existence of the hidden message to achieve secret communication. The image file is mostly used cover medium amongst various digital files such as image, text, audio and video. The proposed idea of this research work is to develop the robust image steganography. It is implemented using Least Significant Bit and Discrete Wavelet Transform techniques for digital image signal to improve the robustness & evaluate the performance of these algorithms. The parameters such as mean square error (MSE), bit error rate (BER), peak signal to noise ratio (PSNR) and processing time are considered here to evaluate the performance of the proposed work. In the proposed system, PSNR and MSE value ranges from 42 to 46 dB and 1.5 to 3.5 for LSB method respectively. For DWT method these results are further improved as it gives higher PSNR values between 49 to 57 dB and lower MSE values 0.2 to 0.7.

27 citations


Journal ArticleDOI
TL;DR: The proposed CNN based Bengali handwritten numeral recognition scheme showed satisfactory recognition accuracy on the benchmark data set and outperformed other prominent existing methods for both Bengali and Bengali-English mixed cases.
Abstract: Recognition of handwritten numerals has gained much interest in recent years due to its various potential applications. Bengali is the fifth ranked among the spoken languages of the world. However, due to inherent difficulties of Bengali numeral recognition, a very few study on handwritten Bengali numeral recognition is found with respect to other major languages. The existing Bengali numeral recognition methods used distinct feature extraction techniques and various classification tools. Recently, convolutional neural network (CNN) is found efficient for image classification with its distinct features. In this paper, we have investigated a CNN based Bengali handwritten numeral recognition scheme. Since English numerals are frequently used with Bengali numerals, handwritten Bengali-English mixed numerals are also investigated in this study. The proposed scheme uses moderate pre-processing technique to generate patterns from images of handwritten numerals and then employs CNN to classify individual numerals. It does not employ any feature extraction method like other related works. The proposed method showed satisfactory recognition accuracy on the benchmark data set and outperformed other prominent existing methods for both Bengali and Bengali-English mixed cases.

25 citations


Journal ArticleDOI
TL;DR: A real time drowning detection method based on HSV color space analysis is presented which uses prior knowledge of the video sequences to set the best values for the color channels in each frame of video sequences.
Abstract: Safety in swimming pools is a crucial issue. In this paper, a real time drowning detection method based on HSV color space analysis is presented which uses prior knowledge of the video sequences to set the best values for the color channels. Our method uses a HSV thresholding mechanism along with Contour detection to detect the region of interest in each frame of video sequences. The presented software can detect drowning person in indoor swimming pools and sends an alarm to the lifeguard rescues if the previously detected person is missing for a specific amount of time. The presented algorithm for this system is tested on several video sequences recorded in swimming pools in real conditions and the results are of high accuracy with a high capability of tracking individuals in real time. According to the evaluation results, the number of false alarms generated by the system is minimal and the maximum alarm delay reported by the system is 2.6 sec which can relatively be reliable compared to the acceptable time for rescue and resuscitation.

21 citations


Journal ArticleDOI
TL;DR: Results show that SVM feature selection method provides better emotional speech-recognition performance compared to CFS and baseline feature set, and the new system is able to achieve high accuracy system with a minimum set of features.
Abstract: The aim of this paper is to utilize Support Vector Machine (SVM) as feature selection and classification techniques for audio signals to identify human emotional states. One of the major bottlenecks of common speech emotion recognition techniques is to use a huge number of features per utterance which could significantly slow down the learning process, and it might cause the problem known as ―the curse of dimensionality‖. Consequently, to ease this challenge this paper aims to achieve high accuracy system with a minimum set of features. The proposed model uses two methods, namely ―SVM features selection‖ and the common ―Correlation-based Feature Subset Selection (CFS)‖ for the feature dimensions reduction part. In addition, two different classifiers, one Support Vector Machine and the other Neural Network are separately adopted to identify the six emotional states of anger, disgust, fear, happiness, sadness and neutral. The method has been verified using Persian (Persian ESD) and German (EMO-DB) emotional speech databases, which yield high recognition rates in both databases. The results show that SVM feature selection method provides better emotional speech-recognition performance compared to CFS and baseline feature set. Moreover, the new system is able to achieve a recognition rate of (99.44%) on the Persian ESD and (87.21%) on Berlin Emotion Database for speaker-dependent classification. Besides, promising result (76.12%) is obtained for speaker-independent classification case; which is among the best-known accuracies reported on the mentioned database relative to its little number of features.

18 citations


Journal ArticleDOI
TL;DR: This work proposes hardware and software solutions which take images of an Ethiopian currency from a scanner and camera as an input and designs a four level classifier, which has a categorization component, which is responsible to denominate the currency notes into their respective denomination and verification component which isresponsible to validate whether the currency is genuine or not.
Abstract: Currency recognition is a technology used to identify currencies of various countries. The use of automatic methods of currency recognition has been increasing due its importance in many sectors such as vending machine, railway ticket counter, banking system, shopping mall, currency exchange service, etc. This paper describes the design of automatic recognition of Ethiopian currency. In this work, we propose hardware and software solutions which take images of an Ethiopian currency from a scanner and camera as an input. We combined characteristic features of currency and local feature descriptors to design a four level classifier. The design has a categorization component, which is responsible to denominate the currency notes into their respective denomination and verification component which is responsible to validate whether the currency is genuine or not. The system is tested using genuine Ethiopian currencies, counterfeit Ethiopian currencies and other countries’ currencies. The denomination accuracy for genuine Ethiopian currency, counterfeit currencies and other countries’ currencies is found to be 90.42%, 83.3% and 100% respectively. The verification accuracy of our system is 96.13%.

17 citations


Journal ArticleDOI
TL;DR: This research investigates and reviews the performance of Convolutional networks, and its variant, convolutional auto encoder networks when tasked with recognition problems considering invariances such as translation, rotation, and scale and provides extensive architectural and learning paradigms review of the considered networks, in view of how built-in invariance is learned.
Abstract: The ability of the human visual processing system to accommodate and retain clear understanding or identification of patterns irrespective of their orientations is quite remarkable. Conversely, pattern invariance, a common problem in intelligent recognition systems is not one that can be overemphasized; obviously, one’s definition of an intelligent system broadens considering the large variability with which the same patterns can occur. This research investigates and reviews the performance of convolutional networks, and its variant, convolutional auto encoder networks when tasked with recognition problems considering invariances such as translation, rotation, and scale. While, various patterns can be used to validate this query, handwritten Yoruba vowel characters have been used in this research. Databases of images containing patterns with constraints of interest are collected, processed, and used to train and simulate the designed networks. We provide extensive architectural and learning paradigms review of the considered networks, in view of how built-in invariance is learned. Lastly, we provide a comparative analysis of achieved error rates against back propagation neural networks, denoising auto encoder, stacked denoising auto encoder, and deep belief network.

17 citations


Journal ArticleDOI
TL;DR: The proposed scheme is compared with existing methods and it is observed that performance of proposed method is superior to existing methods in terms of visual quality, PSNR and Image Quality Index (IQI).
Abstract: The main aim of image denoising is to improve the visual quality in terms of edges and textures of images. In Computed Tomography (CT), images are generated with a combination of hardware, software and radiation dose. Generally, CT images are noisy due to hardware/software fault or mathematical computation error or low radiation dose. The analysis and extraction of medical relevant information from noisy CT images are challenging tasks for diagnosing problems. This paper presents a novel edge preserving image denoising technique based on wavelet transform. The proposed scheme is divided into two phases. In first phase, input CT image is separately denoised using different patch size where denoising is performed based on thresholding and its method noise thresholding. The outcome of first phase provides more than one denoised images. In second phase, block wise variation based aggregation is performed in wavelet domain. The final outcomes of proposed scheme are excellent in terms of noise suppression and structure preservation. The proposed scheme is compared with existing methods and it is observed that performance of proposed method is superior to existing methods in terms of visual quality, PSNR and Image Quality Index (IQI).

17 citations


Journal ArticleDOI
TL;DR: The experimental results show that the proposed method outperforms all the other state-of- the-art methods visually as well quantitatively as in terms of standard deviation, mutual information, edge strength, fusion factor, sharpness and average gradient.
Abstract: Image fusion is a popular application of image processing which performs merging of two or more images into one. The merged image is of improved visual quality and carries more information content. The present work introduces a new image fusion method in complex wavelet domain. The proposed fusion rule is based on a level dependent threshold, where absolute difference of a wavelet coefficient from the threshold value is taken as fusion criteria. This absolute difference represents variation in the image intensity that resembles the salient features of image. Hence, for fusion, the coefficients that are far from threshold value are being selected. The motivation behind using dual tree complex wavelet transform is due to failure of real valued wavelet transform in many aspects. Good directional selectivity, availability of phase information and approximate shift invariant nature of dual tree complex wavelet transform make it suitable for image fusion and help to produce a high quality fused image. To prove the strength of the proposed method, it has been compared with several spatial, pyramidal, wavelet and new generation wavelet based fusion methods. The experimental results show that the proposed method outperforms all the other state-of- the-art methods visually as well quantitatively as in terms of standard deviation, mutual information, edge strength, fusion factor, sharpness and average gradient.

Journal ArticleDOI
TL;DR: Various approaches for moving objects that are used in classification for video surveillance system based on shape and motion are described.
Abstract: Visual surveillance System is used for analysis and interpretation of object behaviors. It involves object classification to understand the visual events in videos. In this review paper various object classification methods are used. Classification technique plays an important role in surveillance system that is used for the classification of both objects like static and moving objects in a better way. The methods in object classification are used to extract meaningful information and various features that are needed for representation of data. In this survey, we described various approaches for moving objects that are used in classification for video surveillance system based on shape and motion. Index Terms—Video Surveillance, object classification, Feature extraction, neural network, recognition.

Journal ArticleDOI
TL;DR: In this paper, the detection of rows in an open field tomato crop by analyzing images acquired using remote sensing from an Unmanned Aerial Vehicle was proposed, where spectral-spatial methods are applied in processing the images and K-means clustering is used for spectral clustering.
Abstract: Detection of rows in crops planted as rows is fundamental to site specific management of agricultural farms. Unmanned Aerial Vehicles are increasingly being used for agriculture applications. Images acquired using Low altitude remote sensing is analysed. In this paper we propose the detection of rows in an open field tomato crop by analyzing images acquired using remote sensing from an Unmanned Aerial Vehicle. The Unmanned Aerial Vehicle used is a quadcopter fitted with an optical sensor. The optical sensor used is a vision spectrum camera. Spectral-spatial methods are applied in processing the images. K-Means clustering is used for spectral clustering. Clustering result is further improved by using spatial methods. Mathematical morphology and geometric shape operations of Shape Index and Density Index are used for spatial segmentation. Six images acquired at different altitudes are analysed to validate the robustness of the proposed method. Performance of row detection is analysed using confusion matrix. The results are comparable for the diverse image sets analyzed.

Journal ArticleDOI
TL;DR: Segmentation and counting of RBCs and WBCs from microscopic blood sample images using Otsu’s thresholding and morphological operations and Circular Hough Transform are presented.
Abstract: In the biomedicine field, blood cell analysis is the first step for diagnosis of many of the disease. The first test that is requested by a doctor is the CBC (Complete Blood cell Count). Microscopic image of blood stream contains three types of blood cells: Red Blood Cells (RBCs), White Blood Cells (WBCs) and platelets. Earlier counting of blood cell was done manually which was inaccurate and depends on operator’s skill. Counting of blood cells using image processing provides cost effective and accurate result than manual counting. During the counting process, the splitting of clumped cell is the most challenging issue. This paper represents segmentation and counting of RBCs and WBCs from microscopic blood sample images. Segmentation is done using Otsu’s thresholding and morphological operations. Counting of cells is done using geometric features of cells. RBCs contain clumped cells which make the task of counting of cells accurately very challenging. For counting of RBCs, two different methods are used: 1) Watershed segmentation 2) Circular Hough Transform. Comparison of both this method is shown for randomly selected images. The performance of counting methods is also analyzed by comparing it with results obtained by manual counts.

Journal ArticleDOI
TL;DR: A comparison of five non-intrusive methods for eye blink detection for low resolution eye images using different features like mean intensity, Fisher faces and Histogram of Oriented Gradients (HOG) and classifiers like Support Vector Machines (SVM) and Artificial neural network (ANN).
Abstract: Eye blink detection has gained a lot of interest in recent years in the field of Human Computer Interaction (HCI). Research is being conducted all over the world for developing new Natural User Interfaces (NUI) that uses eye blinks as an input. This paper presents a comparison of five non-intrusive methods for eye blink detection for low resolution eye images using different features like mean intensity, Fisher faces and Histogram of Oriented Gradients (HOG) and classifiers like Support Vector Machines (SVM) and Artificial neural network (ANN). A comparative study is performed by varying the number of training images and in uncontrolled lighting conditions with low resolution eye images. The results show that HOG features combined with SVM classifier outperforms all other methods with an accuracy of 85.62% when tested on images taken from a totally unknown dataset.

Journal ArticleDOI
TL;DR: The experimental results from both single projection (axial) as well as combination of all projections (coronal and sagittal) demonstrated better classification performance over other existing method, which could be used as diagnostic measure for the detection of Alzheimer disease.
Abstract: The aim of this research is to propose a methodology to classify the subjects into Alzheimer disease and normal control on the basis of visual features from hippocampus region. All three dimensional MRI images were spatially normalized to the MNI/ICBM atlas space. Then, hippocampus region was extracted from brain structural MRI images, followed by application of two dimensional Gabor filter in three scales and eight orientations for texture computation. Texture features were represented on slice by slice basis by mean and standard deviation of magnitude of Gabor response. Classification between Alzheimer disease and normal control was performed with linear support vector machine. This study analyzes the performance of Gabor texture feature along each projection (axial, coronal and sagittal) separately as well as combination of all projections. The experimental results from both single projection (axial) as well as combination of all projections (axial, coronal and sagittal), demonstrated better classification performance over other existing method. Hence, this methodology could be used as diagnostic measure for the detection of Alzheimer disease.

Journal ArticleDOI
TL;DR: A hybrid methodology is suggested which extracts multilingual text from natural scene image with cluttered backgrounds which can be used as an efficient method for text recognition in natural scene images.
Abstract: The objective of this study is to propose a new method for text region localization and character extraction in natural scene images with complex background. In this paper, a hybrid methodology is suggested which extracts multilingual text from natural scene image with cluttered backgrounds. The proposed approach involves four steps. First, potential text regions in an image are extracted based on edge features using Contourlet transform. In the second step, potential text regions are tested for text content or non-text using GLCM features and SVM classifier. In the third step, detection of multiple lines in localized text regions is done and line segmentation is performed using horizontal profiles. In the last step, each character of the segmented line is extracted using vertical profiles. The experimentation has been done using images drawn from own dataset and ICDAR dataset. The performance is measured in terms of the precision and recall. The results demonstrate the effectiveness of the proposed method, which can be used as an efficient method for text recognition in natural scene images.

Journal ArticleDOI
TL;DR: Zhang et al. as mentioned in this paper proposed a shape adaptive discrete wavelet transform (SA-DWT) for image retrieval, which uses multi-features color, texture and edge descriptors.
Abstract: In this paper, we present an efficient region-based image retrieval method, which uses multi-features color, texture and edge descriptors. In contrast to recent image retrieval methods, which use discrete wavelet transform (DWT), we propose using shape adaptive discrete wavelet transform (SA-DWT). The advantage of this method is that the number of coefficients after transformation is identical to the number of pixels in the original region. Since image data is often stored in compressed formats: JPEG 2000, MPEG 4…; constructing image histograms directly in the compressed domain, allows accelerating the retrieval operation time, and reducing computing complexities. Moreover, SA-DWT represents the best way to exploit the coefficients characteristics, and properties such as the correlation. Characterizing image regions without any conversion or modification is first addressed. Using edge descriptor to complement image region characterizing is then introduced. Experimental results show that the proposed method outperforms content based image retrieval methods and recent region based image retrieval methods

Journal ArticleDOI
TL;DR: A survey of notable shadow removal techniques for single image available in the literature, namely, reintegration methods, relighting methods, patch-based methods, color transfer methods, and interactive methods is presented.
Abstract: Shadows are physical phenomena that appear on a surface when direct light from a source is unable to reach the surface due to the presence of an object between the source and the surface. The formation of shadows and their various features has evolved as a topic of discussion among researchers. Though the presence of shadows can aid us in understanding the scene model, it might impair the performance of applications such as object detection. Hence, the removal of shadows from videos and images is required for the faultless working of certain image processing tasks. This paper presents a survey of notable shadow removal techniques for single image available in the literature. For the purpose of the survey, the various shadow removal algorithms are classified under five categories, namely, reintegration methods, relighting methods, patch-based methods, color transfer methods, and interactive methods. Comparative study of qualitative and quantitative performances of these works is also included. The pros and cons of various approaches are highlighted. The survey concludes with the following observations(i) shadow removal should be performed in real time since it is usually considered as a preprocessing task, (ii) the texture and color information of the regions underlying the shadow must be recovered, (iii) there should be no hard transition between shadow and nonshadow regions after removing the shadows.

Journal ArticleDOI
TL;DR: The multi channel peanut sorting algorithm that apply on raspberry pi ARM platform for peanut quality segregation by sort out foreign material as well as defective peanut like aflatoxin contaminants and fungi allergies contents from the required quality good peanuts is discussed.
Abstract: Sorting of finished products or agriculture food has different method for ultra high speed quality inspection. Optical sorting is one of the important applications of image processing used in industries to replace manual method to verify quality of finished products or row food. Most of the systems use the computer as main processing device that perform image processing algorithms on it, such kind of system having limitations like higher cost, bigger size and large Initial boot-up time. This type of design cannot be implemented for ultra fast, higher capacity and smaller in size agricultural products like nuts, grains and pulses. Standalone image processing have embedded image processing platform that can able to overcome the limitation of computer based systems at certain level. As peanuts (Arachis hypogeal) come from farm, they are mixed with foreign material like rocks, moisture contended soil particles and outer shells of raw peanuts and they must be separated with high level of accuracy and precision. here discussed the multi channel peanut sorting algorithm that apply on raspberry pi ARM platform for peanut quality segregation by sort out foreign material as well as defective peanut like aflatoxin contaminants and fungi allergies contents from the required quality good peanuts. In paper we discuss about implementation of such a system by using conveyor belt method and image processing algorithm. Algorithm takes consider the color and size of peanut for optical peanut sorting process.

Journal ArticleDOI
TL;DR: This technique is based on a hybrid between two techniques of audio steganography, Least Significant Bit (LSB) technique and modification of phase coding for improving the performance of the phase coding where theperformance of it is very low.
Abstract: Steganography is the art or science that is used in secret communication. It means that there is a secret message that is hidden within another cover media. The cover media may be image, video or audio and the secret message may be any type of digital message. The hidden message doesn't have any relationship with the cover media where the cover media is just to protect the secret message from hacking by unauthorized receiver. The audio cover is used in this paper because of the higher sensitivity of the human auditory system (HAS) than the human visual system (HVS). In this paper, we proposed a hybrid technique to audio steganography. This technique is based on a hybrid between two techniques of audio steganography. These techniques are Least Significant Bit (LSB) technique and modification of phase coding. The hybrid between them is for improving the performance of the phase coding where the performance of it is very low. Audio steganography performance is measured by several factors, the most important one of them is Signal to noise ratio (SNR) which is used to compare the performance of our technique with some known techniques.

Journal ArticleDOI
TL;DR: The proposed chaos based digital image watermarking algorithm based on redundant discrete wavelet transform (RDWT) and singular value decomposition (SVD) is shown to be robust against both the geometrical and image processing attacks and to provide better watermark concealment via computer simulations.
Abstract: In recent years chaos has received a great deal of attention from the researches specialized in communications, signal and image processing. The complexity property of the chaotic signal raised the idea of using such signals in secure communications. Digital image watermarking is a technique mainly developed for copyright protection and image authentication and it can be considered as one application area of the secure communication. In this study, a chaos based digital image watermarking algorithm based on redundant discrete wavelet transform (RDWT) and singular value decomposition (SVD) is proposed. To the best of our knowledge, there do not exist any digital watermarking scheme combining RDWT, SVD and chaos. Robustness and invisibility of the proposed method are improved by using the logistic mapping function to generate a chaotic image matrix serving as the watermark that is used to modify the singular values of the low frequency sub-band of the cover image obtained by applying RDWT. The method is shown to be robust against both the geometrical and image processing attacks and to provide better watermark concealment via computer simulations. Using a chaotic signal as the watermark allows the proposed scheme to meet the security requirements as well.

Journal ArticleDOI
TL;DR: This work proposed a novel method for the preprocessing of MR brain images for the improved segmentation of brain tumor based on mathematical morphology operations, and implements an algorithm for the contrast enhancement of MR head MR images using morphological operations.
Abstract: Human brain is a complex system, made up of neurons and glial cells. Nothing in the universe can compare with the functioning of human brain. Due to its complex nature, the diseases affected on the brain is also very complex in nature. Brain imaging is the widely used method for the diagnosing of such deceases. Brain tumor is an abnormal mass of tissue in which cells grow and multiply uncontrollably, seemingly unchecked by the mechanisms that control normal cells. Magnetic Resonance Imaging (MRI) is a commonly used modality for detecting the brain diseases. In this work we proposed a novel method for the preprocessing of MR brain images for the improved segmentation of brain tumor based on mathematical morphology operations. The first part of this paper proposes an efficient method for the skull stripping of brain MR images based on mathematical morphology. One of the main disadvantages of MRI technology is its low contrast. The second part of this paper implements an algorithm for the contrast enhancement of MR brain images using morphological operations. The output of this algorithms are evaluated using standard measures. The experimental part shows that the proposed method produces very prominent and efficient results.

Journal ArticleDOI
TL;DR: The proposed work constructed brain tumor boundary using bi-modal fuzzy histogram thresholding and edge indication map (EIM) using hybrid approach with the results of existing edge operators and maximum voting scheme.
Abstract: Tumor boundary detection is one of the challenging tasks in the medical diagnosis field. The proposed work constructed brain tumor boundary using bi-modal fuzzy histogram thresholding and edge indication map (EIM). The proposed work has two major steps. Initially step 1 is aimed to enhance the contrast in order to make the sharp edges. An intensity transformation is used for contrast enhancement with automatic threshold value produced by bimodal fuzzy histogram thresholding technique. Next in step 2 the EIM is generated by hybrid approach with the results of existing edge operators and maximum voting scheme. The edge indication map produces continuous tumor boundary along with brain border and substructures (cerebrospinal fluid (CSF), sulcal CSF (SCSF) and interhemispheric fissure) to reach the tumor location easily. The experimental results compared with gold standard using several evaluation parameters. The results showed better values and quality to proposed method than the traditional edge detection techniques. The 3D volume construction using edge indication map is very useful to analysis the brain tumor location during the surgical planning process.

Journal ArticleDOI
TL;DR: The proposed method analyzes the performance of various wavelet types such as Haar, Daubechies, Coiflet, Morlet and Symlet in MRI scans to obtain better performance in the terms of both quantity and visual appearance.
Abstract: Fully automatic brain tumor detection is one of the critical tasks in medical image processing. The proposed study discusses the tumor segmentation process by means of wavelet transformation and clustering technique. Initially, MRI brain images are preprocessed by various wavelet transformations to sharpen the images and enhance the tumor region. This helps to quicken the clustering technique since tumor region appears good in sharpened CSF region. Finally, a wavelet decomposition method is applied in CSF region and extracts the tumor portion. This proposed method analyzes the performance of various wavelet types such as Haar, Daubechies (db1, db2, db3, db4 and db5), Coiflet, Morlet and Symlet in MRI scans. Experiments with the proposed method were done on 5 volume datasets collected from the popular brain tumor pools are BRATS2012 and whole brain atlas. The quantitative measures of results were compared using the metrics false alarm (FA) and missed alarm (MA). The results demonstrate that the proposed method obtaining better performance in the terms of both quantity and visual appearance.

Journal ArticleDOI
TL;DR: The approach is compared with the five base classifiers through calculating the average classification accuracy and experiments on five UCI data sets and remote sensing image data sets are performed to testify the effectiveness of the proposed method.
Abstract: Remote sensing textual image classification technology has been the hottest topic in the filed of remote sensing. Texture is the most helpful symbol for image classification. In common, there are complex terrain types and multiple texture features are extracted for classification, in addition; there is noise in the remote sensing images and the single classifier is hard to obtain the optimal classification results. Integration of multiple classifiers is able to make good use of the characteristics of different classifiers and improve the classification accuracy in the largest extent. In the paper, based on the diversity measurement of the base classifiers, J48 classifier, IBk classifier, sequential minimal optimization (SMO) classifier, Naive Bayes classifier and multilayer perceptron (MLP) classifier are selected for ensemble learning. In order to evaluate the influence of our proposed method, our approach is compared with the five base classifiers through calculating the average classification accuracy. Experiments on five UCI data sets and remote sensing image data sets are performed to testify the effectiveness of the proposed method.

Journal ArticleDOI
TL;DR: In this article, a local statics based filter is applied in the non homogeneous regions, on the output of an edge preserving filter and an edge map is used to retain the original edges.
Abstract: Speckle is a multiplicative noise that degrades the quality of ultrasound images and its presence makes the visual inspection difficult. In addition, it limits the professional application of image processing techniques such as automatic lesion segmentation. So speckle reduction is an essential step before further processing of ultrasonic images. Numerous techniques have been developed to preserve the edges while reducing speckle noise, but these filters avoid smoothing near the edges to preserve fine details. The objective of this work is to suggest a new technique that enhances B-Scan breast ultrasound images by increasing the speckle reduction capability of an edge sensitive filter. In the proposed technique a local statics based filter is applied in the non homogeneous regions, on the output of an edge preserving filter and an edge map is used to retain the original edges. Experiments are conducted using synthetic test image and real time ultrasound images. The effectiveness of the proposed technique is evaluated qualitatively by experts and quantitatively in terms of various quality metrics. Results indicate that proposed method can reduce more noise and simultaneously preserve important diagnostic edge information in breast ultrasound images.

Journal ArticleDOI
TL;DR: The top level description of a complete automated video surveillance system is presented along with the elaboration of different challenges/issues involved in its design and implementation, a comparative analysis of design methodologies and existing FPGA platforms, and details of various primary input/output interfaces required for designing smart automated video Surveillance systems for future.
Abstract: Because of increasing terrorist activities, the resolution of video cameras and the number of cameras deployed for surveillance are increasing exponentially – producing huge amount of video data. Manual analysis of this large volume of video data by human operators for crime scene and forensic analysis is neither reliable nor scalable. This has generated enormous interest in research activities related to automation of video surveillance systems which allows real-time automatic extraction and analysis of information from live incoming video streams and enables automatic detection and tracking of targets without human intervention. To meet the real-time requirements of automated video surveillance systems, very different technologies and design methodologies have been used in literature. These range from use of General Purpose Processors (GPPs) or special purpose Digital Signal Processors (DSPs) or Graphics Processing Units (GPUs) to Application Specific Integrated Circuits (ASICs) or Applications Specific Instruction Set Processors (ASIPs) or even programmable logic devices like Field Programmable Gate Arrays (FPGAs). FPGAs provide real-time performance that is hard to achieve with GPPs/DSPs, limit the extensive design work, time, and cost required for ASICs, and allow algorithmic changes in later stages of system development. Due to these features FPGAs are being increasingly used for prototyping automated video surveillance system quickly. In this paper we present the top level description of a complete automated video surveillance system along with the elaboration of different challenges/issues involved in its design and implementation, a comparative analysis of design methodologies and existing FPGA platforms, complete design flow for prototyping the FPGA-based automated video surveillance system, and details of various primary input/output interfaces required for designing smart automated video surveillance systems for future.

Journal ArticleDOI
TL;DR: An efficient deblocking filter to reduce block artifacts is proposed and the simulation results indicate that the maximum increase Peak signalto-noise ratio (PSNR) of the proposed method is 0.09db, comparing to other deb locking filter algorithm.
Abstract: The international standard of High Efficiency Video Coding (HEVC) improves the compression ratio over %50 compared with previous standards such as H264/AVC which maintain the same perceptual quality. HEVC has achieved significant coding efficiency improvement beyond existing video coding standard by employing several new coding tools. Deblocking filter, Adaptive Loop Filter (ALF) and Sample Adaptive Offset (SAO) are currently introduced for the HEVC standard. The deblocking filter detects the artifacts at the coded block boundaries and attenuates them by employing a selected filter. However, it was shown that the HEVC encoder may produce visible block artifacts on some sequences. In this paper, we propose an efficient deblocking filter to reduce block artifacts. The simulation results indicate that the maximum increase Peak signalto-noise ratio (PSNR) of the proposed method is 0.09db, comparing to other deblocking filter algorithm.

Journal ArticleDOI
TL;DR: A new method for constricting a pseudo hexagonal structure using square pixel is presented, which preserves the important property of hexagonal architecture that each pixel has exactly six surrounding neighbors and also preserves the equidistance property ofhexagonal pixels.
Abstract: Hexagonal structure is a different approach to represent an image rather than the traditional square structure. Hexagonal shaped pixels are used in hexagonal structure representation of images. The hexagonal structure closely resembles the structure of human visual systems (HVS) because the photo receptors found in human retina are arranged in a hexagonal manner. Also curved structure can be well represented using hexagonal structure. So if we could able to represent the image in hexagonal domain, the computer vision will be as close to human vision. But in the present scenario there is no hardware available to capture or display hexagonal images. So we have to simulate a hexagonal grid on a regular square pixel image for further processing in hexagonal domain. In this paper, a new method for constricting a pseudo hexagonal structure using square pixel is presented. This method preserves the important property of hexagonal architecture that each pixel has exactly six surrounding neighbors. This method also preserves the equidistance property of hexagonal pixels.