Bio: Alireza Behrad is an academic researcher from Shahed University. The author has contributed to research in topics: Feature extraction & Support vector machine. The author has an hindex of 13, co-authored 84 publications receiving 710 citations. Previous affiliations of Alireza Behrad include Amirkabir University of Technology & University of Tehran.
Papers published on a yearly basis
TL;DR: A new automatic system for liver segmentation in abdominal MRI images that utilizes MLP neural networks and watershed algorithm and uses trained neural networks to extract features of the liver region.
Abstract: Precise liver segmentation in abdominal MRI images is one of the most important steps for the computer-aided diagnosis of liver pathology The first and essential step for diagnosis is automatic liver segmentation, and this process remains challenging Extensive research has examined liver segmentation; however, it is challenging to distinguish which algorithm produces more precise segmentation results that are applicable to various medical imaging techniques In this paper, we present a new automatic system for liver segmentation in abdominal MRI images The system includes several successive steps Preprocessing is applied to enhance the image (edge-preserved noise reduction) by using mathematical morphology The proposed algorithm for liver region extraction is a combined algorithm that utilizes MLP neural networks and watershed algorithm The traditional watershed transformation generally results in oversegmentation when directly applied to medical image segmentation Therefore, we use trained neural networks to extract features of the liver region The extracted features are used to monitor the quality of the segmentation using the watershed transform and adjust the required parameters automatically The process of adjusting parameters is performed sequentially in several iterations The proposed algorithm extracts liver region in one slice of the MRI images and the boundary tracking algorithm is suggested to extract the liver region in other slices, which is left as our future work This system was applied to a series of test images to extract the liver region Experimental results showed positive results for the proposed algorithm
01 Jan 2001
TL;DR: Experimental results have shown that the algorithm is reliable and can successfully detect and track targets in most cases and local and regional computations have made the algorithm suitable for real-time applications.
Abstract: In this paper we present a new algorithm for real-time detection and tracking of moving targets in terrestrial scenes using a mobile camera. Our algorithm consists of two modes: detection and tracking. In the detection mode, background motion is estimated and compensated using an affine transformation. The resultant motionrectified image is used for detection of the target location using split and merge algorithm. We also checked other features for precise detection of the target location. When the target is identified, algorithm switches to the tracking mode. Modified Moravec operator is applied to the target to identify feature points. The feature points are matched with points in the region of interest in the current frame. The corresponding points are further refined using disparity vectors. The tracking system is capable of target shape recovery and therefore it can successfully track targets with varying distance from camera or while the camera is zooming. Local and regional computations have made the algorithm suitable for real-time applications. The refined points define the new position of the target in the current frame. Experimental results have shown that the algorithm is reliable and can successfully detect and track targets in most cases.
TL;DR: A new passive approach is proposed for tampering detection and localization in MPEGx coded videos that can detect frame insertion or deletion and double compression with different GOP structures and lengths and reduce the effect of motion on residual errors of P frames.
Abstract: In this paper, a new passive approach is proposed for tampering detection and localization in MPEGx coded videos. The proposed algorithm can detect frame insertion or deletion and double compression with different GOP structures and lengths. To devise the proposed algorithm, the traces of quantization error on residual errors of P frames are mathematically studied. Then, based on the obtained guidelines, a new algorithm is proposed to detect quantization-error-rich areas in the P frames and reduce the effect of motion on residual errors of P frames. Subsequently, a wavelet-based algorithm is proposed to enrich the traces of quantization error in the frequency domain. Finally, the processed and spatially constrained residual errors of P frames are employed to detect and localize video forgery in the temporal domain. Experimental results and a comparison of the proposed method with an existing approach show the efficiency of the proposed algorithm especially for videos with high compression rates. A new algorithm for inter-frame video forgery detection and localization is proposed.The traces of quantization error on residual errors of P frames are studied.The algorithm can detect inter-frame forgeries in videos with different GOP lengths.We detect quantization-error-rich areas in P frames to reduce the effect of motion.
TL;DR: A novel technique to discover double JPEG compression traces, which discriminates single compressed images from double counterparts, estimates the first quantization in double compression, and localizes tampered regions in a forgery examination is presented.
Abstract: This paper presents a novel technique to discover double JPEG compression traces. Existing detectors only operate in a scenario that the image under investigation is explicitly available in JPEG format. Consequently, if quantization information of JPEG files is unknown, their performance dramatically degrades. Our method addresses both forensic scenarios which results in a fresh perceptual detection pipeline. We suggest a dimensionality reduction algorithm to visualize behaviors of a big database including various single and double compressed images. Based on intuitions of visualization, three bottom-up, top-down and combined top-down/bottom-up learning strategies are proposed. Our tool discriminates single compressed images from double counterparts, estimates the first quantization in double compression, and localizes tampered regions in a forgery examination. Extensive experiments on three databases demonstrate results are robust among different quality levels. F1-measure improvement to the best state-of-the-art approach reaches up to 26.32 %. An implementation of algorithms is available upon request to fellows.
TL;DR: A new boosted and parallel architecture is proposed for video captioning using Long Short-Term Memory (LSTM) networks that considerably improves the accuracy of the generated sentence.
Abstract: Video captioning and its integration with deep learning is one of the most challenging issues in the field of machine vision and artificial intelligence. In this paper, a new boosted and parallel architecture is proposed for video captioning using Long Short-Term Memory (LSTM) networks. The proposed architecture comprises two LSTM layers and a word selection module. The first LSTM layer has the responsibility of encoding frame features extracted by a pre-trained deep Convolutional Neural Network (CNN). In the second LSTM layer, a novel architecture is used for video captioning by leveraging several decoding LSTMs in a parallel and boosting architecture. This layer, which is called Boosted and Parallel LSTM (BP-LSTM) layer, is constructed by iteratively training LSTM networks using a special kind of AdaBoost algorithm during the training phase. During the testing phase, the outputs of BP-LSTMs are concurrently combined using the maximum probability criterion and word selection module. We tested the proposed algorithm with two well-known video captioning datasets and compared the results with state-of-the-art algorithms. The results show that the proposed architecture considerably improves the accuracy of the generated sentence.
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.
TL;DR: An analysis of comparative surveys done in the field of gesture based HCI and an analysis of existing literature related to gesture recognition systems for human computer interaction by categorizing it under different key parameters are provided.
Abstract: As computers become more pervasive in society, facilitating natural human---computer interaction (HCI) will have a positive impact on their use. Hence, there has been growing interest in the development of new approaches and technologies for bridging the human---computer barrier. The ultimate aim is to bring HCI to a regime where interactions with computers will be as natural as an interaction between humans, and to this end, incorporating gestures in HCI is an important research area. Gestures have long been considered as an interaction technique that can potentially deliver more natural, creative and intuitive methods for communicating with our computers. This paper provides an analysis of comparative surveys done in this area. The use of hand gestures as a natural interface serves as a motivating force for research in gesture taxonomies, its representations and recognition techniques, software platforms and frameworks which is discussed briefly in this paper. It focuses on the three main phases of hand gesture recognition i.e. detection, tracking and recognition. Different application which employs hand gestures for efficient interaction has been discussed under core and advanced application domains. This paper also provides an analysis of existing literature related to gesture recognition systems for human computer interaction by categorizing it under different key parameters. It further discusses the advances that are needed to further improvise the present hand gesture recognition systems for future perspective that can be widely used for efficient human computer interaction. The main goal of this survey is to provide researchers in the field of gesture based HCI with a summary of progress achieved to date and to help identify areas where further research is needed.
••01 Mar 2012
TL;DR: This paper provides an overview of MHI-based human motion recognition techniques and applications and points some areas for further research based on the MHI method and its variants.
Abstract: The motion history image (MHI) approach is a view-based temporal template method which is simple but robust in representing movements and is widely employed by various research groups for action recognition, motion analysis and other related applications. In this paper, we provide an overview of MHI-based human motion recognition techniques and applications. Since the inception of the MHI template for motion representation, various approaches have been adopted to improve this basic MHI technique. We present all important variants of the MHI method. This paper points some areas for further research based on the MHI method and its variants.
TL;DR: A classification of existing data types, analytical methods, visualization techniques and tools, with a particular emphasis placed on surveying the evolution of visualization methodology over the past years is provided, and disadvantages of existing visualization methods are revealed.
Abstract: This paper provides a multi-disciplinary overview of the research issues and achievements in the field of Big Data and its visualization techniques and tools. The main aim is to summarize challenges in visualization methods for existing Big Data, as well as to offer novel solutions for issues related to the current state of Big Data Visualization. This paper provides a classification of existing data types, analytical methods, visualization techniques and tools, with a particular emphasis placed on surveying the evolution of visualization methodology over the past years. Based on the results, we reveal disadvantages of existing visualization methods. Despite the technological development of the modern world, human involvement (interaction), judgment and logical thinking are necessary while working with Big Data. Therefore, the role of human perceptional limitations involving large amounts of information is evaluated. Based on the results, a non-traditional approach is proposed: we discuss how the capabilities of Augmented Reality and Virtual Reality could be applied to the field of Big Data Visualization. We discuss the promising utility of Mixed Reality technology integration with applications in Big Data Visualization. Placing the most essential data in the central area of the human visual field in Mixed Reality would allow one to obtain the presented information in a short period of time without significant data losses due to human perceptual issues. Furthermore, we discuss the impacts of new technologies, such as Virtual Reality displays and Augmented Reality helmets on the Big Data visualization as well as to the classification of the main challenges of integrating the technology.
TL;DR: The main focus of this survey is application of deep learning techniques in detecting the exact count, involved persons and the happened activity in a large crowd at all climate conditions.
Abstract: Big data applications are consuming most of the space in industry and research area. Among the widespread examples of big data, the role of video streams from CCTV cameras is equally important as other sources like social media data, sensor data, agriculture data, medical data and data evolved from space research. Surveillance videos have a major contribution in unstructured big data. CCTV cameras are implemented in all places where security having much importance. Manual surveillance seems tedious and time consuming. Security can be defined in different terms in different contexts like theft identification, violence detection, chances of explosion etc. In crowded public places the term security covers almost all type of abnormal events. Among them violence detection is difficult to handle since it involves group activity. The anomalous or abnormal activity analysis in a crowd video scene is very difficult due to several real world constraints. The paper includes a deep rooted survey which starts from object recognition, action recognition, crowd analysis and finally violence detection in a crowd environment. Majority of the papers reviewed in this survey are based on deep learning technique. Various deep learning methods are compared in terms of their algorithms and models. The main focus of this survey is application of deep learning techniques in detecting the exact count, involved persons and the happened activity in a large crowd at all climate conditions. Paper discusses the underlying deep learning implementation technology involved in various crowd video analysis methods. Real time processing, an important issue which is yet to be explored more in this field is also considered. Not many methods are there in handling all these issues simultaneously. The issues recognized in existing methods are identified and summarized. Also future direction is given to reduce the obstacles identified. The survey provides a bibliographic summary of papers from ScienceDirect, IEEE Xplore and ACM digital library.