scispace - formally typeset
Search or ask a question
Author

Ausif Mahmood

Bio: Ausif Mahmood is an academic researcher from University of Bridgeport. The author has contributed to research in topics: Facial recognition system & Deep learning. The author has an hindex of 17, co-authored 78 publications receiving 1554 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: This paper reviews several optimization methods to improve the accuracy of the training and to reduce training time, and delve into the math behind training algorithms used in recent deep networks.
Abstract: Deep learning (DL) is playing an increasingly important role in our lives. It has already made a huge impact in areas, such as cancer diagnosis, precision medicine, self-driving cars, predictive forecasting, and speech recognition. The painstakingly handcrafted feature extractors used in traditional learning, classification, and pattern recognition systems are not scalable for large-sized data sets. In many cases, depending on the problem complexity, DL can also overcome the limitations of earlier shallow networks that prevented efficient training and abstractions of hierarchical representations of multi-dimensional training data. Deep neural network (DNN) uses multiple (deep) layers of units with highly optimized algorithms and architectures. This paper reviews several optimization methods to improve the accuracy of the training and to reduce training time. We delve into the math behind training algorithms used in recent deep networks. We describe current shortcomings, enhancements, and implementations. The review also covers different types of deep architectures, such as deep convolution networks, deep residual networks, recurrent neural networks, reinforcement learning, variational autoencoders, and others.

907 citations

Journal ArticleDOI
24 May 2017-Entropy
TL;DR: This framework introduces a new optimization objective function that combines the error rate and the information learnt by a set of feature maps using deconvolutional networks (deconvnet) and enhances the performance by guiding the CNN through better visualization of learnt features via deconvnet.
Abstract: Recent advances in Convolutional Neural Networks (CNNs) have obtained promising results in difficult deep learning tasks. However, the success of a CNN depends on finding an architecture to fit a given problem. A hand-crafted architecture is a challenging, time-consuming process that requires expert knowledge and effort, due to a large number of architectural design choices. In this article, we present an efficient framework that automatically designs a high-performing CNN architecture for a given problem. In this framework, we introduce a new optimization objective function that combines the error rate and the information learnt by a set of feature maps using deconvolutional networks (deconvnet). The new objective function allows the hyperparameters of the CNN architecture to be optimized in a way that enhances the performance by guiding the CNN through better visualization of learnt features via deconvnet. The actual optimization of the objective function is carried out via the Nelder-Mead Method (NMM). Further, our new objective function results in much faster convergence towards a better architecture. The proposed framework has the ability to explore a CNN architecture’s numerous design choices in an efficient way and also allows effective, distributed execution and synchronization via web services. Empirically, we demonstrate that the CNN architecture designed with our approach outperforms several existing approaches in terms of its error rate. Our results are also competitive with state-of-the-art results on the MNIST dataset and perform reasonably against the state-of-the-art results on CIFAR-10 and CIFAR-100 datasets. Our approach has a significant role in increasing the depth, reducing the size of strides, and constraining some convolutional layers not followed by pooling layers in order to find a CNN architecture that produces a high recognition performance.

203 citations

Journal ArticleDOI
TL;DR: This paper uses an unsupervised neural language model to train initial word embeddings that are further tuned by the authors' deep learning network, then, the pre-trained parameters of the network are used to initialize the model and a joint CNN and RNN framework is described to overcome the problem of loss of detailed, local information.
Abstract: As the amount of unstructured text data that humanity produces overall and on the Internet grows, so does the need to intelligently to process it and extract different types of knowledge from it. Convolutional neural networks (CNNs) and recurrent neural networks (RNNs) have been applied to natural language processing systems with comparative, remarkable results. The CNN is a noble approach to extract higher level features that are invariant to local translation. However, it requires stacking multiple convolutional layers in order to capture long-term dependencies, due to the locality of the convolutional and pooling layers. In this paper, we describe a joint CNN and RNN framework to overcome this problem. Briefly, we use an unsupervised neural language model to train initial word embeddings that are further tuned by our deep learning network, then, the pre-trained parameters of the network are used to initialize the model. At a final stage, the proposed framework combines former information with a set of feature maps learned by a convolutional layer with long-term dependencies learned via long-short-term memory. Empirically, we show that our approach, with slight hyperparameter tuning and static vectors, achieves outstanding results on multiple sentiment analysis benchmarks. Our approach outperforms several existing approaches in term of accuracy; our results are also competitive with the state-of-the-art results on the Stanford Large Movie Review data set with 93.3% accuracy, and the Stanford Sentiment Treebank data set with 48.8% fine-grained and 89.2% binary accuracy, respectively. Our approach has a significant role in reducing the number of parameters and constructing the convolutional layer followed by the recurrent layer as a substitute for the pooling layer. Our results show that we were able to reduce the loss of detailed, local information and capture long-term dependencies with an efficient framework that has fewer parameters and a high level of performance.

179 citations

Journal ArticleDOI
TL;DR: A specialized deep convolutional neural network architecture for gait recognition that is less sensitive to several cases of the common variations and occlusions that affect and degrade gait Recognition performance.

113 citations

Proceedings ArticleDOI
24 Apr 2017
TL;DR: Empirical results show that ConvLstm achieved comparable performances with less parameters on sentiment analysis tasks, and exploit LSTM as a substitute of pooling layer in CNN to reduce the loss of detailed local information and capture long term dependencies in sequence of sentences.
Abstract: Unstructured text data produced on the internet grows rapidly, and sentiment analysis for short texts becomes a challenge because of the limit of the contextual information they usually contain. Learning good vector representations for sentences is a challenging task and an ongoing research area. Moreover, learning long-term dependencies with gradient descent is difficult in neural network language model because of the vanishing gradients problem. Natural Language Processing (NLP) systems traditionally treat words as discrete atomic symbols; the model can leverage small amounts of information regarding the relationship between the individual symbols. In this paper, we propose ConvLstm, neural network architecture that employs Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) on top of pre-trained word vectors. In our experiments, ConvLstm exploit LSTM as a substitute of pooling layer in CNN to reduce the loss of detailed local information and capture long term dependencies in sequence of sentences. We validate the proposed model on two sentiment datasets IMDB, and Stanford Sentiment Treebank (SSTb). Empirical results show that ConvLstm achieved comparable performances with less parameters on sentiment analysis tasks.

112 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: In this paper, a comprehensive survey of the most important aspects of DL and including those enhancements recently added to the field is provided, and the challenges and suggested solutions to help researchers understand the existing research gaps.
Abstract: In the last few years, the deep learning (DL) computing paradigm has been deemed the Gold Standard in the machine learning (ML) community. Moreover, it has gradually become the most widely used computational approach in the field of ML, thus achieving outstanding results on several complex cognitive tasks, matching or even beating those provided by human performance. One of the benefits of DL is the ability to learn massive amounts of data. The DL field has grown fast in the last few years and it has been extensively used to successfully address a wide range of traditional applications. More importantly, DL has outperformed well-known ML techniques in many domains, e.g., cybersecurity, natural language processing, bioinformatics, robotics and control, and medical information processing, among many others. Despite it has been contributed several works reviewing the State-of-the-Art on DL, all of them only tackled one aspect of the DL, which leads to an overall lack of knowledge about it. Therefore, in this contribution, we propose using a more holistic approach in order to provide a more suitable starting point from which to develop a full understanding of DL. Specifically, this review attempts to provide a more comprehensive survey of the most important aspects of DL and including those enhancements recently added to the field. In particular, this paper outlines the importance of DL, presents the types of DL techniques and networks. It then presents convolutional neural networks (CNNs) which the most utilized DL network type and describes the development of CNNs architectures together with their main features, e.g., starting with the AlexNet network and closing with the High-Resolution network (HR.Net). Finally, we further present the challenges and suggested solutions to help researchers understand the existing research gaps. It is followed by a list of the major DL applications. Computational tools including FPGA, GPU, and CPU are summarized along with a description of their influence on DL. The paper ends with the evolution matrix, benchmark datasets, and summary and conclusion.

1,084 citations

Journal ArticleDOI
TL;DR: This paper reviews several optimization methods to improve the accuracy of the training and to reduce training time, and delve into the math behind training algorithms used in recent deep networks.
Abstract: Deep learning (DL) is playing an increasingly important role in our lives. It has already made a huge impact in areas, such as cancer diagnosis, precision medicine, self-driving cars, predictive forecasting, and speech recognition. The painstakingly handcrafted feature extractors used in traditional learning, classification, and pattern recognition systems are not scalable for large-sized data sets. In many cases, depending on the problem complexity, DL can also overcome the limitations of earlier shallow networks that prevented efficient training and abstractions of hierarchical representations of multi-dimensional training data. Deep neural network (DNN) uses multiple (deep) layers of units with highly optimized algorithms and architectures. This paper reviews several optimization methods to improve the accuracy of the training and to reduce training time. We delve into the math behind training algorithms used in recent deep networks. We describe current shortcomings, enhancements, and implementations. The review also covers different types of deep architectures, such as deep convolution networks, deep residual networks, recurrent neural networks, reinforcement learning, variational autoencoders, and others.

907 citations

Journal ArticleDOI
TL;DR: In this paper, the authors analyzed the incompletely perceived link between Industry 4.0 and lean manufacturing, and investigated whether Industry4.0 is capable of implementing lean manufacturing and provided an important insight into manufacturers' dilemma as to whether they can commit into Industry 4-0, considering the investment required and unperceived benefits.
Abstract: Purpose: Lean Manufacturing is widely regarded as a potential methodology to improve productivity and decrease costs in manufacturing organisations. The success of lean manufacturing demands consistent and conscious efforts from the organisation, and has to overcome several hindrances. Industry 4.0 makes a factory smart by applying advanced information and communication systems and future-oriented technologies. This paper analyses the incompletely perceived link between Industry 4.0 and lean manufacturing, and investigates whether Industry 4.0 is capable of implementing lean. Executing Industry 4.0 is a cost-intensive operation, and is met with reluctance from several manufacturers. This research also provides an important insight into manufacturers’ dilemma as to whether they can commit into Industry 4.0, considering the investment required and unperceived benefits. Design/methodology/approach: Lean manufacturing is first defined and different dimensions of lean are presented. Then Industry 4.0 is defined followed by representing its current status in Germany. The barriers for implementation of lean are analysed from the perspective of integration of resources. Literatures associated with Industry 4.0 are studied and suitable solution principles are identified to solve the abovementioned barriers of implementing lean. Findings: It is identified that researches and publications in the field of Industry 4.0 held answers to overcome the barriers of implementation of lean manufacturing. These potential solution principles prove the hypothesis that Industry 4.0 is indeed capable of implementing lean. It uncovers the fact that committing into Industry 4.0 makes a factory lean besides being smart. Originality/value: Individual researches have been done in various technologies allied with Industry 4.0, but the potential to execute lean manufacturing was not completely perceived. This paper bridges the gap between these two realms, and identifies exactly which aspects of Industry 4.0 contribute towards respective dimensions of lean manufacturing.

566 citations

Journal ArticleDOI
01 Apr 2020-Symmetry
TL;DR: The main idea is to collect all the possible images for COVID-19 that exists until the writing of this research and use the GAN network to generate more images to help in the detection of this virus from the available X-rays images with the highest accuracy possible.
Abstract: The coronavirus (COVID-19) pandemic is putting healthcare systems across the world under unprecedented and increasing pressure according to the World Health Organization (WHO). With the advances in computer algorithms and especially Artificial Intelligence, the detection of this type of virus in the early stages will help in fast recovery and help in releasing the pressure off healthcare systems. In this paper, a GAN with deep transfer learning for coronavirus detection in chest X-ray images is presented. The lack of datasets for COVID-19 especially in chest X-rays images is the main motivation of this scientific study. The main idea is to collect all the possible images for COVID-19 that exists until the writing of this research and use the GAN network to generate more images to help in the detection of this virus from the available X-rays images with the highest accuracy possible. The dataset used in this research was collected from different sources and it is available for researchers to download and use it. The number of images in the collected dataset is 307 images for four different types of classes. The classes are the COVID-19, normal, pneumonia bacterial, and pneumonia virus. Three deep transfer models are selected in this research for investigation. The models are the Alexnet, Googlenet, and Restnet18. Those models are selected for investigation through this research as it contains a small number of layers on their architectures, this will result in reducing the complexity, the consumed memory and the execution time for the proposed model. Three case scenarios are tested through the paper, the first scenario includes four classes from the dataset, while the second scenario includes 3 classes and the third scenario includes two classes. All the scenarios include the COVID-19 class as it is the main target of this research to be detected. In the first scenario, the Googlenet is selected to be the main deep transfer model as it achieves 80.6% in testing accuracy. In the second scenario, the Alexnet is selected to be the main deep transfer model as it achieves 85.2% in testing accuracy, while in the third scenario which includes two classes (COVID-19, and normal), Googlenet is selected to be the main deep transfer model as it achieves 100% in testing accuracy and 99.9% in the validation accuracy. All the performance measurement strengthens the obtained results through the research.

391 citations

Journal ArticleDOI
TL;DR: This article aims to provide a comparative review of deep learning for aspect-based sentiment analysis to place different approaches in context.
Abstract: The increasing volume of user-generated content on the web has made sentiment analysis an important tool for the extraction of information about the human emotional state. A current research focus for sentiment analysis is the improvement of granularity at aspect level, representing two distinct aims: aspect extraction and sentiment classification of product reviews and sentiment classification of target-dependent tweets. Deep learning approaches have emerged as a prospect for achieving these aims with their ability to capture both syntactic and semantic features of text without requirements for high-level feature engineering, as is the case in earlier methods. In this article, we aim to provide a comparative review of deep learning for aspect-based sentiment analysis to place different approaches in context.

388 citations