Showing papers in "Multimedia Tools and Applications in 2020"
TL;DR: This review covers 135 proposals for serious games in immersive VR-environments that are combinations of both VR and serious games and that offer end-user validation, providing recommendations for the improvement of these tools and their successful application for the enhancement of both learning and training tasks.
Abstract: The merger of game-based approaches and Virtual Reality (VR) environments that can enhance learning and training methodologies have a very promising future, reinforced by the widespread market-availability of affordable software and hardware tools for VR-environments. Rather than passive observers, users engage in those learning environments as active participants, permitting the development of exploration-based learning paradigms. There are separate reviews of VR technologies and serious games for educational and training purposes with a focus on only one knowledge area. However, this review covers 135 proposals for serious games in immersive VR-environments that are combinations of both VR and serious games and that offer end-user validation. First, an analysis of the forum, nationality, and date of publication of the articles is conducted. Then, the application domains, the target audience, the design of the game and its technological implementation, the performance evaluation procedure, and the results are analyzed. The aim here is to identify the factual standards of the proposed solutions and the differences between training and learning applications. Finally, the study lays the basis for future research lines that will develop serious games in immersive VR-environments, providing recommendations for the improvement of these tools and their successful application for the enhancement of both learning and training tasks.
TL;DR: The empirical study quantified the increase in training time when dropout and batch normalization are used, as well as the increaseIn prediction time (important for constrained environments, such as smartphones and low-powered IoT devices) and showed that a non-adaptive optimizer can outperform adaptive optimizers, but only at the cost of a significant amount of training times to perform hyperparameter tuning.
Abstract: Overfitting and long training time are two fundamental challenges in multilayered neural network learning and deep learning in particular. Dropout and batch normalization are two well-recognized approaches to tackle these challenges. While both approaches share overlapping design principles, numerous research results have shown that they have unique strengths to improve deep learning. Many tools simplify these two approaches as a simple function call, allowing flexible stacking to form deep learning architectures. Although their usage guidelines are available, unfortunately no well-defined set of rules or comprehensive studies to investigate them concerning data input, network configurations, learning efficiency, and accuracy. It is not clear when users should consider using dropout and/or batch normalization, and how they should be combined (or used alternatively) to achieve optimized deep learning outcomes. In this paper we conduct an empirical study to investigate the effect of dropout and batch normalization on training deep learning models. We use multilayered dense neural networks and convolutional neural networks (CNN) as the deep learning models, and mix dropout and batch normalization to design different architectures and subsequently observe their performance in terms of training and test CPU time, number of parameters in the model (as a proxy for model size), and classification accuracy. The interplay between network structures, dropout, and batch normalization, allow us to conclude when and how dropout and batch normalization should be considered in deep learning. The empirical study quantified the increase in training time when dropout and batch normalization are used, as well as the increase in prediction time (important for constrained environments, such as smartphones and low-powered IoT devices). It showed that a non-adaptive optimizer (e.g. SGD) can outperform adaptive optimizers, but only at the cost of a significant amount of training times to perform hyperparameter tuning, while an adaptive optimizer (e.g. RMSProp) performs well without much tuning. Finally, it showed that dropout and batch normalization should be used in CNNs only with caution and experimentation (when in doubt and short on time to experiment, use only batch normalization).
TL;DR: Most computer vision applications such as human computer interaction, virtual reality, security, video surveillance and home monitoring are highly correlated to HAR tasks, which establishes new trend and milestone in the development cycle of HAR systems.
Abstract: Human activity recognition (HAR) systems attempt to automatically identify and analyze human activities using acquired information from various types of sensors. Although several extensive review papers have already been published in the general HAR topics, the growing technologies in the field as well as the multi-disciplinary nature of HAR prompt the need for constant updates in the field. In this respect, this paper attempts to review and summarize the progress of HAR systems from the computer vision perspective. Indeed, most computer vision applications such as human computer interaction, virtual reality, security, video surveillance and home monitoring are highly correlated to HAR tasks. This establishes new trend and milestone in the development cycle of HAR systems. Therefore, the current survey aims to provide the reader with an up to date analysis of vision-based HAR related literature and recent progress in the field. At the same time, it will highlight the main challenges and future directions.
TL;DR: This review paper focuses on theobject detection algorithms based on deep convolutional neural networks, while the traditional object detection algorithms will be simply introduced as well.
Abstract: With the rapid development of deep learning techniques, deep convolutional neural networks (DCNNs) have become more important for object detection. Compared with traditional handcrafted feature-based methods, the deep learning-based object detection methods can learn both low-level and high-level image features. The image features learned through deep learning techniques are more representative than the handcrafted features. Therefore, this review paper focuses on the object detection algorithms based on deep convolutional neural networks, while the traditional object detection algorithms will be simply introduced as well. Through the review and analysis of deep learning-based object detection techniques in recent years, this work includes the following parts: backbone networks, loss functions and training strategies, classical object detection architectures, complex problems, datasets and evaluation metrics, applications and future development directions. We hope this review paper will be helpful for researchers in the field of object detection.
TL;DR: A novel model updating strategy is presented, and peak sidelobe ratio (PSR) and skewness are exploited to measure the comprehensive fluctuation of response map for efficient tracking performance.
Abstract: Robust and accurate visual tracking is a challenging problem in computer vision. In this paper, we exploit spatial and semantic convolutional features extracted from convolutional neural networks in continuous object tracking. The spatial features retain higher resolution for precise localization and semantic features capture more semantic information and less fine-grained spatial details. Therefore, we localize the target by fusing these different features, which improves the tracking accuracy. Besides, we construct the multi-scale pyramid correlation filter of the target and extract its spatial features. This filter determines the scale level effectively and tackles target scale estimation. Finally, we further present a novel model updating strategy, and exploit peak sidelobe ratio (PSR) and skewness to measure the comprehensive fluctuation of response map for efficient tracking performance. Each contribution above is validated on 50 image sequences of tracking benchmark OTB-2013. The experimental comparison shows that our algorithm performs favorably against 12 state-of-the-art trackers.
TL;DR: A security framework of healthcare multimedia data through blockchain technique is provided by generating the hash of each data so that any change or alteration in data or breaching of medicines may be reflected in entire blockchain network users.
Abstract: Through the propagation of technology in recent years, people communicate in a range of ways via multimedia. The use of multimedia technique in healthcare system also makes it possible to store, process and transfer the patient’s data presented in variety of forms such as images, text and audio through online using various smart objects. Healthcare organizations around the world are transforming themselves into more efficient, coordinated and user-centered systems through various multimedia techniques. However, the management of huge amount data such as reports and images of every person leads to increase the human efforts and security risks. In order to overcome these issues, IoT in healthcare enhances the quality of patients care and reduce the cost by allocating the medical resources in an efficient way. However, a number of threats can occur in IoT devices initiated by various intruders. Sometimes, in order to make their personal profit, even though the medical shop or pathology labs are not of good reputation, the doctors forced the patients to do the lab tests, or buy the medicines from those organizations only. Therefore, security should be at the staple of outlook in IoT elucidations. In order to prevent these issues, Blockchain technology has been encountered as the best technique that provides the secrecy and protection of control system in real time conditions. In this manuscript, we will provide a security framework of healthcare multimedia data through blockchain technique by generating the hash of each data so that any change or alteration in data or breaching of medicines may be reflected in entire blockchain network users. The results have been analyzed against conventional approach and validated with improved simulated results that offer 86% success rate over product drop ratio, falsification attack, worm hole attack and probabilistic authentication scenarios because of Blockchain technique.
TL;DR: An attempt has been made to classify the MR images of healthy control and Parkinson's disease subjects using deep learning neural network using the Convolutional Neural Network architecture AlexNet to refine the diagnosis of Parkinson’s disease.
Abstract: Parkinson’s disease is the second most common degenerative disease caused by loss of dopamine producing neurons. The substantia nigra region is deprived of its neuronal functions causing striatal dopamine deficiency which remains as hallmark in Parkinson’s disease. Clinical diagnosis reveals a range of motor to non motor symptoms in these patients. Magnetic Resonance (MR) Imaging is able to capture the structural changes in the brain due to dopamine deficiency in Parkinson’s disease subjects. In this work, an attempt has been made to classify the MR images of healthy control and Parkinson’s disease subjects using deep learning neural network. The Convolutional Neural Network architecture AlexNet is used to refine the diagnosis of Parkinson’s disease. The MR images are trained by the transfer learned network and tested to give the accuracy measures. An accuracy of 88.9% is achieved with the proposed system. Deep learning models are able to help the clinicians in the diagnosis of Parkinson’s disease and yield an objective and better patient group classification in the near future.
TL;DR: This work uses best feature extraction techniques such as Histogram of oriented Gradients, wavelet transform-based features, Local Binary Pattern, Scale Invariant Feature Transform, SIFT and Zernike Moment to detect the cancerous lung nodules from the given input lung image and to classify the lung cancer and its severity.
Abstract: Lung cancer is one of the main reasons for death in the world among both men and women, with an impressive rate of about five million deadly cases per year. Computed Tomography (CT) scan can provide valuable information in the diagnosis of lung diseases. The main objective of this work is to detect the cancerous lung nodules from the given input lung image and to classify the lung cancer and its severity. To detect the location of the cancerous lung nodules, this work uses novel Deep learning methods. This work uses best feature extraction techniques such as Histogram of oriented Gradients (HoG), wavelet transform-based features, Local Binary Pattern (LBP), Scale Invariant Feature Transform (SIFT) and Zernike Moment. After extracting texture, geometric, volumetric and intensity features, Fuzzy Particle Swarm Optimization (FPSO) algorithm is applied for selecting the best feature. Finally, these features are classified using Deep learning. A novel FPSOCNN reduces computational complexity of CNN. An additional valuation is performed on another dataset coming from Arthi Scan Hospital which is a real-time data set. From the experimental results, it is shown that novel FPSOCNN performs better than other techniques.
TL;DR: The proposed BPR system, based on statistical dependencies between behaviors and respective signal data, has been used to extract statistical features along with acoustic signal features like zero crossing rate to maximize the possibility of getting optimal feature values.
Abstract: Human behavior pattern recognition (BPR) from accelerometer signals is a challenging problem due to variations in signal durations of different behaviors. Analysis of human behaviors provides in depth observations of subject’s routines, energy consumption and muscular stress. Such observations hold key importance for the athletes and physically ailing humans, who are highly sensitive to even minor injuries. A novel idea having variant of genetic algorithm is proposed in this paper to solve complex feature selection and classification problems using sensor data. The proposed BPR system, based on statistical dependencies between behaviors and respective signal data, has been used to extract statistical features along with acoustic signal features like zero crossing rate to maximize the possibility of getting optimal feature values. Then, reweighting of features is introduced in a feature selection phase to facilitate the segregation of behaviors. These reweighted features are further processed by biological operations of crossover and mutation to adapt varying signal patterns for significant accuracy results. Experiments on wearable sensors benchmark datasets HMP, WISDM and self-annotated IMSB datasets have been demonstrated to testify the efficacy of the proposed work over state-of-the-art methods.
TL;DR: A novel framework based on computer supported diagnosis and IoT to detect and monitor heart failure infected patients, where the data are attained from various other sources and the proposed model is validated by numerical examples on real case studies.
Abstract: In a developed society, people have more concerned about their health. Thus, improvement of medical field application has been one of the greatest active study areas. Medical statistics show that heart disease is the main reason for morbidity and death in the world. The physician’s job is difficult because of having too many factors to analyze in the diagnosis of heart disease. Besides, data and information gained by the physician for diagnosis are often partial and immersed. Recently, health care applications with the Internet of Things (IoT) have offered different dimensions and other online services. These applications have provided a new platform for millions of people to receive benefits from the regular health tips to live a healthy life. In this paper, we propose a novel framework based on computer supported diagnosis and IoT to detect and monitor heart failure infected patients, where the data are attained from various other sources. The proposed healthcare system aims at obtaining better precision of diagnosis with ambiguous information. We suggest neutrosophic multi criteria decision making (NMCDM) technique to aid patient and physician to know if patient is suffering from heart failure. Furthermore, through dealing with the uncertainty of imprecision and vagueness resulted from the symmetrical priority scales of different symptoms of disease, users know what extent the disease is dangerous in their body. The proposed model is validated by numerical examples on real case studies. The experimental results indicate that the proposed system provides a viable solution that can work at wide range, a new platform to millions of people getting benefit over the decreasing of mortality and cost of clinical treatment related to heart failure.
TL;DR: An automated computer-aided diagnosis system for multi-class skin (MCS) cancer classification with an exceptionally high accuracy is proposed, which outperformed both expert dermatologists and contemporary deep learning methods for MCS cancer classification.
Abstract: Skin Cancer accounts for one-third of all diagnosed cancers worldwide. The prevalence of skin cancers have been rising over the past decades. In recent years, use of dermoscopy has enhanced the diagnostic capability of skin cancer. The accurate diagnosis of skin cancer is challenging for dermatologists as multiple skin cancer types may appear similar in appearance. The dermatologists have an average accuracy of 62% to 80% in skin cancer diagnosis. The research community has been made significant progress in developing automated tools to assist dermatologists in decision making. In this work, we propose an automated computer-aided diagnosis system for multi-class skin (MCS) cancer classification with an exceptionally high accuracy. The proposed method outperformed both expert dermatologists and contemporary deep learning methods for MCS cancer classification. We performed fine-tuning over seven classes of HAM10000 dataset and conducted a comparative study to analyse the performance of five pre-trained convolutional neural networks (CNNs) and four ensemble models. The maximum accuracy of 93.20% for individual model amongst the set of models whereas maximum accuracy of 92.83% for ensemble model is reported in this paper. We propose use of ResNeXt101 for the MCS cancer classification owing to its optimized architecture and ability to gain higher accuracy.
TL;DR: A new fully automated scheme is proposed for Human action recognition by fusion of deep neural network (DNN) and multiview features and shows that the proposed scheme outperforms the state of the art methods.
Abstract: Human Action Recognition (HAR) has become one of the most active research area in the domain of artificial intelligence, due to various applications such as video surveillance. The wide range of variations among human actions in daily life makes the recognition process more difficult. In this article, a new fully automated scheme is proposed for Human action recognition by fusion of deep neural network (DNN) and multiview features. The DNN features are initially extracted by employing a pre-trained CNN model name VGG19. Subsequently, multiview features are computed from horizontal and vertical gradients, along with vertical directional features. Afterwards, all features are combined in order to select the best features. The best features are selected by employing three parameters i.e. relative entropy, mutual information, and strong correlation coefficient (SCC). Furthermore, these parameters are used for selection of best subset of features through a higher probability based threshold function. The final selected features are provided to Naive Bayes classifier for final recognition. The proposed scheme is tested on five datasets name HMDB51, UCF Sports, YouTube, IXMAS, and KTH and the achieved accuracy were 93.7%, 98%, 99.4%, 95.2%, and 97%, respectively. Lastly, the proposed method in this article is compared with existing techniques. The resuls shows that the proposed scheme outperforms the state of the art methods.
TL;DR: A novel deep neural network model is introduced for identifying infected falciparum malaria parasite using transfer learning approach, which shows the potential of transfer learning in the field of medical image analysis, especially malaria diagnosis.
Abstract: Malaria is an infectious disease which is caused by plasmodium parasite. Several image processing and machine learning based techniques have been employed to diagnose malaria, using its spatial features extracted from microscopic images. In this work, a novel deep neural network model is introduced for identifying infected falciparum malaria parasite using transfer learning approach. This proposed transfer learning approach can be achieved by unifying existing Visual Geometry Group (VGG) network and Support Vector Machine (SVM). Implementation of this unification is carried out by using “Train top layers and freeze out rest of the layers” strategy. Here, the pre-trained VGG facilitates the role of expert learning model and SVM as domain specific learning model. Initial ‘k’ layers of pre-trained VGG are retained and (n-k) layers are replaced with SVM. To evaluate the proposed VGG-SVM model, a malaria digital corpus has been generated by acquiring blood smear images of infected and non-infected malaria patients and compared with state-of-the-art Convolutional Neural Network (CNN) models. Malaria digital corpus images were used to analyse the performance of VGG19-SVM, resulting in classification accuracy of 93.1% in identification of infected falciparum malaria. Unification of VGG19-SVM shows superiority over the existing CNN models in all performance indicators such as accuracy, sensitivity, specificity, precision and F-Score. The obtained result shows the potential of transfer learning in the field of medical image analysis, especially malaria diagnosis.
TL;DR: The experimental results and security analyses indicate that the proposed image encryption scheme not only has good encryption effect and able to resist against the known attacks, but also is sufficiently fast for practical applications.
Abstract: In this paper, we propose a novel image encryption scheme based on a hybrid model of DNA computing, chaotic systems and hash functions The significant advantage of the proposed scheme is high efficiency The proposed scheme consists of the DNA level permutation and diffusion In the DNA level permutation, a mapping function based on the logistic map is applied on the DNA image to randomly change the position of elements in the DNA image In the DNA level diffusion, not only we define two new algebraic DNA operators, called the DNA left-circular shift and DNA right-circular shift, but we also use a variety of DNA operators to diffuse the permutated DNA image with the key DNA image The experimental results and security analyses indicate that the proposed image encryption scheme not only has good encryption effect and able to resist against the known attacks, but also is sufficiently fast for practical applications The MATLAB source code of the proposed image encryption scheme is publicly available at the URL: https://githubcom/EbrahimZarei64/ImageEncryption
TL;DR: An industry-oriented Canonical Particle Swarm (CPS) optimization data delivery framework is introduced for multimedia routing in IIoT and its facilities during and/or after operational hours and results show that the proposed method is better.
Abstract: Industrial Internet of Things (IIoTs) is the fast growing network of interconnected things that collect and exchange data using embedded sensors planted everywhere. It is an interconnection of several things through a diverse communication system capable of monitoring, collecting, exchanging, analysing, and delivering valuable and challenging amount of information. Given their ability to be operated autonomously, their high mobility, and their communication and processing power, Unmanned Aerial Vehicles (UAVs) are expected to be involved in numerous IIoT-related applications, where multimedia and video streaming plays a key role. Our main interest is the multimedia routing in IIoT and its facilities during and/or after operational hours. For recovering, constructing and selecting k-disjoint paths, capable of putting up with failure of the parameters but satisfying the quality of service, we introduce an industry-oriented Canonical Particle Swarm (CPS) optimization data delivery framework. During communication with the UAV, multi-swarm strategy is used to determine the optimal direction while performing a multipath routing. Authenticity of the proposed approach has been tested and results show that, compared to the ordinary Canonical Particle Multi-path Swarm (CPMS) optimization and Fully Multi-path Particle Swarm (FMPS) optimization, the proposed method is better.
TL;DR: A novel local region model based on adaptive bilateral filter is presented for segmenting noisy images and is more efficient and robust to noise than the state-of-art region-based models.
Abstract: Image segmentation plays an important role in the computer vision . However, it is extremely challenging due to low resolution, high noise and blurry boundaries. Recently, region-based models have been widely used to segment such images. The existing models often utilized Gaussian filtering to filter images, which caused the loss of edge gradient information. Accordingly, in this paper, a novel local region model based on adaptive bilateral filter is presented for segmenting noisy images. Specifically, we firstly construct a range-based adaptive bilateral filter, in which an image can well be preserved edge structures as well as resisted noise. Secondly, we present a data-driven energy model, which utilizes local information of regions centered at each pixel of image to approximate intensities inside and outside of the circular contour. The estimation approach has improved the accuracy of noisy image segmentation. Thirdly, under the premise of keeping the image original shape, a regularization function is used to accelerate the convergence speed and smoothen the segmentation contour. Experimental results of both synthetic and real images demonstrate that the proposed model is more efficient and robust to noise than the state-of-art region-based models.
TL;DR: This paper reviews and discusses existing emotion categorization models for emotion analysis and proposes methods that enhance existing emotion research.
Abstract: Sentiment analysis consists in the identification of the sentiment polarity associated with a target object, such as a book, a movie or a phone. Sentiments reflect feelings and attitudes, while emotions provide a finer characterization of the sentiments involved. With the huge number of comments generated daily on the Internet, besides sentiment analysis, emotion identification has drawn keen interest from different researchers, businessmen and politicians for polling public opinions and attitudes. This paper reviews and discusses existing emotion categorization models for emotion analysis and proposes methods that enhance existing emotion research. We carried out emotion analysis by inviting experts from different research areas to produce comprehensive results. Moreover, a computational emotion sensing model is proposed, and future improvements are discussed in this paper.
TL;DR: The proposed method of OPBS-SSHC performance was found to be better than other classification techniques of Relevance Vector Machine (RVM), Probabilistic Neural Network (PNN), and Support vector Machine (SVM), which were considered for comparison by taking the above metrics and coefficients as and when required throughout this extensive comparative study.
Abstract: Cancer in Liver is the one among all other types of cancer which causes death of carcinogenic victim people throughout the world. GLOBOCAN12 was an initiative for simultaneously generating the expected dominance and mortality incidence that raised out of the cancer over the whole globe. It reported that about 782,000 new cases in the population were reported to have liver cancer, in which around 745,000 people loosed their lives from these kind of diseases worldwide. Some traditional algorithms were found to be widely used in liver segmentation processes. However, it had some limitations such as less effective outcomes in terms of proceeded segmentation operations and also it was very difficult to apply tumor segmentation especially for larger severity intensities of tumor region, which usually gave rise to high computational cost. It was also required to improve the performance of those algorithms for diagnosing even the tiniest parts of liver along with the improvisation needed when there was misclassification of the tumors near the liver boundaries. Along this way as an improvising methodology, an efficient method is proposed in order to overcome all the above discussed issues one by one through our work. The novelty/major contribution of this proposed method is being contributed in three stages namely, preprocessing, segmentation and classification. In preprocessing, the noises of image will be removed and then, the input image edge will be sharpened by using a frequency-based edge sharpening technique which aids in taking the pixels in the images into consideration for proceeding with the next operation of segmentation. The segmentation process gets the appropriated preprocessed images as input and the Outline Preservation Based Segmentation (OPBS) algorithm is used to segment the images in the segmentation phase. The algorithms involving features extraction were preferably deployed to extract the corresponding features from an image. So, the features present in the segmented image serves as the necessary information for the classification purposes. Next, the features were classified in the classification phase by using novel similarity search based hybrid classification technique. The Outline Preservation Based Segmentation and Search Based Hybrid Classification (OPBS-SSHC) used the 3D IR CAD dataset. It was used to analyze with various parameters such as accuracy, precision, recall, and F-measures. Volumetric Overlap Error (VOE), Jaccard, Dice, and Kappa will be determined later on to predict the errors in the segmentation process undertaken. The proposed method of OPBS-SSHC performance was found to be better than other classification techniques of Relevance Vector Machine (RVM), Probabilistic Neural Network (PNN), and Support Vector Machine (SVM), which were considered for comparison by taking the above metrics and coefficients as and when required throughout this extensive comparative study.
TL;DR: A decision based asymmetrically trimmed Winsorized median for the removal of salt and pepper noise in images and videos was found to exhibit excellent noise suppression capabilities by preserving the fine information of the image even at higher noise densities.
Abstract: A decision based asymmetrically trimmed Winsorized median for the removal of salt and pepper noise in images and videos is proposed. The proposed filter initially classifies the pixels as noisy and non noisy and later replaces the noisy pixels with asymmetrically trimmed modified winsorized median leaving the non noisy pixels unaltered. Exhaustive experiments were conducted on standard image database and the performance of the proposed filter was evaluated in terms of both quantitative and qualitatively with existing algorithm. It was found that the proposed algorithm was found to exhibit excellent noise suppression capabilities by preserving the fine information of the image even at higher noise densities. The performance of the proposed filter was found good even for videos.
TL;DR: A secure ( n, n )- Multi-Secret-Sharing (MSS) scheme using image scrambling algorithm which is based on the logistic chaotic sequence generated using the secret key which is retrieved from the geometric pattern named as spirograph which drawn by the users with their private values is proposed.
Abstract: The Secret Sharing Scheme plays a vital role in cryptography which allows to transmit the secret digital information (image, video, audio, handwriting, etc.,) over a communication channel. This cryptographic technique involves encrypting the secret images into noisy shares and transmitted. The transmitted image shares are reconstructed using simple logical computation. In this paper, we propose a secure (n, n)- Multi-Secret-Sharing (MSS) scheme using image scrambling algorithm which is based on the logistic chaotic sequence generated using the secret key which is retrieved from the geometric pattern named as spirograph which drawn by the users with their private values. Also, decomposition and recombination of image pixels which points to change the position and values of the pixels. The experimental results estimate that the standard metrics NPCR, UACI, Entropy, Coefficient Correlation values proves the rigidness of the implemented algorithm.
TL;DR: A new and novel WSN based Disaster Rescue Telemedicine Scheme to minimize energy consumption and to maximize network lifetime and the Simulation results prove that the proposed method amplifies the WSN topology lifetime to a significant level than the earlier versions.
Abstract: Wireless sensor network can be used to construct a telemedicine scheme to bring together the patient data and expansion of medical conveniences when disaster occurs. The Remote Medical Monitoring (RMM) scheme of the disaster period can be constructed using the Health care center (CC), Wireless sensor nodes and a few Primary health care centers (PHC). The sensor nodes possess the capacity of making communication between patients and PHCs. This type of WSN experiences limited lifetime problem due to the limited battery energy and transmission of medical data in large quantity. This paper proposes a new and novel WSN based Disaster Rescue Telemedicine Scheme to minimize energy consumption and to maximize network lifetime. The proposed method reaches this milestone using three novel algorithms namely ‘Network clustering using Non-border CH oriented Genetic algorithm, Fuzzy rules and Kernel FCM (NCNBGF)’, ‘High gain MDC algorithm (HGMDC)’ and ‘Critical node handling using job limiting and job shifting (CJLS)’. The principal technologies used in this paper are Network node clustering, Medical image compression and Critical state node energy management to elongate the life period of WSN. The Simulation results prove that the proposed method amplifies the WSN topology lifetime to a significant level than the earlier versions. The Existing methods compared in this paper holds only 20% energy at the round 80,the proposed method stays with 43% of energy.
TL;DR: Experimental evaluation shows that using combination of NSCT, RDWT, SVD and chaotic encryption makes the approach robust, imperceptible, secure and suitable for medical applications.
Abstract: In this paper, a chaotic based secure medical image watermarking approach is proposed. The method is using non sub-sampled contourlet transform (NSCT), redundant discrete wavelet transform (RDWT) and singular value decomposition (SVD) to provide significant improvement in imperceptibility and robustness. Further, security of the approach is ensured by applying 2-D logistic map based chaotic encryption on watermarked medical image. In our approach, the cover image is initially divided into sub-images and NSCT is applied on the sub-image having maximum entropy. Subsequently, RDWT is applied to NSCT image and the singular vector of the RDWT coefficient is calculated. Similar procedure is followed for both watermark images. The singular value of both watermarks is embedded into the singular matrix of the cover. Experimental evaluation shows when the approach is subjected to attacks, using combination of NSCT, RDWT, SVD and chaotic encryption it makes the approach robust, imperceptible, secure and suitable for medical applications.
TL;DR: The proposed WHITE STAG model and kernel sliding perceptron outperformed the existing well known statistical state-of-the-art methods by achieving a weighted average recognition rate of 87.48% over UT-interaction, 87.5% over BIT-Interaction and 85.7% over IM-IntensityInteractive7 datasets.
Abstract: To understand human to human dealing accurately, human interaction recognition (HIR) systems require robust feature extraction and selection methods based on vision sensors. In this paper, we have proposed WHITE STAG model to wisely track human interactions using space time methods as well as shape based angular-geometric sequential approaches over full-body silhouettes and skeleton joints, respectively. After feature extraction, feature space is reduced by employing codebook generation and linear discriminant analysis (LDA). Finally, kernel sliding perceptron is used to recognize multiple classes of human interactions. The proposed WHITE STAG model is validated using two publicly available RGB datasets and one self-annotated intensity interactive dataset as novelty. For evaluation, four experiments are performed using leave-one-out and cross validation testing schemes. Our WHITE STAG model and kernel sliding perceptron outperformed the existing well known statistical state-of-the-art methods by achieving a weighted average recognition rate of 87.48% over UT-Interaction, 87.5% over BIT-Interaction and 85.7% over proposed IM-IntensityInteractive7 datasets. The proposed system should be applicable to various multimedia contents and security applications such as surveillance systems, video based learning, medical futurists, service cobots, and interactive gaming.
TL;DR: This article presents a detailed discussion of different prospects of digital image watermarking and performance comparisons of the discussed techniques are presented in tabular format.
Abstract: This article presents a detailed discussion of different prospects of digital image watermarking. This discussion of watermarking included: brief comparison of similar information security techniques, concept of watermark embedding and extraction process, watermark characteristics and applications, common types of watermarking techniques, major classification of watermarking attacks, brief summary of various secure watermarking techniques. Further, potential issues and some existing solutions are provided. Furthermore, the performance comparisons of the discussed techniques are presented in tabular format. Authors believe that this article contribution will provide as catalyst for potential researchers to implement efficient watermarking systems.
TL;DR: In this paper, an effective computer-aided diagnosis (CAD) system is presented to detect MI signals using the convolution neural network (CNN) for urban healthcare in smart cities.
Abstract: One of the common cardiac disorders is a cardiac attack called Myocardial infarction (MI), which occurs due to the blockage of one or more coronary arteries. Timely treatment of MI is important and slight delay results in severe consequences. Electrocardiogram (ECG) is the main diagnostic tool to monitor and reveal the MI signals. The complex nature of MI signals along with noise poses challenges to doctors for accurate and quick diagnosis. Manually studying large amounts of ECG data can be tedious and time-consuming. Therefore, there is a need for methods to automatically analyze the ECG data and make diagnosis. Number of studies has been presented to address MI detection, but most of these methods are computationally expensive and faces the problem of overfitting while dealing real data. In this paper, an effective computer-aided diagnosis (CAD) system is presented to detect MI signals using the convolution neural network (CNN) for urban healthcare in smart cities. Two types of transfer learning techniques are employed to retrain the pre-trained VGG-Net (Fine-tuning and VGG-Net as fixed feature extractor) and obtained two new networks VGG-MI1 and VGG-MI2. In the VGG-MI1 model, the last layer of the VGG-Net model is replaced with a specific layer according to our requirements and various functions are optimized to reduce overfitting. In the VGG-MI2 model, one layer of the VGG-Net model is selected as a feature descriptor of the ECG images to describe it with informative features. Considering the limited availability of dataset, ECG data is augmented which has increased the classification performance. A standard well-known database Physikalisch-Technische Bundesanstalt (PTB) Diagnostic ECG is used for the validation of the proposed framework. It is evident from experimental results that the proposed framework achieves a high accuracy surpasses the existing methods. In terms of accuracy, sensitivity, and specificity; VGG-MI1 achieved 99.02%, 98.76%, and 99.17%, respectively, while VGG-MI2 models achieved an accuracy of 99.22%, a sensitivity of 99.15%, and a specificity of 99.49%.
TL;DR: A novel Deep Convolutional Neural Network, DFU_QUTNet, is proposed for the automatic classification of normal skin (healthy skin) class versus abnormal skin (DFU) class, and outperformed the state-of-the-art CNN networks by achieving the F1-score of 94.5%.
Abstract: Diabetic Foot Ulcer (DFU) is the main complication of Diabetes, which, if not properly treated, may lead to amputation. One of the approaches of DFU treatment depends on the attentiveness of clinicians and patients. This treatment approach has drawbacks such as the high cost of the diagnosis as well as the length of treatment. Although this approach gives powerful results, the need for a remote, cost-effective, and convenient DFU diagnosis approach is urgent. In this paper, we introduce a new dataset of 754-ft images which contain healthy skin and skin with a diabetic ulcer from different patients. A novel Deep Convolutional Neural Network, DFU_QUTNet, is also proposed for the automatic classification of normal skin (healthy skin) class versus abnormal skin (DFU) class. Stacking more layers to a traditional Convolutional Neural Network to make it very deep does not lead to better performance, instead leading to worse performance due to the gradient. Therefore, our proposed DFU_QUTNet network is designed based on the idea of increasing the width of the network while keeping the depth compared to the state-of-the-art networks. Our network has been proven to be very beneficial for gradient propagation, as the error can be back-propagated through multiple paths. It also helps to combine different levels of features at each step of the network. Features extracted by DFU_QUTNet network are used to train Support Vector Machine (SVM) and K-Nearest Neighbors (KNN) classifiers. For the sake of comparison, we have fine-tuned then re-trained and tested three pre-trained deep learning networks (GoogleNet, VGG16, and AlexNet) for the same task. The proposed DFU_QUTNet network outperformed the state-of-the-art CNN networks by achieving the F1-score of 94.5%.
TL;DR: The deep learning strategy named Convolutional Neural network (CNN) model is used to predict the cancer images of the pancreas, which is embedded with the model of Gaussian Mixture model with EM algorithm to Predict the essential features from the CT Scan and predicts the percentage of cancer spread in the Pancreas with the threshold parameters taken as a markers.
Abstract: The tremendous research towards medical health systems are giving ample scope for the computing systems to emerge with the latest innovations. These innovations are leading to the efficient implementations of the medical systems which involve in automatic diagnosis of the health related problems. The most important health research is going on towards cancer prediction, which has different forms and can be affected on different portions of the body parts. One of the most affected cancer that predicted to be incurable are Pancreatic Cancer, which cannot be treated efficiently once identified, in most of the cases it found to be unpredictable as it lies in the abdomen region below the stomach. Therefore the advancements in the medical research is trending towards the implementations of an automated systems which identifies the stages of cancer if affected and provide the better diagnosis and treatment if identified. Deep learning is one such area which extended its research towards medical imaging, which automates the process of diagnosing the problems of the patients when appended with the set of machines like CT/PET Scan systems. In this paper, the deep learning strategy named Convolutional Neural network (CNN) model is used to predict the cancer images of the pancreas, which is embedded with the model of Gaussian Mixture model with EM algorithm to predict the essential features from the CT Scan and predicts the percentage of cancer spread in the pancreas with the threshold parameters taken as a markers. The experimentation is carried out on the CT Scan images dataset of pancreas collected from the Cancer Imaging Archive (TCIA) consists of approximately 19,000 images supported by the National Institutes of Health Clinical Center to analyze the performance of the model.
TL;DR: A quadtree based approach to capture the spatial information of medical images for explaining nonlinear SVM prediction and finds ROIs which contain the discriminative regions behind the prediction.
Abstract: In this paper, we propose a quadtree based approach to capture the spatial information of medical images for explaining nonlinear SVM prediction. In medical image classification, interpretability becomes important to understand why the adopted model works. Explaining an SVM prediction is difficult due to implicit mapping done in kernel classification is uninformative about the position of data points in the feature space and the nature of the separating hyperplane in the original space. The proposed method finds ROIs which contain the discriminative regions behind the prediction. Localization of the discriminative region in small boxes can help in interpreting the prediction by SVM. Quadtree decomposition is applied recursively before applying SVMs on sub images and model identified ROIs are highlighted. Pictorial results of experiments on various medical image datasets prove the effectiveness of this approach. We validate the correctness of our method by applying occlusion methods.
TL;DR: By introducing such kind of synthetic data into the training process, the overall accuracy of a state-of-the-art Convolutional/Deconvolutional Neural Network for melanoma skin lesion segmentation is increased.
Abstract: This paper presents a novel strategy that employs Generative Adversarial Networks (GANs) to augment data in the skin lesion segmentation task, which is a fundamental first step in the automated melanoma detection process. The proposed framework generates both skin lesion images and their segmentation masks, making the data augmentation process extremely straightforward. In order to thoroughly analyze how the quality and diversity of synthetic images impact the efficiency of the method, we remodel two different well known GANs: a Deep Convolutional GAN (DCGAN) and a Laplacian GAN (LAPGAN). Experimental results reveal that, by introducing such kind of synthetic data into the training process, the overall accuracy of a state-of-the-art Convolutional/Deconvolutional Neural Network for melanoma skin lesion segmentation is increased.
TL;DR: This work proposes an ensemble of three classification models, namely CNN-Net, Encoded- net, and CNN-LSTM, which is named as EnsemConvNet, and compared this model with some existing deep learning models such as Multi Headed CNN, hybrid of CNN, and Long Short Term Memory models.
Abstract: Human Activity Recognition (HAR) can be defined as the automatic prediction of the regular human activities performed in our day-to-day life, such as walking, running, cooking, performing office work, etc. It is truly beneficial in the field of medical care services, for example, personal health care assistants, old-age care services, maintaining patient records for future help, etc. Input data to a HAR system can be (a) videos or still images capturing human activities, or (b) time-series data of human body movements while performing the activities taken from sensors in the smart devices like accelerometer, gyroscope, etc. In this work, we mainly focus on the second category of the input data. Here, we propose an ensemble of three classification models, namely CNN-Net, Encoded-Net, and CNN-LSTM, which is named as EnsemConvNet. Each of these classification models is built upon simple 1D Convolutional Neural Network (CNN) but differs in terms of the number of dense layers, kernel size used along with other key differences in the architecture. Each model accepts the time series data as a 2D matrix by taking a window of data at a time in order to infer information, which ultimately predicts the type of human activity. Classification outcome of the EnsemConvNet model is decided using various classifier combination methods that include majority voting, sum rule, product rule, and a score fusion approach called adaptive weighted approach. Three benchmark datasets, namely WISDM activity prediction, UniMiB SHAR, MobiAct, are used for evaluating our proposed model. We have compared our EnsemConvNet model with some existing deep learning models such as Multi Headed CNN, hybrid of CNN, and Long Short Term Memory (LSTM) models. The results obtained here establish the supremacy of the EnsemConvNet model over the other mentioned models.