Showing papers in "Multimedia Tools and Applications in 2017"
TL;DR: A manga-specific image retrieval system that consists of efficient margin labeling, edge orientation histogram feature description with screen tone removal, and approximate nearest-neighbor search using product quantization is proposed.
Abstract: Manga (Japanese comics) are popular worldwide. However, current e-manga archives offer very limited search support, i.e., keyword-based search by title or author. To make the manga search experience more intuitive, efficient, and enjoyable, we propose a manga-specific image retrieval system. The proposed system consists of efficient margin labeling, edge orientation histogram feature description with screen tone removal, and approximate nearest-neighbor search using product quantization. For querying, the system provides a sketch-based interface. Based on the interface, two interactive reranking schemes are presented: relevance feedback and query retouch. For evaluation, we built a novel dataset of manga images, Manga109, which consists of 109 comic books of 21,142 pages drawn by professional manga artists. To the best of our knowledge, Manga109 is currently the biggest dataset of manga images available for research. Experimental results showed that the proposed framework is efficient and scalable (70 ms from 21,142 pages using a single computer with 204 MB RAM).
TL;DR: The Salient map is introduced to develop a new method, in which the ROIs (region of interesting) of secret image can be revealed progressively, and to the best of the authors' knowledge, this is the first SSIS that employs a meaningful shadow.
Abstract: Scalable secret image sharing (SSIS) is a new secret image sharing technique. The feature of scalability refers to the fact that the revealed secret information is proportional to the number of gathered shadows. Once all of the valid shadows are collected, the complete secret can be revealed easily. The kernel of secret information, however, may be leaked out with a few shadows collected in the existing SSIS mechanisms. This is because researchers seldom concerned about secret distribution. Thus, we introduce the Salient map to develop a new method, in which the ROIs (region of interesting) of secret image can be revealed progressively. Additionally, we introduce the concepts of meaningful shadow and verification to SSIS. To the best of our knowledge, this is the first SSIS that employs a meaningful shadow. The leading adoption can greatly help reduce the attention of attackers in order to enhance the security, while the second concept can avoid malicious behaviors from outside attackers or dishonest members.
TL;DR: The thrust of this survey is on the utilization of depth cameras and inertial sensors as these two types of sensors are cost-effective, commercially available, and more significantly they both provide 3D human action data.
Abstract: A number of review or survey articles have previously appeared on human action recognition where either vision sensors or inertial sensors are used individually. Considering that each sensor modality has its own limitations, in a number of previously published papers, it has been shown that the fusion of vision and inertial sensor data improves the accuracy of recognition. This survey article provides an overview of the recent investigations where both vision and inertial sensors are used together and simultaneously to perform human action recognition more effectively. The thrust of this survey is on the utilization of depth cameras and inertial sensors as these two types of sensors are cost-effective, commercially available, and more significantly they both provide 3D human action data. An overview of the components necessary to achieve fusion of data from depth and inertial sensors is provided. In addition, a review of the publicly available datasets that include depth and inertial data which are simultaneously captured via depth and inertial sensors is presented.
TL;DR: This work employs an unsupervised method for recognizing physical activities using smartphone accelerometers, extracted from the raw acceleration data collected by smartphones, and finds the method outperforms other existing methods.
Abstract: The development of smartphones equipped with accelerometers gives a promising way for researchers to accurately recognize an individual's physical activity in order to better understand the relationship between physical activity and health. However, a huge challenge for such sensor-based activity recognition task is the collection of annotated or labelled training data. In this work, we employ an unsupervised method for recognizing physical activities using smartphone accelerometers. Features are extracted from the raw acceleration data collected by smartphones, then an unsupervised classification method called MCODE is used for activity recognition. We evaluate the effectiveness of our method on three real-world datasets, i.e., a public dataset of daily living activities and two datasets of sports activities of race walking and basketball playing collected by ourselves, and we find our method outperforms other existing methods. The results show that our method is viable to recognize physical activities using smartphone accelerometers.
TL;DR: The experiments performed on contactless palmprint database confirm that dual-source DPA, which is designed for multi-instance palmprint feature fusion recognition, outperforms single- source DPA.
Abstract: Due to the benefits of palmprint recognition and the advantages of biometric fusion systems, it is necessary to study multi-source palmprint fusion systems. Unfortunately, the research on multi-instance palmprint feature fusion is absent until now. In this paper, we extract the features of left and right palmprints with two-dimensional discrete cosine transform (2DDCT) to constitute a dual-source space. Normalization is utilized in dual-source space to avoid the disturbance caused by the coefficients with large absolute values. Thus complicated pre-masking is needless and arbitrary removing of discriminative coefficients is avoided. Since more discriminative coefficients can be preserved and retrieved with discrimination power analysis (DPA) from dual-source space, the accuracy performance is improved. The experiments performed on contactless palmprint database confirm that dual-source DPA, which is designed for multi-instance palmprint feature fusion recognition, outperforms single-source DPA.
TL;DR: Deep learning for stock prediction has been introduced in this paper and its performance is evaluated on Google stock price multimedia data from NASDAQ.
Abstract: Stock market is considered chaotic, complex, volatile and dynamic. Undoubtedly, its prediction is one of the most challenging tasks in time series forecasting. Moreover existing Artificial Neural Network (ANN) approaches fail to provide encouraging results. Meanwhile advances in machine learning have presented favourable results for speech recognition, image classification and language processing. Methods applied in digital signal processing can be applied to stock data as both are time series. Similarly, learning outcome of this paper can be applied to speech time series data. Deep learning for stock prediction has been introduced in this paper and its performance is evaluated on Google stock price multimedia data (chart) from NASDAQ. The objective of this paper is to demonstrate that deep learning can improve stock market forecasting accuracy. For this, (2D)2PCA + Deep Neural Network (DNN) method is compared with state of the art method 2-Directional 2-Dimensional Principal Component Analysis (2D)2PCA + Radial Basis Function Neural Network (RBFNN). It is found that the proposed method is performing better than the existing method RBFNN with an improved accuracy of 4.8% for Hit Rate with a window size of 20. Also the results of the proposed model are compared with the Recurrent Neural Network (RNN) and it is found that the accuracy for Hit Rate is improved by 15.6%. The correlation coefficient between the actual and predicted return for DNN is 17.1% more than RBFNN and it is 43.4% better than RNN.
TL;DR: A new method of separable data hiding in encrypted images are proposed by using CS and discrete fourier transform, which takes full advantage of both real and imaginary coefficients for ensuring great recovery and providing flexible payload.
Abstract: Reversible data hiding in encrypted images has become an effective and popular way to preserve the security and privacy of users’ personal images. Recently, Xiao et al. firstly presented reversible data hiding in encrypted images with use of the modern signal processing technique compressive sensing (CS). However, the quality of decrypted image is not great enough. In this paper, a new method of separable data hiding in encrypted images are proposed by using CS and discrete fourier transform, which takes full advantage of both real and imaginary coefficients for ensuring great recovery and providing flexible payload. Compared with the original work, the proposed method can obtain better image quality when concealing the same embedding capacity. Furthermore, image decryption and data extraction are separable in the proposed method, and the secret data can be extracted relatively accurately.
TL;DR: The processing of automatic vehicle detection and recognition using Haar-like features and AdaBoost algorithms and a local binary pattern operator to extract multi-scale and multi-orientation vehicle features according to the outside interference on the image and the random position of the vehicle.
Abstract: Vehicle detection and type recognition based on static images is highly practical and directly applicable for various operations in a traffic surveillance system. This paper will introduce the processing of automatic vehicle detection and recognition. First, Haar-like features and AdaBoost algorithms are applied for feature extracting and constructing classifiers, which are used to locate the vehicle over the input image. Then, the Gabor wavelet transform and a local binary pattern operator is used to extract multi-scale and multi-orientation vehicle features, according to the outside interference on the image and the random position of the vehicle. Finally, the image is divided into small regions, from which histograms sequences are extracted and concentrated to represent the vehicle features. Principal component analysis is adopted to reach a low dimensional histogram feature, which is used to measure the similarity of different vehicles in euler space and the nearest neighborhood is exploited for final classification. The typed experiment shows that our detection rate is over 97 %, with a false rate of only 3 %, and that the vehicle recognition rate is over 91 %, while maintaining a fast processing time. This exhibits promising potential for implementation with real-world applications.
TL;DR: The developed encryption algorithm has higher Avalanche Effect and for instance, AES in the proposed system has an Avalanche Effect of %52.50, therefore, such system is able to secure the multimedia big data against real-time attacks.
Abstract: Nowadays, multimedia is considered to be the biggest big data as it dominates the traffic in the Internet and mobile phones. Currently symmetric encryption algorithms are used in IoT but when considering multimedia big data in IoT, symmetric encryption algorithms incur more computational cost. In this paper, we have designed and developed a resource-efficient encryption system for encrypting multimedia big data in IoT. The proposed system takes the advantages of the Feistel Encryption Scheme, an Advanced Encryption Standard (AES), and genetic algorithms. To satisfy high throughput, the GPU has also been used in the proposed system. This system is evaluated on real IoT medical multimedia data to benchmark the encryption algorithms such as MARS, RC6, 3-DES, DES, and Blowfish in terms of computational running time and throughput for both encryption and decryption processes as well as the avalanche effect. The results show that the proposed system has the lowest running time and highest throughput for both encryption and decryption processes and highest avalanche effect with compared to the existing encryption algorithms. To satisfy the security objective, the developed algorithm has better Avalanche Effect with compared to any of the other existing algorithms and hence can be incorporated in the process of encryption/decryption of any plain multimedia big data. Also, it has shown that the classical and modern ciphers have very less Avalanche Effect and hence cannot be used for encryption of confidential multimedia messages or confidential big data. The developed encryption algorithm has higher Avalanche Effect and for instance, AES in the proposed system has an Avalanche Effect of %52.50. Therefore, such system is able to secure the multimedia big data against real-time attacks.
TL;DR: By extracted primary additional error values, a novel fast fractal encoding method is presented and it is found that the different distribution of values denotes the different parts in images.
Abstract: Today, fractal image encoding method becomes an effective loss compression method in multimedia without resolution, and its negativeness is that its high computational complexity. So many approximate methods are given to decrease the computation time. So the distribution of error points is valued to research. In this paper, by extracted primary additional error values, we first present a novel fast fractal encoding method. Then, with the extracted primary additional error values, we abstract the distribution of these values. We find that the different distribution of values denotes the different parts in images. Finally, we analyze the experimental results and find some properties of these values. The experimental results also show the effectiveness of the method.
TL;DR: Robustness of the scheme is better than existing scheme for similar set of medical images in terms of normalized correlation coefficient (NCC) and bit-error-rate (BER) and performance comparison of proposed scheme with existing schemes shows proposed scheme has better robustness against different types of attacks.
Abstract: In this paper, a blind image watermarking scheme based on discrete wavelet transform (DWT) and singular value decomposition (SVD) is proposed. In this scheme, DWT is applied on ROI (region of interest) of the medical image to get different frequency subbands of its wavelet decomposition. On the low frequency subband LL of the ROI, block-SVD is applied to get different singular matrices. A pair of elements with similar values is identified from the left singular value matrix of these selected blocks. The values of these pairs are modified using certain threshold to embed a bit of watermark content. Appropriate threshold is chosen to achieve the imperceptibility and robustness of medical image and watermark contents respectively. For authentication and identification of original medical image, one watermark image (logo) and other text watermark have been used. The watermark image provides authentication whereas the text data represents electronic patient record (EPR) for identification. At receiving end, blind recovery of both watermark contents is performed by a similar comparison scheme used during the embedding process. The proposed algorithm is applied on various groups of medical images like X-ray, CT scan and mammography. This scheme offers better visibility of watermarked image and recovery of watermark content due to DWT-SVD combination. Moreover, use of Hamming error correcting code (ECC) on EPR text bits reduces the BER and thus provides better recovery of EPR. The performance of proposed algorithm with EPR data coding by Hamming code is compared with the BCH error correcting code and it is found that later one perform better. A result analysis shows that imperceptibility of watermarked image is better as PSNR is above 43 dB and WPSNR is above 52 dB for all set of images. In addition, robustness of the scheme is better than existing scheme for similar set of medical images in terms of normalized correlation coefficient (NCC) and bit-error-rate (BER). An analysis is also carried out to verify the performance of the proposed scheme for different size of watermark contents (image and EPR data). It is observed from analysis that the proposed scheme is also appropriate for watermarking of color image. Using proposed scheme, watermark contents are extracted successfully under various noise attacks like JPEG compression, filtering, Gaussian noise, Salt and pepper noise, cropping, filtering and rotation. Performance comparison of proposed scheme with existing schemes shows proposed scheme has better robustness against different types of attacks. Moreover, the proposed scheme is also robust under set of benchmark attacks known as checkmark attacks.
TL;DR: Comparison results viz-a - viz payload and robustness show that the proposed techniques perform better than some existing state of art techniques and could be useful for e-healthcare systems.
Abstract: Electronic transmission of the medical images is one of the primary requirements in a typical Electronic-Healthcare (E-Healthcare) system. However this transmission could be liable to hackers who may modify the whole medical image or only a part of it during transit. To guarantee the integrity of a medical image, digital watermarking is being used. This paper presents two different watermarking algorithms for medical images in transform domain. In first technique, a digital watermark and Electronic Patients Record (EPR) have been embedded in both regions; Region of Interest (ROI) and Region of Non-Interest (RONI). In second technique, Region of Interest (ROI) is kept untouched for tele-diagnosis purpose and Region of Non-Interest (RONI) is used to hide the digital watermark and EPR. In either algorithm 8ź×ź8 block based Discrete Cosine Transform (DCT) has been used. In each 8ź×ź8 block two DCT coefficients are selected and their magnitudes are compared for embedding the watermark/EPR. The selected coefficients are modified by using a threshold for embedding bit a `0' or bit `1' of the watermark/EPR. The proposed techniques have been found robust not only to singular attacks but also to hybrid attacks. Comparison results viz-a - viz payload and robustness show that the proposed techniques perform better than some existing state of art techniques. As such the proposed algorithms could be useful for e-healthcare systems.
TL;DR: Wang et al. as mentioned in this paper proposed a change detection method based on an improved MRF, where linear weights are designed for dividing unchanged, uncertain and changed pixels of the difference image, and spatial attraction model is introduced to refine the spatial neighborhood relations, which aims to enhance the accuracy of spatial information in MRF.
Abstract: The fixed weights between the center pixel and neighboring pixels are used in the traditional Markov random field for change detection, which will easily cause the overuse of spatial neighborhood information. Besides the traditional label field cannot accurately identify the spatial relations between neighborhood pixels. To solve these problems, this study proposes a change detection method based on an improved MRF. Linear weights are designed for dividing unchanged, uncertain and changed pixels of the difference image, and spatial attraction model is introduced to refine the spatial neighborhood relations, which aims to enhance the accuracy of spatial information in MRF. The experimental results indicate that the proposed method can effectively enhance the accuracy of change detection.
TL;DR: A novel set of features based on Quaternion Wavelet Transform (QWT) is proposed for digital image forensics, which provides more valuable information to distinguish photographic images and computer generated (CG) images.
Abstract: In this paper, a novel set of features based on Quaternion Wavelet Transform (QWT) is proposed for digital image forensics. Compared with Discrete Wavelet Transform (DWT) and Contourlet Wavelet Transform (CWT), QWT produces the parameters, i.e., one magnitude and three angles, which provide more valuable information to distinguish photographic (PG) images and computer generated (CG) images. Some theoretical analysis are done and comparative experiments are made. The corresponding results show that the proposed scheme achieves 18 percents’ improvements on the detection accuracy than Farid’s scheme and 12 percents than Ozparlak’s scheme. It may be the first time to introduce QWT to image forensics, but the improvements are encouraging.
TL;DR: Simulation results and security analysis show that the new image encryption algorithm based on Brownian motion and new 1D chaotic system has properties of large key space, high sensitivity to key, strong resisting statistical and differential attack and has high security and important practical application in image transmission and image encryption.
Abstract: In this paper, a new image encryption algorithm based on Brownian motion and new 1D chaotic system is introduced. Firstly, SHA 256 hash value of the plain image is used to generate the initial values and system parameters of chaotic systems for confusion and diffusion process. Then, 8 bitplanes of the plain image are scrambled based on Brownian motion, respectively, and the position and value of all pixels are changed simultaneously. After the confusion process, a two directional diffusion process is carried out, and it is made up of row diffusion (RD) and column diffusion (CD). The whole process can be repeated many rounds in order to get better encryption effect. Simulation results and security analysis show that our scheme has properties of large key space, high sensitivity to key, strong resisting statistical and differential attack. So, it has high security and important practical application in image transmission and image encryption.
TL;DR: A predictive modeling framework to understand consumer choice towards E-commerce products in terms of “likes’ and “dislikes” by analyzing EEG signals is proposed and the framework can be used for better business model.
Abstract: Marketing and promotions of various consumer products through advertisement campaign is a well known practice to increase the sales and awareness amongst the consumers. This essentially leads to increase in profit to a manufacturing unit. Re-production of products usually depends on the various facts including consumption in the market, reviewer's comments, ratings, etc. However, knowing consumer preference for decision making and behavior prediction for effective utilization of a product using unconscious processes is called "Neuromarketing". This field is emerging fast due to its inherent potential. Therefore, research work in this direction is highly demanded, yet not reached a satisfactory level. In this paper, we propose a predictive modeling framework to understand consumer choice towards E-commerce products in terms of "likes" and "dislikes" by analyzing EEG signals. The EEG signals of volunteers with varying age and gender were recorded while they browsed through various consumer products. The experiments were performed on the dataset comprised of various consumer products. The accuracy of choice prediction was recorded using a user-independent testing approach with the help of Hidden Markov Model (HMM) classifier. We have observed that the prediction results are promising and the framework can be used for better business model.
TL;DR: The method has been extensively tested and analyzed against known attacks and is found to be giving superior performance for robustness, capacity and reduced storage and bandwidth requirements compared to reported techniques suggested by other authors.
Abstract: This paper presents a new robust hybrid multiple watermarking technique using fusion of discrete wavelet transforms (DWT), discrete cosine transforms (DCT), and singular value decomposition (SVD) instead of applying DWT, DCT and SVD individually or combination of DWT-SVD / DCT-SVD. For identity authentication purposes, multiple watermarks are embedded into the same medical image / multimedia objects simultaneously, which provides extra level of security with acceptable performance in terms of robustness and imperceptibility. In the embedding process, the cover image is decomposed into first level discrete wavelet transforms where the A (approximation/lower frequency sub-band) is transformed by DCT and SVD. The watermark image is also transformed by DWT, DCT and SVD. The S vector of watermark information is embedded in the S component of the cover image. The watermarked image is generated by inverse SVD on modified S vector and original U, V vectors followed by inverse DCT and inverse DWT. The watermark is extracted using an extraction algorithm. Furthermore, the text watermark is embedding at the second level of the D (diagonal sub-band) of the cover image. The security of the text watermark considered as EPR (Electronic Patient Record) data is enhanced by using encryption method before embedding into the cover. The results are obtained by varying the gain factor, size of the text watermark, and cover medical images. The method has been extensively tested and analyzed against known attacks and is found to be giving superior performance for robustness, capacity and reduced storage and bandwidth requirements compared to reported techniques suggested by other authors.
TL;DR: This paper introduced the concept of multimedia protection into the method based on role access control, and adopted a scheme based on the combination of multimedia data state and roleAccess control that can be used to resist the known attacks.
Abstract: For the issues of large space and storage security of multimedia files, we analyzed the impact of access control and cloud storage on multimedia file, and proposed a mixed security cloud storage framework based on Internet of Things. This paper introduced the concept of multimedia protection into the method based on role access control. Moreover, we also adopted a scheme based on the combination of multimedia data state and role access control. At the same time, all input and output devices were connected to this system. Internet of Things is used to judge whether circuits are connected and whether the devices are normally operated, so as to improve the access efficiency. On this basis, we also described in detail the complete process of registration, role assignment, multimedia file owner's request for data encryption, and user login and access to multimedia file. According to the result, this scheme can be used to resist the known attacks. It guarantees security of multimedia files.
TL;DR: PWLCM and Logistic Map are applied to generate all parameters the presented algorithm needs and DNA encoding technology functions as an auxiliary tool and the algorithm is capable of withstanding typical attacks and has good character of security.
Abstract: In this paper, we proposed a novel and effective image encryption algorithm based on Chaos and DNA encoding rules. Piecewise Linear Chaotic Map (PWLCM) and Logistic Map are applied to generate all parameters the presented algorithm needs and DNA encoding technology functions as an auxiliary tool. The proposed algorithm consists of these parts: firstly, use PWLCM to produce a key image, whose pixels are generated by Chaos; Secondly, encode the plain image and the key image with DNA rules by rows respectively and different rows are encoded according to various rules decided by logistic map; After that, employ encoded key image to conduct DNA operations with the encoded plain image row by row to obtain an intermediate image and the specific operation executed every row is chosen by logistic map; Then, decode the intermediate image as the plain image of next step. Finally, repeat steps above by columns again to get the ultimate cipher image. The experiment results and analysis indicate that the proposed algorithm is capable of withstanding typical attacks and has good character of security.
TL;DR: This work presents the first exhaustive evaluation of today’s state-of-the-art algorithms for splicing localization, that is, algorithms attempting to detect which pixels in an image have been tampered with as the result of such a forgery.
Abstract: With the proliferation of smartphones and social media, journalistic practices are increasingly dependent on information and images contributed by local bystanders through Internet-based applications and platforms. Verifying the images produced by these sources is integral to forming accurate news reports, given that there is very little or no control over the type of user-contributed content, and hence, images found on the Web are always likely to be the result of image tampering. In particular, image splicing, i.e. the process of taking an area from one image and placing it in another is a typical such tampering practice, often used with the goal of misinforming or manipulating Internet users. Currently, the localization of splicing traces in images found on the Web is a challenging task. In this work, we present the first, to our knowledge, exhaustive evaluation of today's state-of-the-art algorithms for splicing localization, that is, algorithms attempting to detect which pixels in an image have been tampered with as the result of such a forgery. As our aim is the application of splicing localization on images found on the Web and social media environments, we evaluate a large number of algorithms aimed at this problem on datasets that match this use case, while also evaluating algorithm robustness in the face of image degradation due to JPEG recompressions. We then extend our evaluations to a large dataset we formed by collecting real-world forgeries that have circulated the Web during the past years. We review the performance of the implemented algorithms and attempt to draw broader conclusions with respect to the robustness of splicing localization algorithms for application in Web environments, their current weaknesses, and the future of the field. Finally, we openly share the framework and the corresponding algorithm implementations to allow for further evaluations and experimentation.
TL;DR: The term serious storytelling is introduced as a new potential media genre – defining serious storytelling as storytelling with a purpose beyond entertainment – and several application areas are predicted, including wellbeing and health, medicine, psychology, education and online communication.
Abstract: In human culture, storytelling is a long-established tradition. The reasons people tell stories are manifold: to entertain, to transfer knowledge between generations, to maintain cultural heritage, or to warn others of dangers. With the emergence of the digitisation of media, many new possibilities to tell stories in serious and non-entertainment contexts emerged. A very simple example is the idea of serious gaming, as in, digital games without the primary purpose of entertainment. In this paper, we introduce the term serious storytelling as a new potential media genre --- defining serious storytelling as storytelling with a purpose beyond entertainment. We also put forward a review of existing potential application areas, and develop a framework for serious storytelling. We foresee several application areas for this fundamental concept, including wellbeing and health, medicine, psychology, education, ethical problem solving, e-leadership and management, qualitative journalism, serious digital games, simulations and virtual training, user experience studies, and online communication.
TL;DR: The experimental results demonstrate that the proposed DWT-SVD and DCT with Arnold Cat Map encryption based robust and blind watermarking scheme is robust, imperceptible and secure to several attacks and common signal processing operations.
Abstract: In this article, a new DWT-SVD and DCT with Arnold Cat Map encryption based robust and blind watermarking scheme is proposed for copyright protection. The proposed scheme solves the most frequently occurring watermarking security problems in Singular Value Decomposition (SVD) based schemes which are unauthorized reading and false-positive detection. This scheme also optimizes fidelity and robustness characteristics. The grey image watermark splits into two parts using four bits MSBs and four bits LSBs of each pixel. Discrete Cosine Transform (DCT) coefficients of these MSBs and LSBs values are embedded into the middle singular value of each block having size 4 × 4 of the host image's one level Discrete Wavelet Transform (DWT) sub-bands. The reason for incorporating Arnold Cat Map in the proposed scheme is to encode the watermark image before embedding it in the host image. The proposed scheme is a blind scheme and does not require the choice of scaling factor. Thus, the proposed scheme is secure as well as free from the false positive detection problem. The proposed watermarking scheme is tested for various malicious and non-malicious attacks. The experimental results demonstrate that the scheme is robust, imperceptible and secure to several attacks and common signal processing operations.
TL;DR: Results are showing that the proposed algorithm is secure against the powerful universal steganalyzer “ensemble classifier” and the histogram attack, and robust against different image processing attacks such as compression, added noise, and cropping attacks.
Abstract: This paper presents a new efficient embedding algorithm in the wavelet domain of digital images based on the diamond encoding (DE) scheme. Current discrete wavelet transform (DWT) steganography adds an unacceptable distortion to the images and is considered as an ineffective in terms of security. Applying the DE scheme to the current DWT steganographic methods solves the problems of these methods, and reduces the distortion added to the images, and thus improves the embedding efficiency. The proposed algorithm first converts the secret image into a sequence of base-5 digits. After that, the cover image is transformed into the DWT domain and segmented into 2 × 1 coefficient pairs. The DE scheme is used then to change at most one coefficient of each coefficient pair to embed the base-5 digits. Experimental results depict that the proposed algorithm is more efficient in embedding compared to other methods in terms of embedding payload and image quality. Moreover, the proposed algorithm is attacked by well-known steganalysis software. Results are showing that the proposed algorithm is secure against the powerful universal steganalyzer “ensemble classifier” and the histogram attack. The results also reveal that the proposed algorithm is robust against different image processing attacks such as compression, added noise, and cropping attacks.
TL;DR: This paper systematically survey popular RGB-D datasets for different applications including object recognition, scene classification, hand gesture recognition, 3D-simultaneous localization and mapping, and pose estimation to guide researchers in the selection of suitable datasets for evaluating their algorithms.
Abstract: RGB-D data has turned out to be a very useful representation of an indoor scene for solving fundamental computer vision problems. It takes the advantages of the color image that provides appearance information of an object and also the depth image that is immune to the variations in color, illumination, rotation angle and scale. With the invention of the low-cost Microsoft Kinect sensor, which was initially used for gaming and later became a popular device for computer vision, high quality RGB-D data can be acquired easily. In recent years, more and more RGB-D image/video datasets dedicated to various applications have become available, which are of great importance to benchmark the state-of-the-art. In this paper, we systematically survey popular RGB-D datasets for different applications including object recognition, scene classification, hand gesture recognition, 3D-simultaneous localization and mapping, and pose estimation. We provide the insights into the characteristics of each important dataset, and compare the popularity and the difficulty of those datasets. Overall, the main goal of this survey is to give a comprehensive description about the available RGB-D datasets and thus to guide researchers in the selection of suitable datasets for evaluating their algorithms.
TL;DR: The discriminant information was introduced into SPP to arrive at a novel supervised feather extraction method that named Uncorrelated Discriminant SPP (UDSPP) algorithm, which can effectively express discriminant Information, while preserving local neighbor relationship.
Abstract: Feature extraction has always been an important step in face recognition, the quality of which directly determines recognition result Based on making full use of advantages of Sparse Preserving Projection (SPP) on feature extraction, the discriminant information was introduced into SPP to arrive at a novel supervised feather extraction method that named Uncorrelated Discriminant SPP (UDSPP) algorithm The obtained projection with the method by sparse preserving intra-class and maximizing distance inter-class can effectively express discriminant information, while preserving local neighbor relationship Moreover, statistics uncorrelated constraint was also added to decrease redundancy among feature vectors so as to obtain more information as possible with little vectors as possible The experimental results show that the recognition rate improved compared with SPP The method is also superior to recognition methods based on Euclidean distance in processing face database in light
TL;DR: A new method for the recognition of facial expressions from single image frame that uses combination of appearance and geometric features with support vector machines classification and has been validated on publicly available extended Cohn-Kanade (CK+) facial expression data sets.
Abstract: Facial expressions are one of the most powerful, natural and immediate means for human being to communicate their emotions and intensions. Recognition of facial expression has many applications including human-computer interaction, cognitive science, human emotion analysis, personality development etc. In this paper, we propose a new method for the recognition of facial expressions from single image frame that uses combination of appearance and geometric features with support vector machines classification. In general, appearance features for the recognition of facial expressions are computed by dividing face region into regular grid (holistic representation). But, in this paper we extracted region specific appearance features by dividing the whole face region into domain specific local regions. Geometric features are also extracted from corresponding domain specific regions. In addition, important local regions are determined by using incremental search approach which results in the reduction of feature dimension and improvement in recognition accuracy. The results of facial expressions recognition using features from domain specific regions are also compared with the results obtained using holistic representation. The performance of the proposed facial expression recognition system has been validated on publicly available extended Cohn-Kanade (CK+) facial expression data sets.
TL;DR: The architecture presented is effective and reduces the proliferation of information and it is suggested, that a person suffering from any of the diseases mentioned above can defer the onset of complications by doing regular physical activities.
Abstract: Diabetes, blood pressure, heart, and kidney, some of the diseases common across the world, are termed ’silent killers’. More than 50 % of the world’s population are affected by these diseases. If suitable steps are not taken during the early stages then severe complications occur from these diseases. In the work proposed, we have discussed the manner in which the Internet-of-Things based Cloud centric architecture is used for predictive analysis of physical activities of the users in sustainable health centers. The architecture proposed is based on the embedded sensors of the equipment rather than using wearable sensors or Smartphone sensors to store the value of the basic health-related parameters. Cloud centric architecture is composed of a Cloud data center, Public cloud, Private cloud, and uses the XML Web services for secure and fast communication of information. The architecture proposed here is evaluated for its adoption, prediction analysis of physical activities, efficiency, and security. From the results obtained it can be seen that the overall response between the local database server and Cloud data center remains almost constant with the rise in the number of users. For prediction analysis, If the results collected in real time for the analysis of physical activities exceed any of the parameter limits of the defined threshold value then an alert is sent to the health care personnel. Security analysis also shows the effective encryption and decryption of information. The architecture presented is effective and reduces the proliferation of information. It is also suggested, that a person suffering from any of the diseases mentioned above can defer the onset of complications by doing regular physical activities.
TL;DR: The quantitative and qualitative experimental results indicate that the proposed framework maintains a better balance between image quality and security, achieving a reasonable payload with relatively less computational complexity, which confirms its effectiveness compared to other state-of-the-art techniques.
Abstract: Information hiding is an active area of research where secret information is embedded in innocent-looking carriers such as images and videos for hiding its existence while maintaining their visual quality. Researchers have presented various image steganographic techniques since the last decade, focusing on payload and image quality. However, there is a trade-off between these two metrics and keeping a better balance between them is still a challenging issue. In addition, the existing methods fail to achieve better security due to direct embedding of secret data inside images without encryption consideration, making data extraction relatively easy for adversaries. Therefore, in this work, we propose a secure image steganographic framework based on stego key-directed adaptive least significant bit (SKA-LSB) substitution method and multi-level cryptography. In the proposed scheme, stego key is encrypted using a two-level encryption algorithm (TLEA); secret data is encrypted using a multi-level encryption algorithm (MLEA), and the encrypted information is then embedded in the host image using an adaptive LSB substitution method, depending on secret key, red channel, MLEA, and sensitive contents. The quantitative and qualitative experimental results indicate that the proposed framework maintains a better balance between image quality and security, achieving a reasonable payload with relatively less computational complexity, which confirms its effectiveness compared to other state-of-the-art techniques.
TL;DR: The experimental results demonstrate that the proposed DCT based quantization and Discrete Cosine Transform based self-embedding fragile watermarking scheme not only outperforms high quality restoration effectively, but also removes the blocking artifacts and improves the accuracy of tamper localization due to use of very small size blocks.
Abstract: Due to rapid development of Internet and computer technology, image authentication and restoration are very essential, especially when it is utilized in forensic science, medical imaging and evidence of court. A quantization and Discrete Cosine Transform(DCT) based self-embedding fragile watermarking scheme with effective image authentication and restoration quality is proposed in this paper. In this scheme, the cover image is divided in size of 2×2 non-overlapping blocks. For each block twelve bits watermark are generated from the five most significant bits (MSBs) of each pixel and are embedded into the three least significant bits (LSBs) of the pixels corresponding to the mapped block. The proposed scheme uses two levels encoding for content restoration bits generation. The restoration is achievable with high PSNR and NCC up to 50 % tampering rate. The experimental results demonstrate that the proposed scheme not only outperforms high quality restoration effectively, but also removes the blocking artifacts and improves the accuracy of tamper localization due to use of very small size blocks.
TL;DR: Research and technology in visual tracking of moving target, one of most important application in computer vision, becomes a highlight today, is reviewed.
Abstract: Recently, computer vision and multimedia understanding become important research domains in computer science. Meanwhile, visual tracking of moving target, one of most important application in computer vision, becomes a highlight today. So, this paper reviews research and technology in this domain. First, background and application of visual tracking is introduced. Then, visual tracking methods are classified by different thinking and technologies. Their positiveness, negativeness and improvement are analyzed deeply. Finally, difficulty in this domain is summarized and future prospect of related fields is presented.