scispace - formally typeset
Search or ask a question

Showing papers in "Multimedia Tools and Applications in 2013"


Journal ArticleDOI
TL;DR: The potential of finger tracking for gesture-based interaction for augmented reality on mobile phones is investigated, and two experiments evaluating canonical operations such as translation, rotation, and scaling of virtual objects with respect to performance and engagement are presented.
Abstract: The goal of this research is to explore new interaction metaphors for augmented reality on mobile phones, i.e. applications where users look at the live image of the device’s video camera and 3D virtual objects enrich the scene that they see. Common interaction concepts for such applications are often limited to pure 2D pointing and clicking on the device’s touch screen. Such an interaction with virtual objects is not only restrictive but also difficult, for example, due to the small form factor. In this article, we investigate the potential of finger tracking for gesture-based interaction. We present two experiments evaluating canonical operations such as translation, rotation, and scaling of virtual objects with respect to performance (time and accuracy) and engagement (subjective user feedback). Our results indicate a high entertainment value, but low accuracy if objects are manipulated in midair, suggesting great possibilities for leisure applications but limited usage for serious tasks.

187 citations


Journal ArticleDOI
TL;DR: An extensive set of experiments has been performed to show the benefits of using the proposed framework, using data from the real life of a significant number of users over almost a year of natural phone usage.
Abstract: In this paper, a new framework to discover places-of-interest from multimodal mobile phone data is presented. Mobile phones have been used as sensors to obtain location information from users’ real lives. A place-of-interest is defined as a location where the user usually goes and stays for a while. Two levels of clustering are used to obtain places of interest. First, user location points are grouped using a time-based clustering technique which discovers stay points while dealing with missing location data. The second level performs clustering on the stay points to obtain stay regions. A grid-based clustering algorithm has been used for this purpose. To obtain more user location points, a client-server system has been installed on the mobile phones, which is able to obtain location information by integrating GPS, Wifi, GSM and accelerometer sensors, among others. An extensive set of experiments has been performed to show the benefits of using the proposed framework, using data from the real life of a significant number of users over almost a year of natural phone usage.

133 citations


Journal ArticleDOI
TL;DR: This paper first presents a four-layer architecture for internet of things (IoT), and employs the second generation RFID technology to propose a novel intelligent system for M2M communications.
Abstract: Recent advances in the fields of wireless technology have exhibited a strong potential and tendency on improving human life by means of ubiquitous communications devices that enable smart, distributed services. In fact, traditional human to human (H2H) communications are gradually falling behind the scale of necessity. Consequently, machine to machine (M2M) communications have surpassed H2H, thus drawing significant interest from industry and the research community recently. This paper first presents a four-layer architecture for internet of things (IoT). Based on this architecture, we employ the second generation RFID technology to propose a novel intelligent system for M2M communications.

104 citations


Journal ArticleDOI
TL;DR: A cryptanalysis of Tsai's scheme is shown to be vulnerable to the password guessing attack and stolen-verifier attack and a novel and secure mutual authentication scheme based on elliptic curve discrete logarithm problem for SIP is proposed which is immune to the presented attacks.
Abstract: The Session Initiation Protocol (SIP) is the most widely used signaling protocol for controlling communication on the internet, establishing, maintaining, and terminating the sessions. The services that are enabled by SIP are equally applicable in the world of multimedia communication. Recently, Tsai proposed an efficient nonce-based authentication scheme for SIP. In this paper, we do a cryptanalysis of Tsai's scheme and show that Tsai's scheme is vulnerable to the password guessing attack and stolen-verifier attack. Furthermore, Tsai's scheme does not provide known-key secrecy and perfect forward secrecy. We also propose a novel and secure mutual authentication scheme based on elliptic curve discrete logarithm problem for SIP which is immune to the presented attacks.

90 citations


Journal ArticleDOI
TL;DR: The Detection of Malicious Vehicles (DMV) algorithm through monitoring to detect malicious nodes that drop or duplicate received packets and to isolate them from honest vehicles, where each vehicle is monitored by some of it trustier neighbors called verifier nodes.
Abstract: Vehicular Ad Hoc Networks (VANETs) are appropriate networks that can be applied to intelligent transportation systems. In VANET, messages exchanged among vehicles may be damaged by attacker nodes. Therefore, security in message forwarding is an important factor. We propose the Detection of Malicious Vehicles (DMV) algorithm through monitoring to detect malicious nodes that drop or duplicate received packets and to isolate them from honest vehicles, where each vehicle is monitored by some of it trustier neighbors called verifier nodes. If a verifier vehicle observes an abnormal behavior from vehicle V, it increases distrust value of vehicle V. The ID of vehicle V is then reported to its relevant Certificate Authority (CA) as a malicious node when its distrust value is higher than a threshold value. Performance evaluation shows that DMV can detect most existence abnormal and malicious vehicles even at high speeds.

89 citations



Journal ArticleDOI
TL;DR: A reality-oriented augmentation approach to support training activities that aims at adding new value and playful features to traditional training environments with keeping their original look-and-feel.
Abstract: In this paper, we propose a reality-oriented augmentation approach to support training activities. The approach aims at adding new value and playful features to traditional training environments with keeping their original look-and-feel. For example, a game monitoring service enables to automatically record game events so that players can review a gaming process and strategy for soul-searching, or replay most impressive scenes to share the experience with others after the game finishes. Even several services are running on background, digital devices and services are seamlessly integrated to the game environment in unobtrusive way so that players can concentrate on training as usual. The concept can be applied to both traditional games (e.g., poker and the game of Go) and non-gaming activities (e.g., calligraphy and drumming). We developed four case studies on the concept: Augmented Reality Go, EmoPoker, Augmented Calligraphy and AR Drum Kit. We discuss design issues in the reality-oriented augmentation process based on user study results.

80 citations


Journal ArticleDOI
TL;DR: The retrieval performance of the algorithm CBR-ZFDR is evidently improved and the result is better than that achieved by the state-of-the-art method on each database in terms of most of the commonly used performance metrics.
Abstract: To improve the retrieval performance on a classified 3D model database, we propose a 3D model retrieval algorithm based on a hybrid 3D shape descriptor ZFDR and a class-based retrieval approach CBR utilizing the existing class information of the database. The hybrid 3D shape descriptor ZFDR comprises four features, depicting a 3D model from different aspects and it itself is already comparable to or better than several related shape descriptors. To compute the distance between a query model and a target model within a class of a database, we define an integrated distance metric which takes into account the class information. It scales the distance between the query model and the target model according to the distance between the query model and the class. Our class-based retrieval approach CBR is general, it can be used with any shape descriptors to improve their retrieval performance. Extensive generic and partial 3D model retrieval experiments on seven standard databases demonstrate that after we employ CBR, the retrieval performance of our algorithm CBR-ZFDR is evidently improved and the result is better than that achieved by the state-of-the-art method on each database in terms of most of the commonly used performance metrics.

79 citations


Journal ArticleDOI
TL;DR: This survey proposes a reference framework to identify key functionalities of context-awareness and investigates the state-of-the-art advances in every functionality of Context-awareness, pointing out potential directions in context- awareness research and tools for building and measuring context-aware ubiquitous media systems.
Abstract: Context-awareness assists ubiquitous media applications in discovering the changeable contextual information and adapting their behaviors accordingly. A wide spectrum of context-aware schemes have been proposed over the last decade. However, most of them provide partial functionalities of context-awareness in ubiquitous media applications. They are specified to a certain task and lack of systematic research on context-awareness. To this end, this survey aims at answering how close we are to developing context-aware applications in ubiquitous media in a systematic manner. This survey proposes a reference framework to identify key functionalities of context-awareness. Then, it investigates the state-of-the-art advances in every functionality of context-awareness. Finally, it points out potential directions in context-awareness research and tools for building and measuring context-aware ubiquitous media systems.

72 citations


Journal ArticleDOI
TL;DR: The Semantics-Based Pipeline for Economic Event Detection (SPEED), focusing on extracting financial events from news articles and annotating these with meta-data at a speed that enables real-time use, is proposed.
Abstract: As today's financial markets are sensitive to breaking news on economic events, accurate and timely automatic identification of events in news items is crucial. Unstructured news items originating from many heterogeneous sources have to be mined in order to extract knowledge useful for guiding decision making processes. Hence, we propose the Semantics-Based Pipeline for Economic Event Detection (SPEED), focusing on extracting financial events from news articles and annotating these with meta-data at a speed that enables real-time use. In our implementation, we use some components of an existing framework as well as new components, e.g., a high-performance Ontology Gazetteer, a Word Group Look-Up component, a Word Sense Disambiguator, and components for detecting economic events. Through their interaction with a domain-specific ontology, our novel, semantically enabled components constitute a feedback loop which fosters future reuse of acquired knowledge in the event detection process.

64 citations


Journal ArticleDOI
TL;DR: This paper proposes a novel image classification approach, based on local descriptors and the KNN algorithm, that uses both supervised and unsupervised classification techniques and improves the effectiveness of local feature vector classification.
Abstract: The KNN classification algorithm is particularly suited to be used when classifying images described by local features In this paper, we propose a novel image classification approach, based on local descriptors and the KNN algorithm The proposed scheme is based on a hierarchical categorization tree that uses both supervised and unsupervised classification techniques The unsupervised one is based on a hierarchical lattice vector quantization algorithm, while the supervised one is based on both feature vectors labelling and supervised feature selection method The proposed tree improves the effectiveness of local feature vector classification and outperforms the exact KNN algorithm in terms of categorization accuracy

Journal ArticleDOI
TL;DR: This paper proposes a new method to analyze the results of paired comparison-based subjective tests, assuming that ties convey information about the significance of quality score differences between two stimuli, and describes the complete test procedure using the proposed method.
Abstract: As 3D image and video content has gained significant popularity, subjective 3D quality assessment has become an important issue for the creation, processing, and distribution of high quality 3D content. Reliable subjective quality assessment of 3D content is often difficult due to the subjects’ limited 3D experience, the interaction of multiple quality factors, minor quality differences between stimuli, etc. Among subjective evaluation methodologies, paired comparison has the advantage of improved simplicity and reliability, which can be useful to tackle the aforementioned difficulties. In this paper, we propose a new method to analyze the results of paired comparison-based subjective tests. We assume that ties convey information about the significance of quality score differences between two stimuli. Then, a maximum likelihood estimation is performed to obtain confidence intervals providing intuitive measures of significance of the quality differences. We describe the complete test procedure using the proposed method, from subjective experiment design to outlier detection and score analysis for 3D image quality assessment. Especially, we design the test procedure in a way that quality comparison across different contents is enabled while the number of pair-wise comparisons is minimized. Experimental results on a stereoscopic image database with varying camera distances demonstrate the usefulness of the proposed method and enhanced quality discriminability of paired comparison in comparison to the conventional single stimulus methodology.

Journal ArticleDOI
TL;DR: This paper proposes a novel approach to generating a sequence of dance motions using music similarity as a criterion to find the appropriate motions given a new musical input, and evaluates the system’s performance using a user study.
Abstract: In this paper, we propose a novel approach to generating a sequence of dance motions using music similarity as a criterion to find the appropriate motions given a new musical input. Based on the observation that dance motions used in similar musical pieces can be a good reference in choreographing a new dance, we first construct a music-motion database that comprises a number of segment-wise music-motion pairs. When a new musical input is given, it is divided into short segments and for each segment our system suggests the dance motion candidates by finding from the database the music cluster that is most similar to the input. After a user selects the best motion segment, we perform music-dance synchronization by means of cross-correlation between the two music segments using the novelty functions as an input. We evaluate our system's performance using a user study, and the results show that the dance motion sequence generated by our system achieves significantly higher ratings than the one generated randomly.

Journal ArticleDOI
TL;DR: A robust and invisible watermarking scheme based on polylines and polygons for the copyright protection of a GIS digital map that is more robust against geometric attacks, such as rotation, scaling, and translation (RST) transformations, data addition, cropping, breaking, and filleting attacks, and layer attacks with rearrangement and cropping.
Abstract: A geographical information services (GIS) can be provided on the basis of a digital map, which is the fundamental form of representation of data in a GIS. Because the process of producing a digital map is considerably complex and the maintenance of a digital map requires substantial monetary and human resources, a digital map is very valuable and requires copyright protection. A digital map consists of a number of layers that are categorized in terms of topographical features and landmarks. Therefore, any unauthorized person can forge either an entire digital map or the feature layers of the map. In this paper, we present a robust and invisible watermarking scheme based on polylines and polygons for the copyright protection of a GIS digital map. The proposed scheme clusters all polylines and polygons in the feature layers of the map on the basis of the polyline length and the polygon area. And then a watermark is embedded in GIS vector data on the basis of the distribution of polyline length and polygon area in each group by moving all vertices in polylines and polygons within a specified tolerance. Experimental results confirm that the proposed scheme is more robust against geometric attacks, such as rotation, scaling, and translation (RST) transformations, data addition, cropping, breaking, and filleting attacks, and layer attacks with rearrangement and cropping, when compared with conventional schemes. Moreover, the scheme also satisfies data position accuracy.

Journal ArticleDOI
TL;DR: A real-time eye-gaze estimation system by using a general low-resolution webcam, which can estimate eye-Gaze accurately without expensive or specific equipment, and also without an intrusive detection process is proposed.
Abstract: Eye detection and gaze estimation play an important role in many applications, e.g., the eye-controlled mouse in the assisting system for disabled or elderly persons, eye fixation and saccade in psychological analysis, or iris recognition in the security system. Traditional research usually achieves eye tracking by employing intrusive infrared-based techniques or expensive eye trackers. Nowadays, there are more and more needs to analyze user behaviors from tracking eye attention in general applications, in which users usually use a consumer-grade computer or even laptop with an inexpensive webcam. To satisfy the requirements of rapid developments of such applications and reduce the cost, it is no more practical to apply intrusive techniques or use expensive/specific equipment. In this paper, we propose a real-time eye-gaze estimation system by using a general low-resolution webcam, which can estimate eye-gaze accurately without expensive or specific equipment, and also without an intrusive detection process. An illuminance filtering approach is designed to remove the influence from light changes so that the eyes can be detected correctly from the low-resolution webcam video frames. A hybrid model combining the position criterion and an angle-based eye detection strategy are also derived to locate the eyes accurately and efficiently. In the eye-gaze estimation stage, we employ the Fourier Descriptor to describe the appearance-based features of eyes compactly. The determination of eye-gaze position is then carried out by the Support Vector Machine. The proposed algorithms have high performances with low computational complexity. The experiment results also show the feasibility of the proposed methodology.

Journal ArticleDOI
TL;DR: A lightweight approach called RESTdesc is suggested that expresses the semantics of Web services by pre- and postconditions in simple N3 rules, and integrates existing standards and conventions such as Link headers, HTTP OPTIONS, and URI templates for discovery and interaction.
Abstract: Many have left their footprints on the field of semantic RESTful Web service description. Albeit some of the propositions are even W3C Recommendations, none of the proposed standards could gain significant adoption with Web service providers. Some approaches were supposedly too complex and verbose, others were considered not RESTful, and some failed to reach a significant majority of API providers for a combination of the reasons above. While we neither have the silver bullet for universal Web service description, with this paper, we want to suggest a lightweight approach called RESTdesc. It expresses the semantics of Web services by pre- and postconditions in simple N3 rules, and integrates existing standards and conventions such as Link headers, HTTP OPTIONS, and URI templates for discovery and interaction. This approach keeps the complexity to a minimum, yet still enables service descriptions with full semantic expressiveness. A sample implementation on the topic of multimedia Web services verifies the effectiveness of our approach.

Journal ArticleDOI
TL;DR: In this paper, humans in images are extracted and recognized using contexts and profiles and the proposed method is compared with a single face detector system and it shows better performance in terms of precision and speed.
Abstract: This study propose a system of extracting and tracking objects for a multimedia system and addresses how to extract the head feature from an object area. It is observed in images taken from real-time records like a video, there is always a variance in human behavior, such as the position, size, etc. of the person being tracked or recorded. This study discusses how to extract and track multiple objects based on context as opposed to a single object. Via cascade extraction, the proposed system allows tracking of more than one human at a time. For this process, an extraction method based on internal and external contexts, which defines features to distinguish a human, is proposed. The proposed method defines shapes of shoulder and head area to recognize the head-shape of a human, and creates an extractor according to its edge information and geometrical shapes context. In this paper, humans in images are extracted and recognized using contexts and profiles. The proposed method is compared with a single face detector system and it shows better performance in terms of precision and speed. This trace information can be applied in safety care system. Extractions can be improved by validating the image using a context based detector when there are duplicated images.

Journal ArticleDOI
TL;DR: An intelligent movie recommender system with a social trust model based on a social network for analyzing social relationships between users and generated group affinity values with user profiles is proposed.
Abstract: As many researchers have taken an interest in social networks with the development of the user-generated web, trust management and its application have come into the spotlight. User information that is extracted by behavior patterns and user profiles provides the essential relationship between individuals. In this paper, we propose an intelligent movie recommender system with a social trust model. The proposed system is based on a social network for analyzing social relationships between users and generated group affinity values with user profiles. In experiments, the performance of this system is evaluated with precision-recall and F-measures.

Journal ArticleDOI
TL;DR: A sketch-based retrieval algorithm is proposed based on a 3D model feature named View Context and 2D relative shape context matching to enhance the accuracy of 2D sketch-3D model correspondence and to speed up retrieval.
Abstract: Sketch-based 3D model retrieval is very important for applications such as 3D modeling and recognition. In this paper, a sketch-based retrieval algorithm is proposed based on a 3D model feature named View Context and 2D relative shape context matching. To enhance the accuracy of 2D sketch-3D model correspondence as well as the retrieval performance, we propose to align a 3D model with a query 2D sketch before measuring their distance. First, we efficiently select some candidate views from a set of densely sampled views of the 3D model to align the sketch and the model based on their View Context similarities. Then, we compute the more accurate relative shape context distance between the sketch and every candidate view, and regard the minimum one as the sketch-model distance. To speed up retrieval, we precompute the View Context and relative shape context features of the sample views of all the 3D models in the database. Comparative and evaluative experiments based on hand-drawn and standard line drawing sketches demonstrate the effectiveness and robustness of our approach and it significantly outperforms several latest sketch-based retrieval algorithms.

Journal ArticleDOI
TL;DR: A reversible fragile watermarking scheme that detects and locates tampered blocks with high accuracy while ensuring recovery of the original content and could detect and locate malicious attacks such as vertex/feature modification, vertex/ Feature addition, and vertex/ feature deletion.
Abstract: For 2D vector maps, obtaining good tamper localization performance and original content recovery with existing reversible fragile watermarking schemes is a technically challenging problem. Using an improved reversible watermarking method and a fragile watermarking algorithm based on vertex insertion, we propose a reversible fragile watermarking scheme that detects and locates tampered blocks with high accuracy while ensuring recovery of the original content. In particular, we propose dividing the features of the vector map into different blocks, calculating the block authentication watermarks and embedding the watermarks with different watermarking schemes. While the block division ensures superior accuracy of tamper localization, the reversible watermarking method and the fragile watermarking algorithm based on vertex insertion provide recovery of the original content. Experimental results show that the proposed scheme could detect and locate malicious attacks such as vertex/feature modification, vertex/feature addition, and vertex/feature deletion.

Journal ArticleDOI
TL;DR: A novel rate control algorithm that takes into account visual attention is proposed that improves the coding quality in frames with strong local motion, and reduces PSNR fluctuation across frames by up to 22.15%.
Abstract: In video coding, a well-designed rate control scheme should be concerned with both the objective and subjective quality. However, the existing H.264 rate control algorithms mainly aim at improving the objective quality without considering the human visual system. In this paper, we propose a novel rate control algorithm that takes into account visual attention. In a group of pictures, bits allocated to each frame are related to the local motion attention in it, and more bits are allocated to the frames with strong local motion attention. Similarly, in each frame, more bits are assigned to visually significant macroblocks (MBs), and fewer to visually insignificant MBs. Experiment results show that the proposed algorithm improves the coding quality in frames with strong local motion, and reduces PSNR fluctuation across frames by up to 22.15%. In addition, PSNR in visually important regions is increased by up to 1.45 dB as compared to the standard H.264 rate control scheme that improves the subjective quality. Increased computation complexity of the proposed algorithm is less than 4%, which is negligible.

Journal ArticleDOI
TL;DR: An automatic 3D face reconstruction method based on a hierarchical dense deformable model is proposed and experimental results indicate that the proposed method has good performance for 3DFace reconstruction from skull.
Abstract: 3D face reconstruction from skull has been investigated deeply by computer scientists in the past two decades because it is important for identification. The dominant methods construct 3D face from the soft tissue thickness measured at a set of landmarks on skull. The quantity and position of the landmarks are very vital for 3D face reconstruction, but there is no uniform standard for the selection of the landmarks. Additionally, the acquirement of the landmarks on skull is difficult without manual assistance. In this paper, an automatic 3D face reconstruction method based on a hierarchical dense deformable model is proposed. To construct the model, the skull and face samples are acquired by CT scanner and represented as dense triangle mesh. Then a non-rigid dense mesh registration algorithm is presented to align all the samples in point-to-point correspondence. Based on the aligned samples, a global deformable model is constructed, and three local models are constructed from the segmented patches of the eye, nose and mouth. For a given skull, the globe and local deformable models are iteratively matched with it, and the reconstructed facial surface is obtained by fusing the globe and local reconstruction results. To validate the presented method, a measurement in the coefficient domain of a face deformable model is defined. The experimental results indicate that the proposed method has good performance for 3D face reconstruction from skull.

Journal ArticleDOI
TL;DR: The analysis of security of the ECC-based authentication scheme for SIP shows that the scheme is suitable for the applications with higher security requirement and only needs to compute four elliptic curve scale multiplications and two hash-to-point operations.
Abstract: Session Initiation Protocol (SIP) has been widely used in the current Internet protocols such as Hyper Text Transport Protocol (HTTP) and Simple Mail Transport Protocol (SMTP). However, the original SIP authentication scheme was insecure and many researchers tried to propose schemes to overcome the flaws. In the year 2011, Arshad et al. proposed a SIP authentication protocol using elliptic curve cryptography (ECC), but their scheme suffered from off-line password guessing attack along with password change pitfalls. To conquer the mentioned weakness, we proposed an ECC-based authentication scheme for SIP. Our scheme only needs to compute four elliptic curve scale multiplications and two hash-to-point operations, and maintains high efficiency. The analysis of security of the ECC-based protocol shows that our scheme is suitable for the applications with higher security requirement.

Journal ArticleDOI
TL;DR: This paper proposes two important design factors that impact on user immersion in serious heritage games: user interface space volume and subsystem sequence.
Abstract: Modern digital technologies support the preservation and transfer of cultural heritage information via devices and applications such as digital storage systems, electronic books and virtual museums. Advances in virtual and augmented reality, real-time computer graphics and computer games have made it possible to construct large virtual environments in which users may experience cultural heritage through a variety of interactions and immersions. Thus, an emerging problem is to implement an appropriate systematic design method for achieving various types of entertainment, learning and information transfer. This paper proposes two important design factors that impact on user immersion in serious heritage games: user interface space volume and subsystem sequence. The impact of the two factors on proposed systematic design methods was investigated through comparative studies by implementing a serious heritage game system on three different platforms.

Journal ArticleDOI
TL;DR: This paper presents two practical examples of optimizations based on the sensing relevancies of source nodes that transmit still images of the monitored field, addressing issues as energy-efficient data transmission and packet prioritization in intermediate nodes.
Abstract: Wireless ad-hoc networks composed of resource-constrained camera-enabled sensors can provide visual information for a series of monitoring applications, enriching the understanding of the physical world. In many cases, source nodes may have different sensing relevancies for the monitoring functions of the applications, according to the importance of the visual information retrieved from the monitored field. As a direct result, high quality is only required for the most relevant information and, as it is expected that many visual monitoring applications can tolerate some quality loss in the data received from the least relevant source nodes, the network operation can be optimized exploiting this innovative concept. As a novel global QoS parameter, we envisage that the sensing relevancies of source nodes can be considered for a series of optimizations in different aspects of the wireless sensor network operation, achieving energy saving or assuring high quality transmission for the most relevant data. In this paper we discuss some approaches for the establishment of the sensing relevancies of the nodes and propose a protocol to support them. Moreover, we present two practical examples of optimizations based on the sensing relevancies of source nodes that transmit still images of the monitored field, addressing issues as energy-efficient data transmission and packet prioritization in intermediate nodes.

Journal ArticleDOI
TL;DR: Experiments show that the approach using high-level semantic modeling achieves better key-frame extraction as compared with its counterparts using low-level features.
Abstract: There is a growing evidence that visual saliency can be better modeled using top-down mechanisms that incorporate object semantics. This suggests a new direction for image and video analysis, where semantics extraction can be effectively utilized to improve video summarization, indexing and retrieval. This paper presents a framework that models semantic contexts for key-frame extraction. Semantic context of video frames is extracted and its sequential changes are monitored so that significant novelties are located using a one-class classifier. Working with wildlife video frames, the framework undergoes image segmentation, feature extraction and matching of image blocks, and then a co-occurrence matrix of semantic labels is constructed to represent the semantic context within the scene. Experiments show that our approach using high-level semantic modeling achieves better key-frame extraction as compared with its counterparts using low-level features.

Journal ArticleDOI
TL;DR: A new color laser printer forensic algorithm based on noisy texture analysis and support vector machine classifier that can detect which color laser printers was used to print the unknown images is presented.
Abstract: Digital forensics in the ubiquitous era can enhance and protect the reliability of multimedia content where this content is accessed, manipulated, and distributed using high quality computer devices Color laser printer forensics is a kind of digital forensics which identifies the printing source of color printed materials such as fine arts, money, and document and helps to catch a criminal This paper present a new color laser printer forensic algorithm based on noisy texture analysis and support vector machine classifier that can detect which color laser printer was used to print the unknown images Since each printer vender uses their own printing process, printed documents from different venders have a little invisible difference looks like noise In our identification scheme, the invisible noises are estimated with the wiener-filter and the 2D Discrete Wavelet Transform (DWT) filter Then, a gray level co-occurrence matrix (GLCM) is calculated to analyze the texture of the noise From the GLCM, 384 statistical features are extracted and applied to train and test the support vector machine classifier for identifying the color laser printers In the experiment, a total of 4,800 images from 8 color laser printer models were used, where half of the image is for training and the other half is for classification Results prove that the presented algorithm performs well by achieving 993%, 974% and 887% accuracy for the brand, toner and model identification respectively

Journal ArticleDOI
TL;DR: A better use of computer technology in form of integrating the augmented reality (AR) with the regular educational process, where a newly designed AR-Fitness system combines physical exercises with academic lessons and associated tests is proposed.
Abstract: Excessive use of non physical entertainments and lack of adequate indoors physical activities in conjunction with pressure from parents for a higher academic performance is creating serious health concerns amongst young students throughout the world. This problem is more sever in industrial societies such as Taiwan with warnings from the government. In order to solve this acute problem we propose a better use of computer technology in form of integrating the augmented reality (AR) with the regular educational process, where a newly designed AR-Fitness system combines physical exercises with academic lessons and associated tests. The new combined learning environment implements four standard physical fitness training schemes with cognitive learning in five categories of physical education (PE) knowledge including `Cardiopulmonary Endurance', `Flexibility', `Explosiveness', `Muscular Endurance' and `Sport Injury' to test the new system.

Journal ArticleDOI
TL;DR: A comparative study of the most well-known ontologies related to multimedia aspects based on a framework proposed in this paper and called FRAMECOMMON to derive some conclusions concerning this one decade state-of-art in multimedia ontologies.
Abstract: Many efforts have been made in the area of multimedia to bridge the so-called "semantic-gap" with the implementation of ontologies from 2001 to the present. In this paper, we provide a comparative study of the most well-known ontologies related to multimedia aspects. This comparative study has been done based on a framework proposed in this paper and called FRAMECOMMON. This framework takes into account process-oriented dimension, such as the methodological one, and outcome-oriented dimensions, like multimedia aspects, understandability, and evaluation criteria. Finally, we derive some conclusions concerning this one decade state-of-art in multimedia ontologies.

Journal ArticleDOI
TL;DR: A new device-free user interface for TV viewing that uses a human gesture recognition technique that recognizes a large enough variety of gestures and is useful because it does not require any contact-type devices.
Abstract: We developed a new device-free user interface for TV viewing that uses a human gesture recognition technique. Although many motion recognition technologies have been reported, no man---machine interface that recognizes a large enough variety of gestures has been developed. The difficulty was the lack of spatial information that could be acquired from normal video sequences. We overcame the difficulty by using a time-of-flight camera and novel action recognition techniques. The main functions of this system are gesture recognition and posture measurement. The former is performed using the bag-of-features approach, which uses key-point trajectories as features. The use of 4-D spatiotemporal trajectory features is the main technical contribution of the proposed system. The latter is obtained through face detection and object tracking technology. The interface is useful because it does not require any contact-type devices. Several experiments proved the effectiveness of our proposed method and the usefulness of the system.