Showing papers in &quot;IEEE MultiMedia in 2016&quot;

Automating the Recognition of Stress and Emotion: From Lab to Real-World Impact

TL;DR: In discussing the rationale behind the vision for JPEG Pleno and how the new standardization initiative aims to reinvent the future of imaging, the authors review plenoptic representation and its underlying practical implications and challenges in implementing real-world applications with an enhanced quality of experience.

...read moreread less

Abstract: In discussing the rationale behind the vision for JPEG Pleno and how the new standardization initiative aims to reinvent the future of imaging, the authors review plenoptic representation and its underlying practical implications and challenges in implementing real-world applications with an enhanced quality of experience.

...read moreread less

158 citations

Journal Article•DOI•

[...]

Rosalind W. Picard¹•Institutions (1)

Massachusetts Institute of Technology¹

Collaborative Sparse Coding for Multiview Action Recognition

TL;DR: As Rosalind W. Picard reflects on the events that moved her from research to the lab to a real-world application, who would have expected efforts to develop algorithms to perceive multimodal inputs would lead to a wearable that detects signals related to deep brain activation and issues potentially life-saving alerts?

...read moreread less

Abstract: As Rosalind W. Picard reflects on the events that moved her from research to the lab to a real-world application, she can't help but think... who would have expected efforts to develop algorithms to perceive multimodal inputs would lead to a wearable that detects signals related to deep brain activation and issues potentially life-saving alerts?

...read moreread less

60 citations

Journal Article•DOI•

[...]

Wei Wang¹, Yan Yan¹, Luming Zhang², Richang Hong², Nicu Sebe¹ - Show less +1 more•Institutions (2)

University of Trento¹, Hefei University of Technology²

An Image Encryption Algorithm Based on Autoblocking and Electrocardiography

TL;DR: A collaborative sparse coding framework is proposed that integrates the classifier training process and sparse coding process into a unified collaborative filtering framework that lets more discriminative sparse video representations and classifiers be learned by optimizing the dictionary and classifier jointly.

...read moreread less

Abstract: Multiview action recognition has received increasing attention over the past decade. Various approaches have been proposed to extract view-invariant features; among them, self-similarity matrices (SSMs) have shown outstanding performance. However, SSMs become sensitive when there's a very large view change. To make SSMs more robust to viewpoint changes, the authors propose a collaborative sparse coding framework. They integrate the classifier training process and sparse coding process into a unified collaborative filtering framework; this lets more discriminative sparse video representations and classifiers be learned by optimizing the dictionary and classifier jointly. Experimental results demonstrate the effectiveness of the framework.

...read moreread less

58 citations

Journal Article•DOI•

[...]

Guodong Ye¹, Xiaoling Huang¹•Institutions (1)

Guangdong Ocean University¹

JPEG XT: A New Family of JPEG Backward-Compatible Standards

TL;DR: A novel image encryption algorithm is designed based on autoblocking and a medical electrocardiography (ECG) signal and the Wolf algorithm to generate initial conditions for the chaotic maps.

...read moreread less

Abstract: A novel image encryption algorithm is designed based on autoblocking and a medical electrocardiography (ECG) signal. The method uses a chaotic logistic map and a generalized Arnold map. To solve deterministic input problems, the method uses an ECG signal and the Wolf algorithm to generate initial conditions for the chaotic maps. Compared with traditional cryptoarchitectures, this system performs the autoblocking diffusion operation only in the encryption process. The keystream is generated by a control parameter produced from the plain-image, which has proven to be secure against chosen-plaintext and known-plaintext attacks. Experimental results show that the proposed algorithm can achieve high security with good performance.

...read moreread less

45 citations

Journal Article•DOI•

[...]

Thomas Richter¹, Alessandro Artusi², Touradj Ebrahimi³•Institutions (3)

University of Stuttgart¹, University of Girona², École Polytechnique³

Nonlocal In-Loop Filter: The Way Toward Next-Generation Video Coding?

TL;DR: The authors shed some light on the recent developments of the JPEG committee and discuss both the current status of JPEG XT and its future plans.

...read moreread less

Abstract: The Joint Photographic Experts Group recently produced a new standard, JPEG eXTension. JPEG XT is both backward-compatible with legacy JPEG and offers the ability to encode images of higher precision and higher dynamic range, and in lossy or lossless modes. Here, the authors shed some light on the recent developments of the JPEG committee and discuss both the current status of JPEG XT and its future plans.

...read moreread less

43 citations

Journal Article•DOI•

[...]

Siwei Ma¹, Xinfeng Zhang², Jian Zhang¹, Chuanmin Jia¹, Shiqi Wang¹, Wen Gao¹ - Show less +2 more•Institutions (2)

Peking University¹, Nanyang Technological University²

18 Mar 2016-IEEE MultiMedia

TL;DR: Experimental results show that this in-loop filter design can significantly improve the compression performance of the High Efficiency Video Coding (HEVC) standard, leading us in a new direction for improving compression efficiency.

...read moreread less

Abstract: In-loop filtering has emerged as an essential coding tool since H.264/AVC, due to its delicate design, which reduces different kinds of compression artifacts. However, existing in-loop filters rely only on local image correlations, largely ignoring nonlocal similarities. In this article, the authors explore the design philosophy of in-loop filters and discuss their vision for the future of in-loop filter research by examining the potential of nonlocal similarities. Specifically, the group-based sparse representation, which jointly exploits an image's local and nonlocal self-similarities, lays a novel and meaningful groundwork for in-loop filter design. Hard- and soft-thresholding filtering operations are applied to derive the sparse parameters that are appropriate for compression artifact reduction. Experimental results show that this in-loop filter design can significantly improve the compression performance of the High Efficiency Video Coding (HEVC) standard, leading us in a new direction for improving compression efficiency.

...read moreread less

41 citations

Journal Article•DOI•

Generating Incidental Word-Learning Tasks via Topic-Based and Load-Based Profiles

[...]

Haoran Xie¹, Di Zou², Raymond Y. K. Lau³, Fu Lee Wang¹, Tak-Lam Wong⁴ - Show less +1 more•Institutions (4)

Caritas Institute of Higher Education¹, Hong Kong Polytechnic University², City University of Hong Kong³, Hong Kong Institute of Education⁴

Extended Guided Filtering for Depth Map Upsampling

TL;DR: The authors present a framework to generate incidental word learning tasks via load-based profiles measured through the involvement load hypothesis, and topic- based profiles obtained from social media and find that the proposed framework promotes more effective and enjoyable word learning than intentional word learning.

...read moreread less

Abstract: Compared to intentional word learning, incidental word learning better motivates learners, integrates development of more language skills, and provides richer contexts. The effectiveness of incidental word learning tasks can also be increased by employing materials that learners are more familiar with or interested in. Here, the authors present a framework to generate incidental word learning tasks via load-based profiles measured through the involvement load hypothesis, and topic-based profiles obtained from social media. They also conduct an experiment on real participants and find that the proposed framework promotes more effective and enjoyable word learning than intentional word learning. This article is part of a special issue on social media for learning.

...read moreread less

35 citations

Journal Article•DOI•

[...]

Kai-Lung Hua¹, Kai-Han Lo¹, Yu-Chiang Frank Frank Wang²•Institutions (2)

National Taiwan University of Science and Technology¹, Academia Sinica²

A Neural Network for Quality of Experience Estimation in Mobile Communications

TL;DR: This extended guided filtering approach for depth map upsampling outperforms other state-of-the-art approaches by using a high-resolution color image as a guide and applying an onion-peeling filtering procedure that exploits local gradient information of depth images.

...read moreread less

Abstract: The authors address the problem of depth map upsampling using a corresponding high-resolution color image. The depth map is captured by low-resolution time-of-flight cameras paired with a high-resolution RGB camera. Inspired by guided image filtering, the proposed method not only uses the structure of the high-resolution color image as guidance, it also exploits local gradient information of depth images to suppress potential texture-copying artifacts. In addition, the authors introduce onion-peel-order filtering that predicts depth values from outside inward in a concentric-layer order, which avoids depth bleeding during the propagation process. Quantitative and qualitative experimental results demonstrate the effectiveness and robustness of this approach over prior depth map upsampling methods.

...read moreread less

33 citations

Journal Article•DOI•

[...]

Laura Pierucci¹, Davide Micheli²•Institutions (2)

University of Florence¹, Telecom Italia²

Eye-Controlled Interfaces for Multimedia Interaction

TL;DR: The authors consider specific key performance indicators (KPIs) and propose using neural networks to provide an automatic classification among these KPIs (related to quality of service and QoE) and adopt the adoption of the neural network ensures replicability of QOE estimation regardless of user involvement and simplifiesQoE analysis for future communications systems.

...read moreread less

Abstract: High data rates are usually envisaged by operators to satisfy the subscribers using multimedia services. However, due to the increasing number of tablets, smartphones, and push applications, user needs can require low throughput. A new analysis of user satisfaction is necessary--the so-called quality of experience (QoE). The authors consider specific key performance indicators (KPIs) and propose using neural networks to provide an automatic classification among these KPIs (related to quality of service) and QoE. The adoption of the neural network ensures replicability of QoE estimation regardless of user involvement and simplifies QoE analysis for future communications systems.

...read moreread less

31 citations

Journal Article•DOI•

[...]

Chandan Kumar, Raphael Menges, Steffen Staab

Multimedia Hashing and Networking

TL;DR: The authors have developed a gaze-based control paradigm to investigate how eye-based interaction techniques can be made precise and fast enough to let disabled people easily interact with multimedia information.

...read moreread less

Abstract: The EU-funded MAMEM project (Multimedia Authoring and Management using your Eyes and Mind) aims to propose a framework for natural interaction with multimedia information for users who lack fine motor skills. As part of this project, the authors have developed a gaze-based control paradigm. Here, they outline the challenges of eye-controlled interaction with multimedia information and present initial project results. Their objective is to investigate how eye-based interaction techniques can be made precise and fast enough to let disabled people easily interact with multimedia information.

...read moreread less

24 citations

Journal Article•DOI•

[...]

Wei Liu¹, Tongtao Zhang²•Institutions (2)

Tencent¹, Rensselaer Polytechnic Institute²

Social-Sensed Multimedia Computing

TL;DR: This department discusses multimedia hashing and networking and presents one paradigm of leveraging MINets to incorporate both visual and textual information to reach a sensible event coreference resolution.

...read moreread less

Abstract: This department discusses multimedia hashing and networking. The authors summarize shallow-learning-based hashing and deep-learning-based hashing. By exploiting successful shallow-learning algorithms, state-of-the-art hashing techniques have been widely used in high-efficiency multimedia storage, indexing, and retrieval, especially in multimedia search applications on smartphone devices. The authors also introduce Multimedia Information Networks (MINets) and present one paradigm of leveraging MINets to incorporate both visual and textual information to reach a sensible event coreference resolution. The goal is to make deep learning practical in realistic multimedia applications.

...read moreread less

Journal Article•DOI•

[...]

Peng Cui¹, Wenwu Zhu¹, Tat-Seng Chua², Ramesh Jain³•Institutions (3)

Tsinghua University¹, National University of Singapore², University of California, Irvine³

A Computer-Supported Collaborative Learning Design for Quality Interaction

TL;DR: In this article, the social-sensed multimedia computing paradigm and advocate for the need to organically integrate social network and social media data with multimedia computing tasks is proposed and more researchers in the multimedia community should focus on the user dimension.

...read moreread less

Abstract: The authors propose the social-sensed multimedia computing paradigm and advocate for the need to organically integrate social network and social media data with multimedia computing tasks. More researchers in the multimedia community should be focusing on the user dimension to quickly advance this line of research.

...read moreread less

Journal Article•DOI•

[...]

Masanori Yamada¹, Yoshiko Goda², Hideya Matsukawa³, Kojiro Hata, Seisuke Yasunami² - Show less +1 more•Institutions (3)

Kyushu University¹, Kumamoto University², Osaka University³

Multimedia Memory Cues for Augmenting Human Memory

TL;DR: This study investigated the relationships among the use of functions in computer-supported collaborative learning (CSCL), psychological factors, and learning behaviors related to applying the Community of Inquiry (CoI) framework to increase active interaction among learners.

...read moreread less

Abstract: This study investigated, through both formative and practical evaluation, the relationships among the use of functions in computer-supported collaborative learning (CSCL), psychological factors, and learning behaviors related to applying the Community of Inquiry (CoI) framework. The goal was to increase active interaction among learners. In two experiments inside and outside the classroom, the authors examined an online discussion and collected data using questionnaires that assessed perceived psychological factors, as well as communication logs related to the efficacy of CoI. The results of a path analysis showed two points. First, cognitive learning tools support the enhancement of expressive cognitive presence that promotes the perception of CoI as formative evaluation. Second, the frequency of the use of the functions fostered expressive social and cognitive presence (which enhanced the perception of both), perceived contribution, and satisfaction with online discussion. This article is part of a special issue on social media for learning.

...read moreread less

Journal Article•DOI•

[...]

Tilman Dingler¹, Passant El Agroudy¹, Huy Viet Le¹, Albrecht Schmidt¹, Evangelos Niforatos², Agon Bexheti², Marc Langheinrich² - Show less +3 more•Institutions (2)

University of Stuttgart¹, University of Lugano²

Visual Attention Retargeting

TL;DR: The authors describe their efforts in the EU Recall project to extract memory cues from multimedia records to augment human memory beyond simple memory prosthetics.

...read moreread less

Abstract: Technology has always had a direct impact on what humans remember. In the era of smartphones and wearable devices, people easily capture information, such as pictures and videos, on a daily basis. The so-called "quantified self" movement focuses on using such captured multimedia information, often in combination with additional contextual data (such as GPS traces or social media posts), with the goal of extracting and providing better insights into people's everyday actions (for example, fitness tracking, work productivity, and dieting). However, a more interesting use of such captured data might be to directly support human memory. Here, the authors describe their efforts in the EU Recall project to extract memory cues from multimedia records to augment human memory beyond simple memory prosthetics.

...read moreread less

Journal Article•DOI•

[...]

Victor A. Mateescu¹, Ivan V. Bajic¹•Institutions (1)

Simon Fraser University¹

Managing and querying efficiently distributed semantic multimedia metadata collections

TL;DR: This article presents an introduction to visual attention retargeting, its connection to visual saliency, the challenges associated with it, and ideas for how it can be approached.

...read moreread less

Abstract: This article presents an introduction to visual attention retargeting, its connection to visual saliency, the challenges associated with it, and ideas for how it can be approached. The difficulty of attention retargeting as a saliency inversion problem lies in the lack of one-to-one mapping between saliency and the image domain, in addition to the possible negative impact of saliency alterations on image aesthetics. A few approaches from recent literature to solve this challenging problem are reviewed, and several suggestions for future development are presented.

...read moreread less

Journal Article•DOI•

[...]

Sébastien Laborie, Ana-Maria Manzat, Florence Sèdes

Fast Summarization of User-Generated Videos: Exploiting Semantic, Emotional, and Quality Clues

TL;DR: This paper presents a method to construct such resume and illustrates the framework with current Semantic Web technologies, such as RDF and SPARQL for representing and querying semantic metadata, and shows the benefits of indexing and retrieving multimedia contents without centralizing multimedia contents or their associated metadata.

...read moreread less

Abstract: Currently, many multimedia contents are acquired and stored in real time and on different locations. In order to retrieve efficiently the desired information and to avoid centralizing all metadata, we propose to compute a centralized metadata resume, i.e., a concise version of the whole metadata, which locates some desired multimedia contents on remote servers. The originality of this resume is that it is automatically constructed based on the extracted metadata. In this paper, we present a method to construct such resume and illustrate our framework with current Semantic Web technologies, such as RDF and SPARQL for representing and querying semantic metadata. Some experimental results are provided in order to show the benefits of indexing and retrieving multimedia contents without centralizing multimedia contents or their associated metadata, and to prove the efficiency of a metadata resume.

...read moreread less

Journal Article•DOI•

[...]

Baohan Xu¹, Xi Wang¹, Yu-Gang Jiang¹•Institutions (1)

Fudan University¹

Learners Thrive Using Multifaceted Open Social Learner Modeling

TL;DR: A novel approach for fast summarization of user-generated videos (UGVs) by selecting a few representative segments based on segment-level semantic and emotional recognition results, which are generally sufficient for a summary.

...read moreread less

Abstract: This article introduces a novel approach for fast summarization of user-generated videos (UGVs). Different from other types of videos where the semantic content might vary greatly over time, most UGVs contain only a single shot with relatively consistent high-level semantics and emotional content. Therefore, a few representative segments, which can be selected based on segment-level semantic and emotional recognition results, are generally sufficient for a summary. In addition, due to the poor shooting quality of many UGVs, factors such as camera shaking and lighting conditions are also considered to achieve more pleasant summaries. This article is part of a special issue on quality modeling.

...read moreread less

Journal Article•DOI•

[...]

Lei Shi¹, Alexandra I. Cristea¹•Institutions (1)

University of Warwick¹

Multimodal Ensemble Fusion for Disambiguation and Retrieval

TL;DR: In this article, the authors explore open social learner modeling (OSLM), a social extension of open learner modelling (OLM), by embedding visualization of both a learner's own model and other learning peers' models into different parts of the learning content.

...read moreread less

Abstract: This article explores open social learner modeling (OSLM)--a social extension of open learner modeling (OLM). A specific implementation of this approach is presented by which learners' self-direction and self-determination in a social e-learning context could be potentially promoted. Unlike previous work, the proposed approach, multifaceted OSLM, lets the system seamlessly and adaptively embed visualization of both a learner's own model and other learning peers' models into different parts of the learning content, for multiple axes of context, at any time during the learning process. It also demonstrates the advantages of visualizing both learners' performance and their contribution to a learning community. An experimental study shows that, contrary to previous research, the richness and complexity of this new approach positively affected the learning experience in terms of perceived effectiveness, efficiency, and satisfaction. This article is part of special issue on social media for learning.

...read moreread less

Journal Article•DOI•

[...]

Yang Peng¹, Xiaofeng Zhou¹, Daisy Zhe Wang¹, Ishan Patwa¹, Dihong Gong¹, Chunsheng Victor Fang - Show less +2 more•Institutions (1)

University of Florida¹

14 Apr 2016-IEEE MultiMedia

TL;DR: Experimental results on the University of Illinois at Urbana-Champaign Image Sense Discrimination dataset and the Google-MM dataset show that the authors' ensemble fusion model outperforms approaches using only a single modality for disambiguation and retrieval.

...read moreread less

Abstract: In this article, the authors identify the correlative and complementary relations among multiple modalities. They then propose a multimodal ensemble fusion model to capture the complementary relation and correlative relation between two modalities (images and text) and explain why this ensemble fusion model works. Experimental results on the University of Illinois at Urbana-Champaign Image Sense Discrimination (UIUC-ISD) dataset and the Google-MM dataset show that their ensemble fusion model outperforms approaches using only a single modality for disambiguation and retrieval. Word sense disambiguation and information retrieval are the use cases they studied to demonstrate the effectiveness of their ensemble fusion model.

...read moreread less

Journal Article•DOI•

Fusing Incomplete Multisensor Heterogeneous Data to Estimate Urban Traffic

[...]

Zhenyu Shan¹, Yingjie Xia², Peipei Hou¹, Jifeng He¹•Institutions (2)

Hangzhou Normal University¹, Zhejiang University²

Social Media for Ubiquitous Learning and Adaptive Tutoring [Guest editors' introduction]

TL;DR: An incomplete traffic data fusing method is proposed to estimate traffic state accurately by extracting data correlations and applying incomplete data fusion, implementing the two approaches in parallel.

...read moreread less

Abstract: Today, data-driven intelligent transportation systems must address data quality challenges, such as the missing data problem. For example, is it possible to improve the performance of traffic state estimation using incomplete data? In this article, an incomplete traffic data fusing method is proposed to estimate traffic state accurately. It improves missing data estimation by extracting data correlations and applying incomplete data fusion, implementing the two approaches in parallel. The main research focus is on extracting the inherent spatio-temporal correlations of traffic states data from road segments based on a multiple linear regression (MLR) model. Computational experiments for accuracy and efficiency demonstrate that this method can use data correlations to implement accurate and real-time traffic state estimation. This article is part of a special issue on quality modeling.

...read moreread less

Journal Article•DOI•

[...]

Qing Li¹, Rynson W. H. Lau¹, Elvira Popescu², Yanghui Rao³, Howard Leung¹, Xinzhong Zhu⁴ - Show less +2 more•Institutions (4)

City University of Hong Kong¹, University of Craiova², Sun Yat-sen University³, Zhejiang Normal University⁴

The Intersection of Art and Technology

TL;DR: The following three main topics are examined and studied here: ubiquitous learning via social media services, intelligent tutoring in adaptive e-learning, and multimedia enabled social learning.

...read moreread less

Abstract: The popularity and influence of social media have been continuously expanding worldwide, and a similar trend is visible in educational settings. Social media tools have been reportedly used for a variety of educational purposes and in wide-ranging contexts, bridging formal and informal, as well as individual and collaborative learning. Currently, the trend is toward the integration of social media services with mobile and ubiquitous learning and adaptive educational technologies. To provide an overview of this interplay by reviewing the applicable techniques and relevant works in the area, we supplement this editorial with a tutorial survey. In particular, the following three main topics are examined and studied here: ubiquitous learning via social media services, intelligent tutoring in adaptive e-learning, and multimedia enabled social learning. The authors also introduced the articles featured in this special issue on social media for learning.

...read moreread less

Journal Article•DOI•

[...]

Antonio Camurri, Gualtiero Volpe

Scale-Aware Spatially Guided Mapping

TL;DR: Researchers at the Casa Paganini-InfoMus Research Centre work to combine scientific research in information and communications technology with artistic and humanistic research, showing how collaboration with artists informed work on analyzing nonverbal expressive and social behavior and contributed to tools that support both artistic and scientific developments.

...read moreread less

Abstract: As art influences science and technology, science and technology can in turn inspire art. Recognizing this mutually beneficial relationship, researchers at the Casa Paganini-InfoMus Research Centre work to combine scientific research in information and communications technology (ICT) with artistic and humanistic research. Here, the authors discuss some of their work, showing how their collaboration with artists informed work on analyzing nonverbal expressive and social behavior and contributed to tools, such as the EyesWeb XMI hardware and software platform, that support both artistic and scientific developments. They also sketch out how art-informed multimedia and multimodal technologies find application beyond the arts, in areas including education, cultural heritage, social inclusion, therapy, rehabilitation, and wellness.

...read moreread less

Journal Article•DOI•

[...]

Shijie Hao¹, Guo Yanrong², Richang Hong¹, Meng Wang¹•Institutions (2)

Hefei University of Technology¹, University of North Carolina at Chapel Hill²

Managing Intellectual Property in a Music Fruition Environment

TL;DR: A scale-aware spatially guided mapping (SaSGM) model is proposed, which formulates and combines multiple spatial influences of simple edge responses under different levels of detail and is thus more sensitive to image patterns at a large scale.

...read moreread less

Abstract: The scale information in images is important for guiding image-filtering configuration. The authors propose a scale-aware spatially guided mapping (SaSGM) model, which formulates and combines multiple spatial influences of simple edge responses under different levels of detail. The SaSGM model is thus more sensitive to image patterns at a large scale. The authors further incorporate the SaSGM into several image processing models, such as detail enhancement and image stylization models. Experiments show that by inheriting the characteristics of the SaSGM, the extended models are able to differentiate image contents in terms of their scales and thus generate more natural or diversified visual effects. This article is part of a special issue on quality modeling.

...read moreread less

Journal Article•DOI•

[...]

Adriano Baratè¹, Goffredo Haus¹, Luca A. Ludovico¹, Paolo Perlasca¹•Institutions (1)

University of Milan¹

Person-Centered Multimedia Computing: A New Paradigm Inspired by Assistive and Rehabilitative Applications

TL;DR: An approach to encode contents and build advanced multimodal interfaces that protect intellectual property is proposed and used as a case study the IEEE 1599--an international standard for music description.

...read moreread less

Abstract: With the advent of the digital age and the spread of portable digital audio players, interest in software and hardware tools that can help producers and distributors enhance and revive their catalogue of music has progressively increased. One of the main concerns of major labels is how to prevent file sharing. An innovative approach that couples reviving catalogues with support for rights management could provide an experience of multimedia content in which users select multiple media streams on the fly in a fully synchronized environment. Because this kind of user experience can't be reconstructed from the single original streams, illegal copying would be intrinsically discouraged. In this article, the authors propose an approach to encode contents and build advanced multimodal interfaces that protect intellectual property. As a case study, they use the IEEE 1599--an international standard for music description.

...read moreread less

Journal Article•DOI•

[...]

Sethuraman Panchanathan¹, Shayok Chakraborty¹, Troy McDaniel¹, Ramin Tadayon¹•Institutions (1)

Arizona State University¹

Planogram Compliance Checking Based on Detection of Recurring Patterns

TL;DR: The authors present the person-centered multimedia computing approach inspired by assistive and rehabilitative applications, where the emphasis is on understanding the individual user's preferences and expectations toward designing, developing, and deploying effective solutions.

...read moreread less

Abstract: Human-centered multimedia computing (HCMC) focuses on a tight engagement of humans in the design, development, and deployment of multimedia solutions. However, people's abilities change over time due to a variety of reasons, including age, context, and geographical location. To address this challenge, the authors recently introduced the concept of person-centered multimedia computing, where the emphasis is on understanding the individual user's preferences and expectations toward designing, developing, and deploying effective solutions. Today's multimedia technology is largely geared toward the "able" population; individuals with disabilities have largely been absent in the design process and thus must adapt themselves (often unsuccessfully) to available solutions. Further, individuals with disabilities have specific and individualized requirements that necessitate a person-centered, adaptive approach to multimedia computing. Here, the authors present the person-centered multimedia computing approach inspired by assistive and rehabilitative applications.

...read moreread less

Journal Article•DOI•

[...]

Song Liu¹, Wanqing Li¹, Stephen James Davis¹, Christian Ritz¹, Hongda Tian¹ - Show less +1 more•Institutions (1)

University of Wollongong¹

Ubiquitous Multimedia: Emerging Research on Multimedia Computing

TL;DR: In this paper, the authors proposed a method for automatic planogram compliance checking in retail chains that does not require product template images for training and extract the product layout from an input image by means of unsupervised recurring pattern detection and matched via graph matching.

...read moreread less

Abstract: In this article, the authors propose a novel method for automatic planogram compliance checking in retail chains that doesn't require product template images for training. Product layout is extracted from an input image by means of unsupervised recurring pattern detection and matched via graph matching, with the expected product layout specified by a planogram to measure the level of compliance. A divide-and-conquer strategy is employed to improve the speed. Specifically, the input image is divided into several regions based on the planogram. Recurring patterns are detected in each region, respectively, and then merged together to estimate the product layout.

...read moreread less

Journal Article•DOI•

[...]

Yonghong Tian¹, Min Chen², Leonel Sousa³•Institutions (3)

Peking University¹, University of Washington², University of Lisbon³

04 May 2016-IEEE MultiMedia

TL;DR: This special issue provides another forum for the researchers of the top symposium papers to further present their research results to the community.

...read moreread less

Abstract: The wide-ranging applications and big data of ubiquitous multimedia present both unprecedented challenges and unique opportunities for multimedia computing research. This was the main theme of the 2015 IEEE International Symposium on Multimedia (ISM 2015), and this special issue provides another forum for the researchers of the top symposium papers to further present their research results to the community.

...read moreread less

Journal Article•DOI•

Unsupervised Speaker Identification for TV News

[...]

Daniel N. Woo¹, Ramazan S. Aygun¹•Institutions (1)

University of Alabama in Huntsville¹

Nonparametric Quality Assessment of Natural Images

TL;DR: This article proposes labeling speakers using just the available information in the news video without external information, which uses face recognition, face clustering, face landmarking, natural language processing tools, and speaker diarization.

...read moreread less

Abstract: Identifying the speakers in TV news would help listeners analyze and understand news content, but doing so in news videos is challenging because new faces often appear. Previous research has identified speakers on pretrained faces for TV shows and movies. Using an unsupervised method, this article proposes labeling speakers using just the available information in the news video without external information. The proposed framework segments the audio by speaker, parses closed captions for speaker names, identifies who is speaking, and performs optical character recognition for speaker names. The framework uses face recognition, face clustering, face landmarking, natural language processing tools, and speaker diarization. Results indicate 63.6 percent accuracy for identifying speakers for CNN News.

...read moreread less

Journal Article•DOI•

[...]

Redzuan Abdul Manap¹, Ling Shao², Alejandro F. Frangi¹•Institutions (2)

University of Sheffield¹, Northumbria University²

Selecting Interesting Image Regions to Automatically Create Cinemagraphs

TL;DR: Following a feature extraction stage in which spatial domain statistics are utilized as features, a two-stage nonparametric NR-IQA framework is proposed, which requires no training phase and enables prediction of the image distortion type as well as local regions' quality.

...read moreread less

Abstract: In this article, the authors explore an alternative way to perform no-reference image quality assessment (NR-IQA). Following a feature extraction stage in which spatial domain statistics are utilized as features, a two-stage nonparametric NR-IQA framework is proposed. This approach requires no training phase, and it enables prediction of the image distortion type as well as local regions' quality, which is not available in most current algorithms. Experimental results on IQA databases show that the proposed framework achieves high correlation to human perception of image quality and delivers competitive performance to state-of-the-art NR-IQA algorithms.

...read moreread less

Journal Article•DOI•

[...]

Mei-Chen Yeh¹•Institutions (1)

National Taiwan Normal University¹