scispace - formally typeset
Journal ArticleDOI

Online Data Organizer: Micro-Video Categorization by Structure-Guided Multimodal Dictionary Learning

Reads0
Chats0
TLDR
A structure-guided multi-modal dictionary learning model is built to learn the concept-level micro-video representation by jointly considering their venue structure and modality relatedness and an online learning algorithm is developed to incrementally and efficiently strengthen this model.
Abstract
Micro-videos have rapidly become one of the most dominant trends in the era of social media. Accordingly, how to organize them draws our attention. Distinct from the traditional long videos that would have multi-site scenes and tolerate the hysteresis, a micro-video: 1) usually records contents at one specific venue within a few seconds. The venues are structured hierarchically regarding their category granularity. This motivates us to organize the micro-videos via their venue structure. 2) timely circulates over social networks. Thus, the timeliness of micro-videos desires effective online processing. However, only 1.22% of micro-videos are labeled with venue information when uploaded at the mobile end. To address this problem, we present a framework to organize the micro-videos online. In particular, we first build a structure-guided multi-modal dictionary learning model to learn the concept-level micro-video representation by jointly considering their venue structure and modality relatedness. We then develop an online learning algorithm to incrementally and efficiently strengthen our model, as well as categorize the micro-videos into a tree structure. Extensive experiments on a real-world data set validate our model well. In addition, we have released the codes to facilitate the research in the community.

read more

Citations
More filters
Proceedings ArticleDOI

MMGCN: Multi-modal Graph Convolution Network for Personalized Recommendation of Micro-video

TL;DR: A Multi-modal Graph Convolution Network (MMGCN) framework built upon the message-passing idea of graph neural networks, which can yield modal-specific representations of users and micro-videos to better capture user preferences is designed.
Journal ArticleDOI

MGAT: Multimodal Graph Attention Network for Recommendation

TL;DR: A new Multimodal Graph Attention Network, short for MGAT, is proposed, which disentangles personal interests at the granularity of modality and is able to capture more complex interaction patterns hidden in user behaviors and provide a more accurate recommendation.
Proceedings ArticleDOI

Graph-Refined Convolutional Network for Multimedia Recommendation with Implicit Feedback

TL;DR: A new GCN-based recommender model, Graph-Refined Convolutional Network (GRCN), which adjusts the structure of interaction graph adaptively based on status of model training, instead of remaining the fixed structure is devised.
Proceedings ArticleDOI

Personalized Hashtag Recommendation for Micro-videos

TL;DR: A Graph Convolution Network based Personalized Hashtag Recommendation (GCN-PHR) model, which leverages recently advanced GCN techniques to model the complicate interactions among users, hashtags, and micro-videos and learn their representations.
Journal ArticleDOI

Infrared and visible image fusion using dual discriminators generative adversarial networks with Wasserstein distance

TL;DR: D2WGAN, a framework that extends GAN to dual discriminators Wasserstein generative adversarial network, can generate better results compared with the other state-of-the-art methods and is defined as a novel LBP (local binary pattern) loss.
References
More filters
Journal ArticleDOI

Image Super-Resolution Using Deep Convolutional Networks

TL;DR: Zhang et al. as discussed by the authors proposed a deep learning method for single image super-resolution (SR), which directly learns an end-to-end mapping between the low/high-resolution images.
Journal ArticleDOI

Image Denoising Via Sparse and Redundant Representations Over Learned Dictionaries

TL;DR: This work addresses the image denoising problem, where zero-mean white and homogeneous Gaussian additive noise is to be removed from a given image, and uses the K-SVD algorithm to obtain a dictionary that describes the image content effectively.
Journal ArticleDOI

Image Super-Resolution Via Sparse Representation

TL;DR: This paper presents a new approach to single-image superresolution, based upon sparse signal representation, which generates high-resolution images that are competitive or even superior in quality to images produced by other similar SR methods.
Proceedings ArticleDOI

Online dictionary learning for sparse coding

TL;DR: A new online optimization algorithm for dictionary learning is proposed, based on stochastic approximations, which scales up gracefully to large datasets with millions of training samples, and leads to faster performance and better dictionaries than classical batch algorithms for both small and large datasets.
Journal ArticleDOI

Sparse Representation for Color Image Restoration

TL;DR: This work puts forward ways for handling nonhomogeneous noise and missing information, paving the way to state-of-the-art results in applications such as color image denoising, demosaicing, and inpainting, as demonstrated in this paper.
Related Papers (5)