Online Data Organizer: Micro-Video Categorization by Structure-Guided Multimodal Dictionary Learning

doi:10.1109/TIP.2018.2875363

Journal ArticleDOI

Online Data Organizer: Micro-Video Categorization by Structure-Guided Multimodal Dictionary Learning

Meng Liu, +4 more

- 01 Mar 2019 -

IEEE Transactions on Image Processing

- Vol. 28, Iss: 3, pp 1235-1247

Chats0

TLDR

A structure-guided multi-modal dictionary learning model is built to learn the concept-level micro-video representation by jointly considering their venue structure and modality relatedness and an online learning algorithm is developed to incrementally and efficiently strengthen this model.

Abstract:

Micro-videos have rapidly become one of the most dominant trends in the era of social media. Accordingly, how to organize them draws our attention. Distinct from the traditional long videos that would have multi-site scenes and tolerate the hysteresis, a micro-video: 1) usually records contents at one specific venue within a few seconds. The venues are structured hierarchically regarding their category granularity. This motivates us to organize the micro-videos via their venue structure. 2) timely circulates over social networks. Thus, the timeliness of micro-videos desires effective online processing. However, only 1.22% of micro-videos are labeled with venue information when uploaded at the mobile end. To address this problem, we present a framework to organize the micro-videos online. In particular, we first build a structure-guided multi-modal dictionary learning model to learn the concept-level micro-video representation by jointly considering their venue structure and modality relatedness. We then develop an online learning algorithm to incrementally and efficiently strengthen our model, as well as categorize the micro-videos into a tree structure. Extensive experiments on a real-world data set validate our model well. In addition, we have released the codes to facilitate the research in the community.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

MMGCN: Multi-modal Graph Convolution Network for Personalized Recommendation of Micro-video

Yinwei Wei, +5 more

TL;DR: A Multi-modal Graph Convolution Network (MMGCN) framework built upon the message-passing idea of graph neural networks, which can yield modal-specific representations of users and micro-videos to better capture user preferences is designed.

...read moreread less

Journal ArticleDOI

MGAT: Multimodal Graph Attention Network for Recommendation

Zhulin Tao, +5 more

- 01 Sep 2020 -

Information Processing and Management

TL;DR: A new Multimodal Graph Attention Network, short for MGAT, is proposed, which disentangles personal interests at the granularity of modality and is able to capture more complex interaction patterns hidden in user behaviors and provide a more accurate recommendation.

...read moreread less

Proceedings ArticleDOI

Graph-Refined Convolutional Network for Multimedia Recommendation with Implicit Feedback

Yinwei Wei, +4 more

TL;DR: A new GCN-based recommender model, Graph-Refined Convolutional Network (GRCN), which adjusts the structure of interaction graph adaptively based on status of model training, instead of remaining the fixed structure is devised.

...read moreread less

Proceedings ArticleDOI

Personalized Hashtag Recommendation for Micro-videos

Yinwei Wei, +5 more

TL;DR: A Graph Convolution Network based Personalized Hashtag Recommendation (GCN-PHR) model, which leverages recently advanced GCN techniques to model the complicate interactions among users, hashtags, and micro-videos and learn their representations.

...read moreread less

Journal ArticleDOI

Infrared and visible image fusion using dual discriminators generative adversarial networks with Wasserstein distance

Jing Li, +3 more

- 01 Aug 2020 -

Information Sciences

TL;DR: D2WGAN, a framework that extends GAN to dual discriminators Wasserstein generative adversarial network, can generate better results compared with the other state-of-the-art methods and is defined as a novel LBP (local binary pattern) loss.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Image Super-Resolution Using Deep Convolutional Networks

Chao Dong, +3 more

- 01 Feb 2016 -

IEEE Transactions on Pattern Analysis an...

TL;DR: Zhang et al. as discussed by the authors proposed a deep learning method for single image super-resolution (SR), which directly learns an end-to-end mapping between the low/high-resolution images.

...read moreread less

Journal ArticleDOI

Image Denoising Via Sparse and Redundant Representations Over Learned Dictionaries

Michael Elad, +1 more

- 01 Dec 2006 -

IEEE Transactions on Image Processing

TL;DR: This work addresses the image denoising problem, where zero-mean white and homogeneous Gaussian additive noise is to be removed from a given image, and uses the K-SVD algorithm to obtain a dictionary that describes the image content effectively.

...read moreread less

Journal ArticleDOI

Image Super-Resolution Via Sparse Representation

Jianchao Yang, +3 more

- 01 Nov 2010 -

IEEE Transactions on Image Processing

TL;DR: This paper presents a new approach to single-image superresolution, based upon sparse signal representation, which generates high-resolution images that are competitive or even superior in quality to images produced by other similar SR methods.

...read moreread less

Proceedings ArticleDOI

Online dictionary learning for sparse coding

Julien Mairal, +3 more

TL;DR: A new online optimization algorithm for dictionary learning is proposed, based on stochastic approximations, which scales up gracefully to large datasets with millions of training samples, and leads to faster performance and better dictionaries than classical batch algorithms for both small and large datasets.

...read moreread less

Journal ArticleDOI

Sparse Representation for Color Image Restoration

Julien Mairal, +2 more

- 01 Jan 2008 -

IEEE Transactions on Image Processing

TL;DR: This work puts forward ways for handling nonhomogeneous noise and missing information, paving the way to state-of-the-art results in applications such as color image denoising, demosaicing, and inpainting, as demonstrated in this paper.

...read moreread less

Collapse

Online Data Organizer: Micro-Video Categorization by Structure-Guided Multimodal Dictionary Learning

Citations

MMGCN: Multi-modal Graph Convolution Network for Personalized Recommendation of Micro-video

MGAT: Multimodal Graph Attention Network for Recommendation

Graph-Refined Convolutional Network for Multimedia Recommendation with Implicit Feedback

Personalized Hashtag Recommendation for Micro-videos

Infrared and visible image fusion using dual discriminators generative adversarial networks with Wasserstein distance

References

Image Super-Resolution Using Deep Convolutional Networks

Image Denoising Via Sparse and Redundant Representations Over Learned Dictionaries

Image Super-Resolution Via Sparse Representation

Online dictionary learning for sparse coding

Sparse Representation for Color Image Restoration

Related Papers (5)

Deep Residual Learning for Image Recognition

BPR: Bayesian personalized ranking from implicit feedback

Inductive Representation Learning on Large Graphs

Semi-Supervised Classification with Graph Convolutional Networks

Learning Spatiotemporal Features with 3D Convolutional Networks