Journal ArticleDOI
Online Data Organizer: Micro-Video Categorization by Structure-Guided Multimodal Dictionary Learning
Reads0
Chats0
TLDR
A structure-guided multi-modal dictionary learning model is built to learn the concept-level micro-video representation by jointly considering their venue structure and modality relatedness and an online learning algorithm is developed to incrementally and efficiently strengthen this model.Abstract:
Micro-videos have rapidly become one of the most dominant trends in the era of social media. Accordingly, how to organize them draws our attention. Distinct from the traditional long videos that would have multi-site scenes and tolerate the hysteresis, a micro-video: 1) usually records contents at one specific venue within a few seconds. The venues are structured hierarchically regarding their category granularity. This motivates us to organize the micro-videos via their venue structure. 2) timely circulates over social networks. Thus, the timeliness of micro-videos desires effective online processing. However, only 1.22% of micro-videos are labeled with venue information when uploaded at the mobile end. To address this problem, we present a framework to organize the micro-videos online. In particular, we first build a structure-guided multi-modal dictionary learning model to learn the concept-level micro-video representation by jointly considering their venue structure and modality relatedness. We then develop an online learning algorithm to incrementally and efficiently strengthen our model, as well as categorize the micro-videos into a tree structure. Extensive experiments on a real-world data set validate our model well. In addition, we have released the codes to facilitate the research in the community.read more
Citations
More filters
Proceedings ArticleDOI
MMGCN: Multi-modal Graph Convolution Network for Personalized Recommendation of Micro-video
TL;DR: A Multi-modal Graph Convolution Network (MMGCN) framework built upon the message-passing idea of graph neural networks, which can yield modal-specific representations of users and micro-videos to better capture user preferences is designed.
Journal ArticleDOI
MGAT: Multimodal Graph Attention Network for Recommendation
TL;DR: A new Multimodal Graph Attention Network, short for MGAT, is proposed, which disentangles personal interests at the granularity of modality and is able to capture more complex interaction patterns hidden in user behaviors and provide a more accurate recommendation.
Proceedings ArticleDOI
Graph-Refined Convolutional Network for Multimedia Recommendation with Implicit Feedback
TL;DR: A new GCN-based recommender model, Graph-Refined Convolutional Network (GRCN), which adjusts the structure of interaction graph adaptively based on status of model training, instead of remaining the fixed structure is devised.
Proceedings ArticleDOI
Personalized Hashtag Recommendation for Micro-videos
TL;DR: A Graph Convolution Network based Personalized Hashtag Recommendation (GCN-PHR) model, which leverages recently advanced GCN techniques to model the complicate interactions among users, hashtags, and micro-videos and learn their representations.
Journal ArticleDOI
Infrared and visible image fusion using dual discriminators generative adversarial networks with Wasserstein distance
TL;DR: D2WGAN, a framework that extends GAN to dual discriminators Wasserstein generative adversarial network, can generate better results compared with the other state-of-the-art methods and is defined as a novel LBP (local binary pattern) loss.
References
More filters
Journal ArticleDOI
Image Super-Resolution Using Deep Convolutional Networks
TL;DR: Zhang et al. as discussed by the authors proposed a deep learning method for single image super-resolution (SR), which directly learns an end-to-end mapping between the low/high-resolution images.
Journal ArticleDOI
Image Denoising Via Sparse and Redundant Representations Over Learned Dictionaries
Michael Elad,Michal Aharon +1 more
TL;DR: This work addresses the image denoising problem, where zero-mean white and homogeneous Gaussian additive noise is to be removed from a given image, and uses the K-SVD algorithm to obtain a dictionary that describes the image content effectively.
Journal ArticleDOI
Image Super-Resolution Via Sparse Representation
TL;DR: This paper presents a new approach to single-image superresolution, based upon sparse signal representation, which generates high-resolution images that are competitive or even superior in quality to images produced by other similar SR methods.
Proceedings ArticleDOI
Online dictionary learning for sparse coding
TL;DR: A new online optimization algorithm for dictionary learning is proposed, based on stochastic approximations, which scales up gracefully to large datasets with millions of training samples, and leads to faster performance and better dictionaries than classical batch algorithms for both small and large datasets.
Journal ArticleDOI
Sparse Representation for Color Image Restoration
TL;DR: This work puts forward ways for handling nonhomogeneous noise and missing information, paving the way to state-of-the-art results in applications such as color image denoising, demosaicing, and inpainting, as demonstrated in this paper.