scispace - formally typeset
Search or ask a question
JournalISSN: 1551-6857

ACM Transactions on Multimedia Computing, Communications, and Applications 

Association for Computing Machinery
About: ACM Transactions on Multimedia Computing, Communications, and Applications is an academic journal published by Association for Computing Machinery. The journal publishes majorly in the area(s): Computer science & Pattern recognition (psychology). It has an ISSN identifier of 1551-6857. Over the lifetime, 1234 publications have been published receiving 26329 citations. The journal is also known as: A C M Transactions on Multimedia Computing Communications and Applications & TOMCCAP.


Papers
More filters
Journal ArticleDOI
TL;DR: This survey reviews 100+ recent articles on content-based multimedia information retrieval and discusses their role in current research directions which include browsing and search paradigms, user studies, affective computing, learning, semantic queries, new features and media types, high performance indexing, and evaluation techniques.
Abstract: Extending beyond the boundaries of science, art, and culture, content-based multimedia information retrieval provides new paradigms and methods for searching through the myriad variety of media all over the world. This survey reviews 100p recent articles on content-based multimedia information retrieval and discusses their role in current research directions which include browsing and search paradigms, user studies, affective computing, learning, semantic queries, new features and media types, high performance indexing, and evaluation techniques. Based on the current state of the art, we discuss the major challenges for the future.

1,652 citations

Journal ArticleDOI
TL;DR: The purpose of this article is to provide a systematic classification of various ideas and techniques proposed towards the effective abstraction of video contents, and identify and detail, for each approach, the underlying components and how they are addressed in specific works.
Abstract: The demand for various multimedia applications is rapidly increasing due to the recent advance in the computing and network infrastructure, together with the widespread use of digital video technology. Among the key elements for the success of these applications is how to effectively and efficiently manage and store a huge amount of audio visual information, while at the same time providing user-friendly access to the stored data. This has fueled a quickly evolving research area known as video abstraction. As the name implies, video abstraction is a mechanism for generating a short summary of a video, which can either be a sequence of stationary images (keyframes) or moving images (video skims). In terms of browsing and navigation, a good video abstract will enable the user to gain maximum information about the target video sequence in a specified time constraint or sufficient information in the minimum time. Over past years, various ideas and techniques have been proposed towards the effective abstraction of video contents. The purpose of this article is to provide a systematic classification of these works. We identify and detail, for each approach, the underlying components and how they are addressed in specific works.

879 citations

Journal ArticleDOI
TL;DR: Li et al. as mentioned in this paper proposed a Siamese network that simultaneously computes the identification loss and verification loss, and the network learns a discriminative embedding and a similarity measurement at the same time.
Abstract: In this article, we revisit two popular convolutional neural networks in person re-identification (re-ID): verification and identification models. The two models have their respective advantages and limitations due to different loss functions. Here, we shed light on how to combine the two models to learn more discriminative pedestrian descriptors. Specifically, we propose a Siamese network that simultaneously computes the identification loss and verification loss. Given a pair of training images, the network predicts the identities of the two input images and whether they belong to the same identity. Our network learns a discriminative embedding and a similarity measurement at the same time, thus taking full usage of the re-ID annotations. Our method can be easily applied on different pretrained networks. Albeit simple, the learned embedding improves the state-of-the-art performance on two public person re-ID benchmarks. Further, we show that our architecture can also be applied to image retrieval. The code is available at https://github.com/layumi/2016_person_re-ID.

662 citations

Journal ArticleDOI
TL;DR: A progressive unsupervised learning (PUL) method to transfer pretrained deep representations to unseen domains and demonstrates that PUL outputs discriminative features that improve the re-ID accuracy.
Abstract: The superiority of deeply learned pedestrian representations has been reported in very recent literature of person re-identification (re-ID). In this article, we consider the more pragmatic issue of learning a deep feature with no or only a few labels. We propose a progressive unsupervised learning (PUL) method to transfer pretrained deep representations to unseen domains. Our method is easy to implement and can be viewed as an effective baseline for unsupervised re-ID feature learning. Specifically, PUL iterates between (1) pedestrian clustering and (2) fine-tuning of the convolutional neural network (CNN) to improve the initialization model trained on the irrelevant labeled dataset. Since the clustering results can be very noisy, we add a selection operation between the clustering and fine-tuning. At the beginning, when the model is weak, CNN is fine-tuned on a small amount of reliable examples that locate near to cluster centroids in the feature space. As the model becomes stronger, in subsequent iterations, more images are being adaptively selected as CNN training samples. Progressively, pedestrian clustering and the CNN model are improved simultaneously until algorithm convergence. This process is naturally formulated as self-paced learning. We then point out promising directions that may lead to further improvement. Extensive experiments on three large-scale re-ID datasets demonstrate that PUL outputs discriminative features that improve the re-ID accuracy. Our code has been released at https://github.com/hehefan/Unsupervised-Person-Re-identification-Clustering-and-Fine-tuning.

488 citations

Journal ArticleDOI
TL;DR: This is the first comprehensive survey of the field of PCG-G, and introduces a comprehensive, six-layered taxonomy of game content: bits, space, systems, scenarios, design, and derived.
Abstract: Hundreds of millions of people play computer games every day For them, game content—from 3D objects to abstract puzzles—plays a major entertainment role Manual labor has so far ensured that the quality and quantity of game content matched the demands of the playing community, but is facing new scalability challenges due to the exponential growth over the last decade of both the gamer population and the production costs Procedural Content Generation for Games (PCG-G) may address these challenges by automating, or aiding in, game content generation PCG-G is difficult, since the generator has to create the content, satisfy constraints imposed by the artist, and return interesting instances for gamers Despite a large body of research focusing on PCG-G, particularly over the past decade, ours is the first comprehensive survey of the field of PCG-G We first introduce a comprehensive, six-layered taxonomy of game content: bits, space, systems, scenarios, design, and derived Second, we survey the methods used across the whole field of PCG-G from a large research body Third, we map PCG-G methods to game content layers; it turns out that many of the methods used to generate game content from one layer can be used to generate content from another We also survey the use of methods in practice, that is, in commercial or prototype games Fourth and last, we discuss several directions for future research in PCG-G, which we believe deserve close attention in the near future

453 citations

Performance
Metrics
No. of papers from the Journal in previous years
YearPapers
2023106
2022315
2021114
2020116
2019101
201875