Author
Zhe-Ming Lu
Bio: Zhe-Ming Lu is an academic researcher from Zhejiang University. The author has contributed to research in topics: Filter (signal processing) & Standard test image. The author has an hindex of 2, co-authored 2 publications receiving 183 citations.
Papers
More filters
TL;DR: Experiments on TRECVID 2001 test data and other video materials show that the proposed scheme can achieve a high detection speed and excellent accuracy compared with recent SBD schemes.
Abstract: Video shot boundary detection (SBD) is the first and essential step for content-based video management and structural analysis. Great efforts have been paid to develop SBD algorithms for years. However, the high computational cost in the SBD becomes a block for further applications such as video indexing, browsing, retrieval, and representation. Motivated by the requirement of the real-time interactive applications, a unified fast SBD scheme is proposed in this paper. We adopted a candidate segment selection and singular value decomposition (SVD) to speed up the SBD. Initially, the positions of the shot boundaries and lengths of gradual transitions are predicted using adaptive thresholds and most non-boundary frames are discarded at the same time. Only the candidate segments that may contain the shot boundaries are preserved for further detection. Then, for all frames in each candidate segment, their color histograms in the hue-saturation-value) space are extracted, forming a frame-feature matrix. The SVD is then performed on the frame-feature matrices of all candidate segments to reduce the feature dimension. The refined feature vector of each frame in the candidate segments is obtained as a new metric for boundary detection. Finally, cut and gradual transitions are identified using our pattern matching method based on a new similarity measurement. Experiments on TRECVID 2001 test data and other video materials show that the proposed scheme can achieve a high detection speed and excellent accuracy compared with recent SBD schemes.
133 citations
TL;DR: The proposed hashing algorithm shows superior robustness and discrimination performance compared with other state-of-the-art algorithms, particularly in the robustness against rotations (of large degrees).
Abstract: In this paper, we propose a robust-hash function based on random Gabor filtering and dithered lattice vector quantization (LVQ). In order to enhance the robustness against rotation manipulations, the conventional Gabor filter is adapted to be rotation invariant, and the rotation-invariant filter is randomized to facilitate secure feature extraction. Particularly, a novel dithered-LVQ-based quantization scheme is proposed for robust hashing. The dithered-LVQ-based quantization scheme is well suited for robust hashing with several desirable features, including better tradeoff between robustness and discrimination, higher randomness, and secrecy, which are validated by analytical and experimental results. The performance of the proposed hashing algorithm is evaluated over a test image database under various content-preserving manipulations. The proposed hashing algorithm shows superior robustness and discrimination performance compared with other state-of-the-art algorithms, particularly in the robustness against rotations (of large degrees).
81 citations
Cited by
More filters
1,584 citations
TL;DR: This work incorporates ring partition and invariant vector distance to image hashing algorithm for enhancing rotation robustness and discriminative capability, and demonstrates that the proposed hashing algorithm is robust at commonly used digital operations to images.
Abstract: Robustness and discrimination are two of the most important objectives in image hashing. We incorporate ring partition and invariant vector distance to image hashing algorithm for enhancing rotation robustness and discriminative capability. As ring partition is unrelated to image rotation, the statistical features that are extracted from image rings in perceptually uniform color space, i.e., CIE L*a*b* color space, are rotation invariant and stable. In particular, the Euclidean distance between vectors of these perceptual features is invariant to commonly used digital operations to images (e.g., JPEG compression, gamma correction, and brightness/contrast adjustment), which helps in making image hash compact and discriminative. We conduct experiments to evaluate the efficiency with 250 color images, and demonstrate that the proposed hashing algorithm is robust at commonly used digital operations to images. In addition, with the receiver operating characteristics curve, we illustrate that our hashing is much better than the existing popular hashing algorithms at robustness and discrimination.
192 citations
TL;DR: An efficient image hashing with a ring partition and a nonnegative matrix factorization (NMF) is designed, which has both the rotation robustness and good discriminative capability.
Abstract: This paper designs an efficient image hashing with a ring partition and a nonnegative matrix factorization (NMF), which has both the rotation robustness and good discriminative capability. The key contribution is a novel construction of rotation-invariant secondary image, which is used for the first time in image hashing and helps to make image hash resistant to rotation. In addition, NMF coefficients are approximately linearly changed by content-preserving manipulations, so as to measure hash similarity with correlation coefficient. We conduct experiments for illustrating the efficiency with 346 images. Our experiments show that the proposed hashing is robust against content-preserving operations, such as image rotation, JPEG compression, watermark embedding, Gaussian low-pass filtering, gamma correction, brightness adjustment, contrast adjustment, and image scaling. Receiver operating characteristics (ROC) curve comparisons are also conducted with the state-of-the-art algorithms, and demonstrate that the proposed hashing is much better than all these algorithms in classification performances with respect to robustness and discrimination.
181 citations
TL;DR: This survey conducts a comprehensive overview of the state-of-the-art research work on multimedia big data analytics, and aims to bridge the gap between multimedia challenges and big data solutions by providing the current big data frameworks, their applications in multimedia analyses, the strengths and limitations of the existing methods, and the potential future directions in multimediabig data analytics.
Abstract: With the proliferation of online services and mobile technologies, the world has stepped into a multimedia big data era. A vast amount of research work has been done in the multimedia area, targeting different aspects of big data analytics, such as the capture, storage, indexing, mining, and retrieval of multimedia big data. However, very few research work provides a complete survey of the whole pine-line of the multimedia big data analytics, including the management and analysis of the large amount of data, the challenges and opportunities, and the promising research directions. To serve this purpose, we present this survey, which conducts a comprehensive overview of the state-of-the-art research work on multimedia big data analytics. It also aims to bridge the gap between multimedia challenges and big data solutions by providing the current big data frameworks, their applications in multimedia analyses, the strengths and limitations of the existing methods, and the potential future directions in multimedia big data analytics. To the best of our knowledge, this is the first survey that targets the most recent multimedia management techniques for very large-scale data and also provides the research studies and technologies advancing the multimedia analyses in this big data era.
168 citations
TL;DR: The experimental results show that the proposed method can efficiently detect shot boundaries under both abrupt and gradual transitions, and even under different levels of illumination, motion effects and camera operations (zoom in, zoom out and camera rotation).
Abstract: In today’s digital era, there are large volumes of long-duration videos resulting from movies, documentaries, sports and surveillance cameras floating over internet and video databases (YouTube). Since manual processing of these videos are difficult, time-consuming and expensive, an automatic technique of abstracting these long-duration videos are very much desirable. In this backdrop, this paper presents a novel and efficient approach of video shot boundary detection and keyframe extraction, which subsequently leads to a summarized and compact video. The proposed method detects video shot boundaries by extracting the SIFT-point distribution histogram (SIFT-PDH) from the frames as a combination of local and global features. In the subsequent step, using the distance of SIFT-PDH of consecutive frames and an adaptive threshold video shot boundaries are detected. Further, the keyframes representing the salient content of each segmented shot are extracted using entropy-based singular values measure. Thus, the summarized video is then generated by combining the extracted keyframes. The experimental results show that our method can efficiently detect shot boundaries under both abrupt and gradual transitions, and even under different levels of illumination, motion effects and camera operations (zoom in, zoom out and camera rotation). With the proposed method, the computational complexity is comparatively less and video summarization is very compact.
104 citations