scispace - formally typeset
Search or ask a question
Author

Jun Sun

Bio: Jun Sun is an academic researcher from Shanghai Jiao Tong University. The author has contributed to research in topics: Transcoding & Multiple description coding. The author has an hindex of 20, co-authored 134 publications receiving 1887 citations.


Papers
More filters
Proceedings ArticleDOI
20 Sep 2009
TL;DR: This paper presents the first complete design to apply compressive sampling theory to sensor data gathering for large-scale wireless sensor networks and shows the efficiency and robustness of the proposed scheme.
Abstract: This paper presents the first complete design to apply compressive sampling theory to sensor data gathering for large-scale wireless sensor networks. The successful scheme developed in this research is expected to offer fresh frame of mind for research in both compressive sampling applications and large-scale wireless sensor networks. We consider the scenario in which a large number of sensor nodes are densely deployed and sensor readings are spatially correlated. The proposed compressive data gathering is able to reduce global scale communication cost without introducing intensive computation or complicated transmission control. The load balancing characteristic is capable of extending the lifetime of the entire sensor network as well as individual sensors. Furthermore, the proposed scheme can cope with abnormal sensor readings gracefully. We also carry out the analysis of the network capacity of the proposed compressive data gathering and validate the analysis through ns-2 simulations. More importantly, this novel compressive data gathering has been tested on real sensor data and the results show the efficiency and robustness of the proposed scheme.

631 citations

Journal ArticleDOI
TL;DR: This paper investigates how to generate RIP (restricted isometry property) preserving measurements of sensor readings by taking multi-hop communication cost into account and discovers that a simple form of measurement matrix has good RIP, and the data gathering scheme that realizes this measurement matrix can further reduce the communication cost of CDG for both chain-type and tree-type topology.
Abstract: We proposed compressive data gathering (CDG) that leverages compressive sampling (CS) principle to efficiently reduce communication cost and prolong network lifetime for large scale monitoring sensor networks The network capacity has been proven to increase proportionally to the sparsity of sensor readings In this paper, we further address two key problems in the CDG framework First, we investigate how to generate RIP (restricted isometry property) preserving measurements of sensor readings by taking multi-hop communication cost into account Excitingly, we discover that a simple form of measurement matrix [I R] has good RIP, and the data gathering scheme that realizes this measurement matrix can further reduce the communication cost of CDG for both chain-type and tree-type topology Second, although the sparsity of sensor readings is pervasive, it might be rather complicated to fully exploit it Owing to the inherent flexibility of CS principle, the proposed CDG framework is able to utilize various sparsity patterns despite of a simple and unified data gathering process In particular, we present approaches for adapting CS decoder to utilize cross-domain sparsity (eg temporal-frequency and spatial-frequency) We carry out simulation experiments over both synthesized and real sensor data The results confirm that CDG can preserve sensor data fidelity at a reduced communication cost

209 citations

Journal ArticleDOI
TL;DR: Among the hundreds of operation modes to support multi-program SDTV/HDTV terrestrial services in the DTTB standard, the key features and major applications of a series of modes named "PN595 + C1" are described in detail.
Abstract: A digital terrestrial television broadcasting (DTTB) standard named "Frame structure, channel coding and modulation for digital television terrestrial broadcasting system" was published in China in August 2006. This is the first paper of a series that provide a complete and in depth description to the standard including laboratory and field measurement results, detailed analysis on technologies in achieving stable fixed reception and fast mobile reception, as well as methodologies in spectrum allocation, and principles and technologies in performing single frequency network operation. Among the hundreds of operation modes to support multi-program SDTV/HDTV terrestrial services in the standard, the key features and major applications of a series of modes named "PN595 + C1" are described in detail. Measurement results of PN595 + C1 are also presented to demonstrate the satisfactory performances in fixed reception and high speed mobile reception

93 citations

Journal ArticleDOI
TL;DR: The proposed metric has demonstrated the state-of-the-art performance for predicting the subjective point cloud quality compared with multiple full-reference and no-reference models, e.g., the weighted peak signal-to-noise ratio (PSNR), structural similarity (SSIM), feature similarity (FSIM) and natural image quality evaluator (NIQE).
Abstract: Point cloud is emerged as a promising media format to represent realistic 3D objects or scenes in applications, such as virtual reality, teleportation, etc. How to accurately quantify the subjective point cloud quality for application-driven optimization, however, is still a challenging and open problem. In this paper, we attempt to tackle this problem in a systematic means. First, we produce a fairly large point cloud dataset where ten popular point clouds are augmented with seven types of impairments (e.g., compression, photometry/color noise, geometry noise, scaling) at six different distortion levels, and organize a formal subjective assessment with tens of subjects to collect mean opinion scores (MOS) for all 420 processed point cloud samples (PPCS). We then try to develop an objective metric that can accurately estimate the subjective quality. Towards this goal, we choose to project the 3D point cloud onto six perpendicular image planes of a cube for the color texture image and corresponding depth image, and aggregate image-based global (e.g., Jensen-Shannon (JS) divergence) and local features (e.g., edge, depth, pixel-wise similarity, complexity) among all projected planes for a final objective index. Model parameters are fixed constants after performing the regression using a small and independent dataset previously published. The proposed metric has demonstrated the state-of-the-art performance for predicting the subjective point cloud quality compared with multiple full-reference and no-reference models, e.g., the weighted peak signal-to-noise ratio (PSNR), structural similarity (SSIM), feature similarity (FSIM) and natural image quality evaluator (NIQE). The dataset is made publicly accessible at http://smt.sjtu.edu.cn or http://vision.nju.edu.cn for all interested audiences.

82 citations

Journal ArticleDOI
TL;DR: A probabilistic model is proposed, which explicitly introduces an extra variable to represent the trustworthiness of noisy labels, termed as the quality variable, which effectively minimizes the influence of label noise and outperforms the state-of-the-art deep learning approaches.
Abstract: There is an emerging trend to leverage noisy image datasets in many visual recognition tasks. However, the label noise among datasets severely degenerates the performance of deep learning approaches. Recently, one mainstream is to introduce the latent label to handle label noise, which has shown promising improvement in the network designs. Nevertheless, the mismatch between latent labels and noisy labels still affects the predictions in such methods. To address this issue, we propose a probabilistic model, which explicitly introduces an extra variable to represent the trustworthiness of noisy labels, termed as the quality variable. Our key idea is to identify the mismatch between the latent and noisy labels by embedding the quality variables into different subspaces, which effectively minimizes the influence of label noise. At the same time, reliable labels are still able to be applied for training. To instantiate the model, we further propose a contrastive-additive noise network (CAN), which consists of two important layers: 1) the contrastive layer that estimates the quality variable in the embedding space to reduce the influence of noisy labels and 2) the additive layer that aggregates the prior prediction and noisy labels as the posterior to train the classifier. Moreover, to tackle the challenges in optimization, we deduce an SGD algorithm with the reparameterization tricks, which makes our method scalable to big data. We validate the proposed method on a range of noisy image datasets. Comprehensive results have demonstrated that CAN outperforms the state-of-the-art deep learning approaches.

81 citations


Cited by
More filters
01 Jan 2006

3,012 citations

Proceedings Article
01 Jan 1994
TL;DR: The main focus in MUCKE is on cleaning large scale Web image corpora and on proposing image representations which are closer to the human interpretation of images.
Abstract: MUCKE aims to mine a large volume of images, to structure them conceptually and to use this conceptual structuring in order to improve large-scale image retrieval. The last decade witnessed important progress concerning low-level image representations. However, there are a number problems which need to be solved in order to unleash the full potential of image mining in applications. The central problem with low-level representations is the mismatch between them and the human interpretation of image content. This problem can be instantiated, for instance, by the incapability of existing descriptors to capture spatial relationships between the concepts represented or by their incapability to convey an explanation of why two images are similar in a content-based image retrieval framework. We start by assessing existing local descriptors for image classification and by proposing to use co-occurrence matrices to better capture spatial relationships in images. The main focus in MUCKE is on cleaning large scale Web image corpora and on proposing image representations which are closer to the human interpretation of images. Consequently, we introduce methods which tackle these two problems and compare results to state of the art methods. Note: some aspects of this deliverable are withheld at this time as they are pending review. Please contact the authors for a preview.

2,134 citations

Journal ArticleDOI
TL;DR: The Rotation Region Proposal Networks are designed to generate inclined proposals with text orientation angle information that are adapted for bounding box regression to make the proposals more accurately fit into the text region in terms of the orientation.
Abstract: This paper introduces a novel rotation-based framework for arbitrary-oriented text detection in natural scene images. We present the Rotation Region Proposal Networks , which are designed to generate inclined proposals with text orientation angle information. The angle information is then adapted for bounding box regression to make the proposals more accurately fit into the text region in terms of the orientation. The Rotation Region-of-Interest pooling layer is proposed to project arbitrary-oriented proposals to a feature map for a text region classifier. The whole framework is built upon a region-proposal-based architecture, which ensures the computational efficiency of the arbitrary-oriented text detection compared with previous text detection systems. We conduct experiments using the rotation-based framework on three real-world scene text detection datasets and demonstrate its superiority in terms of effectiveness and efficiency over previous approaches.

1,002 citations

Journal ArticleDOI
TL;DR: This survey provides a comprehensive overview of a variety of object detection methods in a systematic manner, covering the one-stage and two-stage detectors, and lists the traditional and new applications.
Abstract: Object detection is one of the most important and challenging branches of computer vision, which has been widely applied in people's life, such as monitoring security, autonomous driving and so on, with the purpose of locating instances of semantic objects of a certain class. With the rapid development of deep learning algorithms for detection tasks, the performance of object detectors has been greatly improved. In order to understand the main development status of object detection pipeline thoroughly and deeply, in this survey, we analyze the methods of existing typical detection models and describe the benchmark datasets at first. Afterwards and primarily, we provide a comprehensive overview of a variety of object detection methods in a systematic manner, covering the one-stage and two-stage detectors. Moreover, we list the traditional and new applications. Some representative branches of object detection are analyzed as well. Finally, we discuss the architecture of exploiting these object detection methods to build an effective and efficient system and point out a set of development trends to better follow the state-of-the-art algorithms and further research.

749 citations

Journal ArticleDOI
TL;DR: The purpose of this paper is to provide a complete survey of the traditional and recent approaches to background modeling for foreground detection, and categorize the different approaches in terms of the mathematical models used.

664 citations