Showing papers by "National University of Defense Technology published in 2017"
••
25 Jun 2017
TL;DR: An overview of blockchain architechture is provided and some typical consensus algorithms used in different blockchains are compared and possible future trends for blockchain are laid out.
Abstract: Blockchain, the foundation of Bitcoin, has received extensive attentions recently. Blockchain serves as an immutable ledger which allows transactions take place in a decentralized manner. Blockchain-based applications are springing up, covering numerous fields including financial services, reputation system and Internet of Things (IoT), and so on. However, there are still many challenges of blockchain technology such as scalability and security problems waiting to be overcome. This paper presents a comprehensive overview on blockchain technology. We provide an overview of blockchain architechture firstly and compare some typical consensus algorithms used in different blockchains. Furthermore, technical challenges and recent advances are briefly listed. We also lay out possible future trends for blockchain.
2,642 citations
••
TL;DR: This overview reviews theoretical underpinnings of multi-view learning and attempts to identify promising venues and point out some specific challenges which can hopefully promote further research in this rapidly developing field.
679 citations
••
01 Aug 2017TL;DR: The Improved Deep Embedded Clustering (IDEC) algorithm is proposed, which manipulates feature space to scatter data points using a clustering loss as guidance and can jointly optimize cluster labels assignment and learn features that are suitable for clustering with local structure preservation.
Abstract: Deep clustering learns deep feature representations that favor clustering task using neural networks. Some pioneering work proposes to simultaneously learn embedded features and perform clustering by explicitly defining a clustering oriented loss. Though promising performance has been demonstrated in various applications, we observe that a vital ingredient has been overlooked by these work that the defined clustering loss may corrupt feature space, which leads to non-representative meaningless features and this in turn hurts clustering performance. To address this issue, in this paper, we propose the Improved Deep Embedded Clustering (IDEC) algorithm to take care of data structure preservation. Specifically, we manipulate feature space to scatter data points using a clustering loss as guidance. To constrain the manipulation and maintain the local structure of data generating distribution, an under-complete autoencoder is applied. By integrating the clustering loss and autoencoder’s reconstruction loss, IDEC can jointly optimize cluster labels assignment and learn features that are suitable for clustering with local structure preservation. The resultant optimization problem can be effectively solved by mini-batch stochastic gradient descent and backpropagation. Experiments on image and text datasets empirically validate the importance of local structure preservation and the effectiveness of our algorithm.
566 citations
••
University of Ljubljana1, University of Birmingham2, Czech Technical University in Prague3, Linköping University4, Austrian Institute of Technology5, Autonomous University of Madrid6, Parthenope University of Naples7, University of Isfahan8, University of Oxford9, Superior National School of Advanced Techniques10, Middle East Technical University11, Dalian University of Technology12, Chinese Academy of Sciences13, ASELSAN14, United States Naval Research Laboratory15, National University of Defense Technology16, University of Science and Technology of China17, Electronics and Telecommunications Research Institute18, Zhejiang University19, Beijing University of Posts and Telecommunications20, Huazhong University of Science and Technology21, University of Missouri22, Carnegie Mellon University23, General Electric24, King Abdullah University of Science and Technology25, University of California, Merced26, University of Surrey27, University at Albany, SUNY28
TL;DR: The Visual Object Tracking challenge VOT2017 is the fifth annual tracker benchmarking activity organized by the VOT initiative; results of 51 trackers are presented; many are state-of-the-art published at major computer vision conferences or journals in recent years.
Abstract: The Visual Object Tracking challenge VOT2017 is the fifth annual tracker benchmarking activity organized by the VOT initiative. Results of 51 trackers are presented; many are state-of-the-art published at major computer vision conferences or journals in recent years. The evaluation included the standard VOT and other popular methodologies and a new "real-time" experiment simulating a situation where a tracker processes images as if provided by a continuously running sensor. Performance of the tested trackers typically by far exceeds standard baselines. The source code for most of the trackers is publicly available from the VOT page. The VOT2017 goes beyond its predecessors by (i) improving the VOT public dataset and introducing a separate VOT2017 sequestered dataset, (ii) introducing a realtime tracking experiment and (iii) releasing a redesigned toolkit that supports complex experiments. The dataset, the evaluation kit and the results are publicly available at the challenge website1.
485 citations
•
[...]
06 Jul 2017
TL;DR: In this article, a dual path network (DPN) is proposed for image classification, which shares common features while maintaining the flexibility to explore new features through dual path architectures, achieving state-of-the-art performance on the ImagNet-1k, Places365 and PASCAL VOC datasets.
Abstract: In this work, we present a simple, highly efficient and modularized Dual Path Network (DPN) for image classification which presents a new topology of connection paths internally. By revealing the equivalence of the state-of-the-art Residual Network (ResNet) and Densely Convolutional Network (DenseNet) within the HORNN framework, we find that ResNet enables feature re-usage while DenseNet enables new features exploration which are both important for learning good representations. To enjoy the benefits from both path topologies, our proposed Dual Path Network shares common features while maintaining the flexibility to explore new features through dual path architectures. Extensive experiments on three benchmark datasets, ImagNet-1k, Places365 and PASCAL VOC, clearly demonstrate superior performance of the proposed DPN over state-of-the-arts. In particular, on the ImagNet-1k dataset, a shallow DPN surpasses the best ResNeXt-101(64x4d) with 26% smaller model size, 25% less computational cost and 8% lower memory consumption, and a deeper DPN (DPN-131) further pushes the state-of-the-art single model performance with about 2 times faster training speed. Experiments on the Places365 large-scale scene dataset, PASCAL VOC detection dataset, and PASCAL VOC segmentation dataset also demonstrate its consistently better performance than DenseNet, ResNet and the latest ResNeXt model over various applications.
475 citations
••
TL;DR: A novel neural network architecture for encoding and synthesis of 3D shapes, particularly their structures, is introduced and it is demonstrated that without supervision, the network learns meaningful structural hierarchies adhering to perceptual grouping principles, produces compact codes which enable applications such as shape classification and partial matching, and supports shape synthesis and interpolation with significant variations in topology and geometry.
Abstract: We introduce a novel neural network architecture for encoding and synthesis of 3D shapes, particularly their structures. Our key insight is that 3D shapes are effectively characterized by their hierarchical organization of parts, which reflects fundamental intra-shape relationships such as adjacency and symmetry. We develop a recursive neural net (RvNN) based autoencoder to map a flat, unlabeled, arbitrary part layout to a compact code. The code effectively captures hierarchical structures of man-made 3D objects of varying structural complexities despite being fixed-dimensional: an associated decoder maps a code back to a full hierarchy. The learned bidirectional mapping is further tuned using an adversarial setup to yield a generative model of plausible structures, from which novel structures can be sampled. Finally, our structure synthesis framework is augmented by a second trained module that produces fine-grained part geometry, conditioned on global and local structural context, leading to a full generative pipeline for 3D shapes. We demonstrate that without supervision, our network learns meaningful structural hierarchies adhering to perceptual grouping principles, produces compact codes which enable applications such as shape classification and partial matching, and supports shape synthesis and interpolation with significant variations in topology and geometry.
403 citations
••
TL;DR: A novel convolutional neural network framework is proposed for time series classification that can discover and extract the suitable internal structure to generate deep features of the input time series automatically by using convolution and pooling operations.
398 citations
••
14 Nov 2017TL;DR: A convolutional autoencoders structure is developed to learn embedded features in an end-to-end way and a clustering oriented loss is directly built on embedded features to jointly perform feature refinement and cluster assignment.
Abstract: Deep clustering utilizes deep neural networks to learn feature representation that is suitable for clustering tasks. Though demonstrating promising performance in various applications, we observe that existing deep clustering algorithms either do not well take advantage of convolutional neural networks or do not considerably preserve the local structure of data generating distribution in the learned feature space. To address this issue, we propose a deep convolutional embedded clustering algorithm in this paper. Specifically, we develop a convolutional autoencoders structure to learn embedded features in an end-to-end way. Then, a clustering oriented loss is directly built on embedded features to jointly perform feature refinement and cluster assignment. To avoid feature space being distorted by the clustering loss, we keep the decoder remained which can preserve local structure of data in feature space. In sum, we simultaneously minimize the reconstruction loss of convolutional autoencoders and the clustering loss. The resultant optimization problem can be effectively solved by mini-batch stochastic gradient descent and back-propagation. Experiments on benchmark datasets empirically validate the power of convolutional autoencoders for feature learning and the effectiveness of local structure preservation.
377 citations
•
[...]
TL;DR: This work reveals the equivalence of the state-of-the-art Residual Network (ResNet) and Densely Convolutional Network (DenseNet) within the HORNN framework, and finds that ResNet enables feature re-usage while DenseNet enables new features exploration which are both important for learning good representations.
Abstract: In this work, we present a simple, highly efficient and modularized Dual Path Network (DPN) for image classification which presents a new topology of connection paths internally. By revealing the equivalence of the state-of-the-art Residual Network (ResNet) and Densely Convolutional Network (DenseNet) within the HORNN framework, we find that ResNet enables feature re-usage while DenseNet enables new features exploration which are both important for learning good representations. To enjoy the benefits from both path topologies, our proposed Dual Path Network shares common features while maintaining the flexibility to explore new features through dual path architectures. Extensive experiments on three benchmark datasets, ImagNet-1k, Places365 and PASCAL VOC, clearly demonstrate superior performance of the proposed DPN over state-of-the-arts. In particular, on the ImagNet-1k dataset, a shallow DPN surpasses the best ResNeXt-101(64x4d) with 26% smaller model size, 25% less computational cost and 8% lower memory consumption, and a deeper DPN (DPN-131) further pushes the state-of-the-art single model performance with about 2 times faster training speed. Experiments on the Places365 large-scale scene dataset, PASCAL VOC detection dataset, and PASCAL VOC segmentation dataset also demonstrate its consistently better performance than DenseNet, ResNet and the latest ResNeXt model over various applications.
342 citations
•
TL;DR: A novel monocular visual odometry system called UnDeepVO, able to estimate the 6-DoF pose of a monocular camera and the depth of its view by using deep neural networks, is proposed.
Abstract: We propose a novel monocular visual odometry (VO) system called UnDeepVO in this paper. UnDeepVO is able to estimate the 6-DoF pose of a monocular camera and the depth of its view by using deep neural networks. There are two salient features of the proposed UnDeepVO: one is the unsupervised deep learning scheme, and the other is the absolute scale recovery. Specifically, we train UnDeepVO by using stereo image pairs to recover the scale but test it by using consecutive monocular images. Thus, UnDeepVO is a monocular system. The loss function defined for training the networks is based on spatial and temporal dense information. A system overview is shown in Fig. 1. The experiments on KITTI dataset show our UnDeepVO achieves good performance in terms of pose accuracy.
324 citations
••
TL;DR: A large scale performance evaluation for texture classification, empirically assessing forty texture features including thirty two recent most promising LBP variants and eight non-LBP descriptors based on deep convolutional networks on thirteen widely-used texture datasets.
••
TL;DR: A new diversified DBN is developed through regularizing pretraining and fine-tuning procedures by a diversity promoting prior over latent factors that obtain much better results than original DBNs and comparable or even better performances compared with other recent hyperspectral image classification methods.
Abstract: In the literature of remote sensing, deep models with multiple layers have demonstrated their potentials in learning the abstract and invariant features for better representation and classification of hyperspectral images. The usual supervised deep models, such as convolutional neural networks, need a large number of labeled training samples to learn their model parameters. However, the real-world hyperspectral image classification task provides only a limited number of training samples. This paper adopts another popular deep model, i.e., deep belief networks (DBNs), to deal with this problem. The DBNs allow unsupervised pretraining over unlabeled samples at first and then a supervised fine-tuning over labeled samples. But the usual pretraining and fine-tuning method would make many hidden units in the learned DBNs tend to behave very similarly or perform as “dead” (never responding) or “potential over-tolerant” (always responding) latent factors. These results could negatively affect description ability and thus classification performance of DBNs. To further improve DBN’s performance, this paper develops a new diversified DBN through regularizing pretraining and fine-tuning procedures by a diversity promoting prior over latent factors. Moreover, the regularized pretraining and fine-tuning can be efficiently implemented through usual recursive greedy and back-propagation learning framework. The experiments over real-world hyperspectral images demonstrated that the diversity promoting prior in both pretraining and fine-tuning procedure lead to the learned DBNs with more diverse latent factors, which directly make the diversified DBNs obtain much better results than original DBNs and comparable or even better performances compared with other recent hyperspectral image classification methods.
••
TL;DR: This study demonstrates how public and private data sources that are commonly available for LMICs can be used to provide novel insight into the spatial distribution of poverty, indicating the possibility to estimate and continually monitor poverty rates at high spatial resolution in countries with limited capacity to support traditional methods of data collection.
Abstract: Poverty is one of the most important determinants of adverse health outcomes globally, a major cause of societal instability and one of the largest causes of lost human potential. Traditional approaches to measuring and targeting poverty rely heavily on census data, which in most low- and middle-income countries (LMICs) are unavailable or out-of-date. Alternate measures are needed to complement and update estimates between censuses. This study demonstrates how public and private data sources that are commonly available for LMICs can be used to provide novel insight into the spatial distribution of poverty. We evaluate the relative value of modelling three traditional poverty measures using aggregate data from mobile operators and widely available geospatial data. Taken together, models combining these data sources provide the best predictive power (highest r2 = 0.78) and lowest error, but generally models employing mobile data only yield comparable results, offering the potential to measure poverty more frequently and at finer granularity. Stratifying models into urban and rural areas highlights the advantage of using mobile data in urban areas and different data in different contexts. The findings indicate the possibility to estimate and continually monitor poverty rates at high spatial resolution in countries with limited capacity to support traditional methods of data collection.
••
TL;DR: In this paper, an integrated local trajectory planning and tracking control (ILTPTC) framework for autonomous vehicles driving along a reference path with obstacles avoidance is proposed, where an efficient state-space sampling-based trajectory planning scheme is employed to smoothly follow the reference path.
••
TL;DR: In this paper, a frequency-selective surface (FSS) with high in-band transmission at high frequency and wideband absorption at low frequency is presented. But the PLC structure is not considered.
Abstract: This communication presents a novel frequency-selective surface (FSS) with high in-band transmission at high frequency and wideband absorption at low frequency. It consists of a resistive sheet and a metallic bandpass FSS separated by a foam spacer. The resistive element is realized by inserting a strip-type parallel $LC$ (PLC) structure into the center of a lumped-resistor-loaded metallic dipole. The PLC resonates at the passband of the bandpass FSS and exhibits an infinite impedance, which splits the resistive dipole into two short sections per the surface current; this setup allows for high in-band transmission at high frequency. Below the resonance frequency, the PLC becomes finite inductive and the entire FSS performs as an absorber with the metallic FSS as a ground plane. The surface current distribution on the resistive element can be controlled at various frequencies via the PLC structure. The wideband absorption and high in-band transmission of the proposed design are verified by both numerical simulation and experimental measurements. The potential extension to polarization-insensitive designs is also discussed.
••
TL;DR: An improved detection method based on Faster R-CNN is proposed, which employs a hyper region proposal network (HRPN) to extract vehicle-like targets with a combination of hierarchical feature maps and replaces the classifier after RPN by a cascade of boosted classifiers to verify the candidate regions.
Abstract: Detecting vehicles in aerial imagery plays an important role in a wide range of applications. The current vehicle detection methods are mostly based on sliding-window search and handcrafted or shallow-learning-based features, having limited description capability and heavy computational costs. Recently, due to the powerful feature representations, region convolutional neural networks (CNN) based detection methods have achieved state-of-the-art performance in computer vision, especially Faster R-CNN. However, directly using it for vehicle detection in aerial images has many limitations: (1) region proposal network (RPN) in Faster R-CNN has poor performance for accurately locating small-sized vehicles, due to the relatively coarse feature maps; and (2) the classifier after RPN cannot distinguish vehicles and complex backgrounds well. In this study, an improved detection method based on Faster R-CNN is proposed in order to accomplish the two challenges mentioned above. Firstly, to improve the recall, we employ a hyper region proposal network (HRPN) to extract vehicle-like targets with a combination of hierarchical feature maps. Then, we replace the classifier after RPN by a cascade of boosted classifiers to verify the candidate regions, aiming at reducing false detection by negative example mining. We evaluate our method on the Munich vehicle dataset and the collected vehicle dataset, with improvements in accuracy and robustness compared to existing methods.
••
TL;DR: This work focused on miR-24-1 and found that this miRNA unconventionally activates gene transcription by targeting enhancers, and demonstrates a novel mechanism of miRNA as an enhancer trigger.
Abstract: MicroRNAs (miRNAs) are small non-coding RNAs that function as negative gene expression regulators. Emerging evidence shows that, except for function in the cytoplasm, miRNAs are also present in the nucleus. However, the functional significance of nuclear miRNAs remains largely undetermined. By screening miRNA database, we have identified a subset of miRNA that functions as enhancer regulators. Here, we found a set of miRNAs show gene-activation function. We focused on miR-24-1 and found that this miRNA unconventionally activates gene transcription by targeting enhancers. Consistently, the activation was completely abolished when the enhancer sequence was deleted by TALEN. Furthermore, we found that miR-24-1 activates enhancer RNA (eRNA) expression, alters histone modification, and increases the enrichment of p300 and RNA Pol II at the enhancer locus. Our results demonstrate a novel mechanism of miRNA as an enhancer trigger.
••
01 Oct 2017TL;DR: In this article, a fast-to-train two-streamed CNN is proposed to predict depth and depth gradients, which are then fused together into an accurate and detailed depth map.
Abstract: Estimating depth from a single RGB image is an ill-posed and inherently ambiguous problem. State-of-the-art deep learning methods can now estimate accurate 2D depth maps, but when the maps are projected into 3D, they lack local detail and are often highly distorted. We propose a fast-to-train two-streamed CNN that predicts depth and depth gradients, which are then fused together into an accurate and detailed depth map. We also define a novel set loss over multiple images; by regularizing the estimation between a common set of images, the network is less prone to overfitting and achieves better accuracy than competing methods. Experiments on the NYU Depth v2 dataset shows that our depth predictions are competitive with state-of-the-art and lead to faithful 3D projections.
••
TL;DR: An optimized artificial potential field algorithm for multi-UAV operation in 3-D dynamic space with a distance factor and jump strategy to solve common problems, such as unreachable targets, and ensure that the UAV will not collide with any obstacles is presented.
Abstract: Unmanned aerial vehicle (UAV) systems are one of the most rapidly developing, highest level and most practical applied unmanned aerial systems. Collision avoidance and trajectory planning are the core areas of any UAV system. However, there are theoretical and practical problems associated with the existing methods. To manage these problems, this paper presents an optimized artificial potential field (APF) algorithm for multi-UAV operation in 3-D dynamic space. The classic APF algorithm is restricted to single UAV trajectory planning and usually fails to guarantee the avoidance of collisions. To overcome this challenge, a method is proposed with a distance factor and jump strategy to solve common problems, such as unreachable targets, and ensure that the UAV will not collide with any obstacles. The method considers the UAV companions as dynamic obstacles to realize collaborative trajectory planning. Furthermore, the jitter problem is solved using the dynamic step adjustment method. Several resolution scenarios are illustrated. The method has been validated in quantitative test simulation models and satisfactory results were obtained in a simulated urban environment.
••
TL;DR: Experimental results on the moving and stationary target acquisition and recognition data set indicate that the branched ensemble model based on the unit architecture can achieve 99% classification accuracy with all training data.
Abstract: The deep convolutional neural network (CNN) has been widely used for target classification, because it can learn highly useful representations from data However, it is difficult to apply a CNN for synthetic aperture radar (SAR) target classification directly, for it often requires a large volume of labeled training data, which is impractical for SAR applications The highway network is a newly proposed architecture based on CNN that can be trained with smaller data sets This letter proposes a novel architecture called the convolutional highway unit to train deeper networks with limited SAR data The unit architecture is formed by modified convolutional highway layers, a maxpool layer, and a dropout layer Then, the networks can be flexibly formed by stacking the unit architecture to extract deep feature representations for classification Experimental results on the moving and stationary target acquisition and recognition data set indicate that the branched ensemble model based on the unit architecture can achieve 99% classification accuracy with all training data When the training data are reduced to 30%, the classification accuracy of the ensemble model can still reach 9497%
••
TL;DR: While measurements of the evolution in the IBD spectrum show general agreement with predictions from recent reactor models, the measured evolution in total IBD yield disagrees with recent predictions at 3.1σ, indicating that an overall deficit in the measured flux with respect to predictions does not result from equal fractional deficits from the primary fission isotopes.
Abstract: The Daya Bay experiment has observed correlations between reactor core fuel evolution and changes in the reactor antineutrino flux and energy spectrum. Four antineutrino detectors in two experimental halls were used to identify 2.2 million inverse beta decays (IBDs) over 1230 days spanning multiple fuel cycles for each of six 2.9 GWth reactor cores at the Daya Bay and Ling Ao nuclear power plants. Using detector data spanning effective ^(239)Pu fission fractions F_(239) from 0.25 to 0.35, Daya Bay measures an average IBD yield σ_f of (5.90±0.13)×10^(-43) cm^2/fission and a fuel-dependent variation in the IBD yield, dσ_f/dF_(239), of (-1.86±0.18)×10^(-43) cm^2/fission. This observation rejects the hypothesis of a constant antineutrino flux as a function of the ^(239)Pu fission fraction at 10 standard deviations. The variation in IBD yield is found to be energy dependent, rejecting the hypothesis of a constant antineutrino energy spectrum at 5.1 standard deviations. While measurements of the evolution in the IBD spectrum show general agreement with predictions from recent reactor models, the measured evolution in total IBD yield disagrees with recent predictions at 3.1σ. This discrepancy indicates that an overall deficit in the measured flux with respect to predictions does not result from equal fractional deficits from the primary fission isotopes ^(235)U, ^(239)Pu, ^(238)U, and ^(241)Pu. Based on measured IBD yield variations, yields of (6.17±0.17) and (4.27±0.26)×10^(-43) cm^2/fission have been determined for the two dominant fission parent isotopes ^(235)U and ^(239)Pu. A 7.8% discrepancy between the observed and predicted ^(235)U yields suggests that this isotope may be the primary contributor to the reactor antineutrino anomaly.
••
TL;DR: This paper explored the use of deep convolutional neural network methodology for the automatic classification of diabetic retinopathy using color fundus image, and obtained an accuracy of 94.5% on the authors' dataset, outperforming the results obtained by using classical approaches.
Abstract: The automatic detection of diabetic retinopathy is of vital importance, as it is the main cause of irreversible vision loss in the working-age population in the developed world. The early detection of diabetic retinopathy occurrence can be very helpful for clinical treatment; although several different feature extraction approaches have been proposed, the classification task for retinal images is still tedious even for those trained clinicians. Recently, deep convolutional neural networks have manifested superior performance in image classification compared to previous handcrafted feature-based image classification methods. Thus, in this paper, we explored the use of deep convolutional neural network methodology for the automatic classification of diabetic retinopathy using color fundus image, and obtained an accuracy of 94.5% on our dataset, outperforming the results obtained by using classical approaches.
••
18 May 2017TL;DR: Taking the objects proposals generated by Faster R-CNN for the guard windows of CFAR algorithm, this method picks up the small-sized targets by reevaluating the bounding boxes which have relative low classification scores in detection network, to gain better performance of detection.
Abstract: SAR ship detection is essential to marine monitoring. Recently, with the development of the deep neural network and the spring of the SAR images, SAR ship detection based on deep neural network has been a trend. However, the multi-scale ships in SAR images cause the undesirable differences of features, which decrease the accuracy of ship detection based on deep learning methods. Aiming at this problem, this paper modifies the Faster R-CNN, a state-of-the-art object detection networks, by the traditional constant false alarm rate (CFAR). Taking the objects proposals generated by Faster R-CNN for the guard windows of CFAR algorithm, this method picks up the small-sized targets. By reevaluating the bounding boxes which have relative low classification scores in detection network, this method gain better performance of detection.
••
TL;DR: Fang et al. fabricate one- and two-dimensional nonlinear acoustic metamaterials with a broadband, low-frequency, response—greatly suppressing low frequency noise.
Abstract: Linear acoustic metamaterials (LAMs) are widely used to manipulate sound; however, it is challenging to obtain bandgaps with a generalized width (ratio of the bandgap width to its start frequency) >1 through linear mechanisms. Here we adopt both theoretical and experimental approaches to describe the nonlinear chaotic mechanism in both one-dimensional (1D) and two-dimensional (2D) nonlinear acoustic metamaterials (NAMs). This mechanism enables NAMs to reduce wave transmissions by as much as 20–40 dB in an ultra-low and ultra-broad band that consists of bandgaps and chaotic bands. With subwavelength cells, the generalized width reaches 21 in a 1D NAM and it goes up to 39 in a 2D NAM, which overcomes the bandwidth limit for wave suppression in current LAMs. This work enables further progress in elucidating the dynamics of NAMs and opens new avenues in double-ultra acoustic manipulation.
•
TL;DR: In this paper, a reattention mechanism is proposed to refine current attentions by directly accessing to past attentions in a multi-round alignment architecture, so as to avoid the problems of attention redundancy and attention deficiency.
Abstract: In this paper, we introduce the Reinforced Mnemonic Reader for machine reading comprehension tasks, which enhances previous attentive readers in two aspects. First, a reattention mechanism is proposed to refine current attentions by directly accessing to past attentions that are temporally memorized in a multi-round alignment architecture, so as to avoid the problems of attention redundancy and attention deficiency. Second, a new optimization approach, called dynamic-critical reinforcement learning, is introduced to extend the standard supervised method. It always encourages to predict a more acceptable answer so as to address the convergence suppression problem occurred in traditional reinforcement learning algorithms. Extensive experiments on the Stanford Question Answering Dataset (SQuAD) show that our model achieves state-of-the-art results. Meanwhile, our model outperforms previous systems by over 6% in terms of both Exact Match and F1 metrics on two adversarial SQuAD datasets.
••
TL;DR: Experimental results show that the proposed LRTR method outperforms other denoising algorithms on real corrupted hyperspectral data and can preserve the global structure of HSIs and simultaneously remove Gaussian noise and sparse noise.
Abstract: This paper studies the hyperspectral image (HSI) denoising problem under the assumption that the signal is low in rank. In this paper, a mixture of Gaussian noise and sparse noise is considered. The sparse noise includes stripes, impulse noise, and dead pixels. The denoising task is formulated as a low-rank tensor recovery (LRTR) problem from Gaussian noise and sparse noise. Traditional low-rank tensor decomposition methods are generally NP-hard to compute. Besides, these tensor decomposition based methods are sensitive to sparse noise. In contrast, the proposed LRTR method can preserve the global structure of HSIs and simultaneously remove Gaussian noise and sparse noise.The proposed method is based on a new tensor singular value decomposition and tensor nuclear norm. The NP-hard tensor recovery task is well accomplished by polynomial time algorithms. The convergence of the algorithm and the parameter settings are also described in detail. Preliminary numerical experiments have demonstrated that the proposed method is effective for low-rank tensor recovery from Gaussian noise and sparse noise. Experimental results also show that the proposed LRTR method outperforms other denoising algorithms on real corrupted hyperspectral data.
••
01 Oct 2017TL;DR: This work introduces a novel bridge between the modality-specific representations by creating a co-embedding space based on a recurrent residual fusion (RRF) block that adapts the recurrent mechanism to residual learning, so that it can recursively improve feature embeddings while retaining the shared parameters.
Abstract: A major challenge in matching between vision and language is that they typically have completely different features and representations. In this work, we introduce a novel bridge between the modality-specific representations by creating a co-embedding space based on a recurrent residual fusion (RRF) block. Specifically, RRF adapts the recurrent mechanism to residual learning, so that it can recursively improve feature embeddings while retaining the shared parameters. Then, a fusion module is used to integrate the intermediate recurrent outputs and generates a more powerful representation. In the matching network, RRF acts as a feature enhancement component to gather visual and textual representations into a more discriminative embedding space where it allows to narrow the crossmodal gap between vision and language. Moreover, we employ a bi-rank loss function to enforce separability of the two modalities in the embedding space. In the experiments, we evaluate the proposed RRF-Net using two multi-modal datasets where it achieves state-of-the-art results.
••
TL;DR: To accurately extract vehicle-like targets, an accurate-vehicle-proposal-network (AVPN) based on hyper feature map which combines hierarchical feature maps that are more accurate for small object detection is developed and a coupled R-CNN method is proposed, which combines an AVPN and a vehicle attribute learning network to extract the vehicle's location and attributes simultaneously.
Abstract: Vehicle detection in aerial images, being an interesting but challenging problem, plays an important role for a wide range of applications. Traditional methods are based on sliding-window search and handcrafted or shallow-learning-based features with heavy computational costs and limited representation power. Recently, deep learning algorithms, especially region-based convolutional neural networks (R-CNNs), have achieved state-of-the-art detection performance in computer vision. However, several challenges limit the applications of R-CNNs in vehicle detection from aerial images: 1) vehicles in large-scale aerial images are relatively small in size, and R-CNNs have poor localization performance with small objects; 2) R-CNNs are particularly designed for detecting the bounding box of the targets without extracting attributes; 3) manual annotation is generally expensive and the available manual annotation of vehicles for training R-CNNs are not sufficient in number. To address these problems, this paper proposes a fast and accurate vehicle detection framework. On one hand, to accurately extract vehicle-like targets, we developed an accurate-vehicle-proposal-network (AVPN) based on hyper feature map which combines hierarchical feature maps that are more accurate for small object detection. On the other hand, we propose a coupled R-CNN method, which combines an AVPN and a vehicle attribute learning network to extract the vehicle's location and attributes simultaneously. For original large-scale aerial images with limited manual annotations, we use cropped image blocks for training with data augmentation to avoid overfitting. Comprehensive evaluations on the public Munich vehicle dataset and the collected vehicle dataset demonstrate the accuracy and effectiveness of the proposed method.
••
TL;DR: The results suggest that Hrd1 forms a retro-translocation channel for the movement of misfolded polypeptides through the endoplasmic reticulum membrane.
Abstract: Misfolded endoplasmic reticulum proteins are retro-translocated through the membrane into the cytosol, where they are poly-ubiquitinated, extracted from the membrane, and degraded by the proteasome-a pathway termed endoplasmic reticulum-associated protein degradation (ERAD). Proteins with misfolded domains in the endoplasmic reticulum lumen or membrane are discarded through the ERAD-L and ERAD-M pathways, respectively. In Saccharomyces cerevisiae, both pathways require the ubiquitin ligase Hrd1, a multi-spanning membrane protein with a cytosolic RING finger domain. Hrd1 is the crucial membrane component for retro-translocation, but it is unclear whether it forms a protein-conducting channel. Here we present a cryo-electron microscopy structure of S. cerevisiae Hrd1 in complex with its endoplasmic reticulum luminal binding partner, Hrd3. Hrd1 forms a dimer within the membrane with one or two Hrd3 molecules associated at its luminal side. Each Hrd1 molecule has eight transmembrane segments, five of which form an aqueous cavity extending from the cytosol almost to the endoplasmic reticulum lumen, while a segment of the neighbouring Hrd1 molecule forms a lateral seal. The aqueous cavity and lateral gate are reminiscent of features of protein-conducting conduits that facilitate polypeptide movement in the opposite direction-from the cytosol into or across membranes. Our results suggest that Hrd1 forms a retro-translocation channel for the movement of misfolded polypeptides through the endoplasmic reticulum membrane.
••
TL;DR: A heuristic approach to controlling the iteration number in the fusion process of a cross-view fusion algorithm that leads to a similarity metric for multiview data by systematically fusing multiple similarity measures is proposed.
Abstract: Learning an ideal metric is crucial to many tasks in computer vision. Diverse feature representations may combat this problem from different aspects; as visual data objects described by multiple features can be decomposed into multiple views, thus often provide complementary information. In this paper, we propose a cross-view fusion algorithm that leads to a similarity metric for multiview data by systematically fusing multiple similarity measures. Unlike existing paradigms, we focus on learning distance measure by exploiting a graph structure of data samples, where an input similarity matrix can be improved through a propagation of graph random walk. In particular, we construct multiple graphs with each one corresponding to an individual view, and a cross-view fusion approach based on graph random walk is presented to derive an optimal distance measure by fusing multiple metrics. Our method is scalable to a large amount of data by enforcing sparsity through an anchor graph representation. To adaptively control the effects of different views, we dynamically learn view-specific coefficients, which are leveraged into graph random walk to balance multiviews. However, such a strategy may lead to an over-smooth similarity metric where affinities between dissimilar samples may be enlarged by excessively conducting cross-view fusion. Thus, we figure out a heuristic approach to controlling the iteration number in the fusion process in order to avoid over smoothness. Extensive experiments conducted on real-world data sets validate the effectiveness and efficiency of our approach.