Showing papers in "ACM Transactions on Intelligent Systems and Technology in 2020"

PDF

Open Access

Journal Article•DOI•

A Survey of Unsupervised Deep Domain Adaptation

[...]

Garrett Wilson¹, Diane J. Cook¹•Institutions (1)

05 Jul 2020-ACM Transactions on Intelligent Systems and Technology

TL;DR: A survey will compare single-source and typically homogeneous unsupervised deep domain adaptation approaches, combining the powerful, hierarchical representations from deep learning with domain adaptation to reduce reliance on potentially costly target data labels.

...read moreread less

Abstract: Deep learning has produced state-of-the-art results for a variety of tasks. While such approaches for supervised learning have performed well, they assume that training and testing data are drawn from the same distribution, which may not always be the case. As a complement to this challenge, single-source unsupervised domain adaptation can handle situations where a network is trained on labeled data from a source domain and unlabeled data from a related but different target domain with the goal of performing well at test-time on the target domain. Many single-source and typically homogeneous unsupervised deep domain adaptation approaches have thus been developed, combining the powerful, hierarchical representations from deep learning with domain adaptation to reduce reliance on potentially costly target data labels. This survey will compare these approaches by examining alternative methods, the unique and common elements, results, and theoretical insights. We follow this with a look at application areas and open research directions.

...read moreread less

496 citations

Journal Article•DOI•

Adversarial Attacks on Deep-learning Models in Natural Language Processing: A Survey

[...]

Wei Emma Zhang¹, Quan Z. Sheng², Ahoud Alhazmi², Chenliang Li³•Institutions (3)

University of Adelaide¹, Macquarie University², Wuhan University³

01 Apr 2020-ACM Transactions on Intelligent Systems and Technology

TL;DR: A systematic survey of adversarial examples against deep neural networks for NLP applications is presented in this article, where 40 representative works have been proposed to generate adversarial samples against DNNs.

...read moreread less

Abstract: With the development of high computational devices, deep neural networks (DNNs), in recent years, have gained significant popularity in many Artificial Intelligence (AI) applications. However, previous efforts have shown that DNNs are vulnerable to strategically modified samples, named adversarial examples. These samples are generated with some imperceptible perturbations, but can fool the DNNs to give false predictions. Inspired by the popularity of generating adversarial examples against DNNs in Computer Vision (CV), research efforts on attacking DNNs for Natural Language Processing (NLP) applications have emerged in recent years. However, the intrinsic difference between image (CV) and text (NLP) renders challenges to directly apply attacking methods in CV to NLP. Various methods are proposed addressing this difference and attack a wide range of NLP applications. In this article, we present a systematic survey on these works. We collect all related academic works since the first appearance in 2017. We then select, summarize, discuss, and analyze 40 representative works in a comprehensive way. To make the article self-contained, we cover preliminary knowledge of NLP and discuss related seminal works in computer vision. We conclude our survey with a discussion on open issues to bridge the gap between the existing progress and more robust adversarial attacks on NLP DNNs.

...read moreread less

226 citations

Journal Article•DOI•

Video Object Segmentation and Tracking: A Survey

[...]

Rui Yao¹, Guosheng Lin², Shixiong Xia³, Jiaqi Zhao³, Yong Zhou³ - Show less +1 more•Institutions (3)

Chinese Ministry of Education¹, Nanyang Technological University², China University of Mining and Technology³

23 May 2020-ACM Transactions on Intelligent Systems and Technology

TL;DR: This survey aims to provide a comprehensive review of the state-of-the-art VOST methods, classify these methods into different categories, and identify new trends.

...read moreread less

Abstract: Object segmentation and object tracking are fundamental research areas in the computer vision community. These two topics are difficult to handle some common challenges, such as occlusion, deformation, motion blur, scale variation, and more. The former contains heterogeneous object, interacting object, edge ambiguity, and shape complexity; the latter suffers from difficulties in handling fast motion, out-of-view, and real-time processing. Combining the two problems of Video Object Segmentation and Tracking (VOST) can overcome their respective difficulties and improve their performance. VOST can be widely applied to many practical applications such as video summarization, high definition video compression, human computer interaction, and autonomous vehicles. This survey aims to provide a comprehensive review of the state-of-the-art VOST methods, classify these methods into different categories, and identify new trends. First, we broadly categorize VOST methods into Video Object Segmentation (VOS) and Segmentation-based Object Tracking (SOT). Each category is further classified into various types based on the segmentation and tracking mechanism. Moreover, we present some representative VOS and SOT methods of each time node. Second, we provide a detailed discussion and overview of the technical characteristics of the different methods. Third, we summarize the characteristics of the related video dataset and provide a variety of evaluation metrics. Finally, we point out a set of interesting future works and draw our own conclusions.

...read moreread less

85 citations

Journal Article•DOI•

Transfer Learning with Dynamic Distribution Adaptation

[...]

Jindong Wang¹, Yiqiang Chen², Wenjie Feng², Han Yu³, Meiyu Huang⁴, Qiang Yang⁵ - Show less +2 more•Institutions (5)

Microsoft¹, Chinese Academy of Sciences², Nanyang Technological University³, China Academy of Space Technology⁴, Hong Kong University of Science and Technology⁵

06 Feb 2020-ACM Transactions on Intelligent Systems and Technology

TL;DR: Zhang et al. as mentioned in this paper proposed a novel concept called Dynamic Distribution Adaptation (DDA), which is capable of quantitatively evaluating the relative importance of each distribution, which can be easily incorporated into the framework of structural risk minimization to solve transfer learning problems.

...read moreread less

Abstract: Transfer learning aims to learn robust classifiers for the target domain by leveraging knowledge from a source domain. Since the source and the target domains are usually from different distributions, existing methods mainly focus on adapting the cross-domain marginal or conditional distributions. However, in real applications, the marginal and conditional distributions usually have different contributions to the domain discrepancy. Existing methods fail to quantitatively evaluate the different importance of these two distributions, which will result in unsatisfactory transfer performance. In this article, we propose a novel concept called Dynamic Distribution Adaptation (DDA), which is capable of quantitatively evaluating the relative importance of each distribution. DDA can be easily incorporated into the framework of structural risk minimization to solve transfer learning problems. On the basis of DDA, we propose two novel learning algorithms: (1) Manifold Dynamic Distribution Adaptation (MDDA) for traditional transfer learning, and (2) Dynamic Distribution Adaptation Network (DDAN) for deep transfer learning. Extensive experiments demonstrate that MDDA and DDAN significantly improve the transfer learning performance and set up a strong baseline over the latest deep and adversarial methods on digits recognition, sentiment analysis, and image classification. More importantly, it is shown that marginal and conditional distributions have different contributions to the domain divergence, and our DDA is able to provide good quantitative evaluation of their relative importance, which leads to better performance. We believe this observation can be helpful for future research in transfer learning.

...read moreread less

80 citations

Journal Article•DOI•

Web Table Extraction, Retrieval, and Augmentation: A Survey

[...]

Shuo Zhang¹, Krisztian Balog¹•Institutions (1)

University of Stavanger¹

25 Jan 2020-ACM Transactions on Intelligent Systems and Technology

TL;DR: The objective of this survey is to synthesize and present two decades of research on web tables into six main categories of information access tasks: table extraction, table interpretation, table search, question answering, knowledge base augmentation, and table augmentation.

...read moreread less

Abstract: Tables are powerful and popular tools for organizing and manipulating data. A vast number of tables can be found on the Web, which represent a valuable knowledge resource. The objective of this survey is to synthesize and present two decades of research on web tables. In particular, we organize existing literature into six main categories of information access tasks: table extraction, table interpretation, table search, question answering, knowledge base augmentation, and table augmentation. For each of these tasks, we identify and describe seminal approaches, present relevant resources, and point out interdependencies among the different tasks.

...read moreread less

66 citations

Journal Article•DOI•

Self-weighted Robust LDA for Multiclass Classification with Edge Classes

[...]

Caixia Yan¹, Xiaojun Chang², Minnan Luo¹, Qinghua Zheng¹, Xiaoqin Zhang³, Zhihui Li⁴, Feiping Nie⁵ - Show less +3 more•Institutions (5)

Xi'an Jiaotong University¹, Monash University², Wenzhou University³, Qilu University of Technology⁴, Northwestern Polytechnical University⁵

17 Dec 2020-ACM Transactions on Intelligent Systems and Technology

TL;DR: In this article, a self-weighted robust linear discriminant analysis (LDA) with pairwise between-class distance criterion, called SWRLDA, is proposed for multi-class classification especially with edge classes.

...read moreread less

Abstract: Linear discriminant analysis (LDA) is a popular technique to learn the most discriminative features for multi-class classification. A vast majority of existing LDA algorithms are prone to be dominated by the class with very large deviation from the others, i.e., edge class, which occurs frequently in multi-class classification. First, the existence of edge classes often makes the total mean biased in the calculation of between-class scatter matrix. Second, the exploitation of l2-norm based between-class distance criterion magnifies the extremely large distance corresponding to edge class. In this regard, a novel self-weighted robust LDA with l2,1-norm based pairwise between-class distance criterion, called SWRLDA, is proposed for multi-class classification especially with edge classes. SWRLDA can automatically avoid the optimal mean calculation and simultaneously learn adaptive weights for each class pair without setting any additional parameter. An efficient re-weighted algorithm is exploited to derive the global optimum of the challenging l2,1-norm maximization problem. The proposed SWRLDA is easy to implement and converges fast in practice. Extensive experiments demonstrate that SWRLDA performs favorably against other compared methods on both synthetic and real-world datasets while presenting superior computational efficiency in comparison with other techniques.

...read moreread less

49 citations

Journal Article•DOI•

Flexible Multi-modal Hashing for Scalable Multimedia Retrieval

[...]

Lei Zhu¹, Xu Lu¹, Zhiyong Cheng², Jingjing Li³, Huaxiang Zhang¹ - Show less +1 more•Institutions (3)

Shandong Normal University¹, Qilu University of Technology², University of Electronic Science and Technology of China³

10 Jan 2020-ACM Transactions on Intelligent Systems and Technology

TL;DR: A novel Flexible Multi-modal Hashing method that learns multiple modality-specific hash codes and multi- modality collaborative hash codes simultaneously within a single model to address the problem of binarize queries when only one or part of modalities are provided.

...read moreread less

Abstract: Multi-modal hashing methods could support efficient multimedia retrieval by combining multi-modal features for binary hash learning at the both offline training and online query stages. However, existing multi-modal methods cannot binarize the queries, when only one or part of modalities are provided. In this article, we propose a novel Flexible Multi-modal Hashing (FMH) method to address this problem. FMH learns multiple modality-specific hash codes and multi-modal collaborative hash codes simultaneously within a single model. The hash codes are flexibly generated according to the newly coming queries, which provide any one or combination of modality features. Besides, the hashing learning procedure is efficiently supervised by the pair-wise semantic matrix to enhance the discriminative capability. It could successfully avoid the challenging symmetric semantic matrix factorization and O(n2) storage cost of semantic matrix. Finally, we design a fast discrete optimization to learn hash codes directly with simple operations. Experiments validate the superiority of the proposed approach.

...read moreread less

47 citations

Journal Article•DOI•

Learning Generalizable and Identity-Discriminative Representations for Face Anti-Spoofing

[...]

Xiaoguang Tu¹, Zheng Ma¹, Jian Zhao, Guodong Du², Mei Xie¹, Jiashi Feng² - Show less +2 more•Institutions (2)

University of Electronic Science and Technology of China¹, National University of Singapore²

25 Jul 2020-ACM Transactions on Intelligent Systems and Technology

TL;DR: A new generalizable face authentication CNN (GFA-CNN) model with three novelties, which introduces a simple yet effective total pairwise confusion loss for CNN training that properly balances contributions of all spoofing patterns for recognizing the spoofing faces.

...read moreread less

Abstract: Face anti-spoofing aims to detect presentation attack to face recognition--based authentication systems. It has drawn growing attention due to the high security demand. The widely adopted CNN-based methods usually well recognize the spoofing faces when training and testing spoofing samples display similar patterns, but their performance would drop drastically on testing spoofing faces of novel patterns or unseen scenes, leading to poor generalization performance. Furthermore, almost all current methods treat face anti-spoofing as a prior step to face recognition, which prolongs the response time and makes face authentication inefficient. In this article, we try to boost the generalizability and applicability of face anti-spoofing methods by designing a new generalizable face authentication CNN (GFA-CNN) model with three novelties. First, GFA-CNN introduces a simple yet effective total pairwise confusion loss for CNN training that properly balances contributions of all spoofing patterns for recognizing the spoofing faces. Second, it incorporate a fast domain adaptation component to alleviate negative effects brought by domain variation. Third, it deploys filter diversification learning to make the learned representations more adaptable to new scenes. In addition, the proposed GFA-CNN works in a multi-task manner—it performs face anti-spoofing and face recognition simultaneously. Experimental results on five popular face anti-spoofing and face recognition benchmarks show that GFA-CNN outperforms previous face anti-spoofing methods on cross-test protocols significantly and also well preserves the identity information of input face images.

...read moreread less

42 citations

Journal Article•DOI•

Practical Privacy Preserving POI Recommendation

[...]

Chaochao Chen, Jun Zhou, Bingzhe Wu¹, Wenjing Fang, Li Wang, Yuan Qi, Xiaolin Zheng² - Show less +3 more•Institutions (2)

Peking University¹, Zhejiang University²

05 Jul 2020-ACM Transactions on Intelligent Systems and Technology

TL;DR: This article proposes a novel Privacy preserving POI Recommendation (PriRec) framework that keeps users’ private raw data and models in users' own hands, and protects user privacy to a large extent, and applies PriRec in real-world datasets, and comprehensive experiments demonstrate that it achieves comparable or even better recommendation accuracy.

...read moreread less

Abstract: Point-of-Interest (POI) recommendation has been extensively studied and successfully applied in industry recently However, most existing approaches build centralized models on the basis of collecting users’ data Both private data and models are held by the recommender, which causes serious privacy concerns In this article, we propose a novel Privacy preserving POI Recommendation (PriRec) framework First, to protect data privacy, users’ private data (features and actions) are kept on their own side, eg, Cellphone or Pad Meanwhile, the public data that need to be accessed by all the users are kept by the recommender to reduce the storage costs of users’ devices Those public data include: (1) static data only related to the status of POI, such as POI categories, and (2) dynamic data dependent on user-POI actions such as visited counts The dynamic data could be sensitive, and we develop local differential privacy techniques to release such data to the public with privacy guarantees Second, PriRec follows the representations of Factorization Machine (FM) that consists of a linear model and the feature interaction model To protect the model privacy, the linear models are saved on the users’ side, and we propose a secure decentralized gradient descent protocol for users to learn it collaboratively The feature interaction model is kept by the recommender since there is no privacy risk, and we adopt a secure aggregation strategy in a federated learning paradigm to learn it To this end, PriRec keeps users’ private raw data and models in users’ own hands, and protects user privacy to a large extent We apply PriRec in real-world datasets, and comprehensive experiments demonstrate that, compared with FM, PriRec achieves comparable or even better recommendation accuracy

...read moreread less

36 citations

Journal Article•DOI•

Trembr: Exploring Road Networks for Trajectory Representation Learning

[...]

Tao-Yang Fu¹, Wang-Chien Lee¹•Institutions (1)

Pennsylvania State University¹

04 Feb 2020-ACM Transactions on Intelligent Systems and Technology

TL;DR: A novel representation learning framework, namely TRajectory EMBedding via Road networks (Trembr), to learn trajectory embeddings for use in a variety of trajectory applications that soundly outperforms the state-of-the-art trajectory representation learning models, trajectory2vec and t2vec.

...read moreread less

Abstract: In this article, we propose a novel representation learning framework, namely TRajectory EMBedding via Road networks (Trembr), to learn trajectory embeddings (low-dimensional feature vectors) for use in a variety of trajectory applications. The novelty of Trembr lies in (1) the design of a recurrent neural network--(RNN) based encoder--decoder model, namely Traj2Vec, that encodes spatial and temporal properties inherent in trajectories into trajectory embeddings by exploiting the underlying road networks to constrain the learning process in accordance with the matched road segments obtained using road network matching techniques (e.g., Barefoot [24, 27]), and (2) the design of a neural network--based model, namely Road2Vec, to learn road segment embeddings in road networks that captures various relationships amongst road segments in preparation for trajectory representation learning. In addition to model design, several unique technical issues raising in Trembr, including data preparation in Road2Vec, the road segment relevance-aware loss, and the network topology constraint in Traj2Vec, are examined. To validate our ideas, we learn trajectory embeddings using multiple large-scale real-world trajectory datasets and use them in three tasks, including trajectory similarity measure, travel time prediction, and destination prediction. Empirical results show that Trembr soundly outperforms the state-of-the-art trajectory representation learning models, trajectory2vec and t2vec, by at least one order of magnitude in terms of mean rank in trajectory similarity measure, 23.3% to 41.7% in terms of mean absolute error (MAE) in travel time prediction, and 39.6% to 52.4% in terms of MAE in destination prediction.

...read moreread less

29 citations

Journal Article•DOI•

A Novel Multi-task Tensor Correlation Neural Network for Facial Attribute Prediction

[...]

Mingxing Duan¹, Kenli Li¹, Keqin Li², Qi Tian³•Institutions (3)

Hunan University¹, State University of New York System², Huawei³

13 Nov 2020-ACM Transactions on Intelligent Systems and Technology

TL;DR: In this article, a novel multi-attribute tensor correlation neural network (MTCN) is used to predict face attributes and achieves the best performance compared with the latest multi- attribute recognition algorithms under the same settings.

...read moreread less

Abstract: Multi-task learning plays an important role in face multi-attribute prediction. At present, most researches excavate the shared information between attributes by sharing all convolutional layers. However, it is not appropriate to treat the low-level and high-level features of the face multi-attribute equally, because the high-level features are more biased toward the specific content of the category. In this article, a novel multi-attribute tensor correlation neural network (MTCN) is used to predict face attributes. MTCN shares all attribute features at the low-level layers, and then distinguishes each attribute feature at the high-level layers. To better excavate the correlations among high-level attribute features, each sub-network explores useful information from other networks to enhance its original information. Then a tensor canonical correlation analysis method is used to seek the correlations among the highest-level attributes, which enhances the original information of each attribute. After that, these features are mapped into a highly correlated space through the correlation matrix. Finally, we use sufficient experiments to verify the performance of MTCN on the CelebA and LFWA datasets and our MTCN achieves the best performance compared with the latest multi-attribute recognition algorithms under the same settings.

...read moreread less

Journal Article•DOI•

An Attention-based Rumor Detection Model with Tree-structured Recursive Neural Networks

[...]

Jing Ma¹, Wei Gao², Shafiq Joty³, Kam-Fai Wong⁴•Institutions (4)

Hong Kong Baptist University¹, Singapore Management University², Nanyang Technological University³, The Chinese University of Hong Kong⁴

08 Jun 2020-ACM Transactions on Intelligent Systems and Technology

TL;DR: This work proposes to learn discriminative features from microblog posts by following their non-sequential propagation structure and generate more powerful representations for identifying rumors, and reveals that effective rumor detection is highly related to finding evidential posts.

...read moreread less

Abstract: Rumor spread in social media severely jeopardizes the credibility of online content. Thus, automatic debunking of rumors is of great importance to keep social media a healthy environment. While facing a dubious claim, people often dispute its truthfulness sporadically in their posts containing various cues, which can form useful evidence with long-distance dependencies. In this work, we propose to learn discriminative features from microblog posts by following their non-sequential propagation structure and generate more powerful representations for identifying rumors. For modeling non-sequential structure, we first represent the diffusion of microblog posts with propagation trees, which provide valuable clues on how a claim in the original post is transmitted and developed over time. We then present a bottom-up and a top-down tree-structured models based on Recursive Neural Networks (RvNN) for rumor representation learning and classification, which naturally conform to the message propagation process in microblogs. To enhance the rumor representation learning, we reveal that effective rumor detection is highly related to finding evidential posts, e.g., the posts expressing specific attitude towards the veracity of a claim, as an extension of the previous RvNN-based detection models that treat every post equally. For this reason, we design discriminative attention mechanisms for the RvNN-based models to selectively attend on the subset of evidential posts during the bottom-up/top-down recursive composition. Experimental results on four datasets collected from real-world microblog platforms confirm that (1) our RvNN-based models achieve much better rumor detection and classification performance than state-of-the-art approaches; (2) the attention mechanisms for focusing on evidential posts can further improve the performance of our RvNN-based method; and (3) our approach possesses superior capacity on detecting rumors at a very early stage.

...read moreread less

Journal Article•DOI•

Deep Learning Thermal Image Translation for Night Vision Perception

[...]

Shuo Liu¹, Mingliang Gao², Vijay John³, Zheng Liu¹, Erik Blasch - Show less +1 more•Institutions (3)

University of British Columbia¹, Shandong University of Technology², Toyota Technological Institute³

22 Dec 2020-ACM Transactions on Intelligent Systems and Technology

TL;DR: Zhang et al. as discussed by the authors proposed a thermal image translation method, which can translate thermal/infrared (IR) images into color visible (VI) images, called IR2VI, which consists of two cascaded steps: translation from nighttime thermal IR images to gray-scale visible images (GVI), which is called IR-GVI; and the translation from GVI to color visible images, which is known as GVI-CVI in this article.

...read moreread less

Abstract: Context enhancement is critical for the environmental perception in night vision applications, especially for the dark night situation without sufficient illumination. In this article, we propose a thermal image translation method, which can translate thermal/infrared (IR) images into color visible (VI) images, called IR2VI. The IR2VI consists of two cascaded steps: translation from nighttime thermal IR images to gray-scale visible images (GVI), which is called IR-GVI; and the translation from GVI to color visible images (CVI), which is known as GVI-CVI in this article. For the first step, we develop the Texture-Net, a novel unsupervised image translation neural network based on generative adversarial networks. Texture-Net can learn the intrinsic characteristics from the GVI and integrate them into the IR image. In comparison with the state-of-the-art unsupervised image translation methods, the proposed Texture-Net is able to address some common challenges, e.g., incorrect mapping and lack of fine details, with a structure connection module and a region-of-interest focal loss. For the second step, we investigated the state-of-the-art gray-scale image colorization methods and integrate the deep convolutional neural network into the IR2VI framework. The results of the comprehensive evaluation experiments demonstrate the effectiveness of the proposed IR2VI image translation method. This solution will contribute to the environmental perception and understanding in varied night vision applications.

...read moreread less

Journal Article•DOI•

Pair-based Uncertainty and Diversity Promoting Early Active Learning for Person Re-identification

[...]

Wenhe Liu¹, Xiaojun Chang², Ling Chen³, Dinh Phung², Xiaoqin Zhang⁴, Yi Yang³, Alexander G. Hauptmann¹ - Show less +3 more•Institutions (4)

Carnegie Mellon University¹, Monash University², University of Technology, Sydney³, Wenzhou University⁴

28 Jan 2020-ACM Transactions on Intelligent Systems and Technology

TL;DR: This work proposes a novel pair-based active learning for Re-ID that selects pairs instead of instances from the entire dataset for annotation, and takes into account the uncertainty and the diversity in terms of pairwise relations.

...read moreread less

Abstract: The effective training of supervised Person Re-identification (Re-ID) models requires sufficient pairwise labeled data. However, when there is limited annotation resource, it is difficult to collect pairwise labeled data. We consider a challenging and practical problem called Early Active Learning, which is applied to the early stage of experiments when there is no pre-labeled sample available as references for human annotating. Previous early active learning methods suffer from two limitations for Re-ID. First, these instance-based algorithms select instances rather than pairs, which can result in missing optimal pairs for Re-ID. Second, most of these methods only consider the representativeness of instances, which can result in selecting less diverse and less informative pairs. To overcome these limitations, we propose a novel pair-based active learning for Re-ID. Our algorithm selects pairs instead of instances from the entire dataset for annotation. Besides representativeness, we further take into account the uncertainty and the diversity in terms of pairwise relations. Therefore, our algorithm can produce the most representative, informative, and diverse pairs for Re-ID data annotation. Extensive experimental results on five benchmark Re-ID datasets have demonstrated the superiority of the proposed pair-based early active learning algorithm.

...read moreread less

Journal Article•DOI•

WiSign: Ubiquitous American Sign Language Recognition Using Commercial Wi-Fi Devices

[...]

Lei Zhang¹, Yixiang Zhang¹, Xiaolong Zheng²•Institutions (2)

Tianjin University¹, Beijing University of Posts and Telecommunications²

23 Apr 2020-ACM Transactions on Intelligent Systems and Technology

TL;DR: WiSign is proposed that recognizes the continuous sentences of American Sign Language (ASL) with existing WiFi infrastructure and integrates the language model N-gram, which uses the grammar rules of ASL to calibrate the recognized results of sign words.

...read moreread less

Abstract: In this article, we propose WiSign that recognizes the continuous sentences of American Sign Language (ASL) with existing WiFi infrastructure. Instead of identifying the individual ASL words from the manually segmented ASL sentence in existing works, WiSign can automatically segment the original channel state information (CSI) based on the power spectral density (PSD) segmentation method. WiSign constructs a five-layer Deep Belief Network (DBN) to automatically extract the features of isolated fragments, and then uses the Hidden Markov Model (HMM) with Gaussian mixture and Forward-Backward algorithm to recognize sign words. In order to further improve the accuracy, WiSign also integrates the language model N-gram, which uses the grammar rules of ASL to calibrate the recognized results of sign words. We implement a prototype of WiSign with commercial WiFi devices and evaluate its performance in real indoor environments. The results show that WiSign achieves satisfactory accuracy when recognizing ASL sentences that involve the movements of the head, arms, hands, and fingers.

...read moreread less

Journal Article•DOI•

Travel Recommendation via Fusing Multi-Auxiliary Information into Matrix Factorization

[...]

Lei Chen¹, Zhiang Wu², Jie Cao², Guixiang Zhu¹, Yong Ge³ - Show less +1 more•Institutions (3)

Nanjing University of Science and Technology¹, Nanjing University of Finance and Economics², University of Arizona³

10 Jan 2020-ACM Transactions on Intelligent Systems and Technology

TL;DR: The e-tourism has become one of the hottest industries with the adoption of self-service booking systems, and the personalized recommendation is invariably highly-valued by both consumers and merchants.

...read moreread less

Abstract: As an e-commerce feature, the personalized recommendation is invariably highly-valued by both consumers and merchants. The e-tourism has become one of the hottest industries with the adoption of recommendation systems. Several lines of evidence have confirmed the travel-product recommendation is quite different from traditional recommendations. Travel products are usually browsed and purchased relatively infrequently compared with other traditional products (e.g., books and food), which gives rise to the extreme sparsity of travel data. Meanwhile, the choice of a suitable travel product is affected by an army of factors such as departure, destination, and financial and time budgets. To address these challenging problems, in this article, we propose a Probabilistic Matrix Factorization with Multi-Auxiliary Information (PMF-MAI) model in the context of the travel-product recommendation. In particular, PMF-MAI is able to fuse the probabilistic matrix factorization on the user-item interaction matrix with the linear regression on a suite of features constructed by the multiple auxiliary information. In order to fit the sparse data, PMF-MAI is built by a whole-data based learning approach that utilizes unobserved data to increase the coupling between probabilistic matrix factorization and linear regression. Extensive experiments are conducted on a real-world dataset provided by a large tourism e-commerce company. PMF-MAI shows an overwhelming superiority over all competitive baselines on the recommendation performance. Also, the importance of features is examined to reveal the crucial auxiliary information having a great impact on the adoption of travel products.

...read moreread less

Journal Article•DOI•

DeepKey: A Multimodal Biometric Authentication System via Deep Decoding Gaits and Brainwaves

[...]

Xiang Zhang¹, Lina Yao¹, Chaoran Huang¹, Tao Gu², Zheng Yang³, Yunhao Liu⁴ - Show less +2 more•Institutions (4)

University of New South Wales¹, RMIT University², Tsinghua University³, Michigan State University⁴

31 May 2020-ACM Transactions on Intelligent Systems and Technology

TL;DR: The preliminary results demonstrate that DeepKey is feasible, shows consistent superior performance compared to a set of methods, and has the potential to be applied to the authentication deployment in real-world settings.

...read moreread less

Abstract: Biometric authentication involves various technologies to identify individuals by exploiting their unique, measurable physiological and behavioral characteristics. However, traditional biometric authentication systems (e.g., face recognition, iris, retina, voice, and fingerprint) are at increasing risks of being tricked by biometric tools such as anti-surveillance masks, contact lenses, vocoder, or fingerprint films. In this article, we design a multimodal biometric authentication system named DeepKey, which uses both Electroencephalography (EEG) and gait signals to better protect against such risk. DeepKey consists of two key components: an Invalid ID Filter Model to block unauthorized subjects, and an identification model based on attention-based Recurrent Neural Network (RNN) to identify a subject’s EEG IDs and gait IDs in parallel. The subject can only be granted access while all the components produce consistent affirmations to match the user’s proclaimed identity. We implement DeepKey with a live deployment in our university and conduct extensive empirical experiments to study its technical feasibility in practice. DeepKey achieves the False Acceptance Rate (FAR) and the False Rejection Rate (FRR) of 0 and 1.0%, respectively. The preliminary results demonstrate that DeepKey is feasible, shows consistent superior performance compared to a set of methods, and has the potential to be applied to the authentication deployment in real-world settings.

...read moreread less

Journal Article•DOI•

Domain-attention Conditional Wasserstein Distance for Multi-source Domain Adaptation

[...]

Hanrui Wu¹, Yuguang Yan², Michael K. Ng², Qingyao Wu¹•Institutions (2)

South China University of Technology¹, University of Hong Kong²

31 May 2020-ACM Transactions on Intelligent Systems and Technology

TL;DR: Multi-source domain adaptation has received considerable attention due to its effectiveness of leveraging the knowledge from multiple related sources with different distributions to enhance the knowledge of different distributions.

...read moreread less

Abstract: Multi-source domain adaptation has received considerable attention due to its effectiveness of leveraging the knowledge from multiple related sources with different distributions to enhance the learning performance. One of the fundamental challenges in multi-source domain adaptation is how to determine the amount of knowledge transferred from each source domain to the target domain. To address this issue, we propose a new algorithm, called Domain-attention Conditional Wasserstein Distance (DCWD), to learn transferred weights for evaluating the relatedness across the source and target domains. In DCWD, we design a new conditional Wasserstein distance objective function by taking the label information into consideration to measure the distance between a given source domain and the target domain. We also develop an attention scheme to compute the transferred weights of different source domains based on their conditional Wasserstein distances to the target domain. After that, the transferred weights can be used to reweight the source data to determine their importance in knowledge transfer. We conduct comprehensive experiments on several real-world data sets, and the results demonstrate the effectiveness and efficiency of the proposed method.

...read moreread less

Journal Article•DOI•

Pricing-aware Real-time Charging Scheduling and Charging Station Expansion for Large-scale Electric Buses

[...]

Guang Wang¹, Zhihan Fang¹, Xiaoyang Xie¹, Shuai Wang², Huijun Sun³, Fan Zhang, Yunhuai Liu⁴, Desheng Zhang¹ - Show less +4 more•Institutions (4)

Rutgers University¹, Southeast University², Beijing Jiaotong University³, Peking University⁴

21 Nov 2020-ACM Transactions on Intelligent Systems and Technology

TL;DR: This article designs busCharging, a pricing-aware real-time charging scheduling system based on Markov Decision Process to reduce the overall charging and operating costs for city-scale electric bus fleets, taking the time-variant electricity pricing into account.

...read moreread less

Abstract: We are witnessing a rapid growth of electrified vehicles due to the ever-increasing concerns on urban air quality and energy security. Compared to other types of electric vehicles, electric buses have not yet been prevailingly adopted worldwide due to their high owning and operating costs, long charging time, and the uneven spatial distribution of charging facilities. Moreover, the highly dynamic environment factors such as unpredictable traffic congestion, different passenger demands, and even the changing weather can significantly affect electric bus charging efficiency and potentially hinder the further promotion of large-scale electric bus fleets. To address these issues, in this article, we first analyze a real-world dataset including massive data from 16,359 electric buses, 1,400 bus lines, and 5,562 bus stops. Then, we investigate the electric bus network to understand its operating and charging patterns, and further verify the necessity and feasibility of a real-time charging scheduling. With such understanding, we design busCharging, a pricing-aware real-time charging scheduling system based on Markov Decision Process to reduce the overall charging and operating costs for city-scale electric bus fleets, taking the time-variant electricity pricing into account. To show the effectiveness of busCharging, we implement it with the real-world data from Shenzhen, which includes GPS data of electric buses, the metadata of all bus lines and bus stops, combined with data of 376 charging stations for electric buses. The evaluation results show that busCharging dramatically reduces the charging cost by 23.7% and 12.8% of electricity usage simultaneously. Finally, we design a scheduling-based charging station expansion strategy to verify our busCharging is also effective during the charging station expansion process.

...read moreread less

Journal Article•DOI•

DHPA: Dynamic Human Preference Analytics Framework: A Case Study on Taxi Drivers’ Learning Curve Analysis

[...]

Menghai Pan¹, Weixiao Huang¹, Yanhua Li¹, Xun Zhou², Zhenming Liu, Rui Song³, Hui Lu⁴, Zhihong Tian⁴, Jun Luo - Show less +5 more•Institutions (4)

Worcester Polytechnic Institute¹, University of Iowa², North Carolina State University³, Guangzhou University⁴

17 Jan 2020-ACM Transactions on Intelligent Systems and Technology

TL;DR: This work inversely learns the taxi drivers’ preferences from data and characterize the dynamics of such preferences over time, and extracts two types of features to model the decision space of drivers and learns the preferences of drivers with respect to these features.

...read moreread less

Abstract: Many real-world human behaviors can be modeled and characterized as sequential decision-making processes, such as a taxi driver’s choices of working regions and times. Each driver possesses unique preferences on the sequential choices over time and improves the driver’s working efficiency. Understanding the dynamics of such preferences helps accelerate the learning process of taxi drivers. Prior works on taxi operation management mostly focus on finding optimal driving strategies or routes, lacking in-depth analysis on what the drivers learned during the process and how they affect the performance of the driver. In this work, we make the first attempt to establish Dynamic Human Preference Analytics. We inversely learn the taxi drivers’ preferences from data and characterize the dynamics of such preferences over time. We extract two types of features (i.e., profile features and habit features) to model the decision space of drivers. Then through inverse reinforcement learning, we learn the preferences of drivers with respect to these features. The results illustrate that self-improving drivers tend to keep adjusting their preferences to habit features to increase their earning efficiency while keeping the preferences to profile features invariant. However, experienced drivers have stable preferences over time. The exploring drivers tend to randomly adjust the preferences over time.

...read moreread less

Journal Article•DOI•

Market Clearing–based Dynamic Multi-agent Task Allocation

[...]

Sofia Amador Nelke¹, Steven Okamoto², Roie Zivan³•Institutions (3)

Holon Institute of Technology¹, Google², Ben-Gurion University of the Negev³

21 Jan 2020-ACM Transactions on Intelligent Systems and Technology

TL;DR: In this paper, the authors propose FMC_TA, a novel incomplete task allocation algorithm that allows tasks to be easily sequenced to yield high-quality solutions in realistic multi-agent team applications.

...read moreread less

Abstract: Realistic multi-agent team applications often feature dynamic environments with soft deadlines that penalize late execution of tasks. This puts a premium on quickly allocating tasks to agents. However, when such problems include temporal and spatial constraints that require tasks to be executed sequentially by agents, they are NP-hard, and thus are commonly solved using general and specifically designed incomplete heuristic algorithms. We propose FMC_TA, a novel such incomplete task allocation algorithm that allows tasks to be easily sequenced to yield high-quality solutions. FMC_TA first finds allocations that are fair (envy-free), balancing the load and sharing important tasks among agents, and efficient (Pareto optimal) in a simplified version of the problem. It computes such allocations in polynomial or pseudo-polynomial time (centrally or distributedly, respectively) using a Fisher market with agents as buyers and tasks as goods. It then heuristically schedules the allocations, taking into account inter-agent constraints on shared tasks. We empirically compare our algorithm to state-of-the-art incomplete methods, both centralized and distributed, on law enforcement problems inspired by real police logs. We present a novel formalization of the law enforcement problem, which we use to perform our empirical study. The results show a clear advantage for FMC_TA in total utility and in measures in which law enforcement authorities measure their own performance. Besides problems with realistic properties, the algorithms were compared on synthetic problems in which we increased the size of different elements of the problem to investigate the algorithm’s behavior when the problem scales. The domination of the proposed algorithm was found to be consistent.

...read moreread less

Journal Article•DOI•

Discovering Underlying Plans Based on Shallow Models

[...]

Hankz Hankui Zhuo¹, Yantian Zha², Subbarao Kambhampati², Xin Tian¹•Institutions (2)

Sun Yat-sen University¹, Arizona State University²

24 Jan 2020-ACM Transactions on Intelligent Systems and Technology

TL;DR: Two approaches are proposed, DUP and RNNPlanner, to discover target plans based on vector representations of actions based on the corpora, capable of discovering underlying plans that are not from plan libraries without requiring action models provided.

...read moreread less

Abstract: Plan recognition aims to discover target plans (i.e., sequences of actions) behind observed actions, with history plan libraries or action models in hand. Previous approaches either discover plans by maximally “matching” observed actions to plan libraries, assuming target plans are from plan libraries, or infer plans by executing action models to best explain the observed actions, assuming that complete action models are available. In real-world applications, however, target plans are often not from plan libraries, and complete action models are often not available, since building complete sets of plans and complete action models are often difficult or expensive. In this article, we view plan libraries as corpora and learn vector representations of actions using the corpora; we then discover target plans based on the vector representations. Specifically, we propose two approaches, DUP and RNNPlanner, to discover target plans based on vector representations of actions. DUP explores the EM-style (Expectation Maximization) framework to capture local contexts of actions and discover target plans by optimizing the probability of target plans, while RNNPlanner aims to leverage long-short term contexts of actions based on RNNs (Recurrent Neural Networks) framework to help recognize target plans. In the experiments, we empirically show that our approaches are capable of discovering underlying plans that are not from plan libraries without requiring action models provided. We demonstrate the effectiveness of our approaches by comparing its performance to traditional plan recognition approaches in three planning domains. We also compare DUP and RNNPlanner to see their advantages and disadvantages.

...read moreread less

Journal Article•DOI•

Analyzing and Detecting Collusive Users Involved in Blackmarket Retweeting Activities

[...]

Udit Arora¹, Hridoy Sankar Dutta¹, Brihi Joshi¹, Aditya Chetan¹, Tanmoy Chakraborty¹ - Show less +1 more•Institutions (1)

Indraprastha Institute of Information Technology¹

18 Apr 2020-ACM Transactions on Intelligent Systems and Technology

TL;DR: This article investigates these blackmarket customers engaged in collusive retweeting activities and adopts Weighted Generalized Canonical Correlation Analysis (WGCCA) to combine these individual representations to derive user embeddings that allow us to effectively classify users as: genuine users, bots, promotional customers, and normal customers.

...read moreread less

Abstract: With the rise in popularity of social media platforms like Twitter, having higher influence on these platforms has a greater value attached to it, since it has the power to influence many decisions in the form of brand promotions and shaping opinions. However, blackmarket services that allow users to inorganically gain influence are a threat to the credibility of these social networking platforms. Twitter users can gain inorganic appraisals in the form of likes, retweets, and follows through these blackmarket services either by paying for them or by joining syndicates wherein they gain such appraisals by providing similar appraisals to other users. These customers tend to exhibit a mix of organic and inorganic retweeting behavior, making it tougher to detect them. In this article, we investigate these blackmarket customers engaged in collusive retweeting activities. We collect and annotate a novel dataset containing various types of information about blackmarket customers and use these sources of information to construct multiple user representations. We adopt Weighted Generalized Canonical Correlation Analysis (WGCCA) to combine these individual representations to derive user embeddings that allow us to effectively classify users as: genuine users, bots, promotional customers, and normal customers. Our method significantly outperforms state-of-the-art approaches (32.95% better macro F1-score than the best baseline).

...read moreread less

Journal Article•DOI•

Social Science–guided Feature Engineering: A Novel Approach to Signed Link Analysis

[...]

Ghazaleh Beigi¹, Jiliang Tang², Huan Liu¹•Institutions (2)

Arizona State University¹, Michigan State University²

09 Jan 2020-ACM Transactions on Intelligent Systems and Technology

TL;DR: In this paper, three social science theories, Emotional Information, Diffusion of Innovations, and Individual Personality, are used for signed link analysis. But the authors focus on the problem of data sparsity, i.e., only a small percentage of signed links are given.

...read moreread less

Abstract: Many real-world relations can be represented by signed networks with positive links (e.g., friendships and trust) and negative links (e.g., foes and distrust). Link prediction helps advance tasks in social network analysis such as recommendation systems. Most existing work on link analysis focuses on unsigned social networks. The existence of negative links piques research interests in investigating whether properties and principles of signed networks differ from those of unsigned networks and mandates dedicated efforts on link analysis for signed social networks. Recent findings suggest that properties of signed networks substantially differ from those of unsigned networks and negative links can be of significant help in signed link analysis in complementary ways. In this article, we center our discussion on a challenging problem of signed link analysis. Signed link analysis faces the problem of data sparsity, i.e., only a small percentage of signed links are given. This problem can even get worse when negative links are much sparser than positive ones as users are inclined more toward positive disposition rather than negative. We investigate how we can take advantage of other sources of information for signed link analysis. This research is mainly guided by three social science theories, Emotional Information, Diffusion of Innovations, and Individual Personality. Guided by these, we extract three categories of related features and leverage them for signed link analysis. Experiments show the significance of the features gleaned from social theories for signed link prediction and addressing the data sparsity challenge.

...read moreread less

Journal Article•DOI•

Is Rank Aggregation Effective in Recommender Systems? An Experimental Analysis

[...]

Samuel E. L. Oliveira¹, Victor Diniz¹, Anisio Lacerda¹, Luiz Merschmanm, Gisele L. Pappa¹ - Show less +1 more•Institutions (1)

Universidade Federal de Minas Gerais¹

10 Jan 2020-ACM Transactions on Intelligent Systems and Technology

TL;DR: Results show that supervised rank aggregation methods provide improvements in the results of the recommended rankings in six out of seven datasets, and these methods provide robustness even in the presence of a big set of weak recommendation rankings.

...read moreread less

Abstract: Recommender Systems are tools designed to help users find relevant information from the myriad of content available online. They work by actively suggesting items that are relevant to users according to their historical preferences or observed actions. Among recommender systems, top-N recommenders work by suggesting a ranking of N items that can be of interest to a user. Although a significant number of top-N recommenders have been proposed in the literature, they often disagree in their returned rankings, offering an opportunity for improving the final recommendation ranking by aggregating the outputs of different algorithms. Rank aggregation was successfully used in a significant number of areas, but only a few rank aggregation methods have been proposed in the recommender systems literature. Furthermore, there is a lack of studies regarding rankings’ characteristics and their possible impacts on the improvements achieved through rank aggregation. This work presents an extensive two-phase experimental analysis of rank aggregation in recommender systems. In the first phase, we investigate the characteristics of rankings recommended by 15 different top-N recommender algorithms regarding agreement and diversity. In the second phase, we look at the results of 19 rank aggregation methods and identify different scenarios where they perform best or worst according to the input rankings’ characteristics. Our results show that supervised rank aggregation methods provide improvements in the results of the recommended rankings in six out of seven datasets. These methods provide robustness even in the presence of a big set of weak recommendation rankings. However, in cases where there was a set of non-diverse high-quality input rankings, supervised and unsupervised algorithms produced similar results. In these cases, we can avoid the cost of the former in favor of the latter.

...read moreread less

Journal Article•DOI•

Deep Neighborhood Component Analysis for Visual Similarity Modeling

[...]

Xueliang Liu¹, Xun Yang², Meng Wang¹, Richang Hong¹•Institutions (2)

Hefei University of Technology¹, National University of Singapore²

18 Apr 2020-ACM Transactions on Intelligent Systems and Technology

TL;DR: Zhang et al. as discussed by the authors proposed a novel end-to-end approach for visual similarity modeling, called deep neighborhood component analysis, which discriminatively trains deep neural networks to jointly learn visual features and similarities.

...read moreread less

Abstract: Learning effective visual similarity is an essential problem in multimedia research. Despite the promising progress made in recent years, most existing approaches learn visual features and similarities in two separate stages, which inevitably limits their performance. Once useful information has been lost in the feature extraction stage, it can hardly be recovered later. This article proposes a novel end-to-end approach for visual similarity modeling, called deep neighborhood component analysis, which discriminatively trains deep neural networks to jointly learn visual features and similarities. Specifically, we first formulate a metric learning objective that maximizes the intra-class correlations and minimizes the inter-class correlations under the neighborhood component analysis criterion, and then train deep convolutional neural networks to learn a nonlinear mapping that projects visual instances from original feature space to a discriminative and neighborhood-structure-preserving embedding space, thus resulting in better performance. We conducted extensive evaluations on several widely used and challenging datasets, and the impressive results demonstrate the effectiveness of our proposed approach.

...read moreread less

Journal Article•DOI•

HERA: Partial Label Learning by Combining Heterogeneous Loss with Sparse and Low-Rank Regularization

[...]

Gengyu Lyu¹, Songhe Feng¹, Yidong Li¹, Yi Jin¹, Guojun Dai², Congyan Lang¹ - Show less +2 more•Institutions (2)

Beijing Jiaotong University¹, Hangzhou Dianzi University²

01 Apr 2020-ACM Transactions on Intelligent Systems and Technology

TL;DR: HERA as discussed by the authors integrates the strengths of both the pairwise ranking loss and the pointwise reconstruction loss to provide informative label ranking and reconstruction information for label identification, whereas the embedded sparse and low-rank scheme constrains the sparsity of ground-truth label matrix and the low rank of noise label matrix to explore the global label relevance among the whole training data.

...read moreread less

Abstract: Partial label learning (PLL) aims to learn from the data where each training instance is associated with a set of candidate labels, among which only one is correct. Most existing methods deal with this type of problem by either treating each candidate label equally or identifying the ground-truth label iteratively. In this article, we propose a novel PLL approach named HERA, which simultaneously incorporates the HeterogEneous Loss and the SpaRse and Low-rAnk procedure to estimate the labeling confidence for each instance while training the desired model. Specifically, the heterogeneous loss integrates the strengths of both the pairwise ranking loss and the pointwise reconstruction loss to provide informative label ranking and reconstruction information for label identification, whereas the embedded sparse and low-rank scheme constrains the sparsity of ground-truth label matrix and the low rank of noise label matrix to explore the global label relevance among the whole training data, for improving the learning model. Comprehensive ablation study demonstrates the effectiveness of our employed heterogeneous loss, and extensive experiments on both artificial and real-world datasets demonstrate that our method achieves superior or comparable performance against state-of-the-art methods.

...read moreread less

Journal Article•DOI•

Multi-Task Learning for Entity Recommendation and Document Ranking in Web Search

[...]

Jizhou Huang¹, Haifeng Wang², Wei Zhang², Ting Liu¹•Institutions (2)

Harbin Institute of Technology¹, Baidu²

25 Jul 2020-ACM Transactions on Intelligent Systems and Technology

TL;DR: A multi-task learning framework with deep neural networks (DNNs) to jointly learn and optimize two companion tasks in Web search engines: entity recommendation and document ranking, which can be easily trained in an end-to-end manner.

...read moreread less

Abstract: Entity recommendation, providing users with an improved search experience by proactively recommending related entities to a given query, has become an indispensable feature of today’s Web search engine. Existing studies typically only consider the query issued at the current timestep while ignoring the in-session user search behavior (short-term search history) or historical user search behavior across all sessions (long-term search history) when generating entity recommendations. As a consequence, they may fail to recommend entities of interest relevant to a user’s actual information need. In this work, we believe that both short-term and long-term search history convey valuable evidence that could help understand the user’s search intent behind a query, and take both of them into consideration for entity recommendation. Furthermore, there has been little work on exploring whether the use of other companion tasks in Web search such as document ranking as auxiliary tasks could improve the performance of entity recommendation. To this end, we propose a multi-task learning framework with deep neural networks (DNNs) to jointly learn and optimize two companion tasks in Web search engines: entity recommendation and document ranking, which can be easily trained in an end-to-end manner. Specifically, we regard document ranking as an auxiliary task to improve the main task of entity recommendation, where the representations of queries, sessions, and users are shared across all tasks and optimized by the multi-task objective during training. We evaluate our approach using large-scale, real-world search logs of a widely-used commercial Web search engine. We also performed extensive ablation experiments over a number of facets of the proposed multi-task DNN model to figure out their relative importance. The experimental results show that both short-term and long-term search history can bring significant improvements in recommendation effectiveness, and the combination of both outperforms using either of them individually. In addition, the experiments show that the performance of both entity recommendation and document ranking can be significantly improved, which demonstrates the effectiveness of using multi-task learning to jointly optimize the two companion tasks in Web search.

...read moreread less

Journal Article•DOI•

On Representation Learning for Road Networks

[...]

Meng-xiang Wang¹, Wang-Chien Lee², Tao-Yang Fu², Ge Yu¹•Institutions (2)

Northeastern University (China)¹, Pennsylvania State University²

15 Dec 2020-ACM Transactions on Intelligent Systems and Technology

TL;DR: Wang et al. as discussed by the authors proposed a new learning framework, called Representation Learning for Road Networks (RLRN), which explores various intrinsic properties of road networks to learn embeddings of intersections and road segments in road networks.

...read moreread less

Abstract: Informative representation of road networks is essential to a wide variety of applications on intelligent transportation systems. In this article, we design a new learning framework, called Representation Learning for Road Networks (RLRN), which explores various intrinsic properties of road networks to learn embeddings of intersections and road segments in road networks. To implement the RLRN framework, we propose a new neural network model, namely Road Network to Vector (RN2Vec), to learn embeddings of intersections and road segments jointly by exploring geo-locality and homogeneity of them, topological structure of the road networks, and moving behaviors of road users. In addition to model design, issues involving data preparation for model training are examined. We evaluate the learned embeddings via extensive experiments on several real-world datasets using different downstream test cases, including node/edge classification and travel time estimation. Experimental results show that the proposed RN2Vec robustly outperforms existing methods, including (i) Feature-based methods: raw features and principal components analysis (PCA); (ii) Network embedding methods: DeepWalk, LINE, and Node2vec; and (iii) Features + Network structure-based methods: network embeddings and PCA, graph convolutional networks, and graph attention networks. RN2Vec significantly outperforms all of them in terms of F1-score in classifying traffic signals (11.96% to 16.86%) and crossings (11.36% to 16.67%) on intersections and in classifying avenue (10.56% to 15.43%) and street (11.54% to 16.07%) on road segments, as well as in terms of Mean Absolute Error in travel time estimation (17.01% to 23.58%).

...read moreread less

Journal Article•DOI•

Single Image Snow Removal Using Sparse Representation and Particle Swarm Optimizer

[...]

Shih-Chia Huang¹, Da-Wei Jaw², Bo-Hao Chen³, Sy-Yen Kuo²•Institutions (3)

National Taipei University of Technology¹, National Taiwan University², Yuan Ze University³

10 Jan 2020-ACM Transactions on Intelligent Systems and Technology

TL;DR: A novel snow removal framework for a single image, which can be separated into a sparse image approximation module and an adaptive tolerance optimization module, which achieves a better efficacy of snow removal than do other state-of-the-art techniques via both objective and subjective evaluations.

...read moreread less

Abstract: Images are often corrupted by natural obscuration (e.g., snow, rain, and haze) during acquisition in bad weather conditions. The removal of snowflakes from only a single image is a challenging task due to situational variety and has been investigated only rarely. In this article, we propose a novel snow removal framework for a single image, which can be separated into a sparse image approximation module and an adaptive tolerance optimization module. The first proposed module takes the advantage of sparsity-based regularization to reconstruct a potential snow-free image. An auto-tuning mechanism for this framework is then proposed to seek a better reconstruction of a snow-free image via the time-varying inertia weight particle swarm optimizers in the second proposed module. Through collaboration of these two modules iteratively, the number of snowflakes in the reconstructed image is reduced as generations progress. By the experimental results, the proposed method achieves a better efficacy of snow removal than do other state-of-the-art techniques via both objective and subjective evaluations. As a result, the proposed method is able to remove snowflakes successfully from only a single image while preserving most original object structure information.

...read moreread less