scispace - formally typeset
Search or ask a question

Showing papers in "ACM Transactions on Intelligent Systems and Technology in 2021"


Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors proposed a Bicycle Station Dynamic Planning (BSDP) system to dynamically provide the optimal bicycle station layout for the DL-PBS network, which consists of four modules: bicycle drop-off location clustering, bicycle-station graph modeling, bicyclestation location prediction, and bicyclestation layout recommendation.
Abstract: Benefiting from convenient cycling and flexible parking locations, the Dockless Public Bicycle-sharing (DL-PBS) network becomes increasingly popular in many countries. However, redundant and low-utility stations waste public urban space and maintenance costs of DL-PBS vendors. In this article, we propose a Bicycle Station Dynamic Planning (BSDP) system to dynamically provide the optimal bicycle station layout for the DL-PBS network. The BSDP system contains four modules: bicycle drop-off location clustering, bicycle-station graph modeling, bicycle-station location prediction, and bicycle-station layout recommendation. In the bicycle drop-off location clustering module, candidate bicycle stations are clustered from each spatio-temporal subset of the large-scale cycling trajectory records. In the bicycle-station graph modeling module, a weighted digraph model is built based on the clustering results and inferior stations with low station revenue and utility are filtered. Then, graph models across time periods are combined to create a graph sequence model. In the bicycle-station location prediction module, the GGNN model is used to train the graph sequence data and dynamically predict bicycle stations in the next period. In the bicycle-station layout recommendation module, the predicted bicycle stations are fine-tuned according to the government urban management plan, which ensures that the recommended station layout is conducive to city management, vendor revenue, and user convenience. Experiments on actual DL-PBS networks verify the effectiveness, accuracy, and feasibility of the proposed BSDP system.

51 citations


Journal ArticleDOI
TL;DR: The current state of play in VANETs development is surveyed, including the key technologies critical to the field, the resource-management and safety applications needed for smooth operations, the communications and data transmission protocols that support networking, and the theoretical and environmental constructs underpinning research and development.
Abstract: Vehicular ad hoc networks (VANETs) and the services they support are an essential part of intelligent transportation. Through physical technologies, applications, protocols, and standards, they hel...

33 citations


Journal ArticleDOI
TL;DR: In this article, a feature map is proposed to represent a node with respect to a specific attribute, using all attributes of its h-hop neighbors, and different classifiers are then learned on these feature vectors to predict the value of attribute.
Abstract: In many graphs such as social networks, nodes have associated attributes representing their behavior Predicting node attributes in such graphs is an important task with applications in many domains like recommendation systems, privacy preservation, and targeted advertisement Attribute values can be predicted by treating each node as a data point described by attributes and employing classification/regression algorithms However, in social networks, there is complex interdependence between node attributes and pairwise interaction For instance, attributes of nodes are influenced by their neighbors (social influence), and neighborhoods (friendships) between nodes are established based on pairwise (dis)similarity between their attributes (social selection) In this article, we establish that information in network topology is extremely useful in determining node attributes In particular, we use self- and cross-proclivity measures (quantitative measures of how much a node attribute depends on the same and other attributes of its neighbors) to predict node attributes We propose a feature map to represent a node with respect to a specific attribute a, using all attributes of its h-hop neighbors Different classifiers are then learned on these feature vectors to predict the value of attribute a We perform extensive experimentation on 10 real-world datasets and show that the proposed method significantly outperforms known approaches in terms of prediction accuracy

30 citations


Journal ArticleDOI
TL;DR: In this paper, the authors proposed a method to transform the way we live and work over the next couple of decades by using Artificial Intelligence (AI) techniques for urban computing tasks.
Abstract: From facial recognition to autonomous driving, Artificial Intelligence (AI) will transform the way we live and work over the next couple of decades. Existing AI approaches for urban computing suffe...

26 citations


Journal ArticleDOI
TL;DR: In this paper, the increase in scale and complexity of distributed computing systems challenges O&M teams that perform daily monitoring and monitoring of these systems, which is a challenge for many organizations.
Abstract: Modern society is increasingly moving toward complex and distributed computing systems. The increase in scale and complexity of these systems challenges O&M teams that perform daily monitoring and ...

23 citations


Journal ArticleDOI
TL;DR: Isudra as mentioned in this paper employs Bayesian optimization to select time scales, features, base detector algorithms, and hyperparameters that increase true positive and decrease false positive detection for anomaly detection.
Abstract: Anomaly detection techniques can extract a wealth of information about unusual events. Unfortunately, these methods yield an abundance of findings that are not of interest, obscuring relevant anomalies. In this work, we improve upon traditional anomaly detection methods by introducing Isudra, an Indirectly Supervised Detector of Relevant Anomalies from time series data. Isudra employs Bayesian optimization to select time scales, features, base detector algorithms, and algorithm hyperparameters that increase true positive and decrease false positive detection. This optimization is driven by a small amount of example anomalies, driving an indirectly supervised approach to anomaly detection. Additionally, we enhance the approach by introducing a warm-start method that reduces optimization time between similar problems. We validate the feasibility of Isudra to detect clinically relevant behavior anomalies from over 2M sensor readings collected in five smart homes, reflecting 26 health events. Results indicate that indirectly supervised anomaly detection outperforms both supervised and unsupervised algorithms at detecting instances of health-related anomalies such as falls, nocturia, depression, and weakness.

22 citations


Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper proposed an intelligent information system that calculates the influence of the insertion time of each batch in a large-scale stream database by applying the sliding window model and mines recent high utility patterns without generating candidate patterns.
Abstract: Databases that deal with the real world have various characteristics. New data is continuously inserted over time without limiting the length of the database, and a variety of information about the items constituting the database is contained. Recently generated data has a greater influence than the previously generated data. These are called the time-sensitive non-binary stream databases, and they include databases such as web-server click data, market sales data, data from sensor networks, and network traffic measurement. Many high utility pattern mining and stream pattern mining methods have been proposed so far. However, they have a limitation that they are not suitable to analyze these databases, because they find valid patterns by analyzing a database with only some of the features described above. Therefore, knowledge-based software about how to find meaningful information efficiently by analyzing databases with these characteristics is required. In this article, we propose an intelligent information system that calculates the influence of the insertion time of each batch in a large-scale stream database by applying the sliding window model and mines recent high utility patterns without generating candidate patterns. In addition, a novel list-based data structure is suggested for a fast and efficient management of the time-sensitive stream databases. Moreover, our technique is compared with state-of-the-art algorithms through various experiments using real datasets and synthetic datasets. The experimental results show that our approach outperforms the previously proposed methods in terms of runtime, memory usage, and scalability.

17 citations


Journal ArticleDOI
TL;DR: A comprehensive review of the new research trends of CTG can be found in this article, where the authors summarize several key techniques and illustrate the technical evolution route in the field of neural text generation.
Abstract: In recent years, with the development of deep learning, text-generation technology has undergone great changes and provided many kinds of services for human beings, such as restaurant reservation and daily communication. The automatically generated text is becoming more and more fluent so researchers begin to consider more anthropomorphic text-generation technology, that is, the conditional text generation, including emotional text generation, personalized text generation, and so on. Conditional Text Generation (CTG) has thus become a research hotspot. As a promising research field, we find that much attention has been paid to exploring it. Therefore, we aim to give a comprehensive review of the new research trends of CTG. We first summarize several key techniques and illustrate the technical evolution route in the field of neural text generation, based on the concept model of CTG. We further make an investigation of existing CTG fields and propose several general learning models for CTG. Finally, we discuss the open issues and promising research directions of CTG.

16 citations


Journal ArticleDOI
TL;DR: In this paper, the authors proposed an industrial federated topic modeling (iFTM) framework, in which multiple parties collaboratively train a high-quality topic model by simultaneously alleviating data scarcity and maintaining immunity to privacy adversaries.
Abstract: Probabilistic topic modeling has been applied in a variety of industrial applications. Training a high-quality model usually requires a massive amount of data to provide comprehensive co-occurrence information for the model to learn. However, industrial data such as medical or financial records are often proprietary or sensitive, which precludes uploading to data centers. Hence, training topic models in industrial scenarios using conventional approaches faces a dilemma: A party (i.e., a company or institute) has to either tolerate data scarcity or sacrifice data privacy. In this article, we propose a framework named Industrial Federated Topic Modeling (iFTM), in which multiple parties collaboratively train a high-quality topic model by simultaneously alleviating data scarcity and maintaining immunity to privacy adversaries. iFTM is inspired by federated learning, supports two representative topic models (i.e., Latent Dirichlet Allocation and SentenceLDA) in industrial applications, and consists of novel techniques such as private Metropolis-Hastings, topic-wise normalization, and heterogeneous model integration. We conduct quantitative evaluations to verify the effectiveness of iFTM and deploy iFTM in two real-life applications to demonstrate its utility. Experimental results verify iFTM’s superiority over conventional topic modeling.

13 citations


Journal ArticleDOI
TL;DR: A task-adaptative model-agnostic meta-learning framework to learn city-specific prior initializations from multiple cities, capable of handling the multimodal data distribution and accelerating the adaptation in new cities compared to other methods is proposed.
Abstract: Optimal store placement aims to identify the optimal location for a new brick-and-mortar store that can maximize its sale by analyzing and mining users’ preferences from large-scale urban data. In ...

12 citations



Journal ArticleDOI
TL;DR: The authors propose a Mixed Aspect Sampling (MAS) framework to sample instances that capture different semantic aspects of the dataset and use the ensemble classifier to improve the classification performance, which shows that MAS performs better than random sampling and active learning models to abuse detection tasks where it is hard to collect the labelled data for building an accurate classifier.
Abstract: Language model (LM) has become a common method of transfer learning in Natural Language Processing (NLP) tasks when working with small labeled datasets. An LM is pretrained using an easily available large unlabelled text corpus and is fine-tuned with the labelled data to apply to the target (i.e., downstream) task. As an LM is designed to capture the linguistic aspects of semantics, it can be biased to linguistic features. We argue that exposing an LM model during fine-tuning to instances that capture diverse semantic aspects (e.g., topical, linguistic, semantic relations) present in the dataset will improve its performance on the underlying task. We propose a Mixed Aspect Sampling (MAS) framework to sample instances that capture different semantic aspects of the dataset and use the ensemble classifier to improve the classification performance. Experimental results show that MAS performs better than random sampling as well as the state-of-the-art active learning models to abuse detection tasks where it is hard to collect the labelled data for building an accurate classifier.

Journal ArticleDOI
TL;DR: This work proposes a method called Graph Transformer based Auto Encoder (GTAE), which models a sentence as a linguistic graph and performs feature extraction and style transfer at the graph level, to maximally retain the content and the linguistic structure of original sentences.
Abstract: Non-parallel text style transfer has attracted increasing research interests in recent years. Despite successes in transferring the style based on the encoder-decoder framework, current approaches ...


Journal ArticleDOI
TL;DR: An IoT-inspired framework has been proposed for real-time analysis of athlete performance and acquired enhanced performance values in terms of Temporal Delay, Classification Efficiency, Statistical Efficacy, Correlation Analysis, and Reliability.
Abstract: Internet of Things (IoT) technology backed by Artificial Intelligence (AI) techniques has been increasingly utilized for the realization of the Industry 4.0 vision. Conspicuously, this work provide...

Journal ArticleDOI
TL;DR: In this article, a normative multiagent system (nMAS) specification is presented, where different agents often have conflicting requirements and tradeoffs are made between different agents' requirements. But existing approaches can resolve clear-cut conflicts, tradeoffs can be made.
Abstract: Specifying a normative multiagent system (nMAS) is challenging, because different agents often have conflicting requirements. Whereas existing approaches can resolve clear-cut conflicts, tradeoffs ...

Journal ArticleDOI
TL;DR: Through large-scale quantitative experiments, it is shown that with TFE, the clients can enjoy far better ASR solutions than the “one-size-fits-all” counterpart, and the vendors can exploit the abundance of clients’ data to effectively refine their own ASR products.
Abstract: Automatic Speech Recognition (ASR) is playing a vital role in a wide range of real-world applications. However, Commercial ASR solutions are typically “one-size-fits-all” products and clients are i...

Journal ArticleDOI
TL;DR: In this article, the authors focus on the issue of outlier detection over distributed trajectory streams, where the outliers refer to a few entities whose motion behaviors are significantly different from their local neighbors.
Abstract: Owing to a wide variety of deployment of GPS-enabled devices, tremendous amounts of trajectories have been generated in distributed stream manner It opens up new opportunities to track and analyze the moving behaviors of the entities In this work, we focus on the issue of outlier detection over distributed trajectory streams, where the outliers refer to a few entities whose motion behaviors are significantly different from their local neighbors In view of skewed distribution property and evolving nature of trajectory data, and on-the-fly detection requirement over distributed streams, we first design a high-efficiency outlier detection solution It consists of identifying abnormal trajectory fragment and exceptional fragment cluster at the remote sites and then detecting abnormal evolving object at the coordinator site Further, given that outlier detection accuracy would be damaged due to using inappropriate proximity thresholds or a few trajectory data not having sufficient neighbors at the remote sites, we extract proximity thresholds of different regions and spatial context relationship of each region from historical data to improve the precision Built upon this is an improved version consisting of off-line modeling phase and on-line detection phase During the on-line phase, the proximity thresholds that are derived from historical trajectories during the off-line phase are leveraged to assist in detecting abnormal trajectory fragments and exceptional fragment clusters at the remote sites Additionally, at the coordinator site, the detection results of some remote sites can be refined by incorporating those of other remote sites with neighborhood relationship Extensive experimental results on real data demonstrate that our proposed methods own high detection validity, less communication cost and linear scalability for online identifying outliers over distributed trajectory streams

Journal ArticleDOI
TL;DR: In this paper, multi-view subspace clustering (MVSC) finds a shared structure in latent low-dimensional subspaces of multiview data to enhance clustering performance.
Abstract: Multi-view subspace clustering (MVSC) finds a shared structure in latent low-dimensional subspaces of multi-view data to enhance clustering performance. Nonetheless, we observe that most existing M...

Journal ArticleDOI
TL;DR: Li et al. as mentioned in this paper proposed a causal mechanism transfer network (CMTN) for time series domain adaptation, which can exploit existing data and labels from similar systems, such that the resulting model on a new system is highly reliable even with limited data.
Abstract: Data-driven models are becoming essential parts in modern mechanical systems, commonly used to capture the behavior of various equipment and varying environmental characteristics. Despite the advantages of these data-driven models on excellent adaptivity to high dynamics and aging equipment, they are usually hungry for massive labels, mostly contributed by human engineers at a high cost. Fortunately, domain adaptation enhances the model generalization by utilizing the labeled source data and the unlabeled target data. However, the mainstream domain adaptation methods cannot achieve ideal performance on time series data, since they assume that the conditional distributions are equal. This assumption works well in the static data but is inapplicable for the time series data. Even the first-order Markov dependence assumption requires the dependence between any two consecutive time steps. In this article, we assume that the causal mechanism is invariant and present our Causal Mechanism Transfer Network (CMTN) for time series domain adaptation. By capturing causal mechanisms of time series data, CMTN allows the data-driven models to exploit existing data and labels from similar systems, such that the resulting model on a new system is highly reliable even with limited data. We report our empirical results and lessons learned from two real-world case studies, on chiller plant energy optimization and boiler fault detection, which outperform the existing state-of-the-art method.

Journal ArticleDOI
TL;DR: This article proposed a rating transformation model that compensates for skew in the rating distribution as well as its central tendency by converting ratings into percentile values as a pre-processing step before recommendation generation.
Abstract: It is well known that explicit user ratings in recommender systems are biased toward high ratings and that users differ significantly in their usage of the rating scale. Implementers usually compensate for these issues through rating normalization or the inclusion of a user bias term in factorization models. However, these methods adjust only for the central tendency of users’ distributions. In this work, we demonstrate that a lack of flatness in rating distributions is negatively correlated with recommendation performance. We propose a rating transformation model that compensates for skew in the rating distribution as well as its central tendency by converting ratings into percentile values as a pre-processing step before recommendation generation. This transformation flattens the rating distribution, better compensates for differences in rating distributions, and improves recommendation performance. We also show that a smoothed version of this transformation can yield more intuitive results for users with very narrow rating distributions. A comprehensive set of experiments, with state-of-the-art recommendation algorithms in four real-world datasets, show improved ranking performance for these percentile transformations.

Journal ArticleDOI
TL;DR: A novel ensemble model, called Multiple Kernel Ensemble Learning (MKEL), is developed by introducing a unified ensemble loss.
Abstract: In this article, a novel ensemble model, called Multiple Kernel Ensemble Learning (MKEL), is developed by introducing a unified ensemble loss. Different from the previous multiple kernel learning (...

Journal ArticleDOI
TL;DR: Real-time segmentation and understanding of driving scenes are crucial in autonomous driving and traditional pixel-wise approaches extract scene information by segmenting all pixels in a frame.
Abstract: Real-time segmentation and understanding of driving scenes are crucial in autonomous driving. Traditional pixel-wise approaches extract scene information by segmenting all pixels in a frame, and he...

Journal ArticleDOI
TL;DR: Unsupervised domain adaptation for person re-identification (re-ID) is a challenging task due to large variations in human classes, illuminations, camera views, and so on.
Abstract: Unsupervised domain adaptation (UDA) for person re-identification (re-ID) is a challenging task due to large variations in human classes, illuminations, camera views, and so on. Currently, existing...

Journal ArticleDOI
TL;DR: In this article, given a collection of data points, the authors aim to detect cluster detection in public health, public safety, transportation, and public safety applications, and so on.
Abstract: Cluster detection is important and widely used in a variety of applications, including public health, public safety, transportation, and so on. Given a collection of data points, we aim to detect d...

Journal ArticleDOI
TL;DR: Graph edge partitioning, which is essential for the efficiency of distributed graph computation systems, divides a graph into several balanced partitions within a given size to minimize the number of disconnected edges.
Abstract: Graph edge partitioning, which is essential for the efficiency of distributed graph computation systems, divides a graph into several balanced partitions within a given size to minimize the number ...

Journal ArticleDOI
TL;DR: Three variations of novel vector-quantization-based topic models (VQ-TMs) are proposed that capitalize on vector quantization techniques, embedded input documents, and viewing words as mixtures of topics.
Abstract: With the purpose of learning and utilizing explicit and dense topic embeddings, we propose three variations of novel vector-quantization-based topic models (VQ-TMs): (1) Hard VQ-TM, (2) Soft VQ-TM,...


Journal ArticleDOI
TL;DR: The authors proposed a method for topic modeling and text segmentation using a small dataset for training, limiting their capabilities when only a small collection of text is available, which is the case with our method.
Abstract: Existing topic modeling and text segmentation methodologies generally require large datasets for training, limiting their capabilities when only small collections of text are available. In this wor...

Journal ArticleDOI
TL;DR: In this paper, the task of aspect controlled response generation in a multimodal task-oriented dialog system is addressed by employing a multimmodal hierarchical memory network for generating responses that utilize information from both text and images.
Abstract: Multimodality in dialogue systems has opened up new frontiers for the creation of robust conversational agents Any multimodal system aims at bridging the gap between language and vision by leveraging diverse and often complementary information from image, audio, and video, as well as text For every task-oriented dialog system, different aspects of the product or service are crucial for satisfying the user’s demands Based upon the aspect, the user decides upon selecting the product or service The ability to generate responses with the specified aspects in a goal-oriented dialogue setup facilitates user satisfaction by fulfilling the user’s goals Therefore, in our current work, we propose the task of aspect controlled response generation in a multimodal task-oriented dialog system We employ a multimodal hierarchical memory network for generating responses that utilize information from both text and images As there was no readily available data for building such multimodal systems, we create a Multi-Domain Multi-Modal Dialog (MDMMD++) dataset The dataset comprises the conversations having both text and images belonging to the four different domains, such as hotels, restaurants, electronics, and furniture Quantitative and qualitative analysis on the newly created MDMMD++ dataset shows that the proposed methodology outperforms the baseline models for the proposed task of aspect controlled response generation