Showing papers by "Zhenhui Li published in 2021"

PDF

Open Access

Journal Article•DOI•

Recent Advances in Reinforcement Learning for Traffic Signal Control: A Survey of Models and Evaluation

[...]

Hua Wei¹, Guanjie Zheng¹, Vikash V. Gayah¹, Zhenhui Li¹•Institutions (1)

17 Jan 2021-Sigkdd Explorations

TL;DR: In this paper, the authors focus on investigating the recent advances in using reinforcement learning (RL) techniques to solve the traffic signal control problem and classify the known approaches based on the RL techniques they use and provide a review of existing models with analysis on their advantages and disadvantages.

...read moreread less

Abstract: Traffic signal control is an important and challenging real-world problem that has recently received a large amount of interest from both transportation and computer science communities. In this survey, we focus on investigating the recent advances in using reinforcement learning (RL) techniques to solve the traffic signal control problem. We classify the known approaches based on the RL techniques they use and provide a review of existing models with analysis on their advantages and disadvantages. Moreover, we give an overview of the simulation environments and experimental settings that have been developed to evaluate the traffic signal control methods. Finally, we explore future directions in the area of RLbased traffic signal control methods. We hope this survey could provide insights to researchers dealing with real-world applications in intelligent transportation systems

...read moreread less

78 citations

Journal Article•DOI•

Network spillovers and neighborhood crime: A computational statistics analysis of employment-based networks of neighborhoods.

[...]

Corina Graif¹, Brittany N. Freelin¹, Yu Hsuan Kuo¹, Hongjian Wang², Zhenhui Li², Daniel Kifer¹ - Show less +2 more•Institutions (2)

Pennsylvania State University¹, Penn State College of Information Sciences and Technology²

01 Jan 2021-Justice Quarterly

TL;DR: The findings suggest that connected communities can influence each other from a distance and that connectivity to less disadvantaged work hubs may decrease local crime – with implications for advancing knowledge on the relational ecology of crime, social isolation, and ecological networks.

...read moreread less

Abstract: Research on communities and crime has predominantly focused on social conditions within an area or in its immediate proximity. However, a growing body of research shows that people often travel to ...

...read moreread less

17 citations

Proceedings Article•DOI•

Rebuilding City-Wide Traffic Origin Destination from Road Speed Data

[...]

Guanjie Zheng¹, Chang Liu², Hua Wei¹, Chacha Chen¹, Zhenhui Li¹ - Show less +1 more•Institutions (2)

Pennsylvania State University¹, Shanghai Jiao Tong University²

19 Apr 2021

TL;DR: Zhang et al. as mentioned in this paper proposed to use pervasive speed data to recover the temporal origin-destination (TOD) of vehicles and use other mobility data as auxiliary data.

...read moreread less

Abstract: Understanding city-wide traffic problems may benefit many downstream applications, such as city planning and public transportation development. One key step to understand traffic is to reveal how many people travel from one location to another during one period (we call TOD, short for temporal origin-destination). With TOD, we can rebuild the city-wide traffic by simulating the volume and speed on each road segment.Frequently used mobility data, e.g., GPS trajectories, surveillance cameras, can only cover a subset of vehicles or selected regions of the city. Hence, we propose to use pervasive speed data to recover TOD, and use other mobility data as auxiliary data. To the best of our knowledge, we are the first to work on this challenging problem. It is highly challenging because the speed is generated from a complex process from TOD, and there exists multiple TOD distributions that may generate similar city-wide road speed observations. We propose a new method that models the complex process via separate modules and takes auxiliary data to eliminate infeasible solutions. Extensive experiments on synthetic and real datasets have shown the superior performance of our model over baselines.

...read moreread less

4 citations

Journal Article•DOI•

App2Vec: Context-Aware Application Usage Prediction

[...]

Huandong Wang¹, Yong Li¹, Mu Du¹, Zhenhui Li², Depeng Jin¹ - Show less +1 more•Institutions (2)

Tsinghua University¹, Pennsylvania State University²

28 Jun 2021-ACM Transactions on Knowledge Discovery From Data

TL;DR: In this paper, the authors present a method to understand when and where certain apps are used by users, but it has been a challenging problem due to the highly skewed annealing of the data.

...read moreread less

Abstract: Both app developers and service providers have strong motivations to understand when and where certain apps are used by users. However, it has been a challenging problem due to the highly skewed an...

...read moreread less

3 citations

Posted Content•

Boosting Offline Reinforcement Learning with Residual Generative Modeling

[...]

Hua Wei¹, Deheng Ye¹, Zhao Liu¹, Hao Wu¹, Bo Yuan¹, Qiang Fu¹, Wei Yang¹, Zhenhui Li² - Show less +4 more•Institutions (2)

Tencent¹, Pennsylvania State University²

19 Jun 2021-arXiv: Learning

TL;DR: In this paper, a residual generative model is proposed to reduce policy approximation error for offline RL, which can learn more accurate policy approximations in different benchmark datasets and can learn competitive AI agents in complex control tasks under the multiplayer online battle arena (MOBA) game Honor of Kings.

...read moreread less

Abstract: Offline reinforcement learning (RL) tries to learn the near-optimal policy with recorded offline experience without online exploration. Current offline RL research includes: 1) generative modeling, i.e., approximating a policy using fixed data; and 2) learning the state-action value function. While most research focuses on the state-action function part through reducing the bootstrapping error in value function approximation induced by the distribution shift of training data, the effects of error propagation in generative modeling have been neglected. In this paper, we analyze the error in generative modeling. We propose AQL (action-conditioned Q-learning), a residual generative model to reduce policy approximation error for offline RL. We show that our method can learn more accurate policy approximations in different benchmark datasets. In addition, we show that the proposed offline RL method can learn more competitive AI agents in complex control tasks under the multiplayer online battle arena (MOBA) game Honor of Kings.

...read moreread less

2 citations

DOI•

Probabilistic simulation of spatial demand for intelligent product allocation

[...]

Porter Jenkins¹, Hua Wei², J. Stockton Jenkins¹, Zhenhui Li³•Institutions (3)

Brigham Young University¹, New Jersey Institute of Technology², Pennsylvania State University³

02 Nov 2021

TL;DR: In this paper, a probabilistic spatial demand simulator (PSD-sim) is proposed to learn important spatial patterns in offline retail, which can be used to train policy estimators that discover intelligent, spatial allocation strategies.

...read moreread less

Abstract: Connecting consumers with relevant products is a very important problem in both online and offline commerce. In many offline retail settings, product distributors extend bids to place and manage product displays within a retail outlet. The distributor aims to choose a spatial allocation strategy that maximizes revenue given a preset budget constraint. Prior work shows that carefully selecting product locations within a store can minimize search costs and and induce consumers to make "impulse" purchases. Such impulse purchases are influenced by the spatial configuration of the store. However, learning important spatial patterns in offline retail is challenging due to the scarcity of data and the high cost of exploration and experimentation in the physical world. To address these challenges, we propose a stochastic model of spatial demand in physical retail, which we call the Probabilistic Spatial Demand Simulator (PSD-sim). PSD-sim is an effective mirror of the real environment because it exploits the structure of common retail datasets through a hierarchical parameter sharing structure, and is able to incorporate spatial and economic knowledge through informative priors. We show that PSD-sim can both recover ground truth test data better than baselines, and generate new data for unseen states. The simulator can naturally be used to train policy estimators that discover intelligent, spatial allocation strategies. Finally, we perform a preliminary study into different optimization techniques and find that Deep Q-Learning can learn an effective spatial allocation policy.

...read moreread less

1 citations

Posted Content•

Learning to Simulate on Sparse Trajectory Data

[...]

Hua Wei¹, Chacha Chen¹, Chang Liu², Guanjie Zheng¹, Zhenhui Li¹ - Show less +1 more•Institutions (2)

Pennsylvania State University¹, Shanghai Jiao Tong University²

22 Mar 2021-arXiv: Learning

TL;DR: In this article, the authors propose a novel framework ImInGAIL to address the problem of learning to simulate the driving behavior from sparse real-world data, which incorporates data interpolation with the behavior learning process of imitation learning.

...read moreread less

Abstract: Simulation of the real-world traffic can be used to help validate the transportation policies. A good simulator means the simulated traffic is similar to real-world traffic, which often requires dense traffic trajectories (i.e., with a high sampling rate) to cover dynamic situations in the real world. However, in most cases, the real-world trajectories are sparse, which makes simulation challenging. In this paper, we present a novel framework ImInGAIL to address the problem of learning to simulate the driving behavior from sparse real-world data. The proposed architecture incorporates data interpolation with the behavior learning process of imitation learning. To the best of our knowledge, we are the first to tackle the data sparsity issue for behavior learning problems. We investigate our framework on both synthetic and real-world trajectory datasets of driving vehicles, showing that our method outperforms various baselines and state-of-the-art methods.

...read moreread less

1 citations

Proceedings Article•

Neural Utility Functions

[...]

Porter Jenkins¹, Ahmad Farag², J. Stockton Jenkins³, Huaxiu Yao¹, Suhang Wang¹, Zhenhui Li¹ - Show less +2 more•Institutions (3)

Pennsylvania State University¹, Georgia Institute of Technology², Brigham Young University³

18 May 2021

TL;DR: In this article, the authors propose Neural Utility Functions, which directly optimize the gradients of a neural network so that they are more consistent with utility theory, a mathematical framework for modeling choice among items.

...read moreread less

Abstract: Current neural network architectures have no mechanism for explicitly reasoning about item trade-offs. Such trade-offs are important for popular tasks such as recommendation. The main idea of this work is to give neural networks inductive biases that are inspired by economic theories. To this end, we propose Neural Utility Functions, which directly optimize the gradients of a neural network so that they are more consistent with utility theory, a mathematical framework for modeling choice among items. We demonstrate that Neural Utility Functions can recover theoretical item relationships better than vanilla neural networks, analytically show existing neural networks are not quasi-concave and do not inherently reason about trade-offs, and that augmenting existing models with a utility loss function improves recommendation results. The Neural Utility Functions we propose are theoretically motivated, and yield strong empirical results.

...read moreread less

Posted Content•

Learning to Route via Theory-Guided Residual Network.

[...]

Chang Liu¹, Guanjie Zheng², Zhenhui Li•Institutions (2)

Shanghai Jiao Tong University¹, Pennsylvania State University²

18 May 2021-arXiv: Learning

TL;DR: Zhang et al. as discussed by the authors proposed a theory-guided residual network model, where the theoretical part can emphasize the general principles for human routing decisions, and the residual part can capture drivable condition preferences (e.g., local road or highway).

...read moreread less

Abstract: The heavy traffic and related issues have always been concerns for modern cities. With the help of deep learning and reinforcement learning, people have proposed various policies to solve these traffic-related problems, such as smart traffic signal control systems and taxi dispatching systems. People usually validate these policies in a city simulator, since directly applying them in the real city introduces real cost. However, these policies validated in the city simulator may fail in the real city if the simulator is significantly different from the real world. To tackle this problem, we need to build a real-like traffic simulation system. Therefore, in this paper, we propose to learn the human routing model, which is one of the most essential part in the traffic simulator. This problem has two major challenges. First, human routing decisions are determined by multiple factors, besides the common time and distance factor. Second, current historical routes data usually covers just a small portion of vehicles, due to privacy and device availability issues. To address these problems, we propose a theory-guided residual network model, where the theoretical part can emphasize the general principles for human routing decisions (e.g., fastest route), and the residual part can capture drivable condition preferences (e.g., local road or highway). Since the theoretical part is composed of traditional shortest path algorithms that do not need data to train, our residual network can learn human routing models from limited data. We have conducted extensive experiments on multiple real-world datasets to show the superior performance of our model, especially with small data. Besides, we have also illustrated why our model is better at recovering real routes through case studies.

...read moreread less

Proceedings Article•

Functionally Regionalized Knowledge Transfer for Low-resource Drug Discovery

[...]

Huaxiu Yao¹, Ying Wei², Long-Kai Huang³, Ding Xue, Junzhou Huang³, Zhenhui Li⁴ - Show less +2 more•Institutions (4)

Stanford University¹, University of Science and Technology of China², Tencent³, Pennsylvania State University⁴

06 Dec 2021

Proceedings Article•DOI•

Knowledge-based Residual Learning.

[...]

Guanjie Zheng¹, Guanjie Zheng², Chang Liu², Hua Wei¹, Porter Jenkins¹, Chacha Chen¹, Tao Wen³, Zhenhui Li¹ - Show less +4 more•Institutions (3)

Pennsylvania State University¹, Shanghai Jiao Tong University², Syracuse University³

09 Aug 2021

Posted Content•

Objective-aware Traffic Simulation via Inverse Reinforcement Learning

[...]

Guanjie Zheng¹, Hanyang Liu², Kai Xu, Zhenhui Li³•Institutions (3)

Shanghai Jiao Tong University¹, Washington University in St. Louis², Pennsylvania State University³

20 May 2021-arXiv: Artificial Intelligence

TL;DR: In this article, the authors formulate traffic simulation as an inverse reinforcement learning problem, and propose a parameter sharing adversarial inverse RL model for dynamics-robust simulation learning, which is able to imitate a vehicle's trajectories in the real world while simultaneously recovering the reward function that reveals the vehicle's true objective which is invariant to different dynamics.

...read moreread less

Abstract: Traffic simulators act as an essential component in the operating and planning of transportation systems. Conventional traffic simulators usually employ a calibrated physical car-following model to describe vehicles' behaviors and their interactions with traffic environment. However, there is no universal physical model that can accurately predict the pattern of vehicle's behaviors in different situations. A fixed physical model tends to be less effective in a complicated environment given the non-stationary nature of traffic dynamics. In this paper, we formulate traffic simulation as an inverse reinforcement learning problem, and propose a parameter sharing adversarial inverse reinforcement learning model for dynamics-robust simulation learning. Our proposed model is able to imitate a vehicle's trajectories in the real world while simultaneously recovering the reward function that reveals the vehicle's true objective which is invariant to different dynamics. Extensive experiments on synthetic and real-world datasets show the superior performance of our approach compared to state-of-the-art methods and its robustness to variant dynamics of traffic.

...read moreread less