scispace - formally typeset
Search or ask a question

Showing papers in "Proceedings of the ... AAAI Conference on Artificial Intelligence in 2023"


Journal ArticleDOI
TL;DR: In this paper , the authors evaluate the use of GPT-3, a 1.75B parameter transformer model recently released by OpenAI, for three related challenges pertaining to math word problems corresponding to systems of two linear equations.
Abstract: Researchers have been interested in developing AI tools to help students learn various mathematical subjects. One challenging set of tasks for school students is learning to solve math word problems. We explore how recent advances in natural language processing, specifically the rise of powerful transformer based models, can be applied to help math learners with such problems. Concretely, we evaluate the use of GPT-3, a 1.75B parameter transformer model recently released by OpenAI, for three related challenges pertaining to math word problems corresponding to systems of two linear equations. The three challenges are classifying word problems, extracting equations from word problems, and generating word problems. For the first challenge, we define a set of problem classes and find that GPT-3 has generally very high accuracy in classifying word problems (80%-100%), for all but one of these classes. For the second challenge, we find the accuracy for extracting equations improves with number of examples provided to the model, ranging from an accuracy of 31% for zero-shot learning to about 69% using 3-shot learning, which is further improved to a high value of 80% with fine-tuning. For the third challenge, we find that GPT-3 is able to generate problems with accuracy ranging from 33% to 93%, depending on the problem type.

9 citations


Journal ArticleDOI
TL;DR: Zhang et al. as mentioned in this paper decompose an input into several fine-grained elements and construct a graph structure per sample to measure and utilize element-granularity relations within each sample.
Abstract: Zero-shot learning (ZSL) is an extreme case of transfer learning that aims to recognize samples (e.g., images) of unseen classes relying on a train-set covering only seen classes and a set of auxiliary knowledge (e.g., semantic descriptors). Existing methods usually resort to constructing a visual-to-semantics mapping based on features extracted from each whole sample. However, since the visual and semantic spaces are inherently independent and may exist in different manifolds, these methods may easily suffer from the domain bias problem due to the knowledge transfer from seen to unseen classes. Unlike existing works, this paper investigates the fine-grained ZSL from a novel perspective of sample-level graph. Specifically, we decompose an input into several fine-grained elements and construct a graph structure per sample to measure and utilize element-granularity relations within each sample. Taking advantage of recently developed graph neural networks (GNNs), we formulate the ZSL problem to a graph-to-semantics mapping task, which can better exploit element-semantics correlation and local sub-structural information in samples. Experimental results on the widely used benchmark datasets demonstrate that the proposed method can mitigate the domain bias problem and achieve competitive performance against other representative methods.

9 citations


Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors proposed a real-time monocular 3D video reconstruction approach named Flora for reconstructing delicate and complete 3D scenes from RGB video sequences in an end-to-end manner.
Abstract: In this work, we propose a real-time monocular 3D video reconstruction approach named Flora for reconstructing delicate and complete 3D scenes from RGB video sequences in an end-to-end manner. Specifically, we introduce a novel method with two main contributions. Firstly, the proposed feature aggregation module retains both color and reliability in a dual-frequency form. Secondly, the loss compensation module solves missing structure by correcting losses for falsely pruned voxels. The dual-frequency feature aggregation module enhances reconstruction quality in both precision and recall, and the loss compensation module benefits the recall. Notably, both proposed contributions achieve great results with negligible inferencing overhead. Our state-of-the-art experimental results on real-world datasets demonstrate Flora's leading performance in both effectiveness and efficiency. The code is available at https://github.com/NoOneUST/Flora.

8 citations


Journal ArticleDOI
TL;DR: In this paper , the authors design an online learning method that can adaptively optimize the stability-plasticity tradeoff without knowing the setting as a priori, i.e., CIL models should keep stable to retain old knowledge and keep plastic to absorb new knowledge.
Abstract: Class-incremental learning (CIL) aims to train a classification model while the number of classes increases phase-by-phase. An inherent challenge of CIL is the stability-plasticity tradeoff, i.e., CIL models should keep stable to retain old knowledge and keep plastic to absorb new knowledge. However, none of the existing CIL models can achieve the optimal tradeoff in different data-receiving settings—where typically the training-from-half (TFH) setting needs more stability, but the training-from-scratch (TFS) needs more plasticity. To this end, we design an online learning method that can adaptively optimize the tradeoff without knowing the setting as a priori. Specifically, we first introduce the key hyperparameters that influence the tradeoff, e.g., knowledge distillation (KD) loss weights, learning rates, and classifier types. Then, we formulate the hyperparameter optimization process as an online Markov Decision Process (MDP) problem and propose a specific algorithm to solve it. We apply local estimated rewards and a classic bandit algorithm Exp3 to address the issues when applying online MDP methods to the CIL protocol. Our method consistently improves top-performing CIL methods in both TFH and TFS settings, e.g., boosting the average accuracy of TFH and TFS by 2.2 percentage points on ImageNet-Full, compared to the state-of-the-art. Code is provided at https://class-il.mpi-inf.mpg.de/online/

7 citations


Journal ArticleDOI
TL;DR: In this paper , a transfer learning method was proposed to learn knowledge-enhanced features for emotion recognition in multi-party conversations by explicitly modeling the discourse relations between utterances and incorporating symbolic knowledge.
Abstract: Emotion recognition in conversation (ERC) has received increasing attention from the research community. However, the ERC task is challenging, largely due to the complex and unstructured properties of multi-party conversations. Besides, the majority of daily dialogues take place in a specific context or circumstance, which requires rich external knowledge to understand the background of a certain dialogue. In this paper, we address these challenges by explicitly modeling the discourse relations between utterances and incorporating symbolic knowledge into multi-party conversations. We first introduce a dialogue parsing algorithm into ERC and further improve the algorithm through a transfer learning method. Moreover, we leverage different symbolic knowledge graph relations to learn knowledge-enhanced features for the ERC task. Extensive experiments on three benchmarks demonstrate that both dialogue structure graphs and symbolic knowledge are beneficial to the model performance on the task. Additionally, experimental results indicate that the proposed model surpasses baseline models on several indices.

6 citations


Journal ArticleDOI
TL;DR: Substructure Aware Graph Neural Networks (SAGNN) as discussed by the authors proposed a cut subgraph which can be obtained from the original graph by continuously and selectively removing edges, and extended the random walk encoding paradigm to the return probability of the rooted node on the subgraph to capture the structural information and use it as a node feature.
Abstract: Despite the great achievements of Graph Neural Networks (GNNs) in graph learning, conventional GNNs struggle to break through the upper limit of the expressiveness of first-order Weisfeiler-Leman graph isomorphism test algorithm (1-WL) due to the consistency of the propagation paradigm of GNNs with the 1-WL.Based on the fact that it is easier to distinguish the original graph through subgraphs, we propose a novel framework neural network framework called Substructure Aware Graph Neural Networks (SAGNN) to address these issues. We first propose a Cut subgraph which can be obtained from the original graph by continuously and selectively removing edges. Then we extend the random walk encoding paradigm to the return probability of the rooted node on the subgraph to capture the structural information and use it as a node feature to improve the expressiveness of GNNs. We theoretically prove that our framework is more powerful than 1-WL, and is superior in structure perception. Our extensive experiments demonstrate the effectiveness of our framework, achieving state-of-the-art performance on a variety of well-proven graph tasks, and GNNs equipped with our framework perform flawlessly even in 3-WL failed graphs. Specifically, our framework achieves a maximum performance improvement of 83% compared to the base models and 32% compared to the previous state-of-the-art methods.

5 citations


Journal ArticleDOI
TL;DR: In this paper , a critical focus on fair-AI and survey issues, simplifications, and mistakes that researchers and practitioners often underestimate, which in turn can undermine the trust on fair AI and limit its contribution to society.
Abstract: There is a fast-growing literature in addressing the fairness of AI models (fair-AI), with a continuous stream of new conceptual frameworks, methods, and tools. How much can we trust them? How much do they actually impact society? We take a critical focus on fair-AI and survey issues, simplifications, and mistakes that researchers and practitioners often underestimate, which in turn can undermine the trust on fair-AI and limit its contribution to society. In particular, we discuss the hyper-focus on fairness metrics and on optimizing their average performances. We instantiate this observation by discussing the Yule's effect of fair-AI tools: being fair on average does not imply being fair in contexts that matter. We conclude that the use of fair-AI methods should be complemented with the design, development, and verification practices that are commonly summarized under the umbrella of trustworthy AI.

5 citations


Journal ArticleDOI
TL;DR: In this paper , the co-evolution of punishment and cooperation in one-shot public goods games is studied, and conditions for their co-presence, co-dominance and co-extinction are derived.
Abstract: Altruistic punishment (or punishment) has been extensively shown as an important mechanism for promoting cooperation in human societies. In AI, the emergence of punishment has received much recent interest. In this paper, we contribute with a novel evolutionary game theoretic model to study the impacts of environmental feedback. Whereas a population of agents plays public goods games, there exists a third-party population whose payoffs depend not only on whether to punish or not, but also on the state of the environment (e.g., how cooperative the agents in a social dilemma are). Focusing on one-shot public goods games, we show that environmental feedback, by itself, can lead to the emergence of punishment. We analyze the co-evolution of punishment and cooperation, and derive conditions for their co-presence, co-dominance and co-extinction. Moreover, we show that the system can exhibit bistability as well as cyclic dynamics. Our findings provide a new explanation for the emergence of punishment. On the other hand, our results also alert the need for careful design of implementing punishment in multi-agent systems, as the resulting evolutionary dynamics can be somewhat complex.

4 citations


Journal ArticleDOI
TL;DR: Zhou et al. as discussed by the authors developed a transformer module towards the temporal correlations among the input features within pedestrian video sequences and a deep evidential learning model to capture the AI uncertainty under scene complexities.
Abstract: With rapid development in hardware (sensors and processors) and AI algorithms, automated driving techniques have entered the public’s daily life and achieved great success in supporting human driving performance. However, due to the high contextual variations and temporal dynamics in pedestrian behaviors, the interaction between autonomous-driving cars and pedestrians remains challenging, impeding the development of fully autonomous driving systems. This paper focuses on predicting pedestrian intention with a novel transformer-based evidential prediction (TrEP) algorithm. We develop a transformer module towards the temporal correlations among the input features within pedestrian video sequences and a deep evidential learning model to capture the AI uncertainty under scene complexities. Experimental results on three popular pedestrian intent benchmarks have verified the effectiveness of our proposed model over the state-of-the-art. The algorithm performance can be further boosted by controlling the uncertainty level. We systematically compare human disagreements with AI uncertainty to further evaluate AI performance in confusing scenes. The code is released at https://github.com/zzmonlyyou/TrEP.git.

4 citations


Journal ArticleDOI
TL;DR: In this article , a fusion network is proposed to predict the 3D offsets between radar returns and object centers, enabling radar depths to enhance the accuracy of 3D monocular detection, which shows significant improvement in mean average precision and translation error on the nuScenes dataset over monocular counterparts.
Abstract: As a direct depth sensor, radar holds promise as a tool to improve monocular 3D object detection, which suffers from depth errors, due in part to the depth-scale ambiguity. On the other hand, leveraging radar depths is hampered by difficulties in precisely associating radar returns with 3D estimates from monocular methods, effectively erasing its benefits. This paper proposes a fusion network that addresses this radar-camera association challenge. We train our network to predict the 3D offsets between radar returns and object centers, enabling radar depths to enhance the accuracy of 3D monocular detection. By using parallel radar and camera backbones, our network fuses information at both the feature level and detection level, while at the same time leveraging a state-of-the-art monocular detection technique without retraining it. Experimental results show significant improvement in mean average precision and translation error on the nuScenes dataset over monocular counterparts. Our source code is available at https://github.com/longyunf/radiant.

3 citations


Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper proposed a Hierarchical ConViT with Attention-based Relational Reasoner (HCV-ARR) to tackle the challenges of visual perception and logic reasoning on RPMs.
Abstract: Raven’s Progressive Matrices (RPMs) have been widely used to evaluate the visual reasoning ability of humans. To tackle the challenges of visual perception and logic reasoning on RPMs, we propose a Hierarchical ConViT with Attention-based Relational Reasoner (HCV-ARR). Traditional solution methods often apply relatively shallow convolution networks to visually perceive shape patterns in RPM images, which may not fully model the long-range dependencies of complex pattern combinations in RPMs. The proposed ConViT consists of a convolutional block to capture the low-level attributes of visual patterns, and a transformer block to capture the high-level image semantics such as pattern formations. Furthermore, the proposed hierarchical ConViT captures visual features from multiple receptive fields, where the shallow layers focus on the image fine details while the deeper layers focus on the image semantics. To better model the underlying reasoning rules embedded in RPM images, an Attention-based Relational Reasoner (ARR) is proposed to establish the underlying relations among images. The proposed ARR well exploits the hidden relations among question images through the developed element-wise attentive reasoner. Experimental results on three RPM datasets demonstrate that the proposed HCV-ARR achieves a significant performance gain compared with the state-of-the-art models. The source code is available at: https://github.com/wentaoheunnc/HCV-ARR.

Journal ArticleDOI
TL;DR: In this paper , the authors discuss the benefits and challenges of shared tasks as a teaching method and derive a domain-neutral process model to capture the respective tutorial structure for information retrieval courses at two universities.
Abstract: In this paper, we discuss the benefits and challenges of shared tasks as a teaching method. A shared task is a scientific event and a friendly competition to solve a research problem, the task. In terms of linking research and teaching, shared-task-based tutorials fulfill several faculty desires: they leverage students' interdisciplinary and heterogeneous skills, foster teamwork, and engage them in creative work that has the potential to produce original research contributions. Based on ten information retrieval (IR) courses at two universities since 2019 with shared tasks as tutorials, we derive a domain-neutral process model to capture the respective tutorial structure. Meanwhile, our teaching method has been adopted by other universities in IR courses, but also in other areas of AI such as natural language processing and robotics.

Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper proposed Mask Graph Convolutional Network (Mask-GCN) to learn action-specific skeleton joints that mainly convey action information while masking action-agnostic skeleton joints.
Abstract: Most skeleton-based action recognition methods assume that the same type of action samples in the training set and the test set share similar motion patterns. However, action samples in real scenarios usually contain novel motion patterns which are not involved in the training set. As it is laborious to collect sufficient training samples to enumerate various types of novel motion patterns, this paper presents a practical skeleton-based action recognition task where the training set contains common motion patterns of action samples and the test set contains action samples that suffer from novel motion patterns. For this task, we present a Mask Graph Convolutional Network (Mask-GCN) to focus on learning action-specific skeleton joints that mainly convey action information meanwhile masking action-agnostic skeleton joints that convey rare action information and suffer more from novel motion patterns. Specifically, we design a policy network to learn layer-wise body masks to construct masked adjacency matrices, which guide a GCN-based backbone to learn stable yet informative action features from dynamic graph structure. Extensive experiments on our newly collected dataset verify that Mask-GCN outperforms most GCN-based methods when testing with various novel motion patterns.

Journal ArticleDOI
TL;DR: In this article , the authors studied individual unfairness in the presence of censorship in three benchmark tasks and provided the first known results on individual fairness guarantee in analysis of censored data, where the availability of class label is not always guaranteed due to censorship.
Abstract: There has been increasing concern within the machine learning community and beyond that Artificial Intelligence (AI) faces a bias and discrimination crisis which needs AI fairness with urgency. As many have begun to work on this problem, most existing work depends on the availability of class label for the given fairness definition and algorithm which may not align with real-world usage. In this work, we study an AI fairness problem that stems from the gap between the design of a "fair" model in the lab and its deployment in the real-world. Specifically, we consider defining and mitigating individual unfairness amidst censorship, where the availability of class label is not always guaranteed due to censorship, which is broadly applicable in a diversity of real-world socially sensitive applications. We show that our method is able to quantify and mitigate individual unfairness in the presence of censorship across three benchmark tasks, which provides the first known results on individual fairness guarantee in analysis of censored data.

Journal ArticleDOI
TL;DR: This paper studied compositional generalization of a reinforcement learning agent following natural language instructions in an embodied environment, and developed a set of tasks in a photo-realistic simulated kitchen environment that allow them to study the degree to which a behavioral policy captures the systematicity in language by studying its zero-shot generalization performance on held out natural language instruction.
Abstract: The ability to combine learned knowledge and skills to solve novel tasks is a key aspect of generalization in humans that allows us to understand and perform tasks described by novel language utterances. While progress has been made in supervised learning settings, no work has yet studied compositional generalization of a reinforcement learning agent following natural language instructions in an embodied environment. We develop a set of tasks in a photo-realistic simulated kitchen environment that allow us to study the degree to which a behavioral policy captures the systematicity in language by studying its zero-shot generalization performance on held out natural language instructions. We show that our agent which leverages a novel additive action-value decomposition in tandem with attention based subgoal prediction is able to exploit composition in text instructions to generalize to unseen tasks.

Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors formulated the puzzle reassembly as a combinatorial optimization problem and proposed a Siamese-discriminant deep reinforcement learning (SD2RL) to solve it.
Abstract: Jigsaw puzzle solving has recently become an emerging research area. The developed techniques have been widely used in applications beyond puzzle solving. This paper focuses on solving Jigsaw Puzzles with Large Eroded Gaps (JPwLEG). We formulate the puzzle reassembly as a combinatorial optimization problem and propose a Siamese-Discriminant Deep Reinforcement Learning (SD2RL) to solve it. A Deep Q-network (DQN) is designed to visually understand the puzzles, which consists of two sets of Siamese Discriminant Networks, one set to perceive the pairwise relations between vertical neighbors and another set for horizontal neighbors. The proposed DQN considers not only the evidence from the incumbent fragment but also the support from its four neighbors. The DQN is trained using replay experience with carefully designed rewards to guide the search for a sequence of fragment swaps to reach the correct puzzle solution. Two JPwLEG datasets are constructed to evaluate the proposed method, and the experimental results show that the proposed SD2RL significantly outperforms state-of-the-art methods.

Journal ArticleDOI
TL;DR: In this article , a novel algorithm for the feature relevancy problem is proposed, which is applicable to any ML classifier that meets minor requirements and is shown to be efficient in practice.
Abstract: Trustable explanations of machine learning (ML) models are vital in high-risk uses of artificial intelligence (AI). Apart from the computation of trustable explanations, a number of explainability queries have been identified and studied in recent work. Some of these queries involve solving quantification problems, either in propositional or in more expressive logics. This paper investigates one of these quantification problems, namely the feature relevancy problem (FRP), i.e.\ to decide whether a (possibly sensitive) feature can occur in some explanation of a prediction. In contrast with earlier work, that studied FRP for specific classifiers, this paper proposes a novel algorithm for the \fprob quantification problem which is applicable to any ML classifier that meets minor requirements. Furthermore, the paper shows that the novel algorithm is efficient in practice. The experimental results, obtained using random forests (RFs) induced from well-known publicly available datasets, demonstrate that the proposed solution outperforms existing state-of-the-art solvers for Quantified Boolean Formulas (QBF) by orders of magnitude. Finally, the paper also identifies a novel family of formulas that are challenging for currently state-of-the-art QBF solvers.

Journal ArticleDOI
TL;DR: NL2LTL as mentioned in this paper leverages the latest in natural language understanding (NLU) and large language models (LLMs) to translate natural language instructions to linear temporal logic (LTL) formulas.
Abstract: This is a demonstration of our newly released Python package NL2LTL which leverages the latest in natural language understanding (NLU) and large language models (LLMs) to translate natural language instructions to linear temporal logic (LTL) formulas. This allows direct translation to formal languages that a reasoning system can use, while at the same time, allowing the end-user to provide inputs in natural language without having to understand any details of an underlying formal language. The package comes with support for a set of default LTL patterns, corresponding to popular DECLARE templates, but is also fully extensible to new formulas and user inputs. The package is open-source and is free to use for the AI community under the MIT license. Open Source: https://github.com/IBM/nl2ltl. Video Link: https://bit.ly/3dHW5b1

Journal ArticleDOI
TL;DR: CogitoErgoSumm as mentioned in this paper infuses graphs from two complementary semantic parsing techniques with different goals and granularities, also designing a reward signal to maximize information content preservation through reinforcement learning.
Abstract: The automatic synthesis of biomedical publications catalyzes a profound research interest elicited by literature congestion. Current sequence-to-sequence models mainly rely on the lexical surface and seldom consider the deep semantic interconnections between the entities mentioned in the source document. Such superficiality translates into fabricated, poorly informative, redundant, and near-extractive summaries that severely restrict their real-world application in biomedicine, where the specialized jargon and the convoluted facts further emphasize task complexity. To fill this gap, we argue that the summarizer should acquire semantic interpretation over input, exploiting structured and unambiguous representations to capture and conserve the most relevant parts of the text content. This paper presents CogitoErgoSumm, the first framework for biomedical abstractive summarization equipping large pre-trained language models with rich semantic graphs. Precisely, we infuse graphs from two complementary semantic parsing techniques with different goals and granularities—Event Extraction and Abstract Meaning Representation, also designing a reward signal to maximize information content preservation through reinforcement learning. Extensive quantitative and qualitative evaluations on the CDSR dataset show that our solution achieves competitive performance according to multiple metrics, despite using 2.5x fewer parameters. Results and ablation studies indicate that our joint text-graph model generates more enlightening, readable, and consistent summaries. Code available at: https://github.com/disi-unibo-nlp/cogito-ergo-summ.

Journal ArticleDOI
TL;DR: In this paper , the authors propose Carburacy, a carbon-aware accuracy measure that captures both model effectiveness and eco-sustainability, comparing multiple state-of-the-art quadratic and linear transformers on several datasets under eco sustainable regimes.
Abstract: Generative transformer-based models have reached cutting-edge performance in long document summarization. Nevertheless, this task is witnessing a paradigm shift in developing ever-increasingly computationally-hungry solutions, focusing on effectiveness while ignoring the economic, environmental, and social costs of yielding such results. Accordingly, such extensive resources impact climate change and raise barriers to small and medium organizations distinguished by low-resource regimes of hardware and data. As a result, this unsustainable trend has lifted many concerns in the community, which directs the primary efforts on the proposal of tools to monitor models' energy costs. Despite their importance, no evaluation measure considering models' eco-sustainability exists yet. In this work, we propose Carburacy, the first carbon-aware accuracy measure that captures both model effectiveness and eco-sustainability. We perform a comprehensive benchmark for long document summarization, comparing multiple state-of-the-art quadratic and linear transformers on several datasets under eco-sustainable regimes. Finally, thanks to Carburacy, we found optimal combinations of hyperparameters that let models be competitive in effectiveness with significantly lower costs.

Journal ArticleDOI
TL;DR: In this article , a modification to the deployed Bayesian ensembling case time series forecasting framework was proposed to forecast epidemic dynamics through a number of phases (waves) of a pandemic.
Abstract: Despite hundreds of methods published in the literature, forecasting epidemic dynamics remains challenging yet important. The challenges stem from multiple sources, including: the need for timely data, co-evolution of epidemic dynamics with behavioral and immunological adaptations, and the evolution of new pathogen strains. The ongoing COVID-19 pandemic highlighted these challenges; in an important article, Reich et al. did a comprehensive analysis highlighting many of these challenges. In this paper, we take another step in critically evaluating existing epidemic forecasting methods. Our methods are based on a simple yet crucial observation - epidemic dynamics go through a number of phases (waves). Armed with this understanding, we propose a modification to our deployed Bayesian ensembling case time series forecasting framework. We show that ensembling methods employing the phase information and using different weighting schemes for each phase can produce improved forecasts. We evaluate our proposed method with both the currently deployed model and the COVID-19 forecasthub models. The overall performance of the proposed model is consistent across the pandemic but more importantly, it is ranked third and first during two critical rapid growth phases in cases, regimes where the performance of most models from the CDC forecasting hub dropped significantly.

Journal ArticleDOI
TL;DR: In this paper , a constrained entropy maximization (CEM) algorithm is proposed to solve task-agnostic safe exploration problems, which naturally require a finite horizon and undiscounted constraints on safety costs.
Abstract: In the absence of assigned tasks, a learning agent typically seeks to explore its environment efficiently. However, the pursuit of exploration will bring more safety risks. An under-explored aspect of reinforcement learning is how to achieve safe efficient exploration when the task is unknown. In this paper, we propose a practical Constrained Entropy Maximization (CEM) algorithm to solve task-agnostic safe exploration problems, which naturally require a finite horizon and undiscounted constraints on safety costs. The CEM algorithm aims to learn a policy that maximizes state entropy under the premise of safety. To avoid approximating the state density in complex domains, CEM leverages a k-nearest neighbor entropy estimator to evaluate the efficiency of exploration. In terms of safety, CEM minimizes the safety costs, and adaptively trades off safety and exploration based on the current constraint satisfaction. The empirical analysis shows that CEM enables the acquisition of a safe exploration policy in complex environments, resulting in improved performance in both safety and sample efficiency for target tasks.

Journal ArticleDOI
TL;DR: NHITS as mentioned in this paper uses hierarchical interpolation and multi-rate data sampling techniques to assemble long-horizon forecasting predictions sequentially, emphasizing components with different frequencies and scales while decomposing the input signal and synthesizing the forecast.
Abstract: Recent progress in neural forecasting accelerated improvements in the performance of large-scale forecasting systems. Yet, long-horizon forecasting remains a very difficult task. Two common challenges afflicting the task are the volatility of the predictions and their computational complexity. We introduce NHITS, a model which addresses both challenges by incorporating novel hierarchical interpolation and multi-rate data sampling techniques. These techniques enable the proposed method to assemble its predictions sequentially, emphasizing components with different frequencies and scales while decomposing the input signal and synthesizing the forecast. We prove that the hierarchical interpolation technique can efficiently approximate arbitrarily long horizons in the presence of smoothness. Additionally, we conduct extensive large-scale dataset experiments from the long-horizon forecasting literature, demonstrating the advantages of our method over the state-of-the-art methods, where NHITS provides an average accuracy improvement of almost 20% over the latest Transformer architectures while reducing the computation time by an order of magnitude (50 times). Our code is available at https://github.com/Nixtla/neuralforecast.

Journal ArticleDOI
TL;DR: PateGail as mentioned in this paper is a privacy-preserving imitation learning model to generate mobility trajectories, which utilizes the powerful generative adversary imitation learning (GAN) model to simulate the decision-making process of humans.
Abstract: Generating human mobility trajectories is of great importance to solve the lack of large-scale trajectory data in numerous applications, which is caused by privacy concerns. However, existing mobility trajectory generation methods still require real-world human trajectories centrally collected as the training data, where there exists an inescapable risk of privacy leakage. To overcome this limitation, in this paper, we propose PateGail, a privacy-preserving imitation learning model to generate mobility trajectories, which utilizes the powerful generative adversary imitation learning model to simulate the decision-making process of humans. Further, in order to protect user privacy, we train this model collectively based on decentralized mobility data stored in user devices, where personal discriminators are trained locally to distinguish and reward the real and generated human trajectories. In the training process, only the generated trajectories and their rewards obtained based on personal discriminators are shared between the server and devices, whose privacy is further preserved by our proposed perturbation mechanisms with theoretical proof to satisfy differential privacy. Further, to better model the human decision-making process, we propose a novel aggregation mechanism of the rewards obtained from personal discriminators. We theoretically prove that under the reward obtained based on the aggregation mechanism, our proposed model maximizes the lower bound of the discounted total rewards of users. Extensive experiments show that the trajectories generated by our model are able to resemble real-world trajectories in terms of five key statistical metrics, outperforming state-of-the-art algorithms by over 48.03%. Furthermore, we demonstrate that the synthetic trajectories are able to efficiently support practical applications, including mobility prediction and location recommendation.

Journal ArticleDOI
TL;DR: In this paper , a Transformer-based Conditional Mixture of Bernoulli Network (TCMBN) is proposed for multi-label prediction in real-world data.
Abstract: Streams of irregularly occurring events are commonly modeled as a marked temporal point process. Many real-world datasets such as e-commerce transactions and electronic health records often involve events where multiple event types co-occur, e.g. multiple items purchased or multiple diseases diagnosed simultaneously. In this paper, we tackle multi-label prediction in such a problem setting, and propose a novel Transformer-based Conditional Mixture of Bernoulli Network (TCMBN) that leverages neural density estimation to capture complex temporal dependence as well as probabilistic dependence between concurrent event types. We also propose potentially incorporating domain knowledge in the objective by regularizing the predicted probability. To represent probabilistic dependence of concurrent event types graphically, we design a two-step approach that first learns the mixture of Bernoulli network and then solves a least-squares semi-definite constrained program to numerically approximate the sparse precision matrix from a learned covariance matrix. This approach proves to be effective for event prediction while also providing an interpretable and possibly non-stationary structure for insights into event co-occurrence. We demonstrate the superior performance of our approach compared to existing baselines on multiple synthetic and real benchmarks.

Journal ArticleDOI
TL;DR: In this article , a decomposition of the SHAP contribution function based on decision paths is introduced, which allows a more comprehensible formulation of SHAP algorithms for tree-based models and can also be readily applied to computing SHAP interaction values of these models.
Abstract: In recent years, game-theoretic Shapley values have gained increasing attention with respect to local model explanation by feature attributions. While the approach using Shapley values is model-independent, their (exact) computation is usually intractable, so efficient model-specific algorithms have been devised including approaches for decision trees or their ensembles in general. Our work goes further in this direction by extending the interventional TreeSHAP algorithm to piecewise linear regression trees, which gained more attention in the past few years. To this end, we introduce a decomposition of the contribution function based on decision paths, which allows a more comprehensible formulation of SHAP algorithms for tree-based models. Our algorithm can also be readily applied to computing SHAP interaction values of these models. In particular, as the main contribution of this paper, we provide a more efficient approach of interventional SHAP for tree-based models by precomputing statistics of the background data based on the tree structure.

Journal ArticleDOI
TL;DR: In this paper , the authors define two classes of perpetual voting rules that are particularly easy to explain to voters and explore the bounds imposed by this simplicity and study proportionality in the perpetual setting and identify two rules with strong proportionality guarantees.
Abstract: Perpetual voting is a framework for long-term collective decision making. In this framework, we consider a sequence of subsequent approval-based elections and try to achieve a fair overall outcome. To achieve fairness over time, perpetual voting rules take the history of previous decisions into account and identify voters that were dissatisfied with previous decisions. In this paper, we look at perpetual voting rules from an axiomatic perspective. First, we define two classes of perpetual voting rules that are particularly easy to explain to voters and explore the bounds imposed by this simplicity. Second, we study proportionality in the perpetual setting and identify two rules with strong proportionality guarantees. However, both rules yield different guarantees and we prove them to be incompatible with each other.

Journal ArticleDOI
TL;DR: In this paper , the hidden spatial relations can be decomposed into a prior part which applies across all the samples and a dynamic part which varies between samples, and building different graphs is necessary to model these relations.
Abstract: Modeling multi-variate time-series (MVTS) data is a long-standing research subject and has found wide applications. Recently, there is a surge of interest in modeling spatial relations between variables as graphs, i.e., first learning one static graph for each dataset and then exploiting the graph structure via graph neural networks. However, as spatial relations may differ substantially across samples, building one static graph for all the samples inherently limits flexibility and severely degrades the performance in practice. To address this issue, we propose a framework for fine-grained modeling and utilization of spatial correlation between variables. By analyzing the statistical properties of real-world datasets, a universal decomposition of spatial correlation graphs is first identified. Specifically, the hidden spatial relations can be decomposed into a prior part, which applies across all the samples, and a dynamic part, which varies between samples, and building different graphs is necessary to model these relations. To better coordinate the learning of the two relational graphs, we propose a min-max learning paradigm that not only regulates the common part of different dynamic graphs but also guarantees spatial distinguishability among samples. The experimental results show that our proposed model outperforms the state-of-the-art baseline methods on both time-series forecasting and time-series point prediction tasks.

Journal ArticleDOI
TL;DR: In this article , the authors consider the problem of partitioning n agents in an undirected social network into k almost equal in size (differing by at most one) groups, where the utility of an agent for a group is the number of her neighbors in the group.
Abstract: We consider the problem of partitioning n agents in an undirected social network into k almost equal in size (differing by at most one) groups, where the utility of an agent for a group is the number of her neighbors in the group. The core and envy-freeness are two compelling axiomatic fairness guarantees in such settings. The former demands that there be no coalition of agents such that each agent in the coalition has more utility for that coalition than for her own group, while the latter demands that no agent envy another agent for the group they are in. We provide (often tight) approximations to both fairness guarantees, and many of our positive results are obtained via efficient algorithms.

Journal ArticleDOI
TL;DR: Zhang et al. as mentioned in this paper proposed ConvMatch, which uses convolutional neural network (CNN) as the backbone to capture better context, thus avoiding the complex design of extra blocks.
Abstract: Multilayer perceptron (MLP) has been widely used in two-view correspondence learning for only unordered correspondences provided, and it extracts deep features from individual correspondence effectively. However, the problem of lacking context information limits its performance and hence, many extra complex blocks are designed to capture such information in the follow-up studies. In this paper, from a novel perspective, we design a correspondence learning network called ConvMatch that for the first time can leverage convolutional neural network (CNN) as the backbone to capture better context, thus avoiding the complex design of extra blocks. Specifically, with the observation that sparse motion vectors and dense motion field can be converted into each other with interpolating and sampling, we regularize the putative motion vectors by estimating dense motion field implicitly, then rectify the errors caused by outliers in local areas with CNN, and finally obtain correct motion vectors from the rectified motion field. Extensive experiments reveal that ConvMatch with a simple CNN backbone consistently outperforms state-of-the-arts including MLP-based methods for relative pose estimation and homography estimation, and shows promising generalization ability to different datasets and descriptors. Our code is publicly available at https://github.com/SuhZhang/ConvMatch.