Showing papers in "Proceedings of the ... AAAI Conference on Artificial Intelligence in 2023"

PDF

Open Access

Journal Article•DOI•

Solving Math Word Problems concerning Systems of Equations with GPT-3

[...]

26 Jun 2023-Proceedings of the ... AAAI Conference on Artificial Intelligence

TL;DR: In this paper , the authors evaluate the use of GPT-3, a 1.75B parameter transformer model recently released by OpenAI, for three related challenges pertaining to math word problems corresponding to systems of two linear equations.

...read moreread less

Abstract: Researchers have been interested in developing AI tools to help students learn various mathematical subjects. One challenging set of tasks for school students is learning to solve math word problems. We explore how recent advances in natural language processing, specifically the rise of powerful transformer based models, can be applied to help math learners with such problems. Concretely, we evaluate the use of GPT-3, a 1.75B parameter transformer model recently released by OpenAI, for three related challenges pertaining to math word problems corresponding to systems of two linear equations. The three challenges are classifying word problems, extracting equations from word problems, and generating word problems. For the first challenge, we define a set of problem classes and find that GPT-3 has generally very high accuracy in classifying word problems (80%-100%), for all but one of these classes. For the second challenge, we find the accuracy for extracting equations improves with number of examples provided to the model, ranging from an accuracy of 31% for zero-shot learning to about 69% using 3-shot learning, which is further improved to a high value of 80% with fine-tuning. For the third challenge, we find that GPT-3 is able to generate problems with accuracy ranging from 33% to 93%, depending on the problem type.

...read moreread less

9 citations

Journal Article•DOI•

Graph Knows Unknowns: Reformulate Zero-Shot Learning as Sample-Level Graph Recognition

[...]

Jingcai Guo, Song Guo, Qihua Zhou, Ziming Liu, Xiaocheng Lu, Fushuo Huo - Show less +2 more

26 Jun 2023-Proceedings of the ... AAAI Conference on Artificial Intelligence

TL;DR: Zhang et al. as mentioned in this paper decompose an input into several fine-grained elements and construct a graph structure per sample to measure and utilize element-granularity relations within each sample.

...read moreread less

Abstract: Zero-shot learning (ZSL) is an extreme case of transfer learning that aims to recognize samples (e.g., images) of unseen classes relying on a train-set covering only seen classes and a set of auxiliary knowledge (e.g., semantic descriptors). Existing methods usually resort to constructing a visual-to-semantics mapping based on features extracted from each whole sample. However, since the visual and semantic spaces are inherently independent and may exist in different manifolds, these methods may easily suffer from the domain bias problem due to the knowledge transfer from seen to unseen classes. Unlike existing works, this paper investigates the fine-grained ZSL from a novel perspective of sample-level graph. Specifically, we decompose an input into several fine-grained elements and construct a graph structure per sample to measure and utilize element-granularity relations within each sample. Taking advantage of recently developed graph neural networks (GNNs), we formulate the ZSL problem to a graph-to-semantics mapping task, which can better exploit element-semantics correlation and local sub-structural information in samples. Experimental results on the widely used benchmark datasets demonstrate that the proposed method can mitigate the domain bias problem and achieve competitive performance against other representative methods.

...read moreread less

9 citations

Journal Article•DOI•

Flora: Dual-Frequency LOss-Compensated ReAl-Time Monocular 3D Video Reconstruction

[...]

Likang Wang, Yue Gong, Qirui Wang, Kaixuan Zhou, Lei Chen - Show less +1 more

26 Jun 2023-Proceedings of the ... AAAI Conference on Artificial Intelligence

TL;DR: Wang et al. as discussed by the authors proposed a real-time monocular 3D video reconstruction approach named Flora for reconstructing delicate and complete 3D scenes from RGB video sequences in an end-to-end manner.

...read moreread less

Abstract: In this work, we propose a real-time monocular 3D video reconstruction approach named Flora for reconstructing delicate and complete 3D scenes from RGB video sequences in an end-to-end manner. Specifically, we introduce a novel method with two main contributions. Firstly, the proposed feature aggregation module retains both color and reliability in a dual-frequency form. Secondly, the loss compensation module solves missing structure by correcting losses for falsely pruned voxels. The dual-frequency feature aggregation module enhances reconstruction quality in both precision and recall, and the loss compensation module benefits the recall. Notably, both proposed contributions achieve great results with negligible inferencing overhead. Our state-of-the-art experimental results on real-world datasets demonstrate Flora's leading performance in both effectiveness and efficiency. The code is available at https://github.com/NoOneUST/Flora.

...read moreread less

8 citations

Journal Article•DOI•

Online Hyperparameter Optimization for Class-Incremental Learning

[...]

Yaoyao Liu, Yingying Li, Bernt Schiele, Qianru Sun

11 Jan 2023-Proceedings of the ... AAAI Conference on Artificial Intelligence

TL;DR: In this paper , the authors design an online learning method that can adaptively optimize the stability-plasticity tradeoff without knowing the setting as a priori, i.e., CIL models should keep stable to retain old knowledge and keep plastic to absorb new knowledge.

...read moreread less

Abstract: Class-incremental learning (CIL) aims to train a classification model while the number of classes increases phase-by-phase. An inherent challenge of CIL is the stability-plasticity tradeoff, i.e., CIL models should keep stable to retain old knowledge and keep plastic to absorb new knowledge. However, none of the existing CIL models can achieve the optimal tradeoff in different data-receiving settings—where typically the training-from-half (TFH) setting needs more stability, but the training-from-scratch (TFS) needs more plasticity. To this end, we design an online learning method that can adaptively optimize the tradeoff without knowing the setting as a priori. Specifically, we first introduce the key hyperparameters that influence the tradeoff, e.g., knowledge distillation (KD) loss weights, learning rates, and classifier types. Then, we formulate the hyperparameter optimization process as an online Markov Decision Process (MDP) problem and propose a specific algorithm to solve it. We apply local estimated rewards and a classic bandit algorithm Exp3 to address the issues when applying online MDP methods to the CIL protocol. Our method consistently improves top-performing CIL methods in both TFH and TFS settings, e.g., boosting the average accuracy of TFH and TFS by 2.2 percentage points on ImageNet-Full, compared to the state-of-the-art. Code is provided at https://class-il.mpi-inf.mpg.de/online/

...read moreread less

7 citations

Journal Article•DOI•

SKIER: A Symbolic Knowledge Integrated Model for Conversational Emotion Recognition

[...]

Wei Li, Luyao Zhu, Rui Mao, Erik Cambria

26 Jun 2023-Proceedings of the ... AAAI Conference on Artificial Intelligence

TL;DR: In this paper , a transfer learning method was proposed to learn knowledge-enhanced features for emotion recognition in multi-party conversations by explicitly modeling the discourse relations between utterances and incorporating symbolic knowledge.

...read moreread less

Abstract: Emotion recognition in conversation (ERC) has received increasing attention from the research community. However, the ERC task is challenging, largely due to the complex and unstructured properties of multi-party conversations. Besides, the majority of daily dialogues take place in a specific context or circumstance, which requires rich external knowledge to understand the background of a certain dialogue. In this paper, we address these challenges by explicitly modeling the discourse relations between utterances and incorporating symbolic knowledge into multi-party conversations. We first introduce a dialogue parsing algorithm into ERC and further improve the algorithm through a transfer learning method. Moreover, we leverage different symbolic knowledge graph relations to learn knowledge-enhanced features for the ERC task. Extensive experiments on three benchmarks demonstrate that both dialogue structure graphs and symbolic knowledge are beneficial to the model performance on the task. Additionally, experimental results indicate that the proposed model surpasses baseline models on several indices.

...read moreread less

6 citations

Journal Article•DOI•

Substructure Aware Graph Neural Networks

[...]

Di Zeng, Wanlong Liu, Wenyu Chen, Li Zhou, Malu Zhang, Hong Qu - Show less +2 more

26 Jun 2023-Proceedings of the ... AAAI Conference on Artificial Intelligence

TL;DR: Substructure Aware Graph Neural Networks (SAGNN) as discussed by the authors proposed a cut subgraph which can be obtained from the original graph by continuously and selectively removing edges, and extended the random walk encoding paradigm to the return probability of the rooted node on the subgraph to capture the structural information and use it as a node feature.

...read moreread less

Abstract: Despite the great achievements of Graph Neural Networks (GNNs) in graph learning, conventional GNNs struggle to break through the upper limit of the expressiveness of first-order Weisfeiler-Leman graph isomorphism test algorithm (1-WL) due to the consistency of the propagation paradigm of GNNs with the 1-WL.Based on the fact that it is easier to distinguish the original graph through subgraphs, we propose a novel framework neural network framework called Substructure Aware Graph Neural Networks (SAGNN) to address these issues. We first propose a Cut subgraph which can be obtained from the original graph by continuously and selectively removing edges. Then we extend the random walk encoding paradigm to the return probability of the rooted node on the subgraph to capture the structural information and use it as a node feature to improve the expressiveness of GNNs. We theoretically prove that our framework is more powerful than 1-WL, and is superior in structure perception. Our extensive experiments demonstrate the effectiveness of our framework, achieving state-of-the-art performance on a variety of well-proven graph tasks, and GNNs equipped with our framework perform flawlessly even in 3-WL failed graphs. Specifically, our framework achieves a maximum performance improvement of 83% compared to the base models and 32% compared to the previous state-of-the-art methods.

...read moreread less

5 citations

Journal Article•DOI•

Can We Trust Fair-AI?

[...]

Salvatore Ruggieri, José María Álvarez, Andrea Pugnana, Laura State, Franco Turini - Show less +1 more

26 Jun 2023-Proceedings of the ... AAAI Conference on Artificial Intelligence

TL;DR: In this paper , a critical focus on fair-AI and survey issues, simplifications, and mistakes that researchers and practitioners often underestimate, which in turn can undermine the trust on fair AI and limit its contribution to society.

...read moreread less

Abstract: There is a fast-growing literature in addressing the fairness of AI models (fair-AI), with a continuous stream of new conceptual frameworks, methods, and tools. How much can we trust them? How much do they actually impact society? We take a critical focus on fair-AI and survey issues, simplifications, and mistakes that researchers and practitioners often underestimate, which in turn can undermine the trust on fair-AI and limit its contribution to society. In particular, we discuss the hyper-focus on fairness metrics and on optimizing their average performances. We instantiate this observation by discussing the Yule's effect of fair-AI tools: being fair on average does not imply being fair in contexts that matter. We conclude that the use of fair-AI methods should be complemented with the design, development, and verification practices that are commonly summarized under the umbrella of trustworthy AI.

...read moreread less

5 citations

Journal Article•DOI•

Emergence of Punishment in Social Dilemma with Environmental Feedback

[...]

Zhen Wang, Zhao Song, Chen Shen, Shuyue Hu

26 Jun 2023-Proceedings of the ... AAAI Conference on Artificial Intelligence

TL;DR: In this paper , the co-evolution of punishment and cooperation in one-shot public goods games is studied, and conditions for their co-presence, co-dominance and co-extinction are derived.

...read moreread less

Abstract: Altruistic punishment (or punishment) has been extensively shown as an important mechanism for promoting cooperation in human societies. In AI, the emergence of punishment has received much recent interest. In this paper, we contribute with a novel evolutionary game theoretic model to study the impacts of environmental feedback. Whereas a population of agents plays public goods games, there exists a third-party population whose payoffs depend not only on whether to punish or not, but also on the state of the environment (e.g., how cooperative the agents in a social dilemma are). Focusing on one-shot public goods games, we show that environmental feedback, by itself, can lead to the emergence of punishment. We analyze the co-evolution of punishment and cooperation, and derive conditions for their co-presence, co-dominance and co-extinction. Moreover, we show that the system can exhibit bistability as well as cyclic dynamics. Our findings provide a new explanation for the emergence of punishment. On the other hand, our results also alert the need for careful design of implementing punishment in multi-agent systems, as the resulting evolutionary dynamics can be somewhat complex.

...read moreread less

4 citations

Journal Article•DOI•

TrEP: Transformer-Based Evidential Prediction for Pedestrian Intention with Uncertainty

[...]

Zheng-hua Zhang, Renran Tian, Zhengming Ding

26 Jun 2023-Proceedings of the ... AAAI Conference on Artificial Intelligence

TL;DR: Zhou et al. as discussed by the authors developed a transformer module towards the temporal correlations among the input features within pedestrian video sequences and a deep evidential learning model to capture the AI uncertainty under scene complexities.

...read moreread less

Abstract: With rapid development in hardware (sensors and processors) and AI algorithms, automated driving techniques have entered the public’s daily life and achieved great success in supporting human driving performance. However, due to the high contextual variations and temporal dynamics in pedestrian behaviors, the interaction between autonomous-driving cars and pedestrians remains challenging, impeding the development of fully autonomous driving systems. This paper focuses on predicting pedestrian intention with a novel transformer-based evidential prediction (TrEP) algorithm. We develop a transformer module towards the temporal correlations among the input features within pedestrian video sequences and a deep evidential learning model to capture the AI uncertainty under scene complexities. Experimental results on three popular pedestrian intent benchmarks have verified the effectiveness of our proposed model over the state-of-the-art. The algorithm performance can be further boosted by controlling the uncertainty level. We systematically compare human disagreements with AI uncertainty to further evaluate AI performance in confusing scenes. The code is released at https://github.com/zzmonlyyou/TrEP.git.

...read moreread less

4 citations

Journal Article•DOI•

RADIANT: Radar-Image Association Network for 3D Object Detection

[...]

Yunfei Long, Abhinav Kumar, Daniel H. Morris, Xiaoming Liu, Marcos Castro, Punarjay Chakravarty - Show less +2 more

26 Jun 2023-Proceedings of the ... AAAI Conference on Artificial Intelligence

TL;DR: In this article , a fusion network is proposed to predict the 3D offsets between radar returns and object centers, enabling radar depths to enhance the accuracy of 3D monocular detection, which shows significant improvement in mean average precision and translation error on the nuScenes dataset over monocular counterparts.

...read moreread less

Abstract: As a direct depth sensor, radar holds promise as a tool to improve monocular 3D object detection, which suffers from depth errors, due in part to the depth-scale ambiguity. On the other hand, leveraging radar depths is hampered by difficulties in precisely associating radar returns with 3D estimates from monocular methods, effectively erasing its benefits. This paper proposes a fusion network that addresses this radar-camera association challenge. We train our network to predict the 3D offsets between radar returns and object centers, enabling radar depths to enhance the accuracy of 3D monocular detection. By using parallel radar and camera backbones, our network fuses information at both the feature level and detection level, while at the same time leveraging a state-of-the-art monocular detection technique without retraining it. Experimental results show significant improvement in mean average precision and translation error on the nuScenes dataset over monocular counterparts. Our source code is available at https://github.com/longyunf/radiant.

...read moreread less

3 citations

Journal Article•DOI•

Hierarchical ConViT with Attention-Based Relational Reasoner for Visual Analogical Reasoning

[...]

Wentao He, Jialu Zhang, Jianfeng Ren, Ruibin Bai, Xudong Jiang - Show less +1 more

26 Jun 2023-Proceedings of the ... AAAI Conference on Artificial Intelligence

TL;DR: Wang et al. as mentioned in this paper proposed a Hierarchical ConViT with Attention-based Relational Reasoner (HCV-ARR) to tackle the challenges of visual perception and logic reasoning on RPMs.

...read moreread less

Abstract: Raven’s Progressive Matrices (RPMs) have been widely used to evaluate the visual reasoning ability of humans. To tackle the challenges of visual perception and logic reasoning on RPMs, we propose a Hierarchical ConViT with Attention-based Relational Reasoner (HCV-ARR). Traditional solution methods often apply relatively shallow convolution networks to visually perceive shape patterns in RPM images, which may not fully model the long-range dependencies of complex pattern combinations in RPMs. The proposed ConViT consists of a convolutional block to capture the low-level attributes of visual patterns, and a transformer block to capture the high-level image semantics such as pattern formations. Furthermore, the proposed hierarchical ConViT captures visual features from multiple receptive fields, where the shallow layers focus on the image fine details while the deeper layers focus on the image semantics. To better model the underlying reasoning rules embedded in RPM images, an Attention-based Relational Reasoner (ARR) is proposed to establish the underlying relations among images. The proposed ARR well exploits the hidden relations among question images through the developed element-wise attentive reasoner. Experimental results on three RPM datasets demonstrate that the proposed HCV-ARR achieves a significant performance gain compared with the state-of-the-art models. The source code is available at: https://github.com/wentaoheunnc/HCV-ARR.

...read moreread less

Journal Article•DOI•

Shared Tasks as Tutorials: A Methodical Approach

[...]

Theresa Elstner, Frank Loebe, Yamen Ajjour, Christopher Akiki, Alexander Bondarenko, Maik Fröbe, Lukas Gienapp, Nikolay Kolyada, Janis R. Mohr, Matti Wiegmann, Jörg Frochte, N. Ferro, Benno Stein, Matthias Hagen, M. Potthast - Show less +11 more

26 Jun 2023-Proceedings of the ... AAAI Conference on Artificial Intelligence

TL;DR: In this paper , the authors discuss the benefits and challenges of shared tasks as a teaching method and derive a domain-neutral process model to capture the respective tutorial structure for information retrieval courses at two universities.

...read moreread less

Abstract: In this paper, we discuss the benefits and challenges of shared tasks as a teaching method. A shared task is a scientific event and a friendly competition to solve a research problem, the task. In terms of linking research and teaching, shared-task-based tutorials fulfill several faculty desires: they leverage students' interdisciplinary and heterogeneous skills, foster teamwork, and engage them in creative work that has the potential to produce original research contributions. Based on ten information retrieval (IR) courses at two universities since 2019 with shared tasks as tutorials, we derive a domain-neutral process model to capture the respective tutorial structure. Meanwhile, our teaching method has been adopted by other universities in IR courses, but also in other areas of AI such as natural language processing and robotics.

...read moreread less

Journal Article•DOI•

Novel Motion Patterns Matter for Practical Skeleton-Based Action Recognition

[...]

Mengyuan Liu, Fanyang Meng, Sheng Chen, Songtao Wu

26 Jun 2023-Proceedings of the ... AAAI Conference on Artificial Intelligence

TL;DR: Wang et al. as mentioned in this paper proposed Mask Graph Convolutional Network (Mask-GCN) to learn action-specific skeleton joints that mainly convey action information while masking action-agnostic skeleton joints.

...read moreread less

Abstract: Most skeleton-based action recognition methods assume that the same type of action samples in the training set and the test set share similar motion patterns. However, action samples in real scenarios usually contain novel motion patterns which are not involved in the training set. As it is laborious to collect sufficient training samples to enumerate various types of novel motion patterns, this paper presents a practical skeleton-based action recognition task where the training set contains common motion patterns of action samples and the test set contains action samples that suffer from novel motion patterns. For this task, we present a Mask Graph Convolutional Network (Mask-GCN) to focus on learning action-specific skeleton joints that mainly convey action information meanwhile masking action-agnostic skeleton joints that convey rare action information and suffer more from novel motion patterns. Specifically, we design a policy network to learn layer-wise body masks to construct masked adjacency matrices, which guide a GCN-based backbone to learn stable yet informative action features from dynamic graph structure. Extensive experiments on our newly collected dataset verify that Mask-GCN outperforms most GCN-based methods when testing with various novel motion patterns.

...read moreread less

Journal Article•DOI•

Censored Fairness through Awareness

[...]

Wenbin Zhang, Tina Hernandez-Boussard, Jeremy Weiss

26 Jun 2023-Proceedings of the ... AAAI Conference on Artificial Intelligence

TL;DR: In this article , the authors studied individual unfairness in the presence of censorship in three benchmark tasks and provided the first known results on individual fairness guarantee in analysis of censored data, where the availability of class label is not always guaranteed due to censorship.

...read moreread less

Abstract: There has been increasing concern within the machine learning community and beyond that Artificial Intelligence (AI) faces a bias and discrimination crisis which needs AI fairness with urgency. As many have begun to work on this problem, most existing work depends on the availability of class label for the given fairness definition and algorithm which may not align with real-world usage. In this work, we study an AI fairness problem that stems from the gap between the design of a "fair" model in the lab and its deployment in the real-world. Specifically, we consider defining and mitigating individual unfairness amidst censorship, where the availability of class label is not always guaranteed due to censorship, which is broadly applicable in a diversity of real-world socially sensitive applications. We show that our method is able to quantify and mitigate individual unfairness in the presence of censorship across three benchmark tasks, which provides the first known results on individual fairness guarantee in analysis of censored data.

...read moreread less

Journal Article•DOI•

Learning Compositional Tasks from Language Instructions

[...]

Lajanugen Logeswaran, Wilka Carvalho, Honglak Lee

26 Jun 2023-Proceedings of the ... AAAI Conference on Artificial Intelligence

TL;DR: This paper studied compositional generalization of a reinforcement learning agent following natural language instructions in an embodied environment, and developed a set of tasks in a photo-realistic simulated kitchen environment that allow them to study the degree to which a behavioral policy captures the systematicity in language by studying its zero-shot generalization performance on held out natural language instruction.

...read moreread less

Abstract: The ability to combine learned knowledge and skills to solve novel tasks is a key aspect of generalization in humans that allows us to understand and perform tasks described by novel language utterances. While progress has been made in supervised learning settings, no work has yet studied compositional generalization of a reinforcement learning agent following natural language instructions in an embodied environment. We develop a set of tasks in a photo-realistic simulated kitchen environment that allow us to study the degree to which a behavioral policy captures the systematicity in language by studying its zero-shot generalization performance on held out natural language instructions. We show that our agent which leverages a novel additive action-value decomposition in tandem with attention based subgoal prediction is able to exploit composition in text instructions to generalize to unseen tasks.

...read moreread less

Journal Article•DOI•

Siamese-Discriminant Deep Reinforcement Learning for Solving Jigsaw Puzzles with Large Eroded Gaps

[...]

Jiahuan Jin, Chenglin Yao, Shihe Wang, Jianfeng Ren, Ruibin Bai - Show less +1 more

26 Jun 2023-Proceedings of the ... AAAI Conference on Artificial Intelligence

TL;DR: Wang et al. as discussed by the authors formulated the puzzle reassembly as a combinatorial optimization problem and proposed a Siamese-discriminant deep reinforcement learning (SD2RL) to solve it.

...read moreread less

Abstract: Jigsaw puzzle solving has recently become an emerging research area. The developed techniques have been widely used in applications beyond puzzle solving. This paper focuses on solving Jigsaw Puzzles with Large Eroded Gaps (JPwLEG). We formulate the puzzle reassembly as a combinatorial optimization problem and propose a Siamese-Discriminant Deep Reinforcement Learning (SD2RL) to solve it. A Deep Q-network (DQN) is designed to visually understand the puzzles, which consists of two sets of Siamese Discriminant Networks, one set to perceive the pairwise relations between vertical neighbors and another set for horizontal neighbors. The proposed DQN considers not only the evidence from the incumbent fragment but also the support from its four neighbors. The DQN is trained using replay experience with carefully designed rewards to guide the search for a sequence of fragment swaps to reach the correct puzzle solution. Two JPwLEG datasets are constructed to evaluate the proposed method, and the experimental results show that the proposed SD2RL significantly outperforms state-of-the-art methods.

...read moreread less

Journal Article•DOI•

Solving Explainability Queries with Quantification: The Case of Feature Relevancy

[...]

Xuanxiang Huang, Yacine Izza, Joao Marques-Silva

26 Jun 2023-Proceedings of the ... AAAI Conference on Artificial Intelligence

TL;DR: In this article , a novel algorithm for the feature relevancy problem is proposed, which is applicable to any ML classifier that meets minor requirements and is shown to be efficient in practice.

...read moreread less

Abstract: Trustable explanations of machine learning (ML) models are vital in high-risk uses of artificial intelligence (AI). Apart from the computation of trustable explanations, a number of explainability queries have been identified and studied in recent work. Some of these queries involve solving quantification problems, either in propositional or in more expressive logics. This paper investigates one of these quantification problems, namely the feature relevancy problem (FRP), i.e.\ to decide whether a (possibly sensitive) feature can occur in some explanation of a prediction. In contrast with earlier work, that studied FRP for specific classifiers, this paper proposes a novel algorithm for the \fprob quantification problem which is applicable to any ML classifier that meets minor requirements. Furthermore, the paper shows that the novel algorithm is efficient in practice. The experimental results, obtained using random forests (RFs) induced from well-known publicly available datasets, demonstrate that the proposed solution outperforms existing state-of-the-art solvers for Quantified Boolean Formulas (QBF) by orders of magnitude. Finally, the paper also identifies a novel family of formulas that are challenging for currently state-of-the-art QBF solvers.

...read moreread less

Journal Article•DOI•

NL2LTL – a Python Package for Converting Natural Language (NL) Instructions to Linear Temporal Logic (LTL) Formulas

[...]

Francesco Fuggitti, Tathagata Chakraborti

26 Jun 2023-Proceedings of the ... AAAI Conference on Artificial Intelligence

TL;DR: NL2LTL as mentioned in this paper leverages the latest in natural language understanding (NLU) and large language models (LLMs) to translate natural language instructions to linear temporal logic (LTL) formulas.

...read moreread less

Abstract: This is a demonstration of our newly released Python package NL2LTL which leverages the latest in natural language understanding (NLU) and large language models (LLMs) to translate natural language instructions to linear temporal logic (LTL) formulas. This allows direct translation to formal languages that a reasoning system can use, while at the same time, allowing the end-user to provide inputs in natural language without having to understand any details of an underlying formal language. The package comes with support for a set of default LTL patterns, corresponding to popular DECLARE templates, but is also fully extensible to new formulas and user inputs. The package is open-source and is free to use for the AI community under the MIT license. Open Source: https://github.com/IBM/nl2ltl. Video Link: https://bit.ly/3dHW5b1

...read moreread less

Journal Article•DOI•

Cogito Ergo Summ: Abstractive Summarization of Biomedical Papers via Semantic Parsing Graphs and Consistency Rewards

[...]

Giacomo Frisoni, Paolo Italiani, Stefano Salvatori, Gianluca Moro

26 Jun 2023-Proceedings of the ... AAAI Conference on Artificial Intelligence

TL;DR: CogitoErgoSumm as mentioned in this paper infuses graphs from two complementary semantic parsing techniques with different goals and granularities, also designing a reward signal to maximize information content preservation through reinforcement learning.

...read moreread less

Abstract: The automatic synthesis of biomedical publications catalyzes a profound research interest elicited by literature congestion. Current sequence-to-sequence models mainly rely on the lexical surface and seldom consider the deep semantic interconnections between the entities mentioned in the source document. Such superficiality translates into fabricated, poorly informative, redundant, and near-extractive summaries that severely restrict their real-world application in biomedicine, where the specialized jargon and the convoluted facts further emphasize task complexity. To fill this gap, we argue that the summarizer should acquire semantic interpretation over input, exploiting structured and unambiguous representations to capture and conserve the most relevant parts of the text content. This paper presents CogitoErgoSumm, the first framework for biomedical abstractive summarization equipping large pre-trained language models with rich semantic graphs. Precisely, we infuse graphs from two complementary semantic parsing techniques with different goals and granularities—Event Extraction and Abstract Meaning Representation, also designing a reward signal to maximize information content preservation through reinforcement learning. Extensive quantitative and qualitative evaluations on the CDSR dataset show that our solution achieves competitive performance according to multiple metrics, despite using 2.5x fewer parameters. Results and ablation studies indicate that our joint text-graph model generates more enlightening, readable, and consistent summaries. Code available at: https://github.com/disi-unibo-nlp/cogito-ergo-summ.

...read moreread less

Journal Article•DOI•

Carburacy: Summarization Models Tuning and Comparison in Eco-Sustainable Regimes with a Novel Carbon-Aware Accuracy

[...]

Gianluca Moro, Luca Ragazzi, Lorenzo Valgimigli

26 Jun 2023-Proceedings of the ... AAAI Conference on Artificial Intelligence

TL;DR: In this paper , the authors propose Carburacy, a carbon-aware accuracy measure that captures both model effectiveness and eco-sustainability, comparing multiple state-of-the-art quadratic and linear transformers on several datasets under eco sustainable regimes.

...read moreread less

Abstract: Generative transformer-based models have reached cutting-edge performance in long document summarization. Nevertheless, this task is witnessing a paradigm shift in developing ever-increasingly computationally-hungry solutions, focusing on effectiveness while ignoring the economic, environmental, and social costs of yielding such results. Accordingly, such extensive resources impact climate change and raise barriers to small and medium organizations distinguished by low-resource regimes of hardware and data. As a result, this unsustainable trend has lifted many concerns in the community, which directs the primary efforts on the proposal of tools to monitor models' energy costs. Despite their importance, no evaluation measure considering models' eco-sustainability exists yet. In this work, we propose Carburacy, the first carbon-aware accuracy measure that captures both model effectiveness and eco-sustainability. We perform a comprehensive benchmark for long document summarization, comparing multiple state-of-the-art quadratic and linear transformers on several datasets under eco-sustainable regimes. Finally, thanks to Carburacy, we found optimal combinations of hyperparameters that let models be competitive in effectiveness with significantly lower costs.

...read moreread less

Journal Article•DOI•

Phase-Informed Bayesian Ensemble Models Improve Performance of COVID-19 Forecasts

[...]

Aniruddha Adiga, Gursharn Kaur, Lijing Wang, Benjamin Hurt, Przemyslaw J. Porebski, Srinivasan Venkatramanan, Bryan Lewis, Madhav V. Marathe - Show less +4 more

26 Jun 2023-Proceedings of the ... AAAI Conference on Artificial Intelligence

TL;DR: In this article , a modification to the deployed Bayesian ensembling case time series forecasting framework was proposed to forecast epidemic dynamics through a number of phases (waves) of a pandemic.

...read moreread less

Abstract: Despite hundreds of methods published in the literature, forecasting epidemic dynamics remains challenging yet important. The challenges stem from multiple sources, including: the need for timely data, co-evolution of epidemic dynamics with behavioral and immunological adaptations, and the evolution of new pathogen strains. The ongoing COVID-19 pandemic highlighted these challenges; in an important article, Reich et al. did a comprehensive analysis highlighting many of these challenges. In this paper, we take another step in critically evaluating existing epidemic forecasting methods. Our methods are based on a simple yet crucial observation - epidemic dynamics go through a number of phases (waves). Armed with this understanding, we propose a modification to our deployed Bayesian ensembling case time series forecasting framework. We show that ensembling methods employing the phase information and using different weighting schemes for each phase can produce improved forecasts. We evaluate our proposed method with both the currently deployed model and the COVID-19 forecasthub models. The overall performance of the proposed model is consistent across the pandemic but more importantly, it is ranked third and first during two critical rapid growth phases in cases, regimes where the performance of most models from the CDC forecasting hub dropped significantly.

...read moreread less

Journal Article•DOI•

CEM: Constrained Entropy Maximization for Task-Agnostic Safe Exploration

[...]

Qisong Yang, Matthijs T. J. Spaan

26 Jun 2023-Proceedings of the ... AAAI Conference on Artificial Intelligence

TL;DR: In this paper , a constrained entropy maximization (CEM) algorithm is proposed to solve task-agnostic safe exploration problems, which naturally require a finite horizon and undiscounted constraints on safety costs.

...read moreread less

Abstract: In the absence of assigned tasks, a learning agent typically seeks to explore its environment efficiently. However, the pursuit of exploration will bring more safety risks. An under-explored aspect of reinforcement learning is how to achieve safe efficient exploration when the task is unknown. In this paper, we propose a practical Constrained Entropy Maximization (CEM) algorithm to solve task-agnostic safe exploration problems, which naturally require a finite horizon and undiscounted constraints on safety costs. The CEM algorithm aims to learn a policy that maximizes state entropy under the premise of safety. To avoid approximating the state density in complex domains, CEM leverages a k-nearest neighbor entropy estimator to evaluate the efficiency of exploration. In terms of safety, CEM minimizes the safety costs, and adaptively trades off safety and exploration based on the current constraint satisfaction. The empirical analysis shows that CEM enables the acquisition of a safe exploration policy in complex environments, resulting in improved performance in both safety and sample efficiency for target tasks.

...read moreread less

Journal Article•DOI•

NHITS: Neural Hierarchical Interpolation for Time Series Forecasting

[...]

Cristian Challu, Kin G. Olivares, Boris N. Oreshkin, Artur Dubrawski

26 Jun 2023-Proceedings of the ... AAAI Conference on Artificial Intelligence

TL;DR: NHITS as mentioned in this paper uses hierarchical interpolation and multi-rate data sampling techniques to assemble long-horizon forecasting predictions sequentially, emphasizing components with different frequencies and scales while decomposing the input signal and synthesizing the forecast.

...read moreread less

Abstract: Recent progress in neural forecasting accelerated improvements in the performance of large-scale forecasting systems. Yet, long-horizon forecasting remains a very difficult task. Two common challenges afflicting the task are the volatility of the predictions and their computational complexity. We introduce NHITS, a model which addresses both challenges by incorporating novel hierarchical interpolation and multi-rate data sampling techniques. These techniques enable the proposed method to assemble its predictions sequentially, emphasizing components with different frequencies and scales while decomposing the input signal and synthesizing the forecast. We prove that the hierarchical interpolation technique can efficiently approximate arbitrarily long horizons in the presence of smoothness. Additionally, we conduct extensive large-scale dataset experiments from the long-horizon forecasting literature, demonstrating the advantages of our method over the state-of-the-art methods, where NHITS provides an average accuracy improvement of almost 20% over the latest Transformer architectures while reducing the computation time by an order of magnitude (50 times). Our code is available at https://github.com/Nixtla/neuralforecast.

...read moreread less

Journal Article•DOI•

PateGail: A Privacy-Preserving Mobility Trajectory Generator with Imitation Learning

[...]

Huandong Wang, Changzheng Gao, Yuchen Wu, Depeng Jin, Lina Yao, Yong Li - Show less +2 more

26 Jun 2023-Proceedings of the ... AAAI Conference on Artificial Intelligence

TL;DR: PateGail as mentioned in this paper is a privacy-preserving imitation learning model to generate mobility trajectories, which utilizes the powerful generative adversary imitation learning (GAN) model to simulate the decision-making process of humans.

...read moreread less

Abstract: Generating human mobility trajectories is of great importance to solve the lack of large-scale trajectory data in numerous applications, which is caused by privacy concerns. However, existing mobility trajectory generation methods still require real-world human trajectories centrally collected as the training data, where there exists an inescapable risk of privacy leakage. To overcome this limitation, in this paper, we propose PateGail, a privacy-preserving imitation learning model to generate mobility trajectories, which utilizes the powerful generative adversary imitation learning model to simulate the decision-making process of humans. Further, in order to protect user privacy, we train this model collectively based on decentralized mobility data stored in user devices, where personal discriminators are trained locally to distinguish and reward the real and generated human trajectories. In the training process, only the generated trajectories and their rewards obtained based on personal discriminators are shared between the server and devices, whose privacy is further preserved by our proposed perturbation mechanisms with theoretical proof to satisfy differential privacy. Further, to better model the human decision-making process, we propose a novel aggregation mechanism of the rewards obtained from personal discriminators. We theoretically prove that under the reward obtained based on the aggregation mechanism, our proposed model maximizes the lower bound of the discounted total rewards of users. Extensive experiments show that the trajectories generated by our model are able to resemble real-world trajectories in terms of five key statistical metrics, outperforming state-of-the-art algorithms by over 48.03%. Furthermore, we demonstrate that the synthetic trajectories are able to efficiently support practical applications, including mobility prediction and location recommendation.

...read moreread less

Journal Article•DOI•

Concurrent Multi-Label Prediction in Event Streams

[...]

Xiao Shou, Tian Gao, Dharmashankar Subramanian, Debarun Bhattacharjya, Kristin P. Bennett - Show less +1 more

26 Jun 2023-Proceedings of the ... AAAI Conference on Artificial Intelligence

TL;DR: In this paper , a Transformer-based Conditional Mixture of Bernoulli Network (TCMBN) is proposed for multi-label prediction in real-world data.

...read moreread less

Abstract: Streams of irregularly occurring events are commonly modeled as a marked temporal point process. Many real-world datasets such as e-commerce transactions and electronic health records often involve events where multiple event types co-occur, e.g. multiple items purchased or multiple diseases diagnosed simultaneously. In this paper, we tackle multi-label prediction in such a problem setting, and propose a novel Transformer-based Conditional Mixture of Bernoulli Network (TCMBN) that leverages neural density estimation to capture complex temporal dependence as well as probabilistic dependence between concurrent event types. We also propose potentially incorporating domain knowledge in the objective by regularizing the predicted probability. To represent probabilistic dependence of concurrent event types graphically, we design a two-step approach that first learns the mixture of Bernoulli network and then solves a least-squares semi-definite constrained program to numerically approximate the sparse precision matrix from a learned covariance matrix. This approach proves to be effective for event prediction while also providing an interpretable and possibly non-stationary structure for insights into event co-occurrence. We demonstrate the superior performance of our approach compared to existing baselines on multiple synthetic and real benchmarks.

...read moreread less

Journal Article•DOI•

Interventional SHAP Values and Interaction Values for Piecewise Linear Regression Trees

[...]

Artjom Zern, Klaus Broelemann, Gjergji Kasneci

26 Jun 2023-Proceedings of the ... AAAI Conference on Artificial Intelligence

TL;DR: In this article , a decomposition of the SHAP contribution function based on decision paths is introduced, which allows a more comprehensible formulation of SHAP algorithms for tree-based models and can also be readily applied to computing SHAP interaction values of these models.

...read moreread less

Abstract: In recent years, game-theoretic Shapley values have gained increasing attention with respect to local model explanation by feature attributions. While the approach using Shapley values is model-independent, their (exact) computation is usually intractable, so efficient model-specific algorithms have been devised including approaches for decision trees or their ensembles in general. Our work goes further in this direction by extending the interventional TreeSHAP algorithm to piecewise linear regression trees, which gained more attention in the past few years. To this end, we introduce a decomposition of the contribution function based on decision paths, which allows a more comprehensible formulation of SHAP algorithms for tree-based models. Our algorithm can also be readily applied to computing SHAP interaction values of these models. In particular, as the main contribution of this paper, we provide a more efficient approach of interventional SHAP for tree-based models by precomputing statistics of the background data based on the tree structure.

...read moreread less

Journal Article•DOI•

Proportional Decisions in Perpetual Voting

[...]

Martin Lackner, Jan Maly

26 Jun 2023-Proceedings of the ... AAAI Conference on Artificial Intelligence

TL;DR: In this paper , the authors define two classes of perpetual voting rules that are particularly easy to explain to voters and explore the bounds imposed by this simplicity and study proportionality in the perpetual setting and identify two rules with strong proportionality guarantees.

...read moreread less

Abstract: Perpetual voting is a framework for long-term collective decision making. In this framework, we consider a sequence of subsequent approval-based elections and try to achieve a fair overall outcome. To achieve fairness over time, perpetual voting rules take the history of previous decisions into account and identify voters that were dissatisfied with previous decisions. In this paper, we look at perpetual voting rules from an axiomatic perspective. First, we define two classes of perpetual voting rules that are particularly easy to explain to voters and explore the bounds imposed by this simplicity. Second, we study proportionality in the perpetual setting and identify two rules with strong proportionality guarantees. However, both rules yield different guarantees and we prove them to be incompatible with each other.

...read moreread less

Journal Article•DOI•

Learning Decomposed Spatial Relations for Multi-Variate Time-Series Modeling

[...]

Yuchen Fang, Kan Ren, Caihua Shan, Yifei Shen, You Li, Weinan Zhang, Yong Yu, Dongsheng Li - Show less +4 more

26 Jun 2023-Proceedings of the ... AAAI Conference on Artificial Intelligence

TL;DR: In this paper , the hidden spatial relations can be decomposed into a prior part which applies across all the samples and a dynamic part which varies between samples, and building different graphs is necessary to model these relations.

...read moreread less

Abstract: Modeling multi-variate time-series (MVTS) data is a long-standing research subject and has found wide applications. Recently, there is a surge of interest in modeling spatial relations between variables as graphs, i.e., first learning one static graph for each dataset and then exploiting the graph structure via graph neural networks. However, as spatial relations may differ substantially across samples, building one static graph for all the samples inherently limits flexibility and severely degrades the performance in practice. To address this issue, we propose a framework for fine-grained modeling and utilization of spatial correlation between variables. By analyzing the statistical properties of real-world datasets, a universal decomposition of spatial correlation graphs is first identified. Specifically, the hidden spatial relations can be decomposed into a prior part, which applies across all the samples, and a dynamic part, which varies between samples, and building different graphs is necessary to model these relations. To better coordinate the learning of the two relational graphs, we propose a min-max learning paradigm that not only regulates the common part of different dynamic graphs but also guarantees spatial distinguishability among samples. The experimental results show that our proposed model outperforms the state-of-the-art baseline methods on both time-series forecasting and time-series point prediction tasks.

...read moreread less

Journal Article•DOI•

Partitioning Friends Fairly

[...]

L. Li, Evi Micha, Aleksandar Nikolov, Nisarg Shah

26 Jun 2023-Proceedings of the ... AAAI Conference on Artificial Intelligence

TL;DR: In this article , the authors consider the problem of partitioning n agents in an undirected social network into k almost equal in size (differing by at most one) groups, where the utility of an agent for a group is the number of her neighbors in the group.

...read moreread less

Abstract: We consider the problem of partitioning n agents in an undirected social network into k almost equal in size (differing by at most one) groups, where the utility of an agent for a group is the number of her neighbors in the group. The core and envy-freeness are two compelling axiomatic fairness guarantees in such settings. The former demands that there be no coalition of agents such that each agent in the coalition has more utility for that coalition than for her own group, while the latter demands that no agent envy another agent for the group they are in. We provide (often tight) approximations to both fairness guarantees, and many of our positive results are obtained via efficient algorithms.

...read moreread less

Journal Article•DOI•

ConvMatch: Rethinking Network Design for Two-View Correspondence Learning

[...]

Shihuai Zhang, Jiayi Ma

26 Jun 2023-Proceedings of the ... AAAI Conference on Artificial Intelligence

TL;DR: Zhang et al. as mentioned in this paper proposed ConvMatch, which uses convolutional neural network (CNN) as the backbone to capture better context, thus avoiding the complex design of extra blocks.

...read moreread less

Abstract: Multilayer perceptron (MLP) has been widely used in two-view correspondence learning for only unordered correspondences provided, and it extracts deep features from individual correspondence effectively. However, the problem of lacking context information limits its performance and hence, many extra complex blocks are designed to capture such information in the follow-up studies. In this paper, from a novel perspective, we design a correspondence learning network called ConvMatch that for the first time can leverage convolutional neural network (CNN) as the backbone to capture better context, thus avoiding the complex design of extra blocks. Specifically, with the observation that sparse motion vectors and dense motion field can be converted into each other with interpolating and sampling, we regularize the putative motion vectors by estimating dense motion field implicitly, then rectify the errors caused by outliers in local areas with CNN, and finally obtain correct motion vectors from the rectified motion field. Extensive experiments reveal that ConvMatch with a simple CNN backbone consistently outperforms state-of-the-arts including MLP-based methods for relative pose estimation and homography estimation, and shows promising generalization ability to different datasets and descriptors. Our code is publicly available at https://github.com/SuhZhang/ConvMatch.

...read moreread less

Collapse