scispace - formally typeset
Search or ask a question
Author

J. Daniel Griffith

Bio: J. Daniel Griffith is an academic researcher from Massachusetts Institute of Technology. The author has contributed to research in topics: Reinforcement learning & Collision. The author has an hindex of 7, co-authored 12 publications receiving 277 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: A methodology for encounter model construction based on a Bayesian statistical framework connected to an extensive set of national radar data is described and examples of using several such high-fidelity models to evaluate the safety of collision avoidance systems for manned and unmanned aircraft are provided.
Abstract: Airspace encounter models, providing a statistical representation of geometries and aircraft behavior during a close encounter, are required to estimate the safety and robustness of collision avoidance systems. Prior encounter models, developed to certify the Traffic Alert and Collision Avoidance System, have been limited in their ability to capture important characteristics of encounters as revealed by recorded surveillance data, do not capture the current mix of aircraft types or noncooperative aircraft, and do not represent more recent airspace procedures. This paper describes a methodology for encounter model construction based on a Bayesian statistical framework connected to an extensive set of national radar data. In addition, this paper provides examples of using several such high-fidelity models to evaluate the safety of collision avoidance systems for manned and unmanned aircraft.

150 citations

01 Jan 2008
TL;DR: Lincoln Laboratory’s newer encounter models provide a higherfidelity representation of encounters, are based on substantially more radar data, leverage a theoretical framework for finding optimal model structures, and reflect recent changes in the airspace.
Abstract: Collision avoidance systems play an important role in the future of aviation safety. Before new technologies on board manned or unmanned aircraft are deployed, rigorous analysis using encounter simulations is required to prove system robustness. These simulations rely on models that accurately reflect the geometries and dynamics of aircraft encounters at close range. These types of encounter models have been developed by several organizations since the early 1980s. Lincoln Laboratory’s newer encounter models, however, provide a higherfidelity representation of encounters, are based on substantially more radar data, leverage a theoretical framework for finding optimal model structures, and reflect recent changes in the airspace.

41 citations

Journal ArticleDOI
TL;DR: This paper adapt MCTS and RHO to two problems – a problem inspired by tactical wildfire management and a classical problem involving the control of queueing networks – and undertake an extensive computational study comparing the two methods on large scale instances of both problems in terms of both the state and the action spaces.

29 citations

Proceedings Article
12 Feb 2016
TL;DR: A policy-based reinforcement learning approach, which learns the agent policies based solely on trajectories generated by previous interaction with the environment, which is able to generate valid macro-action controllers and develop an expectation-maximization (EM) algorithm (called Policy-based EM or PoEM), which has convergence guarantees for batch learning.
Abstract: Decentralized partially observable Markov decision processes (Dec-POMDPs) provide a general framework for multiagent sequential decision-making under uncertainty. Although Dec-POMDPs are typically intractable to solve for real-world problems, recent research on macro-actions (i.e., temporally-extended actions) has significantly increased the size of problems that can be solved. However, current methods assume the underlying Dec-POMDP model is known a priori or a full simulator is available during planning time. To accommodate more realistic scenarios, when such information is not available, this paper presents a policy-based reinforcement learning approach, which learns the agent policies based solely on trajectories generated by previous interaction with the environment (e.g., demonstrations). We show that our approach is able to generate valid macro-action controllers and develop an expectation-maximization (EM) algorithm (called Policy-based EM or PoEM), which has convergence guarantees for batch learning. Our experiments show PoEM is a scalable learning method that can learn optimal policies and improve upon hand-coded "expert" solutions.

21 citations

Proceedings ArticleDOI
13 Sep 2010
TL;DR: This paper explains how to use existing encounter models, aight simulation framework, three-dimensional aircraft wireframe models, and surveillance data to estimate mid-air collision risk and shows that 0.1 is an overly conservative estimate and that the true rate is likely to be an order of magnitude lower.
Abstract: Many aviation safety studies involve estimating near mid-air collision (NMAC) rate. In the past, it has been assumed that the probability that an NMAC leads to a mid-air collision is 0.1, but there has not yet been a comprehensive study to serve as a basis for this estimate. This paper explains how to use existing encounter models, a ight simulation framework, three-dimensional aircraft wireframe models, and surveillance data to estimate mid-air collision risk. The results show that 0.1 is an overly conservative estimate and that the true rate is likely to be an order of magnitude lower.

19 citations


Cited by
More filters
Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

Book ChapterDOI
24 Jul 2017
TL;DR: In this paper, the authors presented a scalable and efficient technique for verifying properties of deep neural networks (or providing counter-examples) based on the simplex method, extended to handle the non-convex Rectified Linear Unit (ReLU) activation function.
Abstract: Deep neural networks have emerged as a widely used and effective means for tackling complex, real-world problems. However, a major obstacle in applying them to safety-critical systems is the great difficulty in providing formal guarantees about their behavior. We present a novel, scalable, and efficient technique for verifying properties of deep neural networks (or providing counter-examples). The technique is based on the simplex method, extended to handle the non-convex Rectified Linear Unit (ReLU) activation function, which is a crucial ingredient in many modern neural networks. The verification procedure tackles neural networks as a whole, without making any simplifying assumptions. We evaluated our technique on a prototype deep neural network implementation of the next-generation airborne collision avoidance system for unmanned aircraft (ACAS Xu). Results show that our technique can successfully prove properties of networks that are an order of magnitude larger than the largest networks verified using existing methods.

1,332 citations

Proceedings Article
06 Aug 2017
TL;DR: A decentralized single-task learning approach that is robust to concurrent interactions of teammates is introduced, and an approach for distilling single- task policies into a unified policy that performs well across multiple related tasks, without explicit provision of task identity is presented.
Abstract: Many real-world tasks involve multiple agents with partial observability and limited communication. Learning is challenging in these settings due to local viewpoints of agents, which perceive the world as non-stationary due to concurrently-exploring teammates. Approaches that learn specialized policies for individual tasks face problems when applied to the real world: not only do agents have to learn and store distinct policies for each task, but in practice identities of tasks are often non-observable, making these approaches inapplicable. This paper formalizes and addresses the problem of multi-task multi-agent reinforcement learning under partial observability. We introduce a decentralized single-task learning approach that is robust to concurrent interactions of teammates, and present an approach for distilling single-task policies into a unified policy that performs well across multiple related tasks, without explicit provision of task identity.

358 citations

Proceedings ArticleDOI
01 Sep 2016
TL;DR: A deep neural network is used to learn a complex non-linear function approximation of the lookup table, which reduces the required storage space by a factor of 1000 and surpasses the original table on the performance metrics and encounter sets evaluated here.
Abstract: One approach to designing the decision making logic for an aircraft collision avoidance system is to frame the problem as Markov decision process and optimize the system using dynamic programming. The resulting strategy can be represented as a numeric table. This methodology has been used in the development of the ACAS X family of collision avoidance systems for manned and unmanned aircraft. However, due to the high dimensionality of the state space, discretizing the state variables can lead to very large tables. To improve storage efficiency, we propose two approaches for compressing the lookup table. The first approach exploits redundancy in the table. The table is decomposed into a set of lower-dimensional tables, some of which can be represented by single tables in areas where the lower-dimensional tables are identical or nearly identical with respect to a similarity metric. The second approach uses a deep neural network to learn a complex non-linear function approximation of the table. With the use of an asymmetric loss function and a gradient descent algorithm, the parameters for this network can be trained to provide very accurate estimates of values while preserving the relative preferences of the possible advisories for each state. As a result, the table can be approximately represented by only the parameters of the network, which reduces the required storage space by a factor of 1000. Simulation studies show that system performance is very similar using either compressed table representation in place of the original table. Even though the neural network was trained directly on the original table, the network surpasses the original table on the performance metrics and encounter sets evaluated here.

244 citations

Journal ArticleDOI
TL;DR: An overview of the theory, algorithms, and applications of sensor management as it has developed over the past decades and as it stands today can be found in this article, where the authors provide a survey of the current state of the art.
Abstract: Sensor systems typically operate under resource constraints that prevent the simultaneous use of all resources all of the time. Sensor management becomes relevant when the sensing system has the capability of actively managing these resources; i.e., changing its operating configuration during deployment in reaction to previous measurements. Examples of systems in which sensor management is currently used or is likely to be used in the near future include autonomous robots, surveillance and reconnaissance networks, and waveform-agile radars. This paper provides an overview of the theory, algorithms, and applications of sensor management as it has developed over the past decades and as it stands today.

209 citations