Learning to navigate through crowded environments
Citations
Human-aware robot navigation: A survey
Learning Social Etiquette: Human Trajectory Understanding In Crowded Scenes
Unfreezing the robot: Navigation in dense, interacting crowds
Human motion trajectory prediction: a survey:
References
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
Gaussian Processes for Machine Learning
Statistics for spatial data
Related Papers (5)
Frequently Asked Questions (19)
Q2. What have the authors stated for future works in "Learning to navigate through crowded environments" ?
The authors showed how to extend inverse reinforcement learning to deal with dynamic and partially observed features. With only a local sensor model, the authors use Gaussian processes to extend local information beyond the sensor range in a principled manner. In future work the authors hope to learn weights based on real world crowds, and implement the system on an actual robot. The authors conjecture that their work produces more socially acceptable motion that will allow robots to perform tasks seamlessly in crowded environments.
Q3. What is the focus of other work?
Other work [10], [11] is focused on mimicking group crowd behavior by matching into a database of observed examples to generate plausible overall animations.
Q4. What is the effect of the Gaussian process model on the robots?
In a scenario with multiple robots distributed throughout the environment, the Gaussian process model will allow their local sensor readings to be combined across the map to allowmore accurate density and flow information, enabling better path planning for each robot individually.
Q5. How does the probability of a path depend on the cost of the path?
The probability of a path depends on the cost of the path, which is obtained by adding the weighted features of the partial paths from each time horizon.
Q6. How can the authors compute the gradients of the example paths?
By assuming that the person plans a path based on this information and sticks to this path for the next H time steps, the gradient can be computed based on the difference between observed and expected feature over the next H steps only, instead of the complete path.
Q7. What are the simulations of the crowd simulator?
These simulations also provide their mean density and velocity models, which the authors can use as initial estimates of dynamic feature values in unobserved areas of the grid for both training and planning.
Q8. What is the key idea behind inverse reinforcement learning?
LEARNING WITH PARTIALLY OBSERVABLE FEATURESIn order for a robot to learn how to navigate a crowded space as humans do, the authors employ techniques based on maximum entropy inverse reinforcement learning (MaxEnt IRL) [22].
Q9. What is the purpose of the crowd simulator?
So-called “swarms” of people were given starting locations and goal regions instigating natural crowd flows through a simulated 3D space.
Q10. What is the probability mass of all paths from sstart to si?
After executing the forward/backward algorithm, the authors now possess values Zsi and Z ′ si for every state, where Zsi is the accumulated probability mass of all paths from sstart to si, and Z ′si is the probability mass of all paths from si to sgoal.
Q11. What is the local gradient at timestep t?
Their local gradient at timestep t is defined as:∇F t = f̃t − ∑ai,j∈H Dtai,j f t ai,j (8)where all terms are now indexed by the timestep t.
Q12. How can the authors obtain the expected feature counts?
The expected feature counts can be obtained by multiplying the probability of each action by the features for that action, summed over all actions.
Q13. How many people have been successfully navigating in crowded environments?
Over the last decade, several mobile robots have been deployed successfully in crowded environments such as museums [3] railway stations [15], and exhibits [18], [19].
Q14. What is the robot's representation in the crowd simulator?
The robot is represented in the crowd simulator as a single person, which allows the simulated crowds to react naturally to the presence and motion of the robot.
Q15. What are the traces of the crowd simulator?
For training data, the authors extract demonstration traces from the crowd simulator, where the traces consist of grid cells, directions, and local observations of dynamic flow features.
Q16. How do the authors evaluate the GP over the entire environment?
The authors can then evaluate the GP over the entire environment and the model will produce estimates of mean and variance, nicely integrating areas of dense and sparse data coverage.
Q17. What is the gradient used to update the weights?
The gradient∇F t is used to update the weights as in equation (5), but now the weights θ are updated t times for each training path.
Q18. What is the probability of a path at timestep t?
Letting at represent the action of τ at timestep t, the probability of a path τ is now given as:P (τ |θ) = 1 Z(θ)e ∑ t ∑ 0≤h<H −θ·f t at+h (7)Because the features are dynamic, the authors compute a gradient at every timestep of an example path and only H timesteps into the future, as compared with (4) in the original IRL formulation, which computes a single gradient for the entire path.
Q19. What is the shortest path in the lower panel?
In the lower panel, the authors can see that the robot has continued to move with the correct, rightward moving flow, as it has obtained additional dynamic feature information as it moves forward.