Showing papers on "Representation (systemics) published in 2018"

PDF

Open Access

Book Chapter•DOI•

Audio-Visual Scene Analysis with Self-Supervised Multisensory Features

[...]

Andrew Owens¹, Alexei A. Efros¹•Institutions (1)

08 Sep 2018

TL;DR: In this paper, the authors argue that the visual and audio components of a video signal should be modeled jointly using a fused multisensory representation, and they propose to learn such a representation in a self-supervised way, by training a neural network to predict whether video frames and audio are temporally aligned.

...read moreread less

Abstract: The thud of a bouncing ball, the onset of speech as lips open—when visual and audio events occur together, it suggests that there might be a common, underlying event that produced both signals. In this paper, we argue that the visual and audio components of a video signal should be modeled jointly using a fused multisensory representation. We propose to learn such a representation in a self-supervised way, by training a neural network to predict whether video frames and audio are temporally aligned. We use this learned representation for three applications: (a) sound source localization, i.e. visualizing the source of sound in a video; (b) audio-visual action recognition; and (c) on/off-screen audio source separation, e.g. removing the off-screen translator’s voice from a foreign official’s speech. Code, models, and video results are available on our webpage: http://andrewowens.com/multisensory.

...read moreread less

652 citations

Book Chapter•DOI•

Integral Human Pose Regression

[...]

Xiao Sun¹, Bin Xiao¹, Fangyin Wei², Shuang Liang³, Yichen Wei¹ - Show less +1 more•Institutions (3)

Microsoft¹, Peking University², Tongji University³

08 Sep 2018

TL;DR: In this paper, a simple integral operation relates and unifies the heat map representation and joint regression, thus avoiding the non-differentiable post-processing and quantization error of human pose estimation.

...read moreread less

Abstract: State-of-the-art human pose estimation methods are based on heat map representation. In spite of the good performance, the representation has a few issues in nature, such as non-differentiable post-processing and quantization error. This work shows that a simple integral operation relates and unifies the heat map representation and joint regression, thus avoiding the above issues. It is differentiable, efficient, and compatible with any heat map based methods. Its effectiveness is convincingly validated via comprehensive ablation experiments under various settings, specifically on 3D pose estimation, for the first time.

...read moreread less

536 citations

Book•DOI•

The Oxford Handbook of 4E Cognition

[...]

Leon de Bruin, Shaun Gallagher, Albert Newen

01 Sep 2018

TL;DR: The Oxford Handbook of 4E cognition as mentioned in this paper provides a systematic overview of the state of the art in the field of four-eme cognition, including recent trends such as Bayesian inference and predictive coding, new insights and findings regarding social understanding, and new theoretical paradigms for understanding emotions and conceptualizing the interaction between cognition, language and culture.

...read moreread less

Abstract: The Oxford Handbook of 4E Cognition provides a systematic overview of the state of the art in the field of 4E cognition: it includes chapters on hotly debated topics, for example, on the nature of cognition and the relation between cognition, perception and action; it discusses recent trends such as Bayesian inference and predictive coding; it presents new insights and findings regarding social understanding including the development of false belief understanding, and introduces new theoretical paradigms for understanding emotions and conceptualizing the interaction between cognition, language and culture. Each thematic section ends with a critical note to foster the fruitful discussion. In addition the final section of the book is dedicated to applications of 4E cognition approaches in disciplines such as psychiatry and robotics. This is a book with high relevance for philosophers, psychologists, psychiatrists, neuroscientists and anyone with an interest in the study of cognition as well as a wider audience with an interest in 4E cognition approaches.

...read moreread less

391 citations

Posted Content•

Recent Advances in Autoencoder-Based Representation Learning

[...]

Michael Tschannen, Olivier Bachem, Mario Lucic

12 Dec 2018-arXiv: Learning

TL;DR: An in-depth review of recent advances in representation learning with a focus on autoencoder-based models and makes use of meta-priors believed useful for downstream tasks, such as disentanglement and hierarchical organization of features.

...read moreread less

Abstract: Learning useful representations with little or no supervision is a key challenge in artificial intelligence. We provide an in-depth review of recent advances in representation learning with a focus on autoencoder-based models. To organize these results we make use of meta-priors believed useful for downstream tasks, such as disentanglement and hierarchical organization of features. In particular, we uncover three main mechanisms to enforce such properties, namely (i) regularizing the (approximate or aggregate) posterior distribution, (ii) factorizing the encoding and decoding distribution, or (iii) introducing a structured prior distribution. While there are some promising results, implicit or explicit supervision remains a key enabler and all current methods use strong inductive biases and modeling assumptions. Finally, we provide an analysis of autoencoder-based representation learning through the lens of rate-distortion theory and identify a clear tradeoff between the amount of prior knowledge available about the downstream tasks, and how useful the representation is for this task.

...read moreread less

307 citations

Journal Article•DOI•

State representation learning for control: An overview.

[...]

Timothée Lesort¹, Natalia Díaz-Rodríguez¹, Jean-Francois Goudou, David Filliat¹•Institutions (1)

Université Paris-Saclay¹

04 Aug 2018-Neural Networks

TL;DR: This survey aims at covering the state-of-the-art on state representation learning in the most recent years by reviewing different SRL methods that involve interaction with the environment, their implementations and their applications in robotics control tasks (simulated or real).

...read moreread less

274 citations

Posted Content•

Audio-Visual Scene Analysis with Self-Supervised Multisensory Features

[...]

Andrew Owens¹, Alexei A. Efros¹•Institutions (1)

University of California, Berkeley¹

10 Apr 2018-arXiv: Computer Vision and Pattern Recognition

TL;DR: It is argued that the visual and audio components of a video signal should be modeled jointly using a fused multisensory representation, and it is proposed to learn such a representation in a self-supervised way, by training a neural network to predict whether video frames and audio are temporally aligned.

...read moreread less

Abstract: The thud of a bouncing ball, the onset of speech as lips open -- when visual and audio events occur together, it suggests that there might be a common, underlying event that produced both signals. In this paper, we argue that the visual and audio components of a video signal should be modeled jointly using a fused multisensory representation. We propose to learn such a representation in a self-supervised way, by training a neural network to predict whether video frames and audio are temporally aligned. We use this learned representation for three applications: (a) sound source localization, i.e. visualizing the source of sound in a video; (b) audio-visual action recognition; and (c) on/off-screen audio source separation, e.g. removing the off-screen translator's voice from a foreign official's speech. Code, models, and video results are available on our webpage: this http URL

...read moreread less

252 citations

Journal Article•DOI•

From Skills to Symbols: Learning Symbolic Representations for Abstract High-Level Planning

[...]

George Konidaris, Leslie Pack Kaelbling, Tomás Lozano-Pérez

31 Jan 2018-Journal of Artificial Intelligence Research

TL;DR: The results establish a principled link between high-level actions and abstract representations, a concrete theoretical foundation for constructing abstract representations with provable properties, and a practical mechanism for autonomously learning abstract high- level representations.

...read moreread less

Abstract: We consider the problem of constructing abstract representations for planning in high-dimensional, continuous environments We assume an agent equipped with a collection of high-level actions, and construct representations provably capable of evaluating plans composed of sequences of those actions We first consider the deterministic planning case, and show that the relevant computation involves set operations performed over sets of states We define the specific collection of sets that is necessary and sufficient for planning, and use them to construct a grounded abstract symbolic representation that is provably suitable for deterministic planning The resulting representation can be expressed in PDDL, a canonical high-level planning domain language; we construct such a representation for the Playroom domain and solve it in milliseconds using an off-the-shelf planner We then consider probabilistic planning, which we show requires generalizing from sets of states to distributions over states We identify the specific distributions required for planning, and use them to construct a grounded abstract symbolic representation that correctly estimates the expected reward and probability of success of any plan In addition, we show that learning the relevant probability distributions corresponds to specific instances of probabilistic density estimation and probabilistic classification We construct an agent that autonomously learns the correct abstract representation of a computer game domain, and rapidly solves it Finally, we apply these techniques to create a physical robot system that autonomously learns its own symbolic representation of a mobile manipulation task directly from sensorimotor data---point clouds, map locations, and joint angles---and then plans using that representation Together, these results establish a principled link between high-level actions and abstract representations, a concrete theoretical foundation for constructing abstract representations with provable properties, and a practical mechanism for autonomously learning abstract high-level representations

...read moreread less

234 citations

Book Chapter•DOI•

Unsupervised Geometry-Aware Representation for 3D Human Pose Estimation

[...]

Helge Rhodin¹, Mathieu Salzmann¹, Pascal Fua¹•Institutions (1)

École Polytechnique Fédérale de Lausanne¹

08 Sep 2018

TL;DR: In this article, a weakly supervised 3D human pose estimation method was proposed for multi-view images without annotations, which requires a sufficiently large set of samples with 3D annotations for learning to succeed.

...read moreread less

211 citations

Dense Object Nets: Learning Dense Visual Object Descriptors By and For Robotic Manipulation

[...]

Peter R. Florence, Lucas Manuelli, Russ Tedrake

23 Oct 2018

TL;DR: In this article, a self-supervised dense object representation for visual understanding and manipulation is proposed, which can be trained in approximately 20 minutes for a wide variety of previously unseen and potentially non-rigid objects.

...read moreread less

Abstract: What is the right object representation for manipulation? We would like robots to visually perceive scenes and learn an understanding of the objects in them that (i) is task-agnostic and can be used as a building block for a variety of manipulation tasks, (ii) is generally applicable to both rigid and non-rigid objects, (iii) takes advantage of the strong priors provided by 3D vision, and (iv) is entirely learned from self-supervision. This is hard to achieve with previous methods: much recent work in grasping does not extend to grasping specific objects or other tasks, whereas task-specific learning may require many trials to generalize well across object configurations or other tasks. In this paper we present Dense Object Nets, which build on recent developments in self-supervised dense descriptor learning, as a consistent object representation for visual understanding and manipulation. We demonstrate they can be trained quickly (approximately 20 minutes) for a wide variety of previously unseen and potentially non-rigid objects. We additionally present novel contributions to enable multi-object descriptor learning, and show that by modifying our training procedure, we can either acquire descriptors which generalize across classes of objects, or descriptors that are distinct for each object instance. Finally, we demonstrate the novel application of learned dense descriptors to robotic manipulation. We demonstrate grasping of specific points on an object across potentially deformed object configurations, and demonstrate using class general descriptors to transfer specific grasps across objects in a class.

...read moreread less

184 citations

Journal Article•DOI•

Deep potential: A general representation of a many-body potential energy surface

[...]

Jiequn Han¹, Linfeng Zhang¹, Roberto Car¹, Weinan E•Institutions (1)

Princeton University¹

01 Jan 2018-Communications in Computational Physics

181 citations

Proceedings Article•DOI•

Interpretable Drug Target Prediction Using Deep Neural Representation

[...]

Kyle Gao¹, Achille Fokoue¹, Heng Luo¹, Arun Iyengar¹, Sanjoy Dey¹, Ping Zhang¹ - Show less +2 more•Institutions (1)

IBM¹

01 Jul 2018

TL;DR: This work proposes an end-to-end neural network model that predicts DTIs directly from low level representations and provides biological interpretation using two-way attention mechanism.

...read moreread less

Abstract: The identification of drug-target interactions (DTIs) is a key task in drug discovery, where drugs are chemical compounds and targets are proteins. Traditional DTI prediction methods are either time consuming (simulation-based methods) or heavily dependent on domain expertise (similarity-based and feature-based methods). In this work, we propose an end-to-end neural network model that predicts DTIs directly from low level representations. In addition to making predictions, our model provides biological interpretation using two-way attention mechanism. Instead of using simplified settings where a dataset is evaluated as a whole, we designed an evaluation dataset from BindingDB following more realistic settings where predictions of unobserved examples (proteins and drugs) have to be made. We experimentally compared our model with matrix factorization, similarity-based methods, and a previous deep learning approach. Overall, the results show that our model outperforms other approaches without requiring domain knowledge and feature engineering. In a case study, we illustrated the ability of our approach to provide biological insights to interpret the predictions.

...read moreread less

Journal Article•DOI•

Multiple Scales of Representation along the Hippocampal Anteroposterior Axis in Humans

[...]

Iva K. Brunec¹, Buddhika Bellana¹, Jason D. Ozubko², Vincent Man¹, Jessica Robin¹, Zhong-Xu Liu¹, Cheryl L. Grady¹, R. Shayna Rosenbaum¹, R. Shayna Rosenbaum³, Gordon Winocur, Morgan D. Barense¹, Morris Moscovitch¹ - Show less +8 more•Institutions (3)

University of Toronto¹, State University of New York at Geneseo², Keele University³

09 Jul 2018-Current Biology

TL;DR: This work provides the first evidence for separable scales of representation along the human hippocampal anteroposterior axis by showing greater similarity among voxel time courses and higher temporal autocorrelation in anterior hippocampus (aHPC) relative to posterior hippocampus (pHPC), the human homologs of ventral and dorsal rodent hippocampus.

...read moreread less

Journal Article•DOI•

Recent advances in neuro-fuzzy system: A survey

[...]

K V Shihabudheen¹, G. N. Pillai¹•Institutions (1)

Indian Institute of Technology Roorkee¹

15 Jul 2018-Knowledge Based Systems

TL;DR: A review of different neuro-fuzzy systems based on the classification of research articles from 2000 to 2017 is proposed to help readers have a general overview of the state-of-the-arts of neuro- fizzy systems and easily refer suitable methods according to their research interests.

...read moreread less

Abstract: Neuro-fuzzy systems have attracted the growing interest of researchers in various scientific and engineering areas due to its effective learning and reasoning capabilities. The neuro-fuzzy systems combine the learning power of artificial neural networks and explicit knowledge representation of fuzzy inference systems. This paper proposes a review of different neuro-fuzzy systems based on the classification of research articles from 2000 to 2017. The main purpose of this survey is to help readers have a general overview of the state-of-the-arts of neuro-fuzzy systems and easily refer suitable methods according to their research interests. Different neuro-fuzzy models are compared and a table is presented summarizing the different learning structures and learning criteria with their applications.

...read moreread less

Proceedings Article•

Learning Structured Representation for Text Classification via Reinforcement Learning

[...]

Tianyang Zhang¹, Minlie Huang¹, Li Zhao²•Institutions (2)

Tsinghua University¹, Microsoft²

26 Apr 2018

TL;DR: Results show that the proposed reinforcement learning method can learn task-friendly representations by identifying important words or task-relevant structures without explicit structure annotations, and thus yields competitive performance.

...read moreread less

Abstract: Representation learning is a fundamental problem in natural language processing. This paper studies how to learn a structured representation for text classification. Unlike most existing representation models that either use no structure or rely on pre-specified structures, we propose a reinforcement learning (RL) method to learn sentence representation by discovering optimized structures automatically. We demonstrate two attempts to build structured representation: Information Distilled LSTM (ID-LSTM) and Hierarchically Structured LSTM (HS-LSTM). ID-LSTM selects only important, task-relevant words, and HS-LSTM discovers phrase structures in a sentence. Structure discovery in the two representation models is formulated as a sequential decision problem: current decision of structure discovery affects following decisions, which can be addressed by policy gradient RL. Results show that our method can learn task-friendly representations by identifying important words or task-relevant structures without explicit structure annotations, and thus yields competitive performance.

...read moreread less

Journal Article•DOI•

Concepts, control and context: A connectionist account of normal and disordered semantic cognition

[...]

Paul Hoffman¹, James L. McClelland², Matthew A. Lambon Ralph¹•Institutions (2)

University of Manchester¹, Stanford University²

01 Apr 2018-Psychological Review

TL;DR: The model accounts for executive influences on semantics by including a controlled retrieval mechanism that provides top-down input to amplify weak semantic relationships and successfully codes knowledge for abstract and concrete words, associative and taxonomic relationships, and the multiple meanings of homonyms, within a single representational space.

...read moreread less

Abstract: Semantic cognition requires conceptual representations shaped by verbal and nonverbal experience and executive control processes that regulate activation of knowledge to meet current situational demands. A complete model must also account for the representation of concrete and abstract words, of taxonomic and associative relationships, and for the role of context in shaping meaning. We present the first major attempt to assimilate all of these elements within a unified, implemented computational framework. Our model combines a hub-and-spoke architecture with a buffer that allows its state to be influenced by prior context. This hybrid structure integrates the view, from cognitive neuroscience, that concepts are grounded in sensory-motor representation with the view, from computational linguistics, that knowledge is shaped by patterns of lexical co-occurrence. The model successfully codes knowledge for abstract and concrete words, associative and taxonomic relationships, and the multiple meanings of homonyms, within a single representational space. Knowledge of abstract words is acquired through (a) their patterns of co-occurrence with other words and (b) acquired embodiment, whereby they become indirectly associated with the perceptual features of co-occurring concrete words. The model accounts for executive influences on semantics by including a controlled retrieval mechanism that provides top-down input to amplify weak semantic relationships. The representational and control elements of the model can be damaged independently, and the consequences of such damage closely replicate effects seen in neuropsychological patients with loss of semantic representation versus control processes. Thus, the model provides a wide-ranging and neurally plausible account of normal and impaired semantic cognition. (PsycINFO Database Record

...read moreread less

Proceedings Article•DOI•

Attribute-Aware Attention Model for Fine-grained Representation Learning

[...]

Kai Han¹, Jianyuan Guo¹, Chao Zhang¹, Mingjian Zhu¹•Institutions (1)

Peking University¹

15 Oct 2018

TL;DR: A novel Attribute-Aware Attention Model is proposed, which can learn local attribute representation and global category representation simultaneously in an end-to-end manner and contains more intrinsic information for image recognition instead of the noisy and irrelevant features.

...read moreread less

Abstract: How to learn a discriminative fine-grained representation is a key point in many computer vision applications, such as person re-identification, fine-grained classification, fine-grained image retrieval, etc. Most of the previous methods focus on learning metrics or ensemble to derive better global representation, which are usually lack of local information. Based on the considerations above, we propose a novel Attribute-Aware Attention Model ($A^3M$), which can learn local attribute representation and global category representation simultaneously in an end-to-end manner. The proposed model contains two attention models: attribute-guided attention module uses attribute information to help select category features in different regions, at the same time, category-guided attention module selects local features of different attributes with the help of category cues. Through this attribute-category reciprocal process, local and global features benefit from each other. Finally, the resulting feature contains more intrinsic information for image recognition instead of the noisy and irrelevant features. Extensive experiments conducted on Market-1501, CompCars, CUB-200-2011 and CARS196 demonstrate the effectiveness of our $A^3M$.

...read moreread less

Journal Article•DOI•

A neural-level model of spatial memory and imagery.

[...]

Andrej Bicanski¹, Neil Burgess¹•Institutions (1)

University College London¹

04 Sep 2018-eLife

TL;DR: This account shows how previously reported neural responses can map onto higher cognitive function in a modular way, and predicts new cell types (egocentric and head-direction-modulated boundary- and object-vector cells) and predicts how these neural populations should interact across multiple brain regions to support spatial memory.

...read moreread less

Abstract: We present a model of how neural representations of egocentric spatial experiences in parietal cortex interface with viewpoint-independent representations in medial temporal areas, via retrosplenial cortex, to enable many key aspects of spatial cognition. This account shows how previously reported neural responses (place, head-direction and grid cells, allocentric boundary- and object-vector cells, gain-field neurons) can map onto higher cognitive function in a modular way, and predicts new cell types (egocentric and head-direction-modulated boundary- and object-vector cells). The model predicts how these neural populations should interact across multiple brain regions to support spatial memory, scene construction, novelty-detection, 'trace cells', and mental navigation. Simulated behavior and firing rate maps are compared to experimental data, for example showing how object-vector cells allow items to be remembered within a contextual representation based on environmental boundaries, and how grid cells could update the viewpoint in imagery during planning and short-cutting by driving sequential place cell activity.

...read moreread less

Proceedings Article•DOI•

On the representation of continuous functions of several variables as superpositions of continuous functions of one variable and addition

[...]

R. S-A. Gatsaeva

01 Jan 2018-Landscape Journal

Proceedings Article•DOI•

Actor and Observer: Joint Modeling of First and Third-Person Videos

[...]

Gunnar A. Sigurdsson¹, Abhinav Gupta¹, Cordelia Schmid, Ali Farhadi², Karteek Alahari - Show less +1 more•Institutions (2)

Carnegie Mellon University¹, Allen Institute for Artificial Intelligence²

18 Jun 2018

TL;DR: Charades-Ego is introduced, a large-scale dataset of paired first-person and third-person videos, involving 112 people, with 4000 paired videos, which enables learning the link between the two, actor and observer perspectives, and addresses one of the biggest bottlenecks facing egocentric vision research.

...read moreread less

Abstract: Several theories in cognitive neuroscience suggest that when people interact with the world, or simulate interactions, they do so from a first-person egocentric perspective, and seamlessly transfer knowledge between third-person (observer) and first-person (actor). Despite this, learning such models for human action recognition has not been achievable due to the lack of data. This paper takes a step in this direction, with the introduction of Charades-Ego, a large-scale dataset of paired first-person and third-person videos, involving 112 people, with 4000 paired videos. This enables learning the link between the two, actor and observer perspectives. Thereby, we address one of the biggest bottlenecks facing egocentric vision research, providing a link from first-person to the abundant third-person data on the web. We use this data to learn a joint representation of first and third-person videos, with only weak supervision, and show its effectiveness for transferring knowledge from the third-person to the first-person domain.

...read moreread less

Posted Content•

Unsupervised Geometry-Aware Representation for 3D Human Pose Estimation.

[...]

Helge Rhodin¹, Mathieu Salzmann¹, Pascal Fua¹•Institutions (1)

École Polytechnique Fédérale de Lausanne¹

03 Apr 2018-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this paper, a geometry-aware body representation is learned from multi-view images without annotations using an encoder-decoder that predicts image from one viewpoint given an image from another viewpoint.

...read moreread less

Abstract: Modern 3D human pose estimation techniques rely on deep networks, which require large amounts of training data. While weakly-supervised methods require less supervision, by utilizing 2D poses or multi-view imagery without annotations, they still need a sufficiently large set of samples with 3D annotations for learning to succeed. In this paper, we propose to overcome this problem by learning a geometry-aware body representation from multi-view images without annotations. To this end, we use an encoder-decoder that predicts an image from one viewpoint given an image from another viewpoint. Because this representation encodes 3D geometry, using it in a semi-supervised setting makes it easier to learn a mapping from it to 3D human pose. As evidenced by our experiments, our approach significantly outperforms fully-supervised methods given the same amount of labeled data, and improves over other semi-supervised methods while using as little as 1% of the labeled data.

...read moreread less

Proceedings Article•

Unsupervised Representation Learning With Long-Term Dynamics for Skeleton Based Action Recognition

[...]

Nenggan Zheng¹, Jun Wen¹, Risheng Liu², Liangqu Long¹, Jianhua Dai³, Zhefeng Gong¹ - Show less +2 more•Institutions (3)

Zhejiang University¹, Dalian University of Technology², Hunan Normal University³

26 Apr 2018

TL;DR: An unsupervised representation learning approach for the first time to capture the long-term global motion dynamics in skeleton sequences and designs a conditional skeleton inpainting architecture for learning a fixed-dimensional representation.

...read moreread less

Abstract: In recent years, skeleton based action recognition is becoming an increasingly attractive alternative to existing video-based approaches, beneficial from its robust and comprehensive 3D information. In this paper, we explore an unsupervised representation learning approach for the first time to capture the long-term global motion dynamics in skeleton sequences. We design a conditional skeleton inpainting architecture for learning a fixed-dimensional representation, guided by additional adversarial training strategies. We quantitatively evaluate the effectiveness of our learning approach on three well-established action recognition datasets. Experimental results show that our learned representation is discriminative for classifying actions and can substantially reduce the sequence inpainting errors.

...read moreread less

Journal Article•DOI•

Integrating State Representation Learning Into Deep Reinforcement Learning

[...]

Tim de Bruin¹, Jens Kober¹, Karl Tuyls², Robert Babuska¹•Institutions (2)

Delft University of Technology¹, Google²

31 Jan 2018

TL;DR: Using autonomous racing tests in the TORCS simulator, it is shown how the integrated methods quickly learn policies that generalize to new environments much better than deep reinforcement learning without state representation learning.

...read moreread less

Abstract: Most deep reinforcement learning techniques are unsuitable for robotics, as they require too much interaction time to learn useful, general control policies. This problem can be largely attributed to the fact that a state representation needs to be learned as a part of learning control policies, which can only be done through fitting expected returns based on observed rewards. While the reward function provides information on the desirability of the state of the world, it does not necessarily provide information on how to distill a good, general representation of that state from the sensory observations. State representation learning objectives can be used to help learn such a representation. While many of these objectives have been proposed, they are typically not directly combined with reinforcement learning algorithms. We investigate several methods for integrating state representation learning into reinforcement learning. In these methods, the state representation learning objectives help regularize the state representation during the reinforcement learning, and the reinforcement learning itself is viewed as a crucial state representation learning objective and allowed to help shape the representation. Using autonomous racing tests in the TORCS simulator, we show how the integrated methods quickly learn policies that generalize to new environments much better than deep reinforcement learning without state representation learning.

...read moreread less

Posted Content•

Near-Optimal Representation Learning for Hierarchical Reinforcement Learning

[...]

Ofir Nachum¹, Shixiang Gu¹, Honglak Lee¹, Sergey Levine¹•Institutions (1)

Google¹

02 Oct 2018-arXiv: Artificial Intelligence

TL;DR: Results on a number of difficult continuous-control tasks show that the developed notion of sub-optimality of a representation, defined in terms of expected reward of the optimal hierarchical policy using this representation, yields qualitatively better representations as well as quantitatively better hierarchical policies compared to existing methods.

...read moreread less

Abstract: We study the problem of representation learning in goal-conditioned hierarchical reinforcement learning. In such hierarchical structures, a higher-level controller solves tasks by iteratively communicating goals which a lower-level policy is trained to reach. Accordingly, the choice of representation -- the mapping of observation space to goal space -- is crucial. To study this problem, we develop a notion of sub-optimality of a representation, defined in terms of expected reward of the optimal hierarchical policy using this representation. We derive expressions which bound the sub-optimality and show how these expressions can be translated to representation learning objectives which may be optimized in practice. Results on a number of difficult continuous-control tasks show that our approach to representation learning yields qualitatively better representations as well as quantitatively better hierarchical policies, compared to existing methods (see videos at this https URL).

...read moreread less

Posted Content•

Deep Learning for Generic Object Detection: A Survey

[...]

Li Liu¹, Li Liu², Wanli Ouyang³, Xiaogang Wang⁴, Paul Fieguth⁵, Jie Chen², Xinwang Liu¹, Matti Pietikäinen² - Show less +4 more•Institutions (5)

National University of Defense Technology¹, University of Oulu², University of Sydney³, The Chinese University of Hong Kong⁴, University of Waterloo⁵

06 Sep 2018-arXiv: Computer Vision and Pattern Recognition

TL;DR: More than 300 research contributions are included in this survey, covering many aspects of generic object detection: detection frameworks, object feature representation, object proposal generation, context modeling, training strategies, and evaluation metrics as mentioned in this paper.

...read moreread less

Abstract: Object detection, one of the most fundamental and challenging problems in computer vision, seeks to locate object instances from a large number of predefined categories in natural images. Deep learning techniques have emerged as a powerful strategy for learning feature representations directly from data and have led to remarkable breakthroughs in the field of generic object detection. Given this period of rapid evolution, the goal of this paper is to provide a comprehensive survey of the recent achievements in this field brought about by deep learning techniques. More than 300 research contributions are included in this survey, covering many aspects of generic object detection: detection frameworks, object feature representation, object proposal generation, context modeling, training strategies, and evaluation metrics. We finish the survey by identifying promising directions for future research.

...read moreread less

Posted Content•

Learning a Neural-network-based Representation for Open Set Recognition.

[...]

Mehadi Hassen¹, Philip K. Chan¹•Institutions (1)

Florida Institute of Technology¹

12 Feb 2018-arXiv: Learning

TL;DR: In this representation instances from the same class are close to each other while instances from different classes are further apart, resulting in statistically significant improvement when compared to other approaches on three datasets from two different domains.

...read moreread less

Abstract: Open set recognition problems exist in many domains. For example in security, new malware classes emerge regularly; therefore malware classification systems need to identify instances from unknown classes in addition to discriminating between known classes. In this paper we present a neural network based representation for addressing the open set recognition problem. In this representation instances from the same class are close to each other while instances from different classes are further apart, resulting in statistically significant improvement when compared to other approaches on three datasets from two different domains.

...read moreread less

Proceedings Article•DOI•

VGAN-Based Image Representation Learning for Privacy-Preserving Facial Expression Recognition

[...]

Jiawei Chen¹, Janusz Konrad¹, Prakash Ishwar¹•Institutions (1)

Boston University¹

01 Jan 2018

TL;DR: A Privacy-Preserving Representation-Learning Variational Generative Adversarial Network (PPRL-VGAN) to learn an image representation that is explicitly disentangled from the identity information so that it allows expression-equivalent face image synthesis.

...read moreread less

Abstract: Reliable facial expression recognition plays a critical role in human-machine interactions. However, most of the facial expression analysis methodologies proposed to date pay little or no attention to the protection of a user's privacy. In this paper, we propose a Privacy-Preserving Representation-Learning Variational Generative Adversarial Network (PPRL-VGAN) to learn an image representation that is explicitly disentangled from the identity information. At the same time, this representation is discriminative from the standpoint of facial expression recognition and generative as it allows expression-equivalent face image synthesis. We evaluate the proposed model on two public datasets under various threat scenarios. Quantitative and qualitative results demonstrate that our approach strikes a balance between the preservation of privacy and data utility. We further demonstrate that our model can be effectively applied to other tasks such as expression morphing and image completion.

...read moreread less

Journal Article•DOI•

Predicting remaining useful life of rolling bearings based on deep feature representation and long short-term memory neural network:

[...]

Wentao Mao¹, Wentao Mao², Jianliang He¹, Jiamei Tang¹, Yuan Li¹ - Show less +1 more•Institutions (2)

Henan Normal University¹, Northwestern Polytechnical University²

11 Dec 2018-Advances in Mechanical Engineering

TL;DR: A new remaining useful life prediction approach based on deep feature representation and long short-term memory neural network is proposed, and a new criterion, named support vector data normalized correlation coefficient, is proposed to automatically divide the whole bearing life as normal state and fast degradation state.

...read moreread less

Abstract: For bearing remaining useful life prediction problem, the traditional machine-learning-based methods are generally short of feature representation ability and incapable of adaptive feature extracti...

...read moreread less

Journal Article•DOI•

Erratum: Accurate and simple analytic representation of the electron-gas correlation energy [Phys. Rev. B 45, 13244 (1992)]

[...]

John P. Perdew, Yue Wang

14 Aug 2018-Physical Review B

Journal Article•DOI•

Media representation of digital-free tourism: A critical discourse analysis

[...]

Jing Li¹, Philip L. Pearce¹, David R. Low¹•Institutions (1)

James Cook University¹

01 Dec 2018-Tourism Management

TL;DR: In this paper, critical discourse analysis of over 450 media texts produced between 2009 and 2017, the authors reported the conceptual understanding of digital-free tourism, the ways the media representation has changed over time and explored the broad social context and debates in which the concept is embedded.

...read moreread less

Journal Article•DOI•

Women’s Experiences in eSports: Gendered Differences in Peer and Spectator Feedback During Competitive Video Game Play:

[...]

Omar Ruvalcaba¹, Jeffrey Shulze¹, Angela Kim¹, Sara R. Berzenski¹, Mark P. Otten¹ - Show less +1 more•Institutions (1)

California State University, Northridge¹

15 May 2018-Journal of Sport & Social Issues

TL;DR: In this paper, gender differences were analyzed in online gamers' experience with feedback from other players and spectators during online play, and the findings suggest a mixed experience for women that includes more sexual harassment in online gaming compared with men.

...read moreread less

Abstract: Despite the growing popularity of eSports, the poor representation of women players points to a need to understand the experiences of female players during competitive gaming online. The present study focuses on female gamers’ experiences with positive and negative feedback and sexual harassment in the male-dominated space of eSports. In Study 1, gender differences were analyzed in online gamers’ experience with feedback from other players and spectators during online play. In Study 2, gender differences were analyzed in observations of real gameplay that focused on the types of comments spectators directed toward female and male gamers on Twitch (a popular video game streaming website). The findings suggest a mixed experience for women that includes more sexual harassment in online gaming compared with men.

...read moreread less

Collapse