scispace - formally typeset
Search or ask a question

Showing papers on "Commonsense reasoning published in 2015"


Posted Content
Oriol Vinyals1, Quoc V. Le
TL;DR: A simple approach to conversational modeling which uses the recently proposed sequence to sequence framework, and is able to extract knowledge from both a domain specific dataset, and from a large, noisy, and general domain dataset of movie subtitles.
Abstract: Conversational modeling is an important task in natural language understanding and machine intelligence Although previous approaches exist, they are often restricted to specific domains (eg, booking an airline ticket) and require hand-crafted rules In this paper, we present a simple approach for this task which uses the recently proposed sequence to sequence framework Our model converses by predicting the next sentence given the previous sentence or sentences in a conversation The strength of our model is that it can be trained end-to-end and thus requires much fewer hand-crafted rules We find that this straightforward model can generate simple conversations given a large conversational training dataset Our preliminary results suggest that, despite optimizing the wrong objective function, the model is able to converse well It is able extract knowledge from both a domain specific dataset, and from a large, noisy, and general domain dataset of movie subtitles On a domain-specific IT helpdesk dataset, the model can find a solution to a technical problem via conversations On a noisy open-domain movie transcript dataset, the model can perform simple forms of common sense reasoning As expected, we also find that the lack of consistency is a common failure mode of our model

1,774 citations


Journal ArticleDOI
TL;DR: AI has seen great advances of many kinds recently, but there is one critical area where progress has been extremely slow: ordinary commonsense.
Abstract: AI has seen great advances of many kinds recently, but there is one critical area where progress has been extremely slow: ordinary commonsense.

362 citations


Book ChapterDOI
01 Jan 2015
TL;DR: This chapter provides an overview of possibility theory, emphasizing its historical roots, recent developments, and close connections with random set theory and confidence intervals.
Abstract: This chapter provides an overview of possibility theory, emphasizing its historical roots and its recent developments. Possibility theory lies at the crossroads between fuzzy sets, probability, and nonmonotonic reasoning. Possibility theory can be cast either in an ordinal or in a numerical setting. Qualitative possibility theory is closely related to belief revision theory, and commonsense reasoning with exception-tainted knowledge in artificial intelligence. Possibilistic logic provides a rich representation setting, which enables the handling of lower bounds of possibility theory measures, while remaining close to classical logic. Qualitative possibility theory has been axiomatically justified in a decision-theoretic framework in the style of Savage, thus providing a foundation for qualitative decision theory. Quantitative possibility theory is the simplest framework for statistical reasoning with imprecise probabilities. As such, it has close connections with random set theory and confidence intervals, and can provide a tool for uncertainty propagation with limited statistical or subjective information.

163 citations


01 Jan 2015
TL;DR: For an overview of possibility theory, see as discussed by the authors, where the authors provide a survey of the history and recent developments of qualitative decision theory in the context of probability theory with confidence intervals.
Abstract: This chapter provides an overview of possibility theory, emphasizing its historical roots and its recent developments. Possibility theory lies at the crossroads between fuzzy sets, probability, and nonmonotonic reasoning. Possibility theory can be cast either in an ordinal or in a numerical setting. Qualitative possibility theory is closely related to belief revision theory, and commonsense reasoning with exception-tainted knowledge in artificial intelligence. Possibilistic logic provides a rich representation setting, which enables the handling of lower bounds of possibility theory measures, while remaining close to classical logic. Qualitative possibility theory has been axiomatically justified in a decision-theoretic framework in the style of Savage, thus providing a foundation for qualitative decision theory. Quantitative possibility theory is the simplest framework for statistical reasoning with imprecise probabilities. As such, it has close connections with random set theory and confidence intervals, and can provide a tool for uncertainty propagation with limited statistical or subjective information.

126 citations


Proceedings ArticleDOI
07 Dec 2015
TL;DR: The use of human-generated abstract scenes made from clipart for learning common sense is explored and it is shown that the commonsense knowledge the authors learn is complementary to what can be learnt from sources of text.
Abstract: Common sense is essential for building intelligent machines. While some commonsense knowledge is explicitly stated in human-generated text and can be learnt by mining the web, much of it is unwritten. It is often unnecessary and even unnatural to write about commonsense facts. While unwritten, this commonsense knowledge is not unseen! The visual world around us is full of structure modeled by commonsense knowledge. Can machines learn common sense simply by observing our visual world? Unfortunately, this requires automatic and accurate detection of objects, their attributes, poses, and interactions between objects, which remain challenging problems. Our key insight is that while visual common sense is depicted in visual content, it is the semantic features that are relevant and not low-level pixel information. In other words, photorealism is not necessary to learn common sense. We explore the use of human-generated abstract scenes made from clipart for learning common sense. In particular, we reason about the plausibility of an interaction or relation between a pair of nouns by measuring the similarity of the relation and nouns with other relations and nouns we have seen in abstract scenes. We show that the commonsense knowledge we learn is complementary to what can be learnt from sources of text.

90 citations


Journal ArticleDOI
TL;DR: This paper first induces a conceptual space from the text documents, then relies on the key insight that the required semantic relations correspond to qualitative spatial relations in this conceptual space, and experimentally shows that these classifiers can outperform standard approaches, while being able to provide intuitive explanations of classification decisions.

74 citations


Proceedings Article
25 Jan 2015
TL;DR: The CORPP algorithm is introduced which combines P-log, a probabilistic extension of ASP, with POMDPs to integrate commonsense reasoning with planning under uncertainty and observes significant improvements in both efficiency and accuracy.
Abstract: In order to be fully robust and responsive to a dynamically changing real-world environment, intelligent robots will need to engage in a variety of simultaneous reasoning modalities. In particular, in this paper we consider their needs to i) reason with commonsense knowledge, ii) model their nondeter-ministic action outcomes and partial observability, and iii) plan toward maximizing long-term rewards. On one hand, Answer Set Programming (ASP) is good at representing and reasoning with commonsense and default knowledge, but is ill-equipped to plan under probabilistic uncertainty. On the other hand, Partially Observable Markov Decision Processes (POMDPs) are strong at planning under uncertainty toward maximizing long-term rewards, but are not designed to incorporate commonsense knowledge and inference. This paper introduces the CORPP algorithm which combines P-log, a probabilistic extension of ASP, with POMDPs to integrate commonsense reasoning with planning under uncertainty. Our approach is fully implemented and tested on a shopping request identification problem both in simulation and on a real robot. Compared with existing approaches using P-log or POMDPs individually, we observe significant improvements in both efficiency and accuracy.

57 citations


Posted Content
TL;DR: Amazon Mechanical Turk-based evaluations on Flickr8k, Flickr30k and MS-COCO datasets show that in most cases, sentences auto-constructed from SDGs obtained by the method give a more relevant and thorough description of an image than a recent state-of-the-art image caption based approach.
Abstract: In this paper we propose the construction of linguistic descriptions of images. This is achieved through the extraction of scene description graphs (SDGs) from visual scenes using an automatically constructed knowledge base. SDGs are constructed using both vision and reasoning. Specifically, commonsense reasoning is applied on (a) detections obtained from existing perception methods on given images, (b) a "commonsense" knowledge base constructed using natural language processing of image annotations and (c) lexical ontological knowledge from resources such as WordNet. Amazon Mechanical Turk(AMT)-based evaluations on Flickr8k, Flickr30k and MS-COCO datasets show that in most cases, sentences auto-constructed from SDGs obtained by our method give a more relevant and thorough description of an image than a recent state-of-the-art image caption based approach. Our Image-Sentence Alignment Evaluation results are also comparable to that of the recent state-of-the art approaches.

51 citations


Proceedings Article
25 Jul 2015
TL;DR: An “air traffic control”-like dashboard is proposed, which alerts moderators to large-scale outbreaks that appear to be escalating or spreading and helps them prioritize the current deluge of user complaints.
Abstract: We present an approach for cyberbullying detection based on state-of-the-art text classification and a common sense knowledge base, which permits recognition over a broad spectrum of topics in everyday life. We analyze a more narrow range of particular subject matter associated with bullying and construct BullySpace, a common sense knowledge base that encodes particular knowledge about bullying situations. We then perform joint reasoning with common sense knowledge about a wide range of everyday life topics. We analyze messages using our novel AnalogySpace common sense reasoning technique. We also take into account social network analysis and other factors. We evaluate the model on real-world instances that have been reported by users on Formspring, a social networking website that is popular with teenagers. On the intervention side, we explore a set of reflective userinteraction paradigms with the goal of promoting empathy among social network participants. We propose an air traffic control-like dashboard, which alerts moderators to large-scale outbreaks that appear to be escalating or spreading and helps them prioritize the current deluge of user complaints. For potential victims, we provide educational material that informs them about how to cope with the situation, and connects them with emotional support from others. A user evaluation shows that incontext, targeted, and dynamic help during cyberbullying situations fosters end-user reflection that promotes better coping strategies.

39 citations


Proceedings Article
01 Jan 2015
TL;DR: This work presents a new set of challenge problems for the logical formalization of commonsense knowledge, called TriangleCOPA, which is specifically designed to support the development of logic-based commonsense theories, via two means.
Abstract: We present a new set of challenge problems for the logical formalization of commonsense knowledge, called TriangleCOPA. This set of one hundred problems is smaller than other recent commonsense reasoning question sets, but is unique in that it is specifically designed to support the development of logic-based commonsense theories, via two means. First, questions and potential answers are encoded in logical form using a fixed vocabulary of predicates, eliminating the need for sophisticated natural language processing pipelines. Second, the domain of the questions is tightly constrained so as to focus formalization efforts on one area of inference, namely the commonsense reasoning that people do about human psychology. We describe the authoring methodology used to create this problem set, and our analysis of the scope of requisite commonsense knowledge. We then show an example of how problems can be solved using an implementation of weighted abduction.

29 citations


Proceedings Article
25 Jan 2015
TL;DR: The Winograd Schema Challenge is described, which has been suggested as an alternative to the Turing Test and as a means of measuring progress in commonsense reasoning and is of special interest to the AI applications community.
Abstract: This paper describes the Winograd Schema Challenge (WSC), which has been suggested as an alternative to the Turing Test and as a means of measuring progress in commonsense reasoning. A competition based on the WSC has been organized and announced to the AI research community. The WSC is of special interest to the AI applications community and we encourage its members to participate.

Proceedings Article
12 Mar 2015
TL;DR: This paper combines visual processing with techniques from natural language understanding, common-sense reasoning and knowledge representation and reasoning to improve visual perception to reason about finer aspects of activities.
Abstract: In this paper we explore the use of visual commonsense knowledge and other kinds of knowledge (such as domain knowledge, background knowledge, linguistic knowledge) for scene understanding. In particular, we combine visual processing with techniques from natural language understanding (especially semantic parsing), common-sense reasoning and knowledge representation and reasoning to improve visual perception to reason about finer aspects of activities.

Proceedings Article
07 Apr 2015
TL;DR: GECKA merges, as never before, the potential of serious games and games with a purpose, for the acquisition of re-usable and multi-purpose knowledge and enables the development of games that can, apart from providing entertainment value, also teach gamers something meaningful about the world they live in.
Abstract: Commonsense knowledge representation and reasoning is key for tasks such as natural language understanding. Since common-sense consists of information that humans take for granted, however, gathering it is an extremely difficult task.The game engine for common-sense knowledge acquisition (GECKA) aims to collect common-sense from game designers through the development of serious games. GECKA merges, as never before, the potential of serious games and games with a purpose. This not only provides a platform for the acquisition of re-usable and multi-purpose knowledge, but also enables the development of games that can, apart from providing entertainment value, also teach gamers something meaningful about the world they live in.

Journal ArticleDOI
TL;DR: This article discusses reasoning approaches for the medium-term control of autonomous agents in dynamic spatial systems, which requires a sufficiently detailed description of the agent’s behavior and environment but may still be conducted in a qualitative manner.
Abstract: Autonomous agents that operate as components of dynamic spatial systems are becoming increasingly popular and mainstream. Applications can be found in consumer robotics, in road, rail, and air transportation, manufacturing, and military operations. Unfortunately, the approaches to modeling and analyzing the behavior of dynamic spatial systems are just as diverse as these application domains. In this article, we discuss reasoning approaches for the medium-term control of autonomous agents in dynamic spatial systems, which requires a sufficiently detailed description of the agent’s behavior and environment but may still be conducted in a qualitative manner. We survey logic-based qualitative and hybrid modeling and commonsense reasoning approaches with respect to their features for describing and analyzing dynamic spatial systems in general, and the actions of autonomous agents operating therein in particular. We introduce a conceptual reference model, which summarizes the current understanding of the characteristics of dynamic spatial systems based on a catalog of evaluation criteria derived from the model. We assess the modeling features provided by logic-based qualitative commonsense and hybrid approaches for projection, planning, simulation, and verification of dynamic spatial systems. We provide a comparative summary of the modeling features, discuss lessons learned, and introduce a research roadmap for integrating different approaches of dynamic spatial system analysis to achieve coverage of all required features.

Book ChapterDOI
30 Nov 2015
TL;DR: GluNet is presented, a flexible, open-source, and generic knowledge-base that seamlessly integrates a variety of lexical databases and facilitates commonsense reasoning and interoperability of narrative generation systems and sharing corpus data between fields of computational narrative.
Abstract: In mixed-initiative computational storytelling, stories are authored using a given vocabulary that must be understood by both author and computer. In practice, this vocabulary is manually authored ad-hoc, and prone to errors and consistency problems. What is missing is a generic, rich semantic vocabulary that is reusable in different applications and effectively supportive of advanced narrative reasoning and generation. We propose the integration of lexical semantics and commonsense knowledge and we present GluNet, a flexible, open-source, and generic knowledge-base that seamlessly integrates a variety of lexical databases and facilitates commonsense reasoning. Advantages of this approach are illustrated by means of two prototype applications, which make extensive use of the GluNet vocabulary to reason about and manipulate a coauthored narrative. GluNet aims to promote interoperability of narrative generation systems and sharing corpus data between fields of computational narrative.

Journal ArticleDOI
TL;DR: The author provides first steps toward building a software agent/robot with compassionate intelligence by re-purposing code and architectural ideas from collaborative multi-agent systems and affective common sense reasoning with new concepts and philosophies from the human arts and sciences relating to compassion.
Abstract: The author provides first steps toward building a software agent/robot with compassionate intelligence. She approaches this goal with an example software agent, EM-2. She also gives a generalized software requirements guide for anyone wishing to pursue other means of building compassionate intelligence into an AI system. The purpose of EM-2 is not to build an agent with a state of mind that mimics empathy or consciousness, but rather to create practical applications of AI systems with knowledge and reasoning methods that positively take into account the feelings and state of self and others during decision making, action, or problem solving. To program EM-2 the author re-purposes code and architectural ideas from collaborative multi-agent systems and affective common sense reasoning with new concepts and philosophies from the human arts and sciences relating to compassion. EM-2 has predicates and an agent architecture based on a meta-cognition mental process that was used on India's worst prisoners to cultivate compassion for others, Vipassana or mindfulness. She describes and presents code snippets for common sense based affective inference and the I-TMS, an Irrational Truth Maintenance System, that maintains consistency in agent memory as feelings change over time, and provides a machine theoretic description of the consistency issues of combining affect and logic. The author summarizes the growing body of new biological, cognitive and immune discoveries about compassion and the consequences of these discoveries for programmers working with human-level AI and hybrid human-robot systems.

Journal ArticleDOI
TL;DR: This paper just tries to present a new way to look at Commonsense Reasoning, a manifestation of the natural phenomenon 'thinking', and, perhaps, the only skill human beings truly share for their survival.

Proceedings ArticleDOI
12 Jul 2015
TL;DR: The evaluation phase shows that Sim-Predictor compares positively with ELM and SVM, when addressing the problem of polarity detection in the sentic computing framework, a novel approach to big social data analysis based on the interpretation of the cognitive and affective information associated with natural language (affective common-sense reasoning).
Abstract: This paper explores the theory of learning with similarity functions in the context of common-sense reasoning and natural language processing Based on this theory, the proposed approach (called Sim-Predictor) is characterized by the process of remapping the input space into a new space which is able to convey the similarity between the input pattern and a number of landmarks, ie, a subset of patterns randomly extracted from the training set The new learning scheme exhibits the interesting property of relating the dimensionality of the remapped space to the learning abilities of the eventual predictor in a formal fashion The evaluation phase shows that Sim-Predictor compares positively with ELM and SVM, when addressing the problem of polarity detection in the sentic computing framework, a novel approach to big social data analysis based on the interpretation of the cognitive and affective information associated with natural language (affective common-sense reasoning)

Journal ArticleDOI
TL;DR: The general concepts and ideas underlying MaxEnt and leading to it are addressed, the use of MaxEnt is illustrated by reporting on an example application from the medical domain, and a brief survey on recent approaches to extending the MaxEnt principle to first-order logic is given.
Abstract: Combining logic with probability theory provides a solid ground for the representation of and the reasoning with uncertain knowledge. Given a set of probabilistic conditionals like “If A then B with probability x”, a crucial question is how to extend this explicit knowledge, thereby avoiding any unnecessary bias. The connection between such probabilistic reasoning and commonsense reasoning has been elaborated especially by Jeff Paris, advocating the principle of Maximum Entropy (MaxEnt). In this paper, we address the general concepts and ideas underlying MaxEnt and leading to it, illustrate the use of MaxEnt by reporting on an example application from the medical domain, and give a brief survey on recent approaches to extending the MaxEnt principle to first-order logic.

Journal ArticleDOI
TL;DR: This work suggests that philosophical arguments should be based on formal-computational models to reduce the ambiguities and uncertainties that come with intuitive arguments and reasoning, and capture the dynamic nature of many philosophical concepts.
Abstract: The notions of knowledge and belief play an important role in philosophy. Unfortunately, the literature is not very consistent about defining these notions. Is belief more fundamental than knowledge or is it the other way around? Many accounts rely on the widely accepted strategy of appealing to the intuition of the reader. Such an argumentative methodology is fundamentally flawed because it lets the problems of common sense reasoning in through the front door. Instead, I suggest that philosophical arguments should be based on formal-computational models to a reduce the ambiguities and uncertainties that come with intuitive arguments and reasoning, and b capture the dynamic nature of many philosophical concepts. I present a model of knowledge and belief that lends itself to being implemented on computers. Its purpose is to resolve terminological confusion in favor of a more transparent account. The position I defend is an antirealist naturalized one: knowledge is best conceived as arising from experience, and is fundamental to belief.

Dissertation
01 Jan 2015
TL;DR: The authors develop formal computational models of intuitive theories, in particular intuitive physics and intuitive psychology, which form the basis of commonsense reasoning. But they do not address the question of learning intuitive theories in general.
Abstract: This thesis develops formal computational models of intuitive theories, in particular intuitive physics and intuitive psychology, which form the basis of commonsense reasoning. The overarching formal framework is that of hierarchical Bayesian models, which see the mind as having domain-specific hypotheses about how the world works. The work first extends models of intuitive psychology to include higher-level social utilities, arguing against a pure 'classifier' view. Second, the work extends models of intuitive physics by introducing a ontological hierarchy of physics concepts, and examining how well people can reason about novel dynamic displays. I then examine the question of learning intuitive theories in general, arguing that an algorithmic approach based on stochastic search can address several puzzles of learning, including the 'chicken and egg' problem of concept learning. Finally, I argue the need for a joint theory-space for reasoning about intuitive physics and intuitive psychology, and provide such a simplified space in the form of a generative model for a novel domain called Lineland. Taken together, these results forge links between formal modeling, intuitive theories, and cognitive development.

Journal ArticleDOI
TL;DR: An approximation of the possible worlds semantics ( PWS ) of knowledge with support for postdiction - a fundamental inference pattern for diagnostic reasoning and explanation tasks in a wide range of real-world applications such as cognitive robotics, visual perception for cognitive vision, ambient intelligence and smart environments is proposed.

01 Jan 2015
TL;DR: This paper illustrates with two logical approaches— abductive logic programming and deonitc logic—how these problems can be solved and proposes an idea of how to use background knowledge to support the reasoning process.
Abstract: There is increasing interest in the field of automated commonsense reasoning to find real world benchmarks to challenge and to further develop reasoning systems. One interesting example is the Triangle Choice of Plausible Alternatives (Triangle-COPA), which is a set of problems presented in first-order logic. The setting of these problems stems from the famous Heider-Simmel film used in early experiments in social psychology. This paper illustrates with two logical approaches— abductive logic programming and deonitc logic—how these problems can be solved. Furthermore, we propose an idea of how to use background knowledge to support the reasoning process.

Proceedings Article
25 Jul 2015
TL;DR: It is shown how qualitative spatial reasoning about points with several existing calculi can be reduced to the realisability problem for EER, including LR and calculi for reasoning about betweenness, collinearity and parallelism.
Abstract: We introduce a framework for qualitative reasoning about directions in high-dimensional spaces, called EER, where our main motivation is to develop a form of commonsense reasoning about semantic spaces. The proposed framework is, however, more general; we show how qualitative spatial reasoning about points with several existing calculi can be reduced to the realisability problem for EER (or REER for short), including LR and calculi for reasoning about betweenness, collinearity and parallelism. Finally, we propose an efficient but incomplete inference method, and show its effectiveness for reasoning with EER as well as reasoning with some of the aforementioned calculi.

Journal ArticleDOI
TL;DR: The implementation of HPX is used to investigate the incompleteness issue and an empirical evaluation of the solvable fragment and its performance is presented, finding that thesolvable fragment ofHPX is indeed reasonable and fairly large.

Posted Content
TL;DR: It is argued that building a general data compression algorithm solving all problems up to a complexity threshold should be the main thrust of research and a measure for partial progress in AGI is suggested.
Abstract: This paper presents a tentative outline for the construction of an artificial, generally intelligent system (AGI). It is argued that building a general data compression algorithm solving all problems up to a complexity threshold should be the main thrust of research. A measure for partial progress in AGI is suggested. Although the details are far from being clear, some general properties for a general compression algorithm are fleshed out. Its inductive bias should be flexible and adapt to the input data while constantly searching for a simple, orthogonal and complete set of hypotheses explaining the data. It should recursively reduce the size of its representations thereby compressing the data increasingly at every iteration. Abstract Based on that fundamental ability, a grounded reasoning system is proposed. It is argued how grounding and flexible feature bases made of hypotheses allow for resourceful thinking. While the simulation of representation contents on the mental stage accounts for much of the power of propositional logic, compression leads to simple sets of hypotheses that allow the detection and verification of universally quantified statements. Abstract Together, it is highlighted how general compression and grounded reasoning could account for the birth and growth of first concepts about the world and the commonsense reasoning about them.

Proceedings Article
25 Jan 2015
TL;DR: The proposed language takes advantage of both formalisms in a single framework, allowing us to represent commonsense reasoning problems that require both logical and probabilistic reasoning in an intuitive and elaboration tolerant way.
Abstract: We present a probabilistic extension of logic programs under the stable model semantics, inspired by the concept of Markov Logic Networks. The proposed language takes advantage of both formalisms in a single framework, allowing us to represent commonsense reasoning problems that require both logical and probabilistic reasoning in an intuitive and elaboration tolerant way.

Book ChapterDOI
01 Jan 2015
TL;DR: In the very wide setting of a Basic Fuzzy Algebra, a formal algebraic model for Commonsense Reasoning is presented with fuzzy and crisp sets including, in particular, the usual case of the Standard Algebras of fuzzy Sets.
Abstract: In the very wide setting of a Basic Fuzzy Algebra, a formal algebraic model for Commonsense Reasoning is presented with fuzzy and crisp sets including, in particular, the usual case of the Standard Algebras of Fuzzy Sets. The aim with which the model is constructed is that of, first, adding to Zadeh’s Computing with Words a wide perspective of ordinary reasoning in agreement with some basic characteristics of it, and second, presenting an operational ground on which linguistic terms can be represented, and schemes of inference posed. Additionally, the chapter also tries to express the author’s belief that reasoning deserves to be studied like an Experimental Science.

01 Jan 2015
TL;DR: A Stochastic Scene Grammar (SSG) is presented as a hierarchical compositional representation which integrates functionality, geometry and appearance in a hierarchy and purses a physically stable scene understanding, namely ``a parse tree", by inferring object stability in the physical world.
Abstract: Computer vision has made significant progress in locating and recognizing objects in recent decades. However, beyond the scope of this “what is where” challenge, it lacks the abilities to understand scenes characterizing human visual experience. Comparing with human vision, what is missing in current computer vision? One answer is that human vision is not only for pattern recognition, but also supports a rich set of commonsense reasoning about object function, scene physics, social intentions etc..I build systems for real world applications and simultaneously pursuing a long-term goal of devising a unified framework that can make sense of an images and a scene by reasoning about the functional and physical mechanisms of objects in a 3D world. By bridging advances spanning fields of stochastic learning, computer vision, cognitive science, my research tackles following challenges: (i) What is the visual representation? I develop stochastic grammar models to characterize spatiotemporal structures of visual scenes and events. The analogy of human natural language lays a foundation for representing both visual structure and abstract knowledge. I pose the scene understanding problem as parsing an image into a hierarchical structure of visual entities using the Stochastic Scene Grammar (SSG). With a set of production rules, the grammar enforces both structural regularity and flexibility of visual entities. Therefore, the algorithm is able to handle enormous number of configurations and large geometric variations for both indoor scenes and outdoor scenes. (ii) How to reason about the commonsense knowledge? I augment the commonsense knowledge about functionality, physical stability to the grammatical representation. The bottom-up and top-down inference algorithms are designed for finding a most plausible interpretation of visual stimuli. Functionality refers to the property of an object or scene, especially man-made ones, which has a practical use for which it was designed, and it's deeper than geometry and appearance and thus is a more invariant concept for scene understanding. We present a Stochastic Scene Grammar (SSG) as a hierarchical compositional representation which integrates functionality, geometry and appearance in a hierarchy. This represents a different philosophy that views vision tasks from the perspective of agents, that is, agents (humans, animals and robots) should perceive objects and scenes by reasoning their plausible functions. Physical stability assumption assumes objects in the static scene should be stable with respect to the gravity field. In other words, if any object is not stable on its own, it must be either grouped with neighbors or fixed to its supporting base. We pursue a physically stable scene understanding, namely ``a parse tree , by inferring object stability in the physical world. The assumption is applicable to general scene categories thus poses powerful constraints for physically plausible scene interpretation and understanding.(iii) How to acquire commonsense knowledge? I performed three case studies to acquire different kinds of commonsense knowledges: I teach the computer to learn affordance from observing human actions; to learn tool-use from single one-shot demonstration; and to infer containing relations by physical simulation without explicit training process. They provided some interesting perspectives on how to acquire and exploit commonsense knowledge. In general, the more prediction or simulation is performed, the less training data is needed. As a result, the acquired commonsense knowledge is more generalizable to new situations.Such sophisticated understanding of 3D scenes enables computer vision to reason, predict, interact with the 3D environment, as well as hold intelligent dialogues beyond visible spectrum.

Journal ArticleDOI
TL;DR: A novel evolutionary algorithm with a semantic network-based representation that enables the open-ended generation of networks analogous to a given base network and introduces an analogical similarity-based fitness measure that is computed through structure mapping.
Abstract: We introduce a novel evolutionary algorithm (EA) with a semantic network-based representation. For enabling this, we establish new formulations of EA variation operators, crossover and mutation, that we adapt to work on semantic networks. The algorithm employs commonsense reasoning to ensure all operations preserve the meaningfulness of the networks, using ConceptNet and WordNet knowledge bases. The algorithm can be interpreted as a novel memetic algorithm (MA), given that (1) individuals represent pieces of information that undergo evolution, as in the original sense of memetics as it was introduced by Dawkins; and (2) this is different from existing MA, where the word “memetic” has been used as a synonym for local refinement after global optimization. For evaluating the approach, we introduce an analogical similarity-based fitness measure that is computed through structure mapping. This setup enables the open-ended generation of networks analogous to a given base network.