scispace - formally typeset
Search or ask a question

Showing papers on "Knowledge base published in 2010"


Proceedings Article
11 Jul 2010
TL;DR: This work proposes an approach and a set of design principles for an intelligent computer agent that runs forever and describes a partial implementation of such a system that has already learned to extract a knowledge base containing over 242,000 beliefs.
Abstract: We consider here the problem of building a never-ending language learner; that is, an intelligent computer agent that runs forever and that each day must (1) extract, or read, information from the web to populate a growing structured knowledge base, and (2) learn to perform this task better than on the previous day In particular, we propose an approach and a set of design principles for such an agent, describe a partial implementation of such a system that has already learned to extract a knowledge base containing over 242,000 beliefs with an estimated precision of 74% after running for 67 days, and discuss lessons learned from this preliminary attempt to build a never-ending learning agent

2,010 citations


Book ChapterDOI
20 Sep 2010
TL;DR: A novel approach to distant supervision that can alleviate the problem of noisy patterns that hurt precision by using a factor graph and applying constraint-driven semi-supervision to train this model without any knowledge about which sentences express the relations in the authors' training KB.
Abstract: Several recent works on relation extraction have been applying the distant supervision paradigm: instead of relying on annotated text to learn how to predict relations, they employ existing knowledge bases (KBs) as source of supervision. Crucially, these approaches are trained based on the assumption that each sentence which mentions the two related entities is an expression of the given relation. Here we argue that this leads to noisy patterns that hurt precision, in particular if the knowledge base is not directly related to the text we are working with. We present a novel approach to distant supervision that can alleviate this problem based on the following two ideas: First, we use a factor graph to explicitly model the decision whether two entities are related, and the decision whether this relation is mentioned in a given sentence; second, we apply constraint-driven semi-supervision to train this model without any knowledge about which sentences express the relations in our training KB. We apply our approach to extract relations from the New York Times corpus and use Freebase as knowledge base. When compared to a state-of-the-art approach for relation extraction under distant supervision, we achieve 31% error reduction.

1,304 citations


01 Jan 2010
TL;DR: An overview of the task definition and annotation challenges associated with KBP2010 is provided and the evaluation results and lessons that are learned are discussed based on detailed analysis.
Abstract: In this paper we give an overview of the Knowledge Base Population (KBP) track at TAC 2010. The main goal of KBP is to promote research in discovering facts about entities and expanding a structured knowledge base with this information. A large source collection of newswire and web documents is provided for systems to discover information. Attributes (a.k.a. “slots”) derived from Wikipedia infoboxes are used to create the reference knowledge base (KB). KBP2010 includes the following four tasks: (1) Regular Entity Linking, where names must be aligned to entities in the KB; (2) Optional Entity linking, without using Wikipedia texts; (3) Regular Slot Filling, which requires a system to automatically discover the attributes of specified entities from the source document collection and use them to expand the KB; (4) Surprise Slot Filling, which requires a system to return answers regarding new slot types within a short time period. KBP2010 has attracted many participants (over 45 teams registered for KBP 2010 (not including the RTEKBP Validation Pilot task), among which 23 teams submitted results). In this paper we provide an overview of the task definition and annotation challenges associated with KBP2010. Then we summarize the evaluation results and discuss the lessons that we have learned based on detailed analysis.

535 citations


Journal ArticleDOI
TL;DR: In this article, the authors argue that these gains from R&D outsourcing need to be balanced against the "pains" that stem from a dilution of firm-specific resources, the deterioration of integrative capabilities and the high demands on management attention.
Abstract: The outsourcing of research and development (R&D) activities has frequently been characterized as an important instrument to acquire external technological knowledge that is subsequently integrated into a firm's own knowledge base However, in this paper we argue that these ‘gains’ from R&D outsourcing need to be balanced against the ‘pains’ that stem from a dilution of firm-specific resources, the deterioration of integrative capabilities and the high demands on management attention Based on a panel dataset of innovating firms in Germany, we find evidence for an inverse U-shaped relationship between R&D outsourcing and innovation performance This relationship is positively moderated by the extent to which firms engage in internal R&D and by the breadth of formal R&D collaborations: both serve as an instrument to increase the effectiveness of R&D outsourcing

490 citations


Proceedings Article
23 Aug 2010
TL;DR: This work presents a state of the art system for entity disambiguation that not only addresses challenges but also scales to knowledge bases with several million entries using very little resources.
Abstract: The integration of facts derived from information extraction systems into existing knowledge bases requires a system to disambiguate entity mentions in the text. This is challenging due to issues such as non-uniform variations in entity names, mention ambiguity, and entities absent from a knowledge base. We present a state of the art system for entity disambiguation that not only addresses these challenges but also scales to knowledge bases with several million entries using very little resources. Further, our approach achieves performance of up to 95% on entities mentioned from newswire and 80% on a public test set that was designed to include challenging queries.

356 citations


Journal ArticleDOI
23 Apr 2010-Science
TL;DR: The Landscape Model captures the reading process and the influences of reader characteristics and text characteristics and suggests factors that can optimize—or jeopardize—learning science from text.
Abstract: Texts form a powerful tool in teaching concepts and principles in science How do readers extract information from a text, and what are the limitations in this process? Central to comprehension of and learning from a text is the construction of a coherent mental representation that integrates the textual information and relevant background knowledge This representation engenders learning if it expands the reader's existing knowledge base or if it corrects misconceptions in this knowledge base The Landscape Model captures the reading process and the influences of reader characteristics (such as working-memory capacity, reading goal, prior knowledge, and inferential skills) and text characteristics (such as content/structure of presented information, processing demands, and textual cues) The model suggests factors that can optimize--or jeopardize--learning science from text

229 citations


Patent
09 Jun 2010
TL;DR: In this article, a clusterer for clustering micro-blog messages received over a first period of time, a classifier for scoring the clustered messages; a knowledge base, a rule generator for generating classification rules from the knowledge base; and a matcher for matching the scored messages to information requests.
Abstract: Methods, systems and software are described for analyzing micro-blog messages to detect abnormal activity of interest. The system includes a clusterer for clustering micro-blog messages received over a first period of time, a classifier for scoring the clustered messages; a knowledge base, a rule generator for generating classification rules from the knowledge base; and a matcher for matching the scored messages to information requests. Methods for operating the system and its components are described.

209 citations


Book
12 Oct 2010
TL;DR: In this article, the dimensions of an Instructional Design Knowledge Base (IDKB) are discussed and a taxonomy of the ID knowledge base glossary of terms is presented.
Abstract: List of Tables List of Figures Preface Acknowledgements 1. The Dimensions of an Instructional Design Knowledge Base 2. General Systems Theory 3. Communication 4. Learning Theory 5. Early Instructional Theory 6. Media Theory 7. Conditions-Based Theory 8. Constructivist Design Theory 9. Performance Improvement Theory 10. A Taxonomy of the ID Knowledge Base Glossary of Terms References

191 citations


Journal ArticleDOI
TL;DR: Continuing research toward development of more sophisticated techniques for processing NL text, for utilizing semantic knowledge, and for incorporating logic and reasoning mechanisms will lead to more useful QA systems.

180 citations


Journal ArticleDOI
TL;DR: In this paper, the authors review the research on strategic planning and management in the public sector to understand what has been learned to date and what gaps in knowledge remain, and find substantial empirical testing of the impacts of environmental and institutional/organizational determinants on strategic management, but efforts to assess linkages between strategic planning processes and organizational outcomes or performance improvements are sparse.
Abstract: Although there is considerable literature on strategic planning and management in the public sector, there has been little effort to synthesize what has been learned concerning the extent to which these tools are used in government, how they are implemented, and the results they generate. In this article, the authors review the research on strategic planning and management in the public sector to understand what has been learned to date and what gaps in knowledge remain. In examining the 34 research articles in this area published in the major public administration journals over the past 20 years, the authors find substantial empirical testing of the impacts of environmental and institutional/organizational determinants on strategic management, but efforts to assess linkages between strategic planning processes and organizational outcomes or performance improvements are sparse. Large-N quantitative analyses and comparative case studies could improve the knowledge base in this critical area.

164 citations


Proceedings ArticleDOI
06 Jun 2010
TL;DR: This tutorial discusses state-of-the-art methods, research opportunities, and open challenges along this avenue of knowledge harvesting, to automatically construct and maintain a comprehensive knowledge base of facts about named entities, their semantic classes, and their mutual relations as well as temporal contexts, with high precision and high recall.
Abstract: There are major trends to advance the functionality of search engines to a more expressive semantic level. This is enabled by the advent of knowledge-sharing communities such as Wikipedia and the progress in automatically extracting entities and relationships from semistructured as well as natural-language Web sources. Recent endeavors of this kind include DBpedia, EntityCube, KnowItAll, ReadTheWeb, and our own YAGO-NAGA project (and others). The goal is to automatically construct and maintain a comprehensive knowledge base of facts about named entities, their semantic classes, and their mutual relations as well as temporal contexts, with high precision and high recall. This tutorial discusses state-of-the-art methods, research opportunities, and open challenges along this avenue of knowledge harvesting.

Proceedings Article
11 Jul 2010
TL;DR: OntoUSP builds on the USP unsupervised semantic parser by jointly forming ISA and IS-PART hierarchies of lambda-form clusters and improves on the recall of USP by 47% and greatly outperforms previous state-of-the-art approaches.
Abstract: Extracting knowledge from unstructured text is a long-standing goal of NLP. Although learning approaches to many of its subtasks have been developed (e.g., parsing, taxonomy induction, information extraction), all end-to-end solutions to date require heavy supervision and/or manual engineering, limiting their scope and scalability. We present OntoUSP, a system that induces and populates a probabilistic ontology using only dependency-parsed text as input. OntoUSP builds on the USP unsupervised semantic parser by jointly forming ISA and IS-PART hierarchies of lambda-form clusters. The ISA hierarchy allows more general knowledge to be learned, and the use of smoothing for parameter estimation. We evaluate OntoUSP by using it to extract a knowledge base from biomedical abstracts and answer questions. OntoUSP improves on the recall of USP by 47% and greatly outperforms previous state-of-the-art approaches.

Journal ArticleDOI
TL;DR: The findings suggest that analogies are frequently applied without the aid of formal procedures, techniques, or tools and that the tendency to access knowledge from only a limited set of familiar knowledge sources may constrain the possibility for creative recombination by analogies.

Journal ArticleDOI
TL;DR: A new interactive approach to prune and filter discovered rules and proposes the Rule Schema formalism extending the specification language proposed by Liu et al. for user expectations in order to improve the integration of user knowledge in the postprocessing task.
Abstract: In Data Mining, the usefulness of association rules is strongly limited by the huge amount of delivered rules. To overcome this drawback, several methods were proposed in the literature such as itemset concise representations, redundancy reduction, and postprocessing. However, being generally based on statistical information, most of these methods do not guarantee that the extracted rules are interesting for the user. Thus, it is crucial to help the decision-maker with an efficient postprocessing step in order to reduce the number of rules. This paper proposes a new interactive approach to prune and filter discovered rules. First, we propose to use ontologies in order to improve the integration of user knowledge in the postprocessing task. Second, we propose the Rule Schema formalism extending the specification language proposed by Liu et al. for user expectations. Furthermore, an interactive framework is designed to assist the user throughout the analyzing task. Applying our new approach over voluminous sets of rules, we were able, by integrating domain expert knowledge in the postprocessing step, to reduce the number of rules to several dozens or less. Moreover, the quality of the filtered rules was validated by the domain expert at various points in the interactive process.

Posted Content
TL;DR: In this paper, the authors examine the organizational and geographical patterns of knowledge flows in the media industry of southern Sweden, an industry that is characterized by a strong "symbolic" knowledge base.
Abstract: This paper deals with geographical and organisational patterns of knowledge flows in the media industry of southern Sweden, an industry that is characterised by a strong ‘symbolic’ knowledge base. Aim is to address the question of the local versus the non-local as the prime arena for knowledge exchange, and to examine the organisational patterns of knowledge sourcing with specific attention paid to the nature of the knowledge sourced. Symbolic industries draw heavily on creative production and a cultural awareness that is strongly embedded in the local context; thus knowledge flows and networks are expected to be most of all locally configured, and firms to rely on informal knowledge sources rather than scientific knowledge or principles. Based on structured and semi-structured interviews with firm representatives, these assumptions are empirically assessed through social network analysis and descriptive statistics. Our findings show that firms rely above all on knowledge that is generated in project work through learning-by-doing and by interaction with other firms in localised networks. The analysis contributes to transcending the binary arguments on the role of geography for knowledge exchange which tend to dominate the innovation studies literature.

Journal ArticleDOI
TL;DR: The purpose of this article is to develop a general approach for extending a tableau-based algorithm to a pinpointing algorithm, based on a general definition of ‘tableau algorithms,’ which captures many of the known tableAU-based algorithms employed in DLs, but also other kinds of reasoning procedures.
Abstract: Axiom pinpointing has been introduced in description logics (DLs) to help the user to understand the reasons why consequences hold and to remove unwanted consequences by computing minimal (maximal) subsets of the knowledge base that have (do not have) the consequence in question. Most of the pinpointing algorithms described in the DLliterature are obtained as extensions of the standard tableau-based reasoning algorithms for computing consequences from DL knowledge bases. Although these extensions are based on similar ideas, they are all introduced for a particular tableau-based algorithm for a particular DL. The purpose of this article is to develop a general approach for extending a tableau-based algorithm to a pinpointing algorithm. This approach is based on a general definition of ‘tableau algorithms,’ which captures many of the known tableau-based algorithms employed in DLs, but also other kinds of reasoning procedures.

Proceedings Article
02 Jun 2010
TL;DR: This paper proposes a learning to rank algorithm that effectively utilizes the relationship information among the candidates when ranking and achieves 18.5% improvement in terms of accuracy over the classification models for those entities which have corresponding entries in the Knowledge Base.
Abstract: This paper address the problem of entity linking Specifically, given an entity mentioned in unstructured texts, the task is to link this entity with an entry stored in the existing knowledge base This is an important task for information extraction It can serve as a convenient gateway to encyclopedic information, and can greatly improve the web users' experience Previous learning based solutions mainly focus on classification framework However, it's more suitable to consider it as a ranking problem In this paper, we propose a learning to rank algorithm for entity linking It effectively utilizes the relationship information among the candidates when ranking The experiment results on the TAC 2009 dataset demonstrate the effectiveness of our proposed framework The proposed method achieves 185% improvement in terms of accuracy over the classification models for those entities which have corresponding entries in the Knowledge Base The overall performance of the system is also better than that of the state-of-the-art methods

Journal ArticleDOI
TL;DR: A conceptual framework for perceiving and describing uncertainty in environmental decision-making is introduced and it is argued that perceiv-ing and describe uncertainty is an important prerequisite for deciding and acting under uncertainty.

Patent
30 Jun 2010
TL;DR: In this article, the authors present a system for determining user specific information and knowledge relevancy, relevant knowledge and information discovery, user intent and relevant interactions via intelligent messaging, collaboration, sharing and information categorisation, further delivering created knowledge accessible through a personalised user experience.
Abstract: Systems and methods for determining user specific information and knowledge relevancy, relevant knowledge and information discovery, user intent and relevant interactions via intelligent messaging, collaboration, sharing and information categorisation, further delivering created knowledge accessible through a personalised user experience.

Journal ArticleDOI
TL;DR: The main purpose of the paper is to present the integrated knowledge management model for the construction industry as well as system architecture and system of the Knowledge Based Decision Support System for Construction Projects Management (KDSS-CPM) which the authors of this paper have developed.

Journal ArticleDOI
TL;DR: In this article, a conceptual flaw in the specialised literature which portrayed KIBS as a homogeneous group of activities is addressed. And the authors observe and analyse high variety across the KIB sectors' occupational structures and skill requirements.

Journal ArticleDOI
TL;DR: In this paper, the authors identify the kinds of knowledge about foreign country operations that managers deem to be important, expanding prior studies by attending to normative knowledge in addition to regulative and cultural knowledge.
Abstract: Projections for future demand in infrastructure and buildings indicate that there will be increasing opportunities for firms to engage in construction projects around the world. However, international construction projects also face numerous uncertainties. Foreign firms engaged on these projects must work in unfamiliar environments, with differing regulations, norms, and cultural beliefs. This can increase misunderstandings and risks for the entrant firm. To reduce these risks, successful international firms strategically increase their understanding of the local area by collecting knowledge that is important for a given foreign project. This study compiles and analyzes data from 15 case studies of three types of international firms (developers, contractors, and engineers) engaged in international infrastructure development to identify the types of institutional knowledge that informants indicate are important for their international projects. Using institutional theory, we categorize the kinds of knowledge about foreign country operations that managers deem to be important, expanding prior studies by attending to normative knowledge in addition to regulative and cultural knowledge. Finally, we analyze the importance of different categories of knowledge according to firm type. This analysis provides entrant firms a tool to help identify important types of institutional knowledge to collect as they undertake international projects.

Journal ArticleDOI
TL;DR: In this paper, the authors investigated how an incumbent company's internal characteristics influence its propensity to form learning alliances and found that a firm may be reluctant to enter a research alliance when it has deep knowledge in a certain technological field due to concerns about knowledge leakage and the low possibility of being able to learn much from collaboration.
Abstract: This study investigates how an incumbent company's internal characteristics influence its propensity to form learning alliances. A firm may be reluctant to enter a research alliance when it has deep knowledge in a certain technological field due to concerns about knowledge leakage and the low possibility of being able to learn much from collaboration. On the contrary, when the firm has a broad knowledge base, it may have high propensity to enter alliances due to more self-confidence in its ability to learn fast from partners. In addition, we argue that when a firm concentrates its R&D at a central location, this neutralizes the positive and negative influences of the two knowledge base features on alliance formation. We tested and found support for the hypotheses using a database of 1550 alliances undertaken by 78 large incumbent pharmaceutical, chemical, and agro-food companies active in the biotechnology sector during 1993–2002.

Journal ArticleDOI
TL;DR: In this paper, the authors present the cognitive and relational components of organizational socialization as key facilitators of knowledge transfer and demonstrate that teleworking may negatively affect these cognitive (shared mental schemes, language and narratives, and identification with goals and values) and relational (quality of relationships) components, depending on its frequency, location(s), and perception.
Abstract: Over the last decade, teleworking has gained momentum. While it has been portrayed as both employer- and employee-friendly, we question the positive normativity associated with teleworking by showing how it may endanger an organization's knowledge base and competitive advantage by threatening knowledge transfer between teleworkers and non-teleworkers. Drawing on the literature on knowledge we present the cognitive and relational components of organizational socialization as key facilitators of knowledge transfer and we demonstrate that teleworking may negatively affect these cognitive (shared mental schemes, language and narratives, and identification with goals and values) and relational (quality of relationships) components, depending on its frequency, location(s), and perception. Finally, we suggest some managerial avenues for addressing these potential negative side effects of teleworking.

Proceedings ArticleDOI
Yafang Wang1, Mingjie Zhu1, Lizhen Qu1, Marc Spaniol1, Gerhard Weikum1 
22 Mar 2010
TL;DR: This paper introduces Timely YAGO, which extends the previously built knowledge base Y AGO with temporal aspects, and extracts temporal facts from Wikipedia infoboxes, categories, and lists in articles, and integrates these into the TimelyYAGO knowledge base.
Abstract: Recent progress in information extraction has shown how to automatically build large ontologies from high-quality sources like Wikipedia. But knowledge evolves over time; facts have associated validity intervals. Therefore, ontologies should include time as a first-class dimension. In this paper, we introduce Timely YAGO, which extends our previously built knowledge base YAGO with temporal aspects. This prototype system extracts temporal facts from Wikipedia infoboxes, categories, and lists in articles, and integrates these into the Timely YAGO knowledge base. We also support querying temporal facts, by temporal predicates in a SPARQL-style language. Visualization of query results is provided in order to better understand of the dynamic nature of knowledge.

Journal ArticleDOI
TL;DR: A decision support system (DSS) based on fuzzy information axiom (FIA) is developed in order to make this decision procedure easy and to help the decision makers to solve their decision problems by modifying data-base of the program.
Abstract: Information axiom, one of two axioms of axiomatic design methodology which is proposed to improve a design, is used to select the best design among proposed designs. In the literature, there are a lot of studies related to using of information axiom for the solution of decision making problems. Moreover, applications of information axiom have been increasing day by day. However, calculation procedure of information axiom is not only incommodious but also difficult for decision makers. In this paper, a decision support system (DSS) based on fuzzy information axiom (FIA) is developed in order to make this decision procedure easy. The developed system consists of a knowledge base module including facts and rules, inference engine module including FIA and aggregation method, and a user interface module including entrance windows. The main aim of this study is to present a DSS tool to help the decision makers to solve their decision problems by modifying data-base of the program. In this paper, an application procedure will be presented based on the optimal selection of location for emergency service to illustrate the implementation procedure of the proposed model.

Journal ArticleDOI
TL;DR: This paper presents the integration of methodologies with a model of knowledge for conceptual design in accordance with model-driven engineering and extends the FBS model and presents its practical implementation through ontology and language such as SysML.

Journal ArticleDOI
TL;DR: In this article, a theoretical framework for developing expatriate managers' local competence in emerging markets from a knowledge-based perspective is proposed, which explores the processes and mechanisms through which local knowledge can be acquired and integrated into expat managers' knowledge base supporting local talent development and their effective strategic decision-making.

Journal ArticleDOI
TL;DR: The design and evaluation results for a system called AURA are presented, which enables domain experts in physics, chemistry, and biology to author a knowledge base and that then allows a different set of users to ask novel questions against that knowledge base.
Abstract: In the winter, 2004 issue of AI Magazine, we reported Vulcan Inc.'s first step toward creating a question-answering system called "Digital Aristotle." The goal of that first step was to assess the state of the art in applied Knowledge Representation and Reasoning (KRR) by asking AI experts to represent 70 pages from the advanced placement (AP) chemistry syllabus and to deliver knowledge-based systems capable of answering questions from that syllabus. This paper reports the next step toward realizing a Digital Aristotle: we present the design and evaluation results for a system called AURA, which enables domain experts in physics, chemistry, and biology to author a knowledge base and that then allows a different set of users to ask novel questions against that knowledge base. These results represent a substantial advance over what we reported in 2004, both in the breadth of covered subjects and in the provision of sophisticated technologies in knowledge representation and reasoning, natural language processing, and question answering to domain experts and novice users.

Journal ArticleDOI
01 Dec 2010
TL;DR: Results of this study facilitate the tacit knowledge storage, management and sharing to provide knowledge requesters with accurate and comprehensive empirical knowledge for problem solving and decision support.
Abstract: In the knowledge economy era of the 21st century [14,17], the competitive advantage of enterprises has shifted from visible equipment, capital and labor in the past to invisible knowledge nowadays. Knowledge can be distinguished into tacit knowledge and explicit knowledge. Meanwhile, tacit knowledge largely encompasses empirical knowledge difficult to be documented and generally hidden inside of personal mental models. The inability to transfer tacit knowledge to organizational knowledge would cause it to disappear after knowledge workers leaving their post, ultimately losing important intellectual assets for enterprises. Therefore, enterprises attempting to create higher knowledge value are highly concerned with how to transfer personal empirical knowledge inside of an enterprise into an organizational explicit knowledge by using a systematic method to manage and share such valuable empirical knowledge effectively. This study develops a method of ontology-based empirical knowledge representation and reasoning, which adopts OWL (Web Ontology Language) to represent empirical knowledge in a structural way in order to help knowledge requesters clearly understand empirical knowledge. An ontology reasoning method is subsequently adopted to deduce empirical knowledge in order to share and reuse relevant empirical knowledge effectively. Specifically, this study involves the following tasks: (i) analyze characteristics for empirical knowledge, (ii) design an ontology-based multi-layer empirical knowledge representation model, (iii) design an ontology-based empirical knowledge concept schema, (iv) establish an OWL-based empirical knowledge ontology, (v) design reasoning rules for ontology-based empirical knowledge, (vi) develop a reasoning algorithm for ontology-based empirical knowledge, and (vii) implement an ontology-based empirical knowledge reasoning mechanism. Results of this study facilitate the tacit knowledge storage, management and sharing to provide knowledge requesters with accurate and comprehensive empirical knowledge for problem solving and decision support.