scispace - formally typeset
Search or ask a question

Showing papers on "Domain knowledge published in 2021"


Journal ArticleDOI
TL;DR: The widespread use of neural networks across different scientific domains often involves constraining them to satisfy certain symmetries, conservation laws, or other domain knowledge as discussed by the authors, and such constrai...
Abstract: The widespread use of neural networks across different scientific domains often involves constraining them to satisfy certain symmetries, conservation laws, or other domain knowledge. Such constrai...

227 citations


Journal ArticleDOI
04 Mar 2021
TL;DR: In this paper, the authors discuss the potential of applying supervised/unsupervised deep learning and deep reinforcement learning in ultrareliable and low-latency communications (URLLCs) in future 6G networks.
Abstract: As one of the key communication scenarios in the fifth-generation and also the sixth-generation (6G) mobile communication networks, ultrareliable and low-latency communications (URLLCs) will be central for the development of various emerging mission-critical applications. State-of-the-art mobile communication systems do not fulfill the end-to-end delay and overall reliability requirements of URLLCs. In particular, a holistic framework that takes into account latency, reliability, availability, scalability, and decision-making under uncertainty is lacking. Driven by recent breakthroughs in deep neural networks, deep learning algorithms have been considered as promising ways of developing enabling technologies for URLLCs in future 6G networks. This tutorial illustrates how domain knowledge (models, analytical tools, and optimization frameworks) of communications and networking can be integrated into different kinds of deep learning algorithms for URLLCs. We first provide some background of URLLCs and review promising network architectures and deep learning frameworks for 6G. To better illustrate how to improve learning algorithms with domain knowledge, we revisit model-based analytical tools and cross-layer optimization frameworks for URLLCs. Following this, we examine the potential of applying supervised/unsupervised deep learning and deep reinforcement learning in URLLCs and summarize related open problems. Finally, we provide simulation and experimental results to validate the effectiveness of different learning algorithms and discuss future directions.

203 citations


Proceedings ArticleDOI
01 Jun 2021
TL;DR: Zhang et al. as mentioned in this paper proposed a source-free domain adaptation framework for semantic segmentation, in which only a well-trained source model and an unlabeled target domain dataset are available for adaptation, which not only enables recovering and preserving the source domain knowledge from the source model via knowledge transfer during model adaptation, but also distills valuable information from the target domain for self-supervised learning.
Abstract: Unsupervised Domain Adaptation (UDA) can tackle the challenge that convolutional neural network (CNN)-based approaches for semantic segmentation heavily rely on the pixel-level annotated data, which is labor-intensive. However, existing UDA approaches in this regard inevitably require the full access to source datasets to reduce the gap between the source and target domains during model adaptation, which are impractical in the real scenarios where the source datasets are private, and thus cannot be released along with the well-trained source models. To cope with this issue, we propose a source-free domain adaptation framework for semantic segmentation, namely SFDA, in which only a well-trained source model and an unlabeled target domain dataset are available for adaptation. SFDA not only enables to recover and preserve the source domain knowledge from the source model via knowledge transfer during model adaptation, but also distills valuable information from the target domain for self-supervised learning. The pixel-and patch-level optimization objectives tailored for semantic segmentation are seamlessly integrated in the framework. The extensive experimental results on numerous benchmark datasets highlight the effectiveness of our framework against the existing UDA approaches relying on source data.

145 citations


Journal ArticleDOI
TL;DR: This survey summarizes the current progress on integrating medical domain knowledge into deep learning models for various tasks, such as disease diagnosis, lesion, organ and abnormality detection, lesions and organ segmentation, and systematically categorizes different kinds of medical domainknowledge that have been utilized and their corresponding integrating methods.

114 citations


Journal ArticleDOI
TL;DR: This work presents an in-depth analysis of existing deep learning based methods for modelling social interactions, and proposes a simple yet powerful method for effectively capturing these social interactions.
Abstract: Since the past few decades, human trajectory forecasting has been a field of active research owing to its numerous real-world applications: evacuation situation analysis, deployment of intelligent transport systems, traffic operations, to name a few. In this work, we cast the problem of human trajectory forecasting as learning a representation of human social interactions. Early works handcrafted this representation based on domain knowledge. However, social interactions in crowded environments are not only diverse but often subtle. Recently, deep learning methods have outperformed their handcrafted counterparts, as they learn about human-human interactions in a more generic data-driven fashion. In this work, we present an in-depth analysis of existing deep learning-based methods for modelling social interactions. We propose two domain-knowledge inspired data-driven methods to effectively capture these social interactions. To objectively compare the performance of these interaction-based forecasting models, we develop a large scale interaction-centric benchmark TrajNet++, a significant yet missing component in the field of human trajectory forecasting. We propose novel performance metrics that evaluate the ability of a model to output socially acceptable trajectories. Experiments on TrajNet++ validate the need for our proposed metrics, and our method outperforms competitive baselines on both real-world and synthetic datasets.

108 citations


Journal ArticleDOI
TL;DR: A principled Bayesian workflow is introduced that provides guidelines and checks for valid data analysis, avoiding overfitting complex models to noise, and capturing relevant data structure in a probabilistic model.
Abstract: Experiments in research on memory, language, and in other areas of cognitive science are increasingly being analyzed using Bayesian methods. This has been facilitated by the development of probabilistic programming languages such as Stan, and easily accessible front-end packages such as brms. The utility of Bayesian methods, however, ultimately depends on the relevance of the Bayesian model, in particular whether or not it accurately captures the structure of the data and the data analyst's domain expertise. Even with powerful software, the analyst is responsible for verifying the utility of their model. To demonstrate this point, we introduce a principled Bayesian workflow (Betancourt, 2018) to cognitive science. Using a concrete working example, we describe basic questions one should ask about the model: prior predictive checks, computational faithfulness, model sensitivity, and posterior predictive checks. The running example for demonstrating the workflow is data on reading times with a linguistic manipulation of object versus subject relative clause sentences. This principled Bayesian workflow also demonstrates how to use domain knowledge to inform prior distributions. It provides guidelines and checks for valid data analysis, avoiding overfitting complex models to noise, and capturing relevant data structure in a probabilistic model. Given the increasing use of Bayesian methods, we aim to discuss how these methods can be properly employed to obtain robust answers to scientific questions. All data and code accompanying this article are available from https://osf.io/b2vx9/. (PsycInfo Database Record (c) 2021 APA, all rights reserved).

82 citations


Journal ArticleDOI
TL;DR: An overview over the methods that use ontologies to compute similarity and incorporate them in machine learning methods is provided, which outlines how semantic similarity measures and ontology embeddings can exploit the background knowledge in ontologies and how ontologies can provide constraints that improve machine learning models.
Abstract: Ontologies have long been employed in the life sciences to formally represent and reason over domain knowledge and they are employed in almost every major biological database. Recently, ontologies are increasingly being used to provide background knowledge in similarity-based analysis and machine learning models. The methods employed to combine ontologies and machine learning are still novel and actively being developed. We provide an overview over the methods that use ontologies to compute similarity and incorporate them in machine learning methods; in particular, we outline how semantic similarity measures and ontology embeddings can exploit the background knowledge in ontologies and how ontologies can provide constraints that improve machine learning models. The methods and experiments we describe are available as a set of executable notebooks, and we also provide a set of slides and additional resources at https://github.com/bio-ontology-research-group/machine-learning-with-ontologies.

81 citations


Journal ArticleDOI
TL;DR: A novel multimodal emotion recognition model for conversational videos based on reinforcement learning and domain knowledge (ERLDK) is proposed in this paper and achieves the state-of-the-art results on weighted average and most of the specific emotion categories.
Abstract: Multimodal emotion recognition in conversational videos (ERC) develops rapidly in recent years. To fully extract the relative context from video clips, most studies build their models on the entire dialogues which make them lack of real-time ERC ability. Different from related researches, a novel multimodal emotion recognition model for conversational videos based on reinforcement learning and domain knowledge (ERLDK) is proposed in this paper. In ERLDK, the reinforcement learning algorithm is introduced to conduct real-time ERC with the occurrence of conversations. The collection of history utterances is composed as an emotion-pair which represents the multimodal context of the following utterance to be recognized. Dueling deep-Q-network (DDQN) based on gated recurrent unit (GRU) layers is designed to learn the correct action from the alternative emotion categories. Domain knowledge is extracted from public dataset based on the former information of emotion-pairs. The extracted domain knowledge is used to revise the results from the RL module and is transformed into other dataset to examine the rationality. The experimental results on datasets show that ERLDK achieves the state-of-the-art results on weighted average and most of the specific emotion categories.

71 citations


Proceedings ArticleDOI
01 Jun 2021
TL;DR: By applying a novel knowledge augmentation strategy, UmlsBERT can encode clinical domain knowledge into word embeddings and outperform existing domain-specific models on common named-entity recognition (NER) and clinical natural language inference tasks.
Abstract: Contextual word embedding models, such as BioBERT and Bio_ClinicalBERT, have achieved state-of-the-art results in biomedical natural language processing tasks by focusing their pre-training process on domain-specific corpora. However, such models do not take into consideration structured expert domain knowledge from a knowledge base. We introduce UmlsBERT, a contextual embedding model that integrates domain knowledge during the pre-training process via a novel knowledge augmentation strategy. More specifically, the augmentation on UmlsBERT with the Unified Medical Language System (UMLS) Metathesaurus is performed in two ways: i) connecting words that have the same underlying ‘concept’ in UMLS and ii) leveraging semantic type knowledge in UMLS to create clinically meaningful input embeddings. By applying these two strategies, UmlsBERT can encode clinical domain knowledge into word embeddings and outperform existing domain-specific models on common named-entity recognition (NER) and clinical natural language inference tasks.

70 citations


Journal ArticleDOI
Anping Zhao1, Yu Yu1
TL;DR: This work proposes a knowledge-enabled language representation model BERT that leverages the additional information from a sentiment knowledge graph by injecting sentiment domain knowledge into the language representation models, which obtains the embedding vectors of entities in the sentiment knowledge graphs and words in the text in a consistent vector space.
Abstract: To provide explainable and accurate aspect terms and the corresponding aspect–sentiment detection, it is often useful to take external domain-specific knowledge into consideration. In this work, we propose a knowledge-enabled language representation model BERT for aspect-based sentiment analysis. Specifically, our proposal leverages the additional information from a sentiment knowledge graph by injecting sentiment domain knowledge into the language representation model, which obtains the embedding vectors of entities in the sentiment knowledge graph and words in the text in a consistent vector space. In addition, the model is capable of achieving better performance with a small amount of training data by incorporating external domain knowledge into the language representation model to compensate for the limited training data. As a result, our model is able to provide explainable and detailed results for aspect-based sentiment analysis. Experimental results demonstrate the effectiveness of the proposed method, showing that the knowledge-enabled BERT is an excellent choice for solving aspect-based sentiment analysis problems.

67 citations


Journal ArticleDOI
TL;DR: It is shown that by using ontologies the authors can improve the human understandability of global post-hoc explanations, presented in the form of decision trees, with little compromise on the accuracy with which the surrogate decision trees replicate the behaviour of the original neural network models.

Proceedings ArticleDOI
Yang Shu1, Zhangjie Cao1, Chenyu Wang1, Jianmin Wang1, Mingsheng Long1 
20 Jun 2021
TL;DR: In this paper, a Domain-Augmented Meta-Learning (DAML) framework is proposed to learn open-domain generalizable representations by augmenting domains on both feature-level and label-level by distilled soft-labeling.
Abstract: Leveraging datasets available to learn a model with high generalization ability to unseen domains is important for computer vision, especially when the unseen domain’s annotated data are unavailable. We study a novel and practical problem of Open Domain Generalization (OpenDG), which learns from different source domains to achieve high performance on an unknown target domain, where the distributions and label sets of each individual source domain and the target domain can be different. The problem can be generally applied to diverse source domains and widely applicable to real-world applications. We propose a Domain-Augmented Meta-Learning framework to learn open-domain generalizable representations. We augment domains on both feature-level by a new Dirichlet mixup and label-level by distilled soft-labeling, which complements each domain with missing classes and other domain knowledge. We conduct meta-learning over domains by designing new meta-learning tasks and losses to preserve domain unique knowledge and generalize knowledge across domains simultaneously. Experiment results on various multi-domain datasets demonstrate that the proposed Domain-Augmented Meta-Learning (DAML) outperforms prior methods for unseen domain recognition.

Journal ArticleDOI
TL;DR: This paper addresses the design of RL to create an adaptive production control system by the real-world example of order dispatching in a complex job shop, and examines the performance of the state, action, and reward function RL design.
Abstract: Modern production systems face enormous challenges due to rising customer requirements resulting in complex production systems. The operational efficiency in the competitive industry is ensured by an adequate production control system that manages all operations in order to optimize key performance indicators. Currently, control systems are mostly based on static and model-based heuristics, requiring significant human domain knowledge and, hence, do not match the dynamic environment of manufacturing companies. Data-driven reinforcement learning (RL) showed compelling results in applications such as board and computer games as well as first production applications. This paper addresses the design of RL to create an adaptive production control system by the real-world example of order dispatching in a complex job shop. As RL algorithms are “black box” approaches, they inherently prohibit a comprehensive understanding. Furthermore, the experience with advanced RL algorithms is still limited to single successful applications, which limits the transferability of results. In this paper, we examine the performance of the state, action, and reward function RL design. When analyzing the results, we identify robust RL designs. This makes RL an advantageous control system for highly dynamic and complex production systems, mainly when domain knowledge is limited.

Journal ArticleDOI
TL;DR: The CBRPMO contributes to the industry by extending the application of ontologies in the bridge sector to cover the rehabilitation stage, enhancing functions of conventional ontologies, and reducing information searching time compared to manual searching, which improves constraint management approaches by automating the information searching step.

Journal ArticleDOI
TL;DR: Gryffin this article augments Bayesian optimization based on kernel density estimation with smooth approximations to categorical distributions, which can significantly accelerate the search for promising molecules and materials.
Abstract: Designing functional molecules and advanced materials requires complex design choices: tuning continuous process parameters such as temperatures or flow rates, while simultaneously selecting catalysts or solvents. To date, the development of data-driven experiment planning strategies for autonomous experimentation has largely focused on continuous process parameters, despite the urge to devise efficient strategies for the selection of categorical variables. Here, we introduce Gryffin, a general-purpose optimization framework for the autonomous selection of categorical variables driven by expert knowledge. Gryffin augments Bayesian optimization based on kernel density estimation with smooth approximations to categorical distributions. Leveraging domain knowledge in the form of physicochemical descriptors, Gryffin can significantly accelerate the search for promising molecules and materials. Gryffin can further highlight relevant correlations between the provided descriptors to inspire physical insights and foster scientific intuition. In addition to comprehensive benchmarks, we demonstrate the capabilities and performance of Gryffin on three examples in materials science and chemistry: (i) the discovery of non-fullerene acceptors for organic solar cells, (ii) the design of hybrid organic–inorganic perovskites for light-harvesting, and (iii) the identification of ligands and process parameters for Suzuki–Miyaura reactions. Our results suggest that Gryffin, in its simplest form, is competitive with state-of-the-art categorical optimization algorithms. However, when leveraging domain knowledge provided via descriptors, Gryffin outperforms other approaches while simultaneously refining this domain knowledge to promote scientific understanding.

Journal ArticleDOI
TL;DR: The goal of this paper is to provide a review of the recent advances in BLM from both communities and inspire new research directions.
Abstract: The recent years have witnessed a rapid increase in the number of scientific articles in biomedical domain. These literature are mostly available and readily accessible in electronic format. The domain knowledge hidden in them is critical for biomedical research and applications, which makes biomedical literature mining (BLM) techniques highly demanding. Numerous efforts have been made on this topic from both biomedical informatics (BMI) and computer science (CS) communities. The BMI community focuses more on the concrete application problems and thus prefer more interpretable and descriptive methods, while the CS community chases more on superior performance and generalization ability, thus more sophisticated and universal models are developed. The goal of this paper is to provide a review of the recent advances in BLM from both communities and inspire new research directions.

Journal ArticleDOI
TL;DR: A novel domain-knowledge-based deep-broad learning framework (DK-DBLF) that can significantly reduce the usage of labeled samples in the learning process, and architecture adjustment can be easily performed when compared with traditional deep methods.
Abstract: Intelligent fault diagnosis is a vital role in smart manufacturing. And deep-learning-based fault diagnosis has become a hot topic due to its strong feature extraction ability. However, traditional deep-learning-based methods show two limitations. One is that a large number of labeled samples are required to construct effective diagnosis models. Another is that these methods lack flexibility, especially for homologous multitasking problems. In this article, a novel domain-knowledge-based deep-broad learning framework (DK-DBLF) is proposed to overcome abovementioned limitations. A DK-DBLF consists of two parts: a task-specific feature extractor and a flexible fault recognizer. The first part is constructed by several convolutional neural networks to obtain abstract features automatically, and the second part employs a broad learning system to improve the flexibility of the proposed framework. To combine these two parts more effectively, bridge label -based strategy is designed, which is a key connection that can integrate domain knowledge into the learning process. The performance of a DK-DBLF is tested on motor-bearing and pipeline defect datasets, which are health condition classification and homologous multitask estimation problems, respectively. The results have demonstrated that our framework can significantly reduce the usage of labeled samples in the learning process, and architecture adjustment can be easily performed when compared with traditional deep methods.

Journal ArticleDOI
TL;DR: This article demonstrates how knowledge, provided as a knowledge graph, is incorporated into DL using K- iL, and discusses the utility of K-iL towards interpretability and explainability.
Abstract: The recent series of innovations in deep learning (DL) have shown enormous potential to impact individuals and society, both positively and negatively. DL models utilizing massive computing power and enormous datasets have significantly outperformed prior historical benchmarks on increasingly difficult, well-defined research tasks across technology domains such as computer vision, natural language processing, and human-computer interactions. However, DL's black-box nature and over-reliance on massive amounts of data condensed into labels and dense representations pose challenges for interpretability and explainability. Furthermore, DLs have not proven their ability to effectively utilize relevant domain knowledge critical to human understanding. This aspect was missing in early data-focused approaches and necessitated knowledge-infused learning (K-iL) to incorporate computational knowledge. This article demonstrates how knowledge, provided as a knowledge graph, is incorporated into DL using K-iL. Through examples from natural language processing applications in healthcare and education, we discuss the utility of K-iL towards interpretability and explainability.

Proceedings ArticleDOI
02 Nov 2021
TL;DR: In this paper, a fixed ratio-based mixup is introduced to augment multiple intermediate domains between the source and target domains to learn domain invariant representations for unsupervised domain adaptation.
Abstract: Unsupervised domain adaptation (UDA) methods for learning domain invariant representations have achieved remarkable progress. However, most of the studies were based on direct adaptation from the source domain to the target domain and have suffered from large domain discrepancies. In this paper, we propose a UDA method that effectively handles such large domain discrepancies. We introduce a fixed ratio-based mixup to augment multiple intermediate domains between the source and target domain. From the augmented-domains, we train the source-dominant model and the target-dominant model that have complementary characteristics. Using our confidence-based learning methodologies, e.g., bidirectional matching with high-confidence predictions and self-penalization using low-confidence predictions, the models can learn from each other or from its own results. Through our proposed methods, the models gradually transfer domain knowledge from the source to the target domain. Extensive experiments demonstrate the superiority of our proposed method on three public benchmarks: Office-31, Office-Home, and VisDA-2017. 1

Proceedings ArticleDOI
19 Apr 2021
TL;DR: In this article, the authors proposed a novel framework, called AutoSTG, for automated spatio-temporal graph prediction, in which spatial graph convolution and temporal convolution operations are adopted in the search space to capture complex spatiotemporal correlations.
Abstract: Spatio-temporal graphs are important structures to describe urban sensory data, e.g., traffic speed and air quality. Predicting over spatio-temporal graphs enables many essential applications in intelligent cities, such as traffic management and environment analysis. Recently, many deep learning models have been proposed for spatio-temporal graph prediction and achieved significant results. However, designing neural networks requires rich domain knowledge and expert efforts. To this end, we study automated neural architecture search for spatio-temporal graphs with the application to urban traffic prediction, which meets two challenges: 1) how to define search space for capturing complex spatio-temporal correlations; and 2) how to learn network weight parameters related to the corresponding attributed graph of a spatio-temporal graph. To tackle these challenges, we propose a novel framework, entitled AutoSTG, for automated spatio-temporal graph prediction. In our AutoSTG, spatial graph convolution and temporal convolution operations are adopted in our search space to capture complex spatio-temporal correlations. Besides, we employ the meta learning technique to learn the adjacency matrices of spatial graph convolution layers and kernels of temporal convolution layers from the meta knowledge of the attributed graph. And specifically, such meta knowledge is learned by a graph meta knowledge learner that iteratively aggregates knowledge on the attributed graph. Finally, extensive experiments were conducted on two real-world benchmark datasets to demonstrate that AutoSTG can find effective network architectures and achieve state-of-the-art results. To the best of our knowledge, we are the first to study neural architecture search for spatio-temporal graphs.

Journal ArticleDOI
TL;DR: The research introduces the related concepts of the knowledge representation and analyzes knowledge representation of knowledge graphs by category, which includes some classical general knowledge graphs and several typical domain knowledge graphs.
Abstract: Domain knowledge graph has become a research topic in the era of artificial intelligence. Knowledge representation is the key step to construct domain knowledge graph. There have been quite a few well-established general knowledge graphs. However, there are still gaps on the domain knowledge graph construction. The research introduces the related concepts of the knowledge representation and analyzes knowledge representation of knowledge graphs by category, which includes some classical general knowledge graphs and several typical domain knowledge graphs. The paper also discusses the development of knowledge representation in accordance with the difference of entities, relationships and properties. It also presents the unsolved problems and future research trends in the knowledge representation of domain knowledge graph study.

Journal ArticleDOI
TL;DR: A novel ZSRSSC method based on locality-preservation deep cross-modal embedding networks (LPDCMENs) which can fully assimilate the pairwise intramodal and intermodal supervision in an end-to-end manner and aim to alleviate the problem of class structure inconsistency between two hybrid spaces.
Abstract: Due to its wide applications, remote sensing (RS) image scene classification has attracted increasing research interest. When each category has a sufficient number of labeled samples, RS image scene classification can be well addressed by deep learning. However, in the RS big data era, it is extremely difficult or even impossible to annotate RS scene samples for all the categories in one time as the RS scene classification often needs to be extended along with the emergence of new applications that inevitably involve a new class of RS images. Hence, the RS big data era fairly requires a zero-shot RS scene classification (ZSRSSC) paradigm in which the classification model learned from training RS scene categories obeys the inference ability to recognize the RS image scenes from unseen categories, in common with the humans' evolutionary perception ability. Unfortunately, zero-shot classification is largely unexploited in the RS field. This article proposes a novel ZSRSSC method based on locality-preservation deep cross-modal embedding networks (LPDCMENs). The proposed LPDCMENs, which can fully assimilate the pairwise intramodal and intermodal supervision in an end-to-end manner, aim to alleviate the problem of class structure inconsistency between two hybrid spaces (i.e., the visual image space and the semantic space). To pursue a stable and generalization ability, which is highly desired for ZSRSSC, a set of explainable constraints is specially designed to optimize LPDCMENs. To fully verify the effectiveness of the proposed LPDCMENs, we collect a new large-scale RS scene data set, including the instance-level visual images and class-level semantic representations (RSSDIVCS), where the general and domain knowledge is exploited to construct the class-level semantic representations. Extensive experiments show that the proposed ZSRSSC method based on LPDCMENs can obviously outperform the state-of-the-art methods, and the domain knowledge further improves the performance of ZSRSSC compared with the general knowledge. The collected RSSDIVCS will be made publicly available along with this article.

Proceedings ArticleDOI
01 Jan 2021
TL;DR: DeepInversion for Object Detection (DIODE) as discussed by the authors uses an extensive set of differentiable augmentations to improve image fidelity and distillation effectiveness without any prior domain knowledge, generator network, or pre-computed activations.
Abstract: We present DeepInversion for Object Detection (DIODE) to enable data-free knowledge distillation for neural networks trained on the object detection task. From a data-free perspective, DIODE synthesizes images given only an off-the-shelf pre-trained detection network and without any prior domain knowledge, generator network, or pre-computed activations. DIODE relies on two key components—first, an extensive set of differentiable augmentations to improve image fidelity and distillation effectiveness. Second, a novel automated bounding box and category sampling scheme for image synthesis enabling generating a large number of images with a diverse set of spatial and category objects. The resulting images enable data-free knowledge distillation from a teacher to a student detector, initialized from scratch.In an extensive set of experiments, we demonstrate that DIODE’s ability to match the original training distribution consistently enables more effective knowledge distillation than out-of-distribution proxy datasets, which unavoidably occur in a data-free setup given the absence of the original domain knowledge.

Proceedings ArticleDOI
19 Apr 2021
TL;DR: In this paper, the authors propose a Mixed-Curvature Multi-Relational Graph Neural Network (M2GNN) to embed multi-relational KGs in a mixed-curvature space for knowledge graph completion.
Abstract: Knowledge graphs (KGs) have gradually become valuable assets for many AI applications. In a KG, a node denotes an entity, and an edge (or link) denotes a relationship between the entities represented by the nodes. Knowledge graph completion infers and predicts missing edges in a KG automatically. Knowledge graph embeddings have shed light on addressing this task. Recent research embeds KGs in hyperbolic (negatively curved) space instead of conventional Euclidean (zero curved) space and is effective in capturing hierarchical structures. However, as multi-relational graphs, KGs are not structured uniformly and display intrinsic heterogeneous structures. They usually contain rich types of structures, such as hierarchical and cyclic typed structures. Embedding KGs in single-curvature space, such as Euclidean or hyperbolic space, overlooks the intrinsic heterogeneous structures of KGs, and therefore cannot accurately capture their structures. To address this issue, we propose Mixed-Curvature Multi-Relational Graph Neural Network (M2GNN), a generic approach that embeds multi-relational KGs in a mixed-curvature space for knowledge graph completion. Specifically, we define and construct a mixed-curvature space through a product manifold combining multiple single-curvature spaces (e.g., spherical, hyperbolic, or Euclidean) with the purpose of modeling a variety of structures. However, constructing a mixed-curvature space typically requires manually defining the fixed curvatures, which needs domain knowledge and additional data analysis. Improperly defined curvature space also cannot capture the structures of KGs accurately. To address this problem, we set mixed-curvatures as trainable parameters to better capture the underlying structures of the KGs. Furthermore, we propose a Graph Neural Updater by leveraging the heterogeneous relational context in mixed-curvature space to improve the quality of the embedding. Experiments on three KG datasets demonstrate that the proposed M2GNN can outperform its single geometry counterpart as well as state-of-the-art embedding methods on the KG completion task.

Journal ArticleDOI
TL;DR: In this paper, the authors argue that the use of Cronbach's alpha to construct and evaluate an empirical measure implies a reflective model (the construct reflects in manifest behaviors), and illustrate how this mismatch between theoretical conception and empirical operationalization can have substantial implications for the assessment and modeling of domain knowledge.

Journal ArticleDOI
01 May 2021
TL;DR: This work presents a novel multi-view deep learning Android malware detector with no specialist malware domain insight used to select, rank or hand-craft input features, encapsulating knowledge inside a deep learning neural net with no prior understanding of malicious characteristics.
Abstract: Zero-day malware samples pose a considerable danger to users as implicitly there are no documented defences for previously unseen, newly encountered behaviour. Malware detection therefore relies on past knowledge to attempt to deal with zero-days. Often such insight is provided by a human expert hand-crafting and pre-categorising certain features as malicious. However, tightly coupled feature-engineering based on previous domain knowledge risks not being effective when faced with a new threat. In this work we decouple this human expertise, instead encapsulating knowledge inside a deep learning neural net with no prior understanding of malicious characteristics. Raw input features consist of low-level opcodes, app permissions and proprietary Android API package usage. Our method makes three main contributions. Firstly, a novel multi-view deep learning Android malware detector with no specialist malware domain insight used to select, rank or hand-craft input features. Secondly, a comprehensive zero-day scenario evaluation using the Drebin and AMD benchmarks, with our model achieving weighted average detection rates of 91% and 81% respectively, an improvement of up to 57% over the state-of-the-art. Thirdly, a 77% reduction in false positives on average compared to the state-of-the-art, with excellent F1 scores of 0.9928 and 0.9963 for the general detection task again on the Drebin and AMD benchmark datasets respectively.

Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors proposed a principled clustering method named scDCC, which integrates domain knowledge into the clustering step to facilitate the biological interpretability of clusters, and subsequent cell type identification.
Abstract: Clustering is a critical step in single cell-based studies. Most existing methods support unsupervised clustering without the a priori exploitation of any domain knowledge. When confronted by the high dimensionality and pervasive dropout events of scRNA-Seq data, purely unsupervised clustering methods may not produce biologically interpretable clusters, which complicates cell type assignment. In such cases, the only recourse is for the user to manually and repeatedly tweak clustering parameters until acceptable clusters are found. Consequently, the path to obtaining biologically meaningful clusters can be ad hoc and laborious. Here we report a principled clustering method named scDCC, that integrates domain knowledge into the clustering step. Experiments on various scRNA-seq datasets from thousands to tens of thousands of cells show that scDCC can significantly improve clustering performance, facilitating the interpretability of clusters and downstream analyses, such as cell type assignment. Clustering cells based on similarities in gene expression is the first step towards identifying cell types in scRNASeq data. Here the authors incorporate biological knowledge into the clustering step to facilitate the biological interpretability of clusters, and subsequent cell type identification.

Journal ArticleDOI
TL;DR: A novel knowledge coverage-based trust propagation operator is proposed to estimate the trust relationship between pairs of unknown experts, and a trust order induced recommendation mechanism is proposed by combining subjective and objective weights to allow inconsistent experts to accept the advices they trust.

Journal ArticleDOI
TL;DR: This article proposes a new GP-based approach to automatically learning informative features for different image classification tasks and shows that the new approach achieves better classification performance than most benchmark methods.
Abstract: Feature extraction is essential for solving image classification by transforming low-level pixel values into high-level features. However, extracting effective features from images is challenging due to high variations across images in scale, rotation, illumination, and background. Existing methods often have a fixed model complexity and require domain expertise. Genetic programming (GP) with a flexible representation can find the best solution without the use of domain knowledge. This article proposes a new GP-based approach to automatically learning informative features for different image classification tasks. In the new approach, a number of image-related operators, including filters, pooling operators, and feature extraction methods, are employed as functions. A flexible program structure is developed to integrate different functions and terminals into a single tree/solution. The new approach can evolve solutions of variable depths to extract various numbers and types of features from the images. The new approach is examined on 12 different image classification tasks of varying difficulty and compared with a large number of effective algorithms. The results show that the new approach achieves better classification performance than most benchmark methods. The analysis of the evolved programs/solutions and the visualization of the learned features provide deep insights on the proposed approach.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed a theory-guided hard constraint projection (HCP) model, which converts physical constraints such as governing equations into a form that is easy to handle through discretization, and then implements hard constraint optimization through projection in a patch.