scispace - formally typeset
Search or ask a question

Showing papers on "Representation (systemics) published in 2020"


Journal ArticleDOI
TL;DR: A comprehensive survey of the recent achievements in this field brought about by deep learning techniques, covering many aspects of generic object detection: detection frameworks, object feature representation, object proposal generation, context modeling, training strategies, and evaluation metrics.
Abstract: Object detection, one of the most fundamental and challenging problems in computer vision, seeks to locate object instances from a large number of predefined categories in natural images. Deep learning techniques have emerged as a powerful strategy for learning feature representations directly from data and have led to remarkable breakthroughs in the field of generic object detection. Given this period of rapid evolution, the goal of this paper is to provide a comprehensive survey of the recent achievements in this field brought about by deep learning techniques. More than 300 research contributions are included in this survey, covering many aspects of generic object detection: detection frameworks, object feature representation, object proposal generation, context modeling, training strategies, and evaluation metrics. We finish the survey by identifying promising directions for future research.

1,897 citations


Book ChapterDOI
23 Aug 2020
TL;DR: This paper addresses the semantic segmentation problem with a focus on the context aggregation strategy, and presents a simple yet effective approach, object-contextual representations, characterizing a pixel by exploiting the representation of the corresponding object class.
Abstract: In this paper, we study the context aggregation problem in semantic segmentation. Motivated by that the label of a pixel is the category of the object that the pixel belongs to, we present a simple yet effective approach, object-contextual representations, characterizing a pixel by exploiting the representation of the corresponding object class. First, we learn object regions under the supervision of the ground-truth segmentation. Second, we compute the object region representation by aggregating the representations of the pixels lying in the object region. Last, we compute the relation between each pixel and each object region, and augment the representation of each pixel with the object-contextual representation which is a weighted aggregation of all the object region representations. We empirically demonstrate our method achieves competitive performance on various benchmarks: Cityscapes, ADE20K, LIP, PASCAL-Context and COCO-Stuff. Our submission “HRNet + OCR + SegFix” achieves the \({1}^{\mathrm {st}}\) place on the Cityscapes leaderboard by the ECCV 2020 submission deadline. Code is available at: https://git.io/openseg and https://git.io/HRNet.OCR.

952 citations


Proceedings ArticleDOI
14 Jun 2020
TL;DR: This work proposes a novel framework, called InterFaceGAN, for semantic face editing by interpreting the latent semantics learned by GANs, and finds that the latent code of well-trained generative models actually learns a disentangled representation after linear transformations.
Abstract: Despite the recent advance of Generative Adversarial Networks (GANs) in high-fidelity image synthesis, there lacks enough understanding of how GANs are able to map a latent code sampled from a random distribution to a photo-realistic image. Previous work assumes the latent space learned by GANs follows a distributed representation but observes the vector arithmetic phenomenon. In this work, we propose a novel framework, called InterFaceGAN, for semantic face editing by interpreting the latent semantics learned by GANs. In this framework, we conduct a detailed study on how different semantics are encoded in the latent space of GANs for face synthesis. We find that the latent code of well-trained generative models actually learns a disentangled representation after linear transformations. We explore the disentanglement between various semantics and manage to decouple some entangled semantics with subspace projection, leading to more precise control of facial attributes. Besides manipulating gender, age, expression, and the presence of eyeglasses, we can even vary the face pose as well as fix the artifacts accidentally generated by GAN models. The proposed method is further applied to achieve real image manipulation when combined with GAN inversion methods or some encoder-involved models. Extensive results suggest that learning to synthesize faces spontaneously brings a disentangled and controllable facial attribute representation.

560 citations


Posted Content
TL;DR: This paper proposes Local Implicit Image Function (LIIF), which takes an image coordinate and the 2D deep features around the coordinate as inputs, predicts the RGB value at a given coordinate as an output and builds a bridge between discrete and continuous representation in 2D.
Abstract: How to represent an image? While the visual world is presented in a continuous manner, machines store and see the images in a discrete way with 2D arrays of pixels. In this paper, we seek to learn a continuous representation for images. Inspired by the recent progress in 3D reconstruction with implicit neural representation, we propose Local Implicit Image Function (LIIF), which takes an image coordinate and the 2D deep features around the coordinate as inputs, predicts the RGB value at a given coordinate as an output. Since the coordinates are continuous, LIIF can be presented in arbitrary resolution. To generate the continuous representation for images, we train an encoder with LIIF representation via a self-supervised task with super-resolution. The learned continuous representation can be presented in arbitrary resolution even extrapolate to x30 higher resolution, where the training tasks are not provided. We further show that LIIF representation builds a bridge between discrete and continuous representation in 2D, it naturally supports the learning tasks with size-varied image ground-truths and significantly outperforms the method with resizing the ground-truths.

198 citations


Posted Content
TL;DR: This work combines a scene representation network with a low-dimensional morphable model which provides explicit control over pose and expressions and shows that this learned volumetric representation allows for photorealistic image generation that surpasses the quality of state-of-the-art video-based reenactment methods.
Abstract: We present dynamic neural radiance fields for modeling the appearance and dynamics of a human face. Digitally modeling and reconstructing a talking human is a key building-block for a variety of applications. Especially, for telepresence applications in AR or VR, a faithful reproduction of the appearance including novel viewpoints or head-poses is required. In contrast to state-of-the-art approaches that model the geometry and material properties explicitly, or are purely image-based, we introduce an implicit representation of the head based on scene representation networks. To handle the dynamics of the face, we combine our scene representation network with a low-dimensional morphable model which provides explicit control over pose and expressions. We use volumetric rendering to generate images from this hybrid representation and demonstrate that such a dynamic neural scene representation can be learned from monocular input data only, without the need of a specialized capture setup. In our experiments, we show that this learned volumetric representation allows for photo-realistic image generation that surpasses the quality of state-of-the-art video-based reenactment methods.

185 citations


Proceedings ArticleDOI
Yuan Yao1, Chang Liu1, Dezhao Luo1, Yu Zhou1, Qixiang Ye1 
14 Jun 2020
TL;DR: A novel self-supervised method, referred to as video Playback Rate Perception (PRP), to learn spatio-temporal representation in a simple-yet-effective way and outperforms state-of-the-art self- supervised models with significant margins.
Abstract: In self-supervised spatio-temporal representation learning, the temporal resolution and long-short term characteristics are not yet fully explored, which limits representation capabilities of learned models. In this paper, we propose a novel self-supervised method, referred to as video Playback Rate Perception (PRP), to learn spatio-temporal representation in a simple-yet-effective way. PRP roots in a dilated sampling strategy, which produces self-supervision signals about video playback rates for representation model learning. PRP is implemented with a feature encoder, a classification module, and a reconstructing decoder, to achieve spatio-temporal semantic retention in a collaborative discrimination-generation manner. The discriminative perception model follows a feature encoder to prefer perceiving low temporal resolution and long-term representation by classifying fast-forward rates. The generative perception model acts as a feature decoder to focus on comprehending high temporal resolution and short-term representation by introducing a motion-attention mechanism. PRP is applied on typical video target tasks including action recognition and video retrieval. Experiments show that PRP outperforms state-of-the-art self-supervised models with significant margins. Code is available at github.com/yuanyao366/PRP.

165 citations


Proceedings ArticleDOI
14 Jun 2020
TL;DR: A deep network based on PointNet++ is developed that predicts ANCSH from a single depth point cloud, including part segmentation, normalized coordinates, and joint parameters in the canonical object space, and leveraging the canonicalized joints are demonstrated.
Abstract: This paper addresses the task of category-level pose estimation for articulated objects from a single depth image. We present a novel category-level approach that correctly accommodates object instances previously unseen during training. We introduce Articulation-aware Normalized Coordinate Space Hierarchy (ANCSH) – a canonical representation for different articulated objects in a given category. As the key to achieve intra-category generalization, the representation constructs a canonical object space as well as a set of canonical part spaces. The canonical object space normalizes the object orientation, scales and articulations (e.g. joint parameters and states) while each canonical part space further normalizes its part pose and scale. We develop a deep network based on PointNet++ that predicts ANCSH from a single depth point cloud, including part segmentation, normalized coordinates, and joint parameters in the canonical object space. By leveraging the canonicalized joints, we demonstrate: 1) improved performance in part pose and scale estimations using the induced kinematic constraints from joints; 2) high accuracy for joint parameter estimation in camera space.

142 citations


ReportDOI
30 Sep 2020
TL;DR: The Concise Binary Object Representation is a data format whose design goals include the possibility of extremely small code size, fairly small message size, and extensibility without the need for version negotiation.
Abstract: The Concise Binary Object Representation (CBOR) is a data format whose design goals include the possibility of extremely small code size, fairly small message size, and extensibility without the need for version negotiation. These design goals make it different from earlier binary serializations such as ASN.1 and MessagePack.

138 citations


Journal ArticleDOI
TL;DR: It is demonstrated that changes in the geometry of the associated object manifolds underlie this improved capacity, and shed light on the functional roles different levels in the hierarchy play to achieve it, through orchestrated reduction of manifolds’ radius, dimensionality and inter-manifold correlations.
Abstract: Stimuli are represented in the brain by the collective population responses of sensory neurons, and an object presented under varying conditions gives rise to a collection of neural population responses called an 'object manifold'. Changes in the object representation along a hierarchical sensory system are associated with changes in the geometry of those manifolds, and recent theoretical progress connects this geometry with 'classification capacity', a quantitative measure of the ability to support object classification. Deep neural networks trained on object classification tasks are a natural testbed for the applicability of this relation. We show how classification capacity improves along the hierarchies of deep neural networks with different architectures. We demonstrate that changes in the geometry of the associated object manifolds underlie this improved capacity, and shed light on the functional roles different levels in the hierarchy play to achieve it, through orchestrated reduction of manifolds' radius, dimensionality and inter-manifold correlations.

120 citations


Proceedings Article
30 Apr 2020
TL;DR: This work shows state-of-the-art word representation learning methods maximize an objective function that is a lower bound on the mutual information between different parts of a word sequence (i.e., a sentence).
Abstract: We show state-of-the-art word representation learning methods maximize an objective function that is a lower bound on the mutual information between different parts of a word sequence (i.e., a sentence). Our formulation provides an alternative perspective that unifies classical word embedding models (e.g., Skip-gram) and modern contextual embeddings (e.g., BERT, XLNet). In addition to enhancing our theoretical understanding of these methods, our derivation leads to a principled framework that can be used to construct new self-supervised tasks. We provide an example by drawing inspirations from related methods based on mutual information maximization that have been successful in computer vision, and introduce a simple self-supervised objective that maximizes the mutual information between a global sentence representation and n-grams in the sentence. Our analysis offers a holistic view of representation learning methods to transfer knowledge and translate progress across multiple domains (e.g., natural language processing, computer vision, audio processing).

119 citations


Journal ArticleDOI
TL;DR: This novel 3D orientation estimation is based on a variant of the Denoising Autoencoder that is trained on simulated views of a 3D model using Domain Randomization and achieves state-of-the-art performance on the T-LESS dataset both in the RGB and RGB-D domain.
Abstract: We propose a real-time RGB-based pipeline for object detection and 6D pose estimation. Our novel 3D orientation estimation is based on a variant of the Denoising Autoencoder that is trained on simulated views of a 3D model using Domain Randomization. This so-called Augmented Autoencoder has several advantages over existing methods: It does not require real, pose-annotated training data, generalizes to various test sensors and inherently handles object and view symmetries. Instead of learning an explicit mapping from input images to object poses, it provides an implicit representation of object orientations defined by samples in a latent space. Our pipeline achieves state-of-the-art performance on the T-LESS dataset both in the RGB and RGB-D domain. We also evaluate on the LineMOD dataset where we can compete with other synthetically trained approaches. We further increase performance by correcting 3D orientation estimates to account for perspective errors when the object deviates from the image center and show extended results. Our code is available here https://github.com/DLR-RM/AugmentedAutoencoder.

Journal ArticleDOI
TL;DR: This paper presents a feature representation and fusion model to combine the feature representation of the object in RGB and infrared modalities for object tracking and demonstrates the effectiveness of the proposed method.

Proceedings ArticleDOI
01 Jul 2020
TL;DR: An extraction system that uses knowledge of the types of the target fields to generate extraction candidates and a neural network architecture that learns a dense representation of each candidate based on neighboring words in the document is proposed.
Abstract: We propose a novel approach using representation learning for tackling the problem of extracting structured information from form-like document images. We propose an extraction system that uses knowledge of the types of the target fields to generate extraction candidates and a neural network architecture that learns a dense representation of each candidate based on neighboring words in the document. These learned representations are not only useful in solving the extraction task for unseen document templates from two different domains but are also interpretable, as we show using loss cases.

Posted Content
TL;DR: This paper demonstrates that self-supervised learned representations can extract task-relevant information and discard task-irrelevant information, and proposes a composite objective that bridges the gap between prior contrastive and predictive learning objectives, and introduces an additional objective term to discard Task-IRrelevant information.
Abstract: As a subset of unsupervised representation learning, self-supervised representation learning adopts self-defined signals as supervision and uses the learned representation for downstream tasks, such as object detection and image captioning. Many proposed approaches for self-supervised learning follow naturally a multi-view perspective, where the input (e.g., original images) and the self-supervised signals (e.g., augmented images) can be seen as two redundant views of the data. Building from this multi-view perspective, this paper provides an information-theoretical framework to better understand the properties that encourage successful self-supervised learning. Specifically, we demonstrate that self-supervised learned representations can extract task-relevant information and discard task-irrelevant information. Our theoretical framework paves the way to a larger space of self-supervised learning objective design. In particular, we propose a composite objective that bridges the gap between prior contrastive and predictive learning objectives, and introduce an additional objective term to discard task-irrelevant information. To verify our analysis, we conduct controlled experiments to evaluate the impact of the composite objectives. We also explore our framework's empirical generalization beyond the multi-view perspective, where the cross-view redundancy may not be clearly observed.

Journal ArticleDOI
03 Apr 2020
TL;DR: This paper proposed a model augmented with hierarchical contextualized representation, which takes different contributions of words in a single sentence into consideration to enhance the sentence representation learned from an independent BiLSTM via label embedding attention mechanism.
Abstract: Named entity recognition (NER) models are typically based on the architecture of Bi-directional LSTM (BiLSTM). The constraints of sequential nature and the modeling of single input prevent the full utilization of global information from larger scope, not only in the entire sentence, but also in the entire document (dataset). In this paper, we address these two deficiencies and propose a model augmented with hierarchical contextualized representation: sentence-level representation and document-level representation. In sentence-level, we take different contributions of words in a single sentence into consideration to enhance the sentence representation learned from an independent BiLSTM via label embedding attention mechanism. In document-level, the key-value memory network is adopted to record the document-aware information for each unique word which is sensitive to similarity of context information. Our two-level hierarchical contextualized representations are fused with each input token embedding and corresponding hidden state of BiLSTM, respectively. The experimental results on three benchmark NER datasets (CoNLL-2003 and Ontonotes 5.0 English datasets, CoNLL-2002 Spanish dataset) show that we establish new state-of-the-art results.

Posted Content
TL;DR: The generative model is able to synthesize high-quality human grasps, given only on a 3D object point cloud and achieves comparable performance for 3D hand reconstruction compared to state-of-the-art methods.
Abstract: Robotic grasping of house-hold objects has made remarkable progress in recent years. Yet, human grasps are still difficult to synthesize realistically. There are several key reasons: (1) the human hand has many degrees of freedom (more than robotic manipulators); (2) the synthesized hand should conform to the surface of the object; and (3) it should interact with the object in a semantically and physically plausible manner. To make progress in this direction, we draw inspiration from the recent progress on learning-based implicit representations for 3D object reconstruction. Specifically, we propose an expressive representation for human grasp modelling that is efficient and easy to integrate with deep neural networks. Our insight is that every point in a three-dimensional space can be characterized by the signed distances to the surface of the hand and the object, respectively. Consequently, the hand, the object, and the contact area can be represented by implicit surfaces in a common space, in which the proximity between the hand and the object can be modelled explicitly. We name this 3D to 2D mapping as Grasping Field, parameterize it with a deep neural network, and learn it from data. We demonstrate that the proposed grasping field is an effective and expressive representation for human grasp generation. Specifically, our generative model is able to synthesize high-quality human grasps, given only on a 3D object point cloud. The extensive experiments demonstrate that our generative model compares favorably with a strong baseline and approaches the level of natural human grasps. Our method improves the physical plausibility of the hand-object contact reconstruction and achieves comparable performance for 3D hand reconstruction compared to state-of-the-art methods.

Journal ArticleDOI
TL;DR: A deep neural network tensor factorization method is proposed, Avocado, that compresses this epigenomic data into a dense, information-rich representation, and it is shown that machine learning models that exploit this representation outperform those trained directly on epigenomicData on a variety of genomics tasks.
Abstract: The human epigenome has been experimentally characterized by thousands of measurements for every basepair in the human genome. We propose a deep neural network tensor factorization method, Avocado, that compresses this epigenomic data into a dense, information-rich representation. We use this learned representation to impute epigenomic data more accurately than previous methods, and we show that machine learning models that exploit this representation outperform those trained directly on epigenomic data on a variety of genomics tasks. These tasks include predicting gene expression, promoter-enhancer interactions, replication timing, and an element of 3D chromatin architecture.


Journal ArticleDOI
TL;DR: The right combination of quantum and classical computations allows for accurate quantum chemistry simulations using surprisingly few qubits as mentioned in this paper, which can be found in many quantum simulations using only a handful of qubits.
Abstract: The right combination of quantum and classical computations allows for accurate quantum chemistry simulations using surprisingly few qubits.

Journal ArticleDOI
TL;DR: This article considers the parameter estimation for a special bilinear system with colored noise and proposes a multiinnovation generalized extended stochastic gradient (MI‐GESG) algorithm that can generate more accurate parameter estimation.

Posted Content
TL;DR: A detailed 2D-3D joint representation learning method for human-Object Interaction detection and a new benchmark named Ambiguous-HOI consisting of hard ambiguous images are proposed to better evaluate the 2D ambiguity processing capacity of models.
Abstract: Human-Object Interaction (HOI) detection lies at the core of action understanding. Besides 2D information such as human/object appearance and locations, 3D pose is also usually utilized in HOI learning since its view-independence. However, rough 3D body joints just carry sparse body information and are not sufficient to understand complex interactions. Thus, we need detailed 3D body shape to go further. Meanwhile, the interacted object in 3D is also not fully studied in HOI learning. In light of these, we propose a detailed 2D-3D joint representation learning method. First, we utilize the single-view human body capture method to obtain detailed 3D body, face and hand shapes. Next, we estimate the 3D object location and size with reference to the 2D human-object spatial configuration and object category priors. Finally, a joint learning framework and cross-modal consistency tasks are proposed to learn the joint HOI representation. To better evaluate the 2D ambiguity processing capacity of models, we propose a new benchmark named Ambiguous-HOI consisting of hard ambiguous images. Extensive experiments in large-scale HOI benchmark and Ambiguous-HOI show impressive effectiveness of our method. Code and data are available at this https URL.

Posted Content
TL;DR: The first to introduce a technique for measuring the contribution of each error in a way that isolates its effect on overall performance and shows that such a representation is critical for drawing accurate, comprehensive conclusions through in-depth analysis across 4 datasets and 7 recognition models.
Abstract: We introduce TIDE, a framework and associated toolbox for analyzing the sources of error in object detection and instance segmentation algorithms. Importantly, our framework is applicable across datasets and can be applied directly to output prediction files without required knowledge of the underlying prediction system. Thus, our framework can be used as a drop-in replacement for the standard mAP computation while providing a comprehensive analysis of each model's strengths and weaknesses. We segment errors into six types and, crucially, are the first to introduce a technique for measuring the contribution of each error in a way that isolates its effect on overall performance. We show that such a representation is critical for drawing accurate, comprehensive conclusions through in-depth analysis across 4 datasets and 7 recognition models. Available at this https URL

Journal ArticleDOI
03 Apr 2020
TL;DR: A novel method of language attention-based VQA that learns decomposed linguistic representations of questions and utilizes the representations to infer answers for overcoming language priors is presented.
Abstract: Most existing Visual Question Answering (VQA) models overly rely on language priors between questions and answers. In this paper, we present a novel method of language attention-based VQA that learns decomposed linguistic representations of questions and utilizes the representations to infer answers for overcoming language priors. We introduce a modular language attention mechanism to parse a question into three phrase representations: type representation, object representation, and concept representation. We use the type representation to identify the question type and the possible answer set (yes/no or specific concepts such as colors or numbers), and the object representation to focus on the relevant region of an image. The concept representation is verified with the attended region to infer the final answer. The proposed method decouples the language-based concept discovery and vision-based concept verification in the process of answer inference to prevent language priors from dominating the answering process. Experiments on the VQA-CP dataset demonstrate the effectiveness of our method.

Journal ArticleDOI
TL;DR: The results warrant further investigation into the effects of preconception cannabis use in males and the potential effects on subsequent generations.
Abstract: Parental cannabis use has been associated with adverse neurodevelopmental outcomes in offspring, but how such phenotypes are transmitted is largely unknown. Using reduced representation bis...

Journal ArticleDOI
Min Oh1, Liqing Zhang1
TL;DR: This work proposes DeepMicro, a deep representation learning framework allowing for an effective representation of microbiome profiles that outperforms the current best approaches based on the strain-level marker profile in five different datasets in disease prediction.
Abstract: Human microbiota plays a key role in human health and growing evidence supports the potential use of microbiome as a predictor of various diseases. However, the high-dimensionality of microbiome data, often in the order of hundreds of thousands, yet low sample sizes, poses great challenge for machine learning-based prediction algorithms. This imbalance induces the data to be highly sparse, preventing from learning a better prediction model. Also, there has been little work on deep learning applications to microbiome data with a rigorous evaluation scheme. To address these challenges, we propose DeepMicro, a deep representation learning framework allowing for an effective representation of microbiome profiles. DeepMicro successfully transforms high-dimensional microbiome data into a robust low-dimensional representation using various autoencoders and applies machine learning classification algorithms on the learned representation. In disease prediction, DeepMicro outperforms the current best approaches based on the strain-level marker profile in five different datasets. In addition, by significantly reducing the dimensionality of the marker profile, DeepMicro accelerates the model training and hyperparameter optimization procedure with 8X-30X speedup over the basic approach. DeepMicro is freely available at https://github.com/minoh0201/DeepMicro.

Book ChapterDOI
23 Aug 2020
TL;DR: TIDE as discussed by the authors is a framework and associated toolbox (https://dbolya.github.io/tide/) for analyzing the sources of error in object detection and instance segmentation algorithms.
Abstract: We introduce TIDE, a framework and associated toolbox (https://dbolya.github.io/tide/) for analyzing the sources of error in object detection and instance segmentation algorithms. Importantly, our framework is applicable across datasets and can be applied directly to output prediction files without required knowledge of the underlying prediction system. Thus, our framework can be used as a drop-in replacement for the standard mAP computation while providing a comprehensive analysis of each model’s strengths and weaknesses. We segment errors into six types and, crucially, are the first to introduce a technique for measuring the contribution of each error in a way that isolates its effect on overall performance. We show that such a representation is critical for drawing accurate, comprehensive conclusions through in-depth analysis across 4 datasets and 7 recognition models.

Journal ArticleDOI
TL;DR: PerL is a representation learning model that extends contextualized word embedding models such as BERT with pivot-based fine-tuning that outperforms strong baselines across 22 sentiment classification domain adaptation setups, improves in-domain model performance, yields effective reduced-size models, and increases model stability.
Abstract: Pivot-based neural representation models have led to significant progress in domain adaptation for NLP. However, previous research following this approach utilize only labeled data from the source ...

Journal ArticleDOI
TL;DR: This paper examines the key features of CPSs and their relation with other system types, defines the dependencies between levels of automation and human roles in CPSs from a systems engineering perspective, and applies systems thinking to describe a multi‐layered diagrammatic representation of CPS's for combined safety and security risk analysis.

Posted Content
TL;DR: A variety of text representation methods, and model designs have blossomed in the context of NLP, including SOTA LMs are described, which can transform large volumes of text into effective vector representations capturing the same semantic information.
Abstract: Word representation has always been an important research area in the history of natural language processing (NLP). Understanding such complex text data is imperative, given that it is rich in information and can be used widely across various applications. In this survey, we explore different word representation models and its power of expression, from the classical to modern-day state-of-the-art word representation language models (LMS). We describe a variety of text representation methods, and model designs have blossomed in the context of NLP, including SOTA LMs. These models can transform large volumes of text into effective vector representations capturing the same semantic information. Further, such representations can be utilized by various machine learning (ML) algorithms for a variety of NLP related tasks. In the end, this survey briefly discusses the commonly used ML and DL based classifiers, evaluation metrics and the applications of these word embeddings in different NLP tasks.

Journal ArticleDOI
TL;DR: This work evaluates a feed-forward neural network (FNN) model's prediction performance over five feature selection methods and nine ground-state properties from a public data set composed of ∼130k organic molecules and investigates the impact of choosing free-coordinate descriptors based on the Simplified Molecular Input Line Entry System (SMILES) representation.
Abstract: Machine learning (ML) models can potentially accelerate the discovery of tailored materials by learning a function that maps chemical compounds into their respective target properties. In this real...