Showing papers in "Journal of Chemical Information and Computer Sciences in 1998"

PDF

Open Access

Journal Article•DOI•

[...]

Peter Willett¹, John M. Barnard and², Geoffrey M. Downs²•Institutions (2)

University of Sheffield¹, Barnard College²

21 Jul 1998-Journal of Chemical Information and Computer Sciences

TL;DR: The concept of similarity searching is introduced, differentiating it from the more common substructure searching, and the current generation of fragment-based measures that are used for searching chemical structure databases are discussed.

...read moreread less

Abstract: This paper reviews the use of similarity searching in chemical databases. It begins by introducing the concept of similarity searching, differentiating it from the more common substructure searching, and then discusses the current generation of fragment-based measures that are used for searching chemical structure databases. The next sections focus upon two of the principal characteristics of a similarity measure: the coefficient that is used to quantify the degree of structural resemblance between pairs of molecules and the structural representations that are used to characterize molecules that are being compared in a similarity calculation. New types of similarity measure are then compared with current approaches, and examples are given of several applications that are related to similarity searching.

...read moreread less

1,662 citations

Journal Article•DOI•

RECAPRetrosynthetic Combinatorial Analysis Procedure: A Powerful New Technique for Identifying Privileged Molecular Fragments with Useful Applications in Combinatorial Chemistry

[...]

Xiao Qing Lewell¹, Duncan Bruce Judd¹, Stephen P. Watson¹, Michael M. Hann¹•Institutions (1)

University of Hertfordshire¹

11 Apr 1998-Journal of Chemical Information and Computer Sciences

TL;DR: "RECAP" (Retrosynthetic Combinatorial Analysis Procedure), a new computational technique designed to address the design and availability of high quality building blocks which are likely to afford hits from the libraries that they generate is described.

...read moreread less

Abstract: The use of combinatorial chemistry for the generation of new lead molecules is now a well established strategy in the drug discovery process. Central to the use of combinatorial chemistry is the design and availability of high quality building blocks which are likely to afford hits from the libraries that they generate. Herein we describe “RECAP” (Retrosynthetic Combinatorial Analysis Procedure), a new computational technique designed to address this building block issue. RECAP electronically fragments molecules based on chemical knowledge. When applied to databases of biologically active molecules this allows the identification of building block fragments rich in biologically recognized elements and privileged motifs and structures. This allows the design of building blocks and the synthesis of libraries rich in biological motifs. Application of RECAP to the Derwent World Drug Index (WDI) and the molecular fragments/building blocks that this generates are discussed. We also describe a WDI fragment knowle...

...read moreread less

568 citations

Journal Article•DOI•

On the properties of bit string-based measures of chemical similarity

[...]

Darren R. Flower¹•Institutions (1)

Loughborough University¹

04 Apr 1998-Journal of Chemical Information and Computer Sciences

TL;DR: Empirical results suggest that bit strings provide a nonintuitive encoding of molecular size, shape, and global similarity and suggest that there are instances when they may not be the most appropriate tool for searching or segregating chemical structures.

...read moreread less

Abstract: With the growth of interest in database searching and compound selection, the quantification of chemical similarity has become an area of intense practical and theoretical interest. One of the most widely used methods of measuring chemical similarity is based on mapping fragments within a molecule as bits within a binary string. We present empirical results which suggest that bit strings provide a nonintuitive encoding of molecular size, shape, and global similarity. Other results, this time statistical in nature, suggest that the observed behavior of bit string-based searches have a large nonspecific component. On this basis, we question whether bit string-based similarity methods possess all the features desirable in a quantitative chemical distance measure or metric and suggest that there are instances when they may not be the most appropriate tool for searching or segregating chemical structures.

...read moreread less

331 citations

Journal Article•DOI•

Prediction of human intestinal absorption of drug compounds from molecular structure.

[...]

Matthew D. Wessel¹, Peter C. Jurs¹, John W. Tolan, Steven M. Muskal•Institutions (1)

Pennsylvania State University¹

19 Jun 1998-Journal of Chemical Information and Computer Sciences

TL;DR: A nonlinear computational neural network model developed by using the genetic algorithm with a neural network fitness evaluator to estimate percent human intestinal absorption (%HIA) is an attractive alternative to experimental measurements.

...read moreread less

Abstract: Prediction of human intestinal absorption (HIA) is a major goal in the development of oral drugs. The application of combinatorial chemistry methods to drug discovery has dramatically increased the demand for rapid and efficient models for estimating HIA and other biopharmaceutical properties. While experimental methods for measurement of intestinal absorption have been developed and are used widely, computational approaches provide an attractive alternative.

...read moreread less

317 citations

Journal Article•DOI•

Chemometrics: A Practical Guide By Kenneth R. Beebe, Randy J. Pell, and Mary Beth Seasholtz. Wiley-Interscience Series on Laboratory Automation. John Wiley & Sons: New York, 1998. xi + 348 pp. ISBN 0-471-12451-6. $69.95.

[...]

Bruce Slutsky

24 Oct 1998-Journal of Chemical Information and Computer Sciences

202 citations

Journal Article•DOI•

Evaluation of Quantitative Structure−Activity Relationship Methods for Large-Scale Prediction of Chemicals Binding to the Estrogen Receptor†

[...]

Weida Tong¹, David R. Lowis, Roger Perkins, Yu Chen, William J. Welsh², Dean W. Goddette, Daniel M. Sheehan³ - Show less +3 more•Institutions (3)

University of Missouri–St. Louis¹, University of Missouri², Food and Drug Administration³

20 May 1998-Journal of Chemical Information and Computer Sciences

TL;DR: Among the QSAR methods considered, HQSAR appears to offer many attractive features, such as speed, reproducibility and ease of use, which portend its utility for prioritizing large numbers of potential EDCs for subsequent toxicological testing and risk assessment.

...read moreread less

Abstract: Three different QSAR methods, Comparative Molecular Field AnaIysis (CoMFA), classical QSAR (utilizing the CODESSA program), and Hologram QSAR (HQSAR), are compared in terms of their potential for screening large data sets of chemicals as endocrine disrupting compounds (EDCs). While CoMFA and CODESSA (Comprehensive Descriptors for Structural and Statistical Analysis) have been commercially available for some time, HQSAR is a novel QSAR technique. HQSAR attempts to correlate molecular structure with biological activity for a series of compounds using molecular holograms constructed from counts of sub-structural molecular fragments. In addition to using r2 and q2 (cross-validated r2) in assessing the statistical quality of QSAR models, another statistical parameter was defined to be the ratio of the standard error to the activity range. The statistical quality of the QSAR models constructed using CoMFA and HQSAR techniques were comparable and were generally better than those produced with CODESSA. It is nota...

...read moreread less

196 citations

Journal Article•DOI•

The Vertex-Connectivity Index Revisited

[...]

Dragan Amić, Drago Bešlo, Bono Lučić, Sonja Nikolić, Nenad Trinajstić - Show less +1 more

01 Aug 1998-Journal of Chemical Information and Computer Sciences

TL;DR: A search for optimum molecular descriptors based on the connectivity index found that in most cases the optimum value of the exponent is indeed different from -0.5, and suggests that a modified version of the (valence) vertex-connectivity index should be routinely employed in the structure-property modeling instead of the standard versions of the index.

...read moreread less

Abstract: We report a search for optimum molecular descriptors based on the connectivity index. A suggestion made by several authors that the exponent -0.5 used in the standard formula for computing the connectivity index may not be the optimum for modeling some molecular properties was reexamined. We considered several molecular properties and found that in most cases the optimum value of the exponent is indeed different from -0.5. We suggest that a modified version of the (valence) vertex-connectivity index should be routinely employed in the structure-property modeling instead of the standard version of the index.

...read moreread less

193 citations

Journal Article•DOI•

Identification of biological activity profiles using substructural analysis and genetic algorithms.

[...]

Valerie J. Gillet¹, Peter Willett, John Bradshaw•Institutions (1)

University of Sheffield¹

24 Feb 1998-Journal of Chemical Information and Computer Sciences

TL;DR: A substructural analysis approach is used to calculate biological activity profiles, which contain weights that describe the differential occurrences of generic features in active molecules taking from the World Drug Index and in (presumed) inactive molecules taken from the SPRESI database.

...read moreread less

Abstract: A substructural analysis approach is used to calculate biological activity profiles, which contain weights that describe the differential occurrences of generic features (specifically, the numbers of hydrogen-bond donors and acceptors, the numbers of rotatable bonds and aromatic rings, the molecular weights, and the 2κα shape descriptors) in active molecules taken from the World Drug Index and in (presumed) inactive molecules taken from the SPRESI database. Even with such simple structural descriptors, the profiles discriminate effectively between active and inactive compounds. The effectiveness of the approach is further increased by using a genetic algorithm for the calculation of the weights comprising a profile. The methods have been successfully applied to a number of different data sets.

...read moreread less

162 citations

Journal Article•DOI•

Correlation and Prediction of the Refractive Indices of Polymers by QSPR

[...]

Alan R. Katritzky¹, Sulev Sild¹, Mati Karelson¹•Institutions (1)

University of Florida¹

22 Oct 1998-Journal of Chemical Information and Computer Sciences

TL;DR: A general QSPR model was developed for the prediction of the refractive index for a diverse set of amorphous homopolymers with the CODESSA program and the average prediction error by this model is 0.9%.

...read moreread less

Abstract: A general QSPR model (R2 = 0.940, s = 0.018) was developed for the prediction of the refractive index for a diverse set of amorphous homopolymers with the CODESSA program. The five descriptors, involved in the model, are calculated from the structure of the repeating unit of the polymer. The average prediction error by this model is 0.9%.

...read moreread less

156 citations

Journal Article•DOI•

QSPR Studies on Vapor Pressure, Aqueous Solubility, and the Prediction of Water−Air Partition Coefficients

[...]

Alan R. Katritzky¹, Yilin Wang¹, Sulev Sild¹, Tarmo Tamm¹, Mati Karelson² - Show less +1 more•Institutions (2)

University of Florida¹, University of Tartu²

30 Jun 1998-Journal of Chemical Information and Computer Sciences

TL;DR: The vapor pressures and the aqueous solubilities of 411 compounds with a large structural diversity were investigated using a quantitative structure−property relationship (QSPR) approach to allow the reliable prediction of water−air partition coefficients.

...read moreread less

Abstract: The vapor pressures and the aqueous solubilities of 411 compounds with a large structural diversity were investigated using a quantitative structure−property relationship (QSPR) approach. A five-descriptor equation with the squared correlation coefficient (R2) of 0.949 for vapor pressure and a six-descriptor equation with R2 of 0.879 for aqueous solubility were obtained. All descriptors were derived solely from the chemical structure of the compounds. The QSPR correlation equations for vapor pressure and aqueous solubility allow the reliable prediction of water−air partition coefficients.

...read moreread less

152 citations

Journal Article•DOI•

Aqueous solubility prediction of drugs based on molecular topology and neural network modeling

[...]

Jarmo Huuskonen¹, Marja Salo¹, Jyrki Taskinen¹•Institutions (1)

University of Helsinki¹

24 Feb 1998-Journal of Chemical Information and Computer Sciences

TL;DR: A method for predicting the aqueous solubility of drug compounds was developed based on topological indices and artificial neural network (ANN) modeling, which yielded positive results for acidic, neutral, and basic drugs of different structural classes.

...read moreread less

Abstract: A method for predicting the aqueous solubility of drug compounds was developed based on topological indices and artificial neural network (ANN) modeling. The aqueous solubility values for 211 drugs and related compounds representing acidic, neutral, and basic drugs of different structural classes were collected from the literature. The data set was divided into a training set (n = 160) and a randomly chosen test set (n = 51). Structural parameters used as inputs in a 23-5-1 artificial neural network included 14 atom-type electrotopological indices and nine other topological indices. For the test set, a predictive r2 = 0.86 and s = 0.53 (log units) were achieved.

...read moreread less

Journal Article•DOI•

Prediction of Aqueous Solubility of Organic Compounds from Molecular Structure

[...]

Brooke E. Mitchell¹, Peter C. Jurs¹•Institutions (1)

Pennsylvania State University¹

03 Apr 1998-Journal of Chemical Information and Computer Sciences

TL;DR: Genetic algorithm and simulated annealing routines, in conjunction with MLR and CNN, are used to select subsets of descriptors that accurately relate to aqueous solubility.

...read moreread less

Abstract: Multiple linear regression (MLR) and computational neural networks (CNN) are utilized to develop mathematical models to relate the structures of a diverse set of 332 organic compounds to their aqueous solubilities. Topological, geometric, and electronic descriptors are used to numerically represent structural features of the data set compounds. Genetic algorithm and simulated annealing routines, in conjunction with MLR and CNN, are used to select subsets of descriptors that accurately relate to aqueous solubility. Nonlinear models with nine calculated structural descriptors are developed that have a training set root-mean-square error of 0.394 log units for compounds which span a −log(molarity) range from −2 to +12 log units.

...read moreread less

Journal Article•DOI•

Quantitative Structure−Property Relationship (QSPR) Correlation of Glass Transition Temperatures of High Molecular Weight Polymers

[...]

Alan R. Katritzky¹, Sulev Sild¹, Victor S. Lobanov¹, Mati Karelson¹•Institutions (1)

University of Florida¹

24 Feb 1998-Journal of Chemical Information and Computer Sciences

TL;DR: A new quantitative structure−property relationship (QSPR) five-parameter correlation of molar glass transition temperatures (Tg/M) for a diverse set of 88 polymers is developed with the Comprehensive Descriptors for Structural and Statistical Analysis (CODESSA) program.

...read moreread less

Abstract: A new quantitative structure−property relationship (QSPR) five-parameter correlation (R2 = 0.946) of molar glass transition temperatures (Tg/M) for a diverse set of 88 polymers is developed with the Comprehensive Descriptors for Structural and Statistical Analysis (CODESSA) program. The descriptors are all calculated directly from the molecular structure, and the approach given is applicable, in principle, to all linear polymers of regular structure.

...read moreread less

Journal Article•DOI•

Mining the NCI anticancer drug discovery databases: genetic function approximation for the QSAR study of anticancer ellipticine analogues.

[...]

Leming M. Shi¹, Yi Fan², Timothy G. Myers², Patrick M. O'Connor², Kenneth D. Paull², Stephen H. Friend¹, John N. Weinstein - Show less +3 more•Institutions (2)

Fred Hutchinson Cancer Research Center¹, National Institutes of Health²

30 Jan 1998-Journal of Chemical Information and Computer Sciences

TL;DR: This study analyzes the antitumor activity patterns of 112 ellipticine analogues and investigates the quantitative structure-activity relationships (QSAR) of these compounds, in particular with respect to the influence of p53-status and the CNS cell selectivity of the activity patterns.

...read moreread less

Abstract: The U.S. National Cancer Institute (NCI) conducts a drug discovery program in which ∼10 000 compounds are screened every year in vitro against a panel of 60 human cancer cell lines from different organs of origin. Since 1990, ∼63 000 compounds have been tested, and their patterns of activity profiled. Recently, we analyzed the antitumor activity patterns of 112 ellipticine analogues using a hierarchical clustering algorithm. Dramatic coherence between molecular structures and activity patterns was observed qualitatively from the cluster tree. In the present study, we further investigate the quantitative structure−activity relationships (QSAR) of these compounds, in particular with respect to the influence of p53-status and the CNS cell selectivity of the activity patterns. Independent variables (i.e., chemical structural descriptors of the ellipticine analogues) were calculated from the Cerius2 molecular modeling package. Important structural descriptors, including partial atomic charges on the ellipticin...

...read moreread less

Journal Article•DOI•

Approach to Estimation and Prediction for Normal Boiling Point (NBP) of Alkanes Based on a Novel Molecular Distance-Edge (MDE) Vector, λ

[...]

Shushen Liu¹, Chenzhong Cao¹, Zhi-Liang Li¹•Institutions (1)

University of Science and Technology of China¹

07 Mar 1998-Journal of Chemical Information and Computer Sciences

TL;DR: A predictive model has been developed by using 125 isomers in alkanes as the training set, and its performance was certified by employing 25 alkanes chosen randomly as the test set from a total of 150 alkane compounds; excellent predicted results were obtained.

...read moreread less

Abstract: Models that estimate and predict the normal boiling point (NBP) of alkanes based on a molecular distance-edge (MDE) vector, λ, have been developed by using multiple linear regression (MLR) methods. The structures of the examined compounds are selectively described by an MDE vector structure descriptor, a novel molecular distance-edge vector recently developed in our laboratory. MLR was used to develop a linear model containing ten variables with a high precision root mean squares error (RMS = 4.985K) and a good correlation with the correlation coefficient (R = 0.9948). In addition, a predictive model has been developed by using 125 isomers in alkanes as the training set, and its performance was certified by employing 25 alkanes chosen randomly as the test set from a total of 150 alkane compounds; excellent predicted results were obtained with the RMS and R values found between the calculated value and observed NBP being RMS = 4.486K and R = 0.9945.

...read moreread less

Journal Article•DOI•

Spectral moments of the edge adjacency matrix in molecular graphs. 3. molecules containing cycles

[...]

Ernesto Estrada¹•Institutions (1)

University of Valencia¹

19 Jan 1998-Journal of Chemical Information and Computer Sciences

TL;DR: A substructural approach to quantitative structure−property relationships based on the spectral moments of the edge adjacency matrix is extended to molecules containing cycles to describe the boiling points of a series of 80 cycloalkanes.

...read moreread less

Abstract: A substructural approach to quantitative structure−property relationships based on the spectral moments of the edge adjacency matrix is extended to molecules containing cycles. Spectral moments are expressed as linear combinations of structural fragments of any kind of nonweighted graphs. The boiling points of a series of 80 cycloalkanes was well-described by the present approach. The predictive power of the model was proved by using a test set of another 26 compounds. An equation that expresses the contribution of the different fragments of the molecules to the boiling point was obtained.

...read moreread less

Journal Article•DOI•

Rational combinatorial library design. 1. Focus-2D: A new approach to the design of targeted combinatorial chemical libraries

[...]

Weifan Zheng¹, Sung Jin Cho¹, Alexander Tropsha¹•Institutions (1)

University of North Carolina at Chapel Hill¹

04 Mar 1998-Journal of Chemical Information and Computer Sciences

TL;DR: Frequency analysis of building block composition of selected virtual compounds identifies building blocks that can be used in combinatorial synthesis of chemical libraries with high similarity to the lead molecules.

...read moreread less

Abstract: We describe a new computational approach, called Focus-2D, to the rational design of targeted combinatorial chemical libraries. This approach is based on the hypothesis that structurally similar compounds display similar biological activity profiles. Building blocks that are used in a combinatorial chemical synthesis are randomly assembled to produce virtual library compounds. Individual library compounds are represented by Kier−Hall topological descriptors. Molecular similarities between compounds are evaluated quantitatively by modified pairwise Euclidean distances in multidimensional descriptor space. Simulated annealing is used to search the potentially large structural space of virtual chemical libraries to identify compounds similar to lead molecules. Frequency analysis of building block composition of selected virtual compounds identifies building blocks that can be used in combinatorial synthesis of chemical libraries with high similarity to the lead molecules. We show that this method correctly i...

...read moreread less

Journal Article•DOI•

Virtual compound libraries : a new approach to decision making in molecular discovery research

[...]

Richard D. Cramer, David E. Patterson, Robert D. Clark, Farhad Soltanshahi, Michael S. Lawless - Show less +1 more

17 Jul 1998-Journal of Chemical Information and Computer Sciences

TL;DR: Issues to be considered include fundamental data structures, neighborhood searching principles, useful searching approaches and techniques, library definition and construction, algorithmic details of library comparison, and user interfaces.

...read moreread less

Abstract: Virtual compound libraries, descriptions of all of the structures that might be produced by specified transformations involving specified reagents, are especially useful in molecular discovery when suitably fast and relevant searching techniques are available. Issues to be considered include fundamental data structures, neighborhood searching principles, useful searching approaches and techniques, library definition and construction, algorithmic details of library comparison, and user interfaces.

...read moreread less

Journal Article•DOI•

Superposition of Three-Dimensional Chemical Structures Allowing for Conformational Flexibility by a Hybrid Method

[...]

Sandra Handschuh¹, Markus Wagener¹, Johann Gasteiger¹•Institutions (1)

University of Erlangen-Nuremberg¹

28 Feb 1998-Journal of Chemical Information and Computer Sciences

TL;DR: The superposition method described here combines a genetic algorithm with a numerical optimization method to adequately address the conformational flexibility of ligand molecules.

...read moreread less

Abstract: The superposition of three-dimensional structures is the first task in the evaluation of the largest common three-dimensional substructure of a set of molecules. This is an important step in the identification of a pharmacophoric pattern for molecules that bind to the same receptor. The superposition method described here combines a genetic algorithm with a numerical optimization method. A major goal is to adequately address the conformational flexibility of ligand molecules. The genetic algorithm optimizes in a nondeterministic process the size and the geometric fit of the substructures. The geometric fit is further improved by changing torsional angles combining the genetic algorithm and the directed tweak method. This directed tweak method is based on a numerical quasi-Newton optimization method. Only one starting conformation per molecule is necessary. Molecules having several rotatable bonds and quite different initial conformations are modified to find large structural similarities. A set of angiote...

...read moreread less

Journal Article•DOI•

Rational Combinatorial Library Design. 2. Rational Design of Targeted Combinatorial Peptide Libraries Using Chemical Similarity Probe and the Inverse QSAR Approaches

[...]

Sung Jin Cho¹, Weifan Zheng¹, Alexander Tropsha¹•Institutions (1)

University of North Carolina at Chapel Hill¹

04 Mar 1998-Journal of Chemical Information and Computer Sciences

TL;DR: A novel strategy for rational design of targeted peptide libraries to select a subset of natural amino acids that are most likely to be present in active peptides for the synthesis of library is developed.

...read moreread less

Abstract: We have developed a novel strategy for rational design of targeted peptide libraries. The goal of this method is to select a subset of natural amino acids that are most likely to be present in active peptides for the synthesis of library. Two different protocols are employed where chemical structures of peptides are described either by topological indices or by a combination of physicochemical descriptors for individual amino acids. The selection of a peptide as a candidate for the targeted library is based either on its chemical similarity to a biologically active probe or on its biological activity predicted from a preconstructed quantitative structure−activity (QSAR) equation. The optimization of the library is achieved by means of genetic algorithms (GA). This method was tested by rational design of the library with bradykinin-potentiating activity. Twenty-eight bradykinin-potentiating pentapeptides were used as a training set for the development of a QSAR equation, and, alternatively, two active pent...

...read moreread less

Journal Article•DOI•

Recursive Partitioning Analysis of a Large Structure−Activity Data Set Using Three-Dimensional Descriptors1

[...]

Xin Chen¹, and Andrew Rusinko Iii¹, S. Stanley Young¹•Institutions (1)

University of North Carolina at Chapel Hill¹

24 Oct 1998-Journal of Chemical Information and Computer Sciences

TL;DR: The idea is to encode the three-dimensional features of chemical compounds into bit strings and use RP to determine the important features that statistically correlate to the biological activities of these compounds.

...read moreread less

Abstract: Large chemical data sets are becoming available from high throughput screening of corporate collections and chemical libraries. There is a growing need to develop three-dimensional pharmacophores from these large data sets to guide database screening, chemical library design, and lead optimization. Recursive partitioning (RP) is a statistical method that can be used to analyze very large data sets; data sets of over 100 000 observations and over 2 000 000 descriptors pose no computational problems. Our idea is to encode the three-dimensional features of chemical compounds into bit strings and use RP to determine the important features that statistically correlate to the biological activities of these compounds. This kind of structure−activity relationship analysis (SAR) can be considered as the first step to the goal of pharmacophore identification for large chemical data sets. We report here our RP work that for the first time successfully retrieved 3D SARs from a large, heterogeneous data set of 1650 mo...

...read moreread less

Journal Article•DOI•

Normal Boiling Points for Organic Compounds: Correlation and Prediction by a Quantitative Structure−Property Relationship

[...]

Alan R. Katritzky¹, Victor S. Lobanov¹, Mati Karelson¹•Institutions (1)

University of Florida¹

19 Jan 1998-Journal of Chemical Information and Computer Sciences

TL;DR: The applicability of these two descriptors for the prediction of boiling points for various other classes of organic compounds was investigated by employing a diverse data set of 612 organic compounds containing C, H, N, O, S, F, Cl, Br, and I.

...read moreread less

Abstract: We recently reported a successful correlation of the normal boiling points of 298 organic compounds containing O, N, Cl, and Br with two molecular descriptors.1 In the present study the applicabili...

...read moreread less

Journal Article•DOI•

QSPR Prediction of Vapor Pressure from Solely Theoretically-Derived Descriptors

[...]

Cikui Liang¹, David A. Gallagher•Institutions (1)

Oregon Health & Science University¹

19 Feb 1998-Journal of Chemical Information and Computer Sciences

TL;DR: A model to predict vapor pressure from only computationally derived molecular descriptors, allowing study of hypothetical structures, is described here and proves to be more accurate and works over a wider range of compound classes than most previously reported models.

...read moreread less

Abstract: To date, most reported quantitative structure−property relationship (QSPR) methods to predict vapor pressure rely on, at least, some empirical data, such as boiling points, critical pressures, and critical temperatures. This limits their usefulness to available chemicals and incurs the time and expense of experimentation. A model to predict vapor pressure from only computationally derived molecular descriptors, allowing study of hypothetical structures, is described here. Several multilinear regressions and artificial neural network analyses were tested with a range of descriptors (e.g., topological and quantum mechanical) derived solely from computations on molecular structure data. From a set of 479 compounds, a linear regression with an r2 of 0.960 was achieved using polarizibility and polar functional group counts as descriptors. This new computationally based model also proves to be more accurate and works over a wider range of compound classes than most previously reported models.

...read moreread less

Journal Article•DOI•

Different Discrete Wavelet Transforms Applied to Denoising Analytical Data

[...]

Chunsheng Cai and¹, Peter de B. Harrington¹•Institutions (1)

Ohio University¹

25 Sep 1998-Journal of Chemical Information and Computer Sciences

TL;DR: Although there exists an infinite variety of wavelet transformations, 22 orthonormal wavelet transforms that are typically used, which include Haar, 9 daublets, 5 coiflets, and 7 symmlets, were evaluated and four threshold selection methods have been studied.

...read moreread less

Abstract: Discrete wavelet transform (DWT) denoising contains three steps: forward transformation of the signal to the wavelet domain, reduction of the wavelet coefficients, and inverse transformation to the native domain. Three aspects that should be considered for DWT denoising include selecting the wavelet type, selecting the threshold, and applying the threshold to the wavelet coefficients. Although there exists an infinite variety of wavelet transformations, 22 orthonormal wavelet transforms that are typically used, which include Haar, 9 daublets, 5 coiflets, and 7 symmlets, were evaluated. Four threshold selection methods have been studied: universal, minimax, Stein's unbiased estimate of risk (SURE), and minimum description length (MDL) criteria. The application of the threshold to the wavelet coefficients includes global (hard, soft, garrote, and firm), level-dependent, data-dependent, translation invariant (TI), and wavelet package transform (WPT) thresholding methods. The different DWT-based denoising m...

...read moreread less

Journal Article•DOI•

Correlation of the Aqueous Solubility of Hydrocarbons and Halogenated Hydrocarbons with Molecular Structure

[...]

Paul D. T. Huibers¹, Alan R. Katritzky¹•Institutions (1)

Massachusetts Institute of Technology¹

23 Mar 1998-Journal of Chemical Information and Computer Sciences

TL;DR: The use of descriptors calculated only from molecular structure eliminates the need for experimental determination of properties for use in the correlation and allows for the estimation of aqueous solubility for molecules not yet synthesized or isolated.

...read moreread less

Abstract: The aqueous solubilities of a set of 109 hydrocarbons and 132 halogenated hydrocarbons (total 241) are correlated by a three term equation using descriptors calculated solely from molecular structure, with a correlation coefficient (R) of 0.979 and a standard error (s) of 0.386 log units. This equation allows the estimation of aqueous solubilities of hydrocarbons and halogenated hydrocarbons (including polychlorinated biphenyls). The key descriptor is the molecular volume, modified by topological and electrostatic terms. The use of descriptors calculated only from molecular structure eliminates the need for experimental determination of properties for use in the correlation and allows for the estimation of aqueous solubility for molecules not yet synthesized or isolated.

...read moreread less

Journal Article•DOI•

Isomorphism, Automorphism Partitioning, and Canonical Labeling Can Be Solved in Polynomial-Time for Molecular Graphs

[...]

Jean-Loup Faulon

28 Mar 1998-Journal of Chemical Information and Computer Sciences

TL;DR: This paper presents the theoretical results that for all molecules, the problems of isomorphism, automorphism partitioning, and canonical labeling are polynomial-time problems.

...read moreread less

Abstract: The graph isomorphism problem belongs to the class of NP problems, and has been conjectured intractable, although probably not NP-complete. However, in the context of chemistry, because molecules are a restricted class of graphs, the problem of graph isomorphism can be solved efficiently (i.e., in polynomial-time). This paper presents the theoretical results that for all molecules, the problems of isomorphism, automorphism partitioning, and canonical labeling are polynomial-time problems. Simple polynomial-time algorithms are also given for planar molecular graphs and used for automorphism partitioning of paraffins, polycyclic aromatic hydrocarbons (PAHs), fullerenes, and nanotubes.

...read moreread less

Journal Article•DOI•

Bioactive Diversity and Screening Library Selection via Affinity Fingerprinting

[...]

Steven L. Dixon¹, Hugo O. Villar•Institutions (1)

Telik, Inc.¹

25 Sep 1998-Journal of Chemical Information and Computer Sciences

TL;DR: It is demonstrated how affinity fingerprints may be used in conjunction with simple algorithms to select active-enriched diverse training sets and to efficiently extract the most active compounds from a large library.

...read moreread less

Abstract: The Similarity Principle provides the conceptual framework behind most modern approaches to library sampling and design. However, it is often the case that compounds which appear to be very similar structurally may in fact exhibit quite different activities toward a given target. Conversely, some targets recognize a wide variety of molecules and thus bind compounds that have markedly different structures. Affinity fingerprints largely overcome the difficulties associated with selecting compounds on the basis of structure alone. By describing each compound in terms of its binding affinity to a set of functionally dissimilar proteins, fundamental factors relevant to binding and biological activity are automatically encoded. We demonstrate how affinity fingerprints may be used in conjunction with simple algorithms to select active-enriched diverse training sets and to efficiently extract the most active compounds from a large library.

...read moreread less

Journal Article•DOI•

ACD Labs/LogP dB 3.5 and ChemSketch 3.5

[...]

Gary O. Spessard

28 Oct 1998-Journal of Chemical Information and Computer Sciences

Journal Article•DOI•

Design of Topological Indices. Part 10. 1 Parameters Based on Electronegativity and Covalent Radius for the Computation of Molecular Graph Descriptors for Heteroatom-Containing Molecules

[...]

Ovidiu Ivanciuc¹, and Teodora Ivanciuc¹, Alexandru T. Balaban¹•Institutions (1)

Politehnica University of Bucharest¹

28 Feb 1998-Journal of Chemical Information and Computer Sciences

TL;DR: A quantitative structure−property relationship study is reported for boiling points of 185 acyclic compounds with one or two oxygen or sulfur atoms (devoid of hydrogen bonding), in terms of four or five molecular descriptors.

...read moreread less

Abstract: Two new approaches are presented for the calculation of atom and bond parameters for heteroatom-containing molecules used in computing graph theoretic invariants. In the first approach, the atom and bond weights are computed on the basis of relative atomic electronegativity, using carbon as standard. In the second system, the relative covalent radii are used to compute atom and bond weights, again with the carbon atom as standard. The new definition of the atom and bond parameters leads to a periodic variation versus the atomic number Z, with a more natural variation when compared with the parameters defined only by Z. The two approaches are used to define and compute topological indices based on graph distance. A quantitative structure−property relationship study is reported for boiling points of 185 acyclic compounds with one or two oxygen or sulfur atoms (devoid of hydrogen bonding), in terms of four or five molecular descriptors.

...read moreread less

Journal Article•DOI•

Quantitative Structure−Activity Relationships for the Aquatic Toxicity of Polar and Nonpolar Narcotic Pollutants

[...]

Eñaut Urrestarazu Ramos¹, Wouter H. J. Vaes¹, and Henk J. M. Verhaar¹, Joop L. M. Hermens¹•Institutions (1)

Utrecht University¹

25 Jul 1998-Journal of Chemical Information and Computer Sciences

TL;DR: QSARs were developed for the acute toxicity of narcotic pollutants to the water flea, the guppy, and the pond snail using hydrophobicity (log KOW) and hydrogen bonding capacity descriptors (Q-, Q+, eHOMO, eLUMO).

...read moreread less

Abstract: QSARs were developed for the acute toxicity of narcotic pollutants (nonpolar and polar) to the water flea (Daphnia magna), the guppy (Poecilia reticulata), and the pond snail (Lymnaea stagnalis) using hydrophobicity (log KOW) and hydrogen bonding capacity descriptors (Q-, Q+, eHOMO, eLUMO). Toxicity increases with increasing hydrophobicity and to a minor extent with decreasing LUMO energies and increasing absolute charges in the molecule. The models are rationalized by taking into account the composition of biomembranes, into which chemicals must partition for displaying narcosis. The similarity of these results with models for the membrane/water partition coefficients supports the hypothesis that the toxicity of narcotics is directly related to the accumulation in biological membranes. The results indicate that baseline toxicity based on log KOW should be redefined for chemicals for which log KOW is not a good surrogate for partitioning into biological membranes.

...read moreread less

Collapse