scispace - formally typeset
Search or ask a question

Showing papers on "Applications of artificial intelligence published in 1969"



21 May 1969
TL;DR: The use of the computer program dlndral in constructing the total number of possible acyclic structures of H, N, and O is described, which forms the basis for the computer-aided interpretation of mass spectra to be reported in subsequent articles from the authors' laboratories.
Abstract: The use ot the computer program dlndral in constructing the total number of possible acyclic structures of H, N, and Ois described. Those structures containing either chemical absurdities or undesired functional groups are not constructed if these substructures are explicitly listed. Conversely, if it is desired to restrict the output to any functional group(s) then this can be accomplished. Examples of the linear notation used are given. Semilog plots of total numbers of isomers vs. carbon content for selected compositions summarize the results. Some broader implications of the program are discussed which forms the basis for the computer-aided interpretationof mass spectra to be reported2 in subsequentarticles from our laboratories. Chemists have sensed ever since the theory of structural isomerism was conceived that the number of organic compounds possible was astronomical. In retrospect, therefore, it is surprising that there have been so few attempts to find mathematical procedures for evaluating the number of isomers of a given molecular formula. Such enumerations would be of universal interest in defining the boundaries, scope, and limits of the subject. One specific use of lists ofpossible isomers is in the computerized inference of chemical structures from mass spectra.Formal attempts to devise an algorithm yielding the number of acyclic alkanes for a given carbon content began in 1875 with Cayley,3 but it was not until 1931 that Henze and Blair solved this problem. 4 They found it necessary to derive first the number of isomeric alkyl groups. ' The few other references on the application of topology and combinatorial analysis to chemical problems were listed recently by Balaban. 6 The first general procedure for enumerating the isomers of any given elemental composition was recently devised by Lederberg.7 The key to the solution, seen from the viewpoint of topological graph theory (the atoms and bonds of a chemical structure forming the nodes and edges, respectively, of the graph) had been foreshadowed for monofunctional acyclic structures, by Henze and Blair. 45 It is that any chemical structure, considered as a tree-graph, has a unique centroid. This centroid is either a bond that evenly divides the tree into two parts with equal numbers of atoms (neglecting hydrogen), or a single (1) This research was financially supported by the Advanced Research Projects Agency of the Office of the Secretary ofDefense (Grant SD-183), the National Aeronautics and Space Administration (Grant NGR-05-020-004), and the National Institutes of Health (Grants CM-11309 and AM-04257). The award of a Fulbright Travel Grant (to A. V. R.l from the Australian-American Educational Foundation is gratefully acknowledged. (2) A. M. Duffield, A. V. Robertson. C. Djerassi, B. G. Buchanan, G. L. Sutherland, E. A. Feigenbaum, and J. Lederberg, J. Am. Chem. 91, 2977(1969). (3) A. Cayley, 1056(1875). (4) H. R. Henze and C. M. Blair, /. Am. Chem. Soc, 53, 3077 (1931). (5) H. R. Henze and C. M. 3042 (1931). (6) A. T. Balaban, Rec Chim., Acad. Rep. Populairc 12, 875(1967). (7) J. Lederberg, "Topology of Molecules, in The Mathematical Sciences," The MIT Press, Cambridge, Mass., 1969, p 37. atom from which each branch carries less than half the atoms. The unique centroid is then the starting point for a canonical mapping of the tree, following rules that arrange the constituent radicals in systematic sequence. These canons of precedence establish priorities between radicals in terms of, say, the relative number of atoms in each (disregarding hydrogen), heteroatom content, unsaturation present, etc., along lines similar to, but more compactly axiomatized than, the Cahn-Ingold-Prelog* absolute configuration conventions. In this way, the atomic connectivity can be conveyed in a linear notational form (i.e., written on one line), and the format itself contains the information needed to rank the linear formulas for a set of isomers in a canonical dictionary order. Examples of this linear notation are given in Table I, which shows ten topologically possible linear isomers of C,H8N03 (threonine). Table I. Ten Topologically Possible Linear Isomers of Threonine Generated by dendral" ■ The conventions used by dendral are as follows: period denotes a single bond, *COOH and *CONHT are obvious abbreviations, = denotes a double bond, hydrogens and spaces are included for readability. The total output amounted to 3294 topologically possible structures of which only ten are reproduced. Each moleculeshown is represented as threeor fourradicals jointed to a central atom. Entry 3125 corresponds to 2,3-dihydroxy-2-methylpropionamide. The rules or canons of precedence for writing the linear notation, and for assigning to each isomer of a given composition its unique position in the dictionary list, can then be used to generate such an exhaustive, nonredundant list. It is easy to write down the (8) R. S. Calm, C. K. Ingold, and V. Prclog, Experientia, 12, 81 (1956); Angew. Chem. Intern. Ed. Engl., 5, 385 (1966). Djerassi, et al I Artificial intelligence for Chemical Inference 3050. 3075. 3100. 3125. 31 50. 3175. 3200. 322 5.

31 citations