scispace - formally typeset
Search or ask a question
Author

Ruisheng Zhang

Bio: Ruisheng Zhang is an academic researcher from Lanzhou University. The author has contributed to research in topics: Support vector machine & Quantitative structure–activity relationship. The author has an hindex of 26, co-authored 134 publications receiving 2019 citations.


Papers
More filters
Journal ArticleDOI
Huanxiang Liu1, Xiaojun Yao1, Ruisheng Zhang1, Mancang Liu1, Zhide Hu1, Botao Fan1 
TL;DR: A new and effective method for predicting the solubility of C60 from its structures is provided and some insight is given into the structural features related to thesolubilityof C60 in different solvents.
Abstract: A least-squares support vector machine (LSSVM) was used for the first time as a novel machine-learning technique for the prediction of the solubility of C60 in a large number of diverse solvents using calculated molecular descriptors from the molecular structure alone and on the basis of the software CODESSA as inputs. The heuristic method of CODESSA was used to select the correlated descriptors and build the linear model. Both the linear and the nonlinear models can give very satisfactory prediction results: the square of the correlation coefficient R2 was 0.892 and 0.903, and the root-mean-square error was 0.126 and 0.116, respectively, for the whole data set. The prediction result of the LSSVM model is better than that obtained by the heuristic method and the reference, which proved LSSVM was a useful tool in the prediction of the solubility of C60. In addition, this paper provided a new and effective method for predicting the solubility of C60 from its structures and gave some insight into the struct...

125 citations

Journal ArticleDOI
Huanxiang Liu1, Ruisheng Zhang1, Xiaojun Yao1, Mancang Liu1, Zhide Hu1, Bo Tao Fan1 
TL;DR: The support vector machine (SVM), as a novel type of a learning machine, for the first time, was used to develop a QSPR model that relates the structures of 35 amino acids to their isoelectric point, indicating that the GA-PLS approach is a very effective method for variable selection, and theSupport vector machine is avery promising tool for the nonlinear approximation.
Abstract: The support vector machine (SVM), as a novel type of a learning machine, for the first time, was used to develop a QSPR model that relates the structures of 35 amino acids to their isoelectric point. Molecular descriptors calculated from the structure alone were used to represent molecular structures. The seven descriptors selected using GA-PLS, which is a sophisticated hybrid approach that combines GA as a powerful optimization method with PLS as a robust statistical method for variable selection, were used as inputs of RBFNNs and SVM to predict the isoelectric point of an amino acid. The optimal QSPR model developed was based on support vector machines, which showed the following results: the root-mean-square error of 0.2383 and the prediction correlation coefficient R = 0.9702 were obtained for the whole data set. Satisfactory results indicated that the GA-PLS approach is a very effective method for variable selection, and the support vector machine is a very promising tool for the nonlinear approxima...

111 citations

Journal ArticleDOI
TL;DR: In this paper, a QSPR study was performed to develop models that relate the structures of 856 organic compounds to their critical temperatures using molecular descriptors derived solely from structure.

96 citations

Journal ArticleDOI
TL;DR: A novel mutation enhanced BPSO-SVM algorithm is presented by adjusting the memory of local and global optimum (LGO) and increasing the particles’ mutation probability for feature selection to overcome convergence premature problem and achieve high quality features.

74 citations

Journal ArticleDOI
Huanxiang Liu1, Ruisheng Zhang1, Xiaojun Yao1, Mancang Liu1, Zhide Hu1, Bo Tao Fan1 
TL;DR: The support vector machine, as a novel type of learning machine, was used to develop a QSAR model of 57 analogues of ethyl 2]-4-(trifluoromethyl)pyrimidine-5-carboxylate (EPC), an inhibitor of AP-1 and NF-kappa B mediated gene expression, based on calculated quantum chemical parameters.
Abstract: The support vector machine, as a novel type of learning machine, for the first time, was used to develop a QSAR model of 57 analogues of ethyl 2-[(3-methyl-2,5-dioxo(3-pyrrolinyl))amino]-4-(trifluoromethyl)pyrimidine-5-carboxylate (EPC), an inhibitor of AP-1 and NF-kappa B mediated gene expression, based on calculated quantum chemical parameters. The quantum chemical parameters involved in the model are Kier and Hall index (order3) (KHI3), Information content (order 0) (IC0), YZ Shadow (YZS) and Max partial charge for an N atom (MaxPCN), Min partial charge for an N atom (MinPCN). The mean relative error of the training set, the validation set, and the testing set is 1.35%, 1.52%, and 2.23%, respectively, and the maximum relative error is less than 5.00%.

73 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: A review of available methods for variable selection within one of the many modeling approaches for high-throughput data, Partial Least Squares Regression, to get an understanding of the characteristics of the methods and to get a basis for selecting an appropriate method for own use.

1,180 citations

Journal ArticleDOI
Jens Kattge1, Gerhard Bönisch2, Sandra Díaz3, Sandra Lavorel  +751 moreInstitutions (314)
TL;DR: The extent of the trait data compiled in TRY is evaluated and emerging patterns of data coverage and representativeness are analyzed to conclude that reducing data gaps and biases in the TRY database remains a key challenge and requires a coordinated approach to data mobilization and trait measurements.
Abstract: Plant traits-the morphological, anatomical, physiological, biochemical and phenological characteristics of plants-determine how plants respond to environmental factors, affect other trophic levels, and influence ecosystem properties and their benefits and detriments to people. Plant trait data thus represent the basis for a vast area of research spanning from evolutionary biology, community and functional ecology, to biodiversity conservation, ecosystem and landscape management, restoration, biogeography and earth system modelling. Since its foundation in 2007, the TRY database of plant traits has grown continuously. It now provides unprecedented data coverage under an open access data policy and is the main plant trait database used by the research community worldwide. Increasingly, the TRY database also supports new frontiers of trait-based plant research, including the identification of data gaps and the subsequent mobilization or measurement of new data. To support this development, in this article we evaluate the extent of the trait data compiled in TRY and analyse emerging patterns of data coverage and representativeness. Best species coverage is achieved for categorical traits-almost complete coverage for 'plant growth form'. However, most traits relevant for ecology and vegetation modelling are characterized by continuous intraspecific variation and trait-environmental relationships. These traits have to be measured on individual plants in their respective environment. Despite unprecedented data coverage, we observe a humbling lack of completeness and representativeness of these continuous traits in many aspects. We, therefore, conclude that reducing data gaps and biases in the TRY database remains a key challenge and requires a coordinated approach to data mobilization and trait measurements. This can only be achieved in collaboration with other initiatives.

882 citations

Journal ArticleDOI
TL;DR: In this Perspective, the current status of NN potentials is reviewed, and their advantages and limitations are discussed.
Abstract: The accuracy of the results obtained in molecular dynamics or Monte Carlo simulations crucially depends on a reliable description of the atomic interactions. A large variety of efficient potentials has been proposed in the literature, but often the optimum functional form is difficult to find and strongly depends on the particular system. In recent years, artificial neural networks (NN) have become a promising new method to construct potentials for a wide range of systems. They offer a number of advantages: they are very general and applicable to systems as different as small molecules, semiconductors and metals; they are numerically very accurate and fast to evaluate; and they can be constructed using any electronic structure method. Significant progress has been made in recent years and a number of successful applications demonstrate the capabilities of neural network potentials. In this Perspective, the current status of NN potentials is reviewed, and their advantages and limitations are discussed.

618 citations

Journal ArticleDOI
TL;DR: Basic principles and recent case studies are presented to demonstrate the utility of machine learning techniques in chemoinformatics analyses; and limitations and future directions are discussed to guide further development in this evolving field.

593 citations