Home
/
Authors
/
Zi-Jie Sun

Author

Zi-Jie Sun

University of Electronic Science and Technology of China

Bio: Zi-Jie Sun is an academic researcher from University of Electronic Science and Technology of China. The author has contributed to research in topics: Composite number & Deep learning. The author has an hindex of 2, co-authored 3 publications receiving 8 citations.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Identification of cyclin protein using gradient boost decision tree algorithm.

[...]

Hasan Zulfiqar¹, Shi-Shi Yuan¹, Qin-Lai Huang¹, Zi-Jie Sun¹, Fu-Ying Dao¹, Xiao-Long Yu², Hao Lin¹ - Show less +3 more•Institutions (2)

University of Electronic Science and Technology of China¹, Hainan University²

19 Jul 2021-Computational and structural biotechnology journal

TL;DR: In this paper, a gradient boost decision tree (GBDT) classifier was trained on the optimal features to identify cyclins with an accuracy of 93.06% and AUC value of 0.971.

...read moreread less

Abstract: Cyclin proteins are capable to regulate the cell cycle by forming a complex with cyclin-dependent kinases to activate cell cycle. Correct recognition of cyclin proteins could provide key clues for studying their functions. However, their sequences share low similarity, which results in poor prediction for sequence similarity-based methods. Thus, it is urgent to construct a machine learning model to identify cyclin proteins. This study aimed to develop a computational model to discriminate cyclin proteins from non-cyclin proteins. In our model, protein sequences were encoded by seven kinds of features that are amino acid composition, composition of k-spaced amino acid pairs, tri peptide composition, pseudo amino acid composition, geary correlation, normalized moreau-broto autocorrelation and composition/transition/distribution. Afterward, these features were optimized by using analysis of variance (ANOVA) and minimum redundancy maximum relevance (mRMR) with incremental feature selection (IFS) technique. A gradient boost decision tree (GBDT) classifier was trained on the optimal features. Five-fold cross-validated results showed that our model would identify cyclins with an accuracy of 93.06% and AUC value of 0.971, which are higher than the two recent studies on the same data.

...read moreread less

31 citations

Journal Article•DOI•

iDHS-Deep: an integrated tool for predicting DNase I hypersensitive sites by deep neural network

[...]

Fu-Ying Dao¹, Hao Lv¹, Wei Su¹, Zi-Jie Sun¹, Qin-Lai Huang¹, Hao Lin¹ - Show less +2 more•Institutions (1)

University of Electronic Science and Technology of China¹

02 Sep 2021-Briefings in Bioinformatics

TL;DR: Wang et al. as mentioned in this paper developed a deep learning-based algorithm to identify whether an unknown sequence region would be potential hypersensitive site (DHS), which showed high prediction performance on both training datasets and independent datasets in different cell types and developmental stages, demonstrating that the method has excellent superiority in the identification of DHSs.

...read moreread less

Abstract: DNase I hypersensitive site (DHS) refers to the hypersensitive region of chromatin for the DNase I enzyme. It is an important part of the noncoding region and contains a variety of regulatory elements, such as promoter, enhancer, and transcription factor-binding site, etc. Moreover, the related locus of disease (or trait) are usually enriched in the DHS regions. Therefore, the detection of DHS region is of great significance. In this study, we develop a deep learning-based algorithm to identify whether an unknown sequence region would be potential DHS. The proposed method showed high prediction performance on both training datasets and independent datasets in different cell types and developmental stages, demonstrating that the method has excellent superiority in the identification of DHSs. Furthermore, for the convenience of related wet-experimental researchers, the user-friendly web-server iDHS-Deep was established at http://lin-group.cn/server/iDHS-Deep/, by which users can easily distinguish DHS and non-DHS and obtain the corresponding developmental stage ofDHS.

...read moreread less

22 citations

Journal Article•DOI•

Deep-4mCW2V: A sequence-based predictor to identify N4-methylcytosine sites in Escherichia coli.

[...]

Lourdes Peña-Castillo¹, Hasan Zulfiqar¹, Zi-Jie Sun¹, Qin-Lai Huang¹, Shi-Shi Yuan¹, Hao Lv¹, Fu-Ying Dao¹, Hao Lin¹, Yan-Wen Li² - Show less +5 more•Institutions (2)

University of Electronic Science and Technology of China¹, Northeast Normal University²

02 Aug 2021-Methods

TL;DR: Wang et al. as discussed by the authors developed a deep learning-based model to predict 4mC sites in the Escherichia coli genome, where DNA sequences were encoded by word embedding technique 'word2vec'.

...read moreread less

17 citations

Journal Article•DOI•

Investigation on axial resistance of steel reinforced concrete cross-shaped columns exposed to high temperature

[...]

Yuzhuo Wang, Zi-Jie Sun, Ying Gao, Xiao Tian Zhang, Junlin Gong, Bing-gang Zhang - Show less +2 more

01 Dec 2022-Case Studies in Construction Materials

TL;DR: In this article , the fire resistance of steel reinforced concrete (SRC) columns was analyzed under different axial compression ratios (0.4, 0.6) and eccentricity (0, 40, 60, 80 mm).

...read moreread less

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

NeuroPred-FRL: an interpretable prediction model for identifying neuropeptide using feature representation learning.

[...]

Mehedi Hasan¹, Mehedi Hasan², Ashad Alam³, Watshara Shoombuatong⁴, Hong-Wen Deng³, Balachandran Manavalan⁵, Hiroyuki Kurata¹ - Show less +3 more•Institutions (5)

Kyushu Institute of Technology¹, Japan Society for the Promotion of Science², Tulane University³, Mahidol University⁴, Ajou University⁵

05 Nov 2021-Briefings in Bioinformatics

TL;DR: In this paper, a machine learning-based meta-predictor called NeuroPred-FRL was developed by employing the feature representation learning approach, where the predicted probability scores of NPs based on the 66 baseline models were combined as the input feature vector.

...read moreread less

Abstract: Neuropeptides (NPs) are the most versatile neurotransmitters in the immune systems that regulate various central anxious hormones. An efficient and effective bioinformatics tool for rapid and accurate large-scale identification of NPs is critical in immunoinformatics, which is indispensable for basic research and drug development. Although a few NP prediction tools have been developed, it is mandatory to improve their NPs' prediction performances. In this study, we have developed a machine learning-based meta-predictor called NeuroPred-FRL by employing the feature representation learning approach. First, we generated 66 optimal baseline models by employing 11 different encodings, six different classifiers and a two-step feature selection approach. The predicted probability scores of NPs based on the 66 baseline models were combined to be deemed as the input feature vector. Second, in order to enhance the feature representation ability, we applied the two-step feature selection approach to optimize the 66-D probability feature vector and then inputted the optimal one into a random forest classifier for the final meta-model (NeuroPred-FRL) construction. Benchmarking experiments based on both cross-validation and independent tests indicate that the NeuroPred-FRL achieves a superior prediction performance of NPs compared with the other state-of-the-art predictors. We believe that the proposed NeuroPred-FRL can serve as a powerful tool for large-scale identification of NPs, facilitating the characterization of their functional mechanisms and expediting their applications in clinical therapy. Moreover, we interpreted some model mechanisms of NeuroPred-FRL by leveraging the robust SHapley Additive exPlanation algorithm.

...read moreread less

48 citations

Journal Article•DOI•

Wearable Flexible Electronics Based Cardiac Electrode for Researcher Mental Stress Detection System Using Machine Learning Models on Single Lead Electrocardiogram Signal

[...]

Belal Bin Heyat, Faijan Akhtar, Syed Jafar Abbas, Mohammed Al-Sarem, Abdulrahman Alqarafi, Antony Stalin, Rashid Abbasi, A. Y. Muaad, Dakun Lai, Kaishun Wu - Show less +6 more

01 Jun 2022-Biosensors

TL;DR: The findings suggest that the wearable smart T-shirt based on the DT classifier may be used in big data applications and health monitoring, and may help assess cardiovascular and related risk factors in the initial stage based on machine learning techniques.

...read moreread less

27 citations

Journal Article•DOI•

Integrative machine learning framework for the identification of cell-specific enhancers from the human genome.

[...]

Shaherin Basith¹, Md. Mehedi Hasan², Md. Mehedi Hasan³, Gwang Lee¹, Leyi Wei⁴, Leyi Wei⁵, Balachandran Manavalan¹ - Show less +3 more•Institutions (5)

Ajou University¹, Tulane University², Kyushu Institute of Technology³, Shandong University⁴, Xiamen University⁵

05 Nov 2021-Briefings in Bioinformatics

TL;DR: In this paper, an integrative machine learning (ML)-based framework called Enhancer-IF was proposed for identifying cell-specific enhancers, which comprehensively explores a wide range of heterogeneous features with five commonly used ML methods (random forest, extremely randomized tree, multilayer perceptron, support vector machine and extreme gradient boosting).

...read moreread less

Abstract: Enhancers are deoxyribonucleic acid (DNA) fragments which when bound by transcription factors enhance the transcription of related genes. Due to its sporadic distribution and similar fractions, identification of enhancers from the human genome seems a daunting task. Compared to the traditional experimental approaches, computational methods with easy-to-use platforms could be efficiently applied to annotate enhancers' functions and physiological roles. In this aspect, several bioinformatics tools have been developed to identify enhancers. Despite their spectacular performances, existing methods have certain drawbacks and limitations, including fixed length of sequences being utilized for model development and cell-specificity negligence. A novel predictor would be beneficial in the context of genome-wide enhancer prediction by addressing the above-mentioned issues. In this study, we constructed new datasets for eight different cell types. Utilizing these data, we proposed an integrative machine learning (ML)-based framework called Enhancer-IF for identifying cell-specific enhancers. Enhancer-IF comprehensively explores a wide range of heterogeneous features with five commonly used ML methods (random forest, extremely randomized tree, multilayer perceptron, support vector machine and extreme gradient boosting). Specifically, these five classifiers were trained with seven encodings and obtained 35 baseline models. The output of these baseline models was integrated and again inputted to five classifiers for the construction of five meta-models. Finally, the integration of five meta-models through ensemble learning improved the model robustness. Our proposed approach showed an excellent prediction performance compared to the baseline models on both training and independent datasets in different cell types, thus highlighting the superiority of our approach in the identification of the enhancers. We assume that Enhancer-IF will be a valuable tool for screening and identifying potential enhancers from the human DNA sequences.

...read moreread less

27 citations

Journal Article•DOI•

Deepm5C: A deep learning-based hybrid framework for identifying human RNA N5-methylcytosine sites using a stacking strategy.

[...]

Md. Mehedi Hasan, Sho Tsukiyama, Jae Youl Cho, Hiroyuki Kurata, Md. Ashad Alam, Xiaowen Liu, Balachandran Manavalan, Hong-Wen Deng - Show less +4 more

01 May 2022-Molecular Therapy

TL;DR: Deepm5C as discussed by the authors is a bioinformatics method for identifying RNA m5C sites throughout the human genome, which uses a mixture of three conventional feature-encoding algorithms and a feature derived from word embedding approaches.

...read moreread less

25 citations

Journal Article•DOI•

Phage_UniR_LGBM: Phage Virion Proteins Classification with UniRep Features and LightGBM Model

[...]

Wenzheng Bao, Qingyu Cui, Baitong Chen, Bin Yang

15 Apr 2022-Computational and Mathematical Methods in Medicine

TL;DR: This work proposed the Phage_UniR_LGBM, a model that utilizes the UniRep as the feature and the LightGBM algorithm as the classification model to classify the virion proteins.

...read moreread less

Abstract: Phage, the most prevalent creature on the planet, serves a variety of critical roles. Phage's primary role is to facilitate gene-to-gene communication. The phage proteins can be defined as the virion proteins and the nonvirion ones. Nowadays, experimental identification is a difficult process that necessitates a significant amount of laboratory time and expense. Considering such situation, it is critical to design practical calculating techniques and develop well-performance tools. In this work, the Phage_UniR_LGBM has been proposed to classify the virion proteins. In detailed, such model utilizes the UniRep as the feature and the LightGBM algorithm as the classification model. And then, the training data train the model, and the testing data test the model with the cross-validation. The Phage_UniR_LGBM was compared with the several state-of-the-art features and classification algorithms. The performances of the Phage_UniR_LGBM are 88.51% in Sp,89.89% in Sn, 89.18% in Acc, 0.7873 in MCC, and 0.8925 in F1 score.

...read moreread less

21 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13

Collapse