Home
/
Authors
/
Min Zhu

Author

Min Zhu

Bio: Min Zhu is an academic researcher from Zhejiang University. The author has contributed to research in topics: Random forest & Sample size determination. The author has an hindex of 3, co-authored 5 publications receiving 93 citations.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Class Weights Random Forest Algorithm for Processing Class Imbalanced Medical Data

[...]

Min Zhu¹, Jing Xia¹, Xiaoqing Jin, Molei Yan, Guolong Cai, Jing Yan, Gangmin Ning¹ - Show less +3 more•Institutions (1)

Zhejiang University¹

04 Jan 2018-IEEE Access

TL;DR: The validation test on UCI data sets demonstrates that for imbalanced medical data, the proposed method enhanced the overall performance of the classifier while producing high accuracy in identifying both majority and minority class.

...read moreread less

Abstract: The classification in class imbalanced data has drawn significant interest in medical application. Most existing methods are prone to categorize the samples into the majority class, resulting in bias, in particular the insufficient identification of minority class. A kind of novel approach, class weights random forest is introduced to address the problem, by assigning individual weights for each class instead of a single weight. The validation test on UCI data sets demonstrates that for imbalanced medical data, the proposed method enhanced the overall performance of the classifier while producing high accuracy in identifying both majority and minority class.

...read moreread less

128 citations

Journal Article•DOI•

Dimensionality Reduction in Complex Medical Data: Improved Self-Adaptive Niche Genetic Algorithm.

[...]

Min Zhu¹, Jing Xia¹, Molei Yan, Guolong Cai, Jing Yan, Gangmin Ning¹ - Show less +2 more•Institutions (1)

Zhejiang University¹

16 Nov 2015-Computational and Mathematical Methods in Medicine

TL;DR: The results show that, by applying INGA, the feature dimensionality of datasets was reduced from 77 to 10 and that the model achieved an accuracy of 92% in predicting 28-day death in sepsis patients, which is significantly higher than other methods.

...read moreread less

Abstract: With the development of medical technology, more and more parameters are produced to describe the human physiological condition, forming high-dimensional clinical datasets. In clinical analysis, data are commonly utilized to establish mathematical models and carry out classification. High-dimensional clinical data will increase the complexity of classification, which is often utilized in the models, and thus reduce efficiency. The Niche Genetic Algorithm (NGA) is an excellent algorithm for dimensionality reduction. However, in the conventional NGA, the niche distance parameter is set in advance, which prevents it from adjusting to the environment. In this paper, an Improved Niche Genetic Algorithm (INGA) is introduced. It employs a self-adaptive niche-culling operation in the construction of the niche environment to improve the population diversity and prevent local optimal solutions. The INGA was verified in a stratification model for sepsis patients. The results show that, by applying INGA, the feature dimensionality of datasets was reduced from 77 to 10 and that the model achieved an accuracy of 92% in predicting 28-day death in sepsis patients, which is significantly higher than other methods.

...read moreread less

20 citations

Journal Article•DOI•

A Long Short-Term Memory Ensemble Approach for Improving the Outcome Prediction in Intensive Care Unit

[...]

Jing Xia¹, Su Pan¹, Min Zhu¹, Guolong Cai, Molei Yan, Qun Su¹, Jing Yan, Gangmin Ning¹ - Show less +4 more•Institutions (1)

Zhejiang University¹

03 Nov 2019-Computational and Mathematical Methods in Medicine

TL;DR: The results demonstrate that the eLSTM is capable of dynamically predicting the mortality of patients in complex clinical situations.

...read moreread less

Abstract: In intensive care unit (ICU), it is essential to predict the mortality of patients and mathematical models aid in improving the prognosis accuracy. Recently, recurrent neural network (RNN), especially long short-term memory (LSTM) network, showed advantages in sequential modeling and was promising for clinical prediction. However, ICU data are highly complex due to the diverse patterns of diseases; therefore, instead of single LSTM model, an ensemble algorithm of LSTM (eLSTM) is proposed, utilizing the superiority of the ensemble framework to handle the diversity of clinical data. The eLSTM algorithm was evaluated by the acknowledged database of ICU admissions Medical Information Mart for Intensive Care III (MIMIC-III). The investigation in total of 18415 cases shows that compared with clinical scoring systems SAPS II, SOFA, and APACHE II, random forests classification algorithm, and the single LSTM classifier, the eLSTM model achieved the superior performance with the largest value of area under the receiver operating characteristic curve (AUROC) of 0.8451 and the largest area under the precision-recall curve (AUPRC) of 0.4862. Furthermore, it offered an early prognosis of ICU patients. The results demonstrate that the eLSTM is capable of dynamically predicting the mortality of patients in complex clinical situations.

...read moreread less

18 citations

Journal Article•DOI•

Feature Selection and Optimization of Random Forest Modeling

[...]

Min Zhu¹, Jing Xia¹, Mo Lei Yan, Sheng Yu Zhang¹, Guo Long Cai, Jing Yan, Gang Min Ning¹ - Show less +3 more•Institutions (1)

Zhejiang University¹

01 Nov 2014

TL;DR: For the sample size of sepsis cases data, this paper adopts for parameters used in random forest modeling interval division choice; divide feature interval into high correlation and uncertain correlation intervals; select data from two intervals respectively for modeling to reduce model generalization error, and improve accuracy of prediction.

...read moreread less

Abstract: Traditional random forest algorithm is difficult to achieve very good effect for the classification of small sample data set. Because in the process of repeated random selection, selection sample is little, resulting in trees with very small degree of difference, which floods right decisions, makes bigger generalization error of the model, and the predict rate is reduced. For the sample size of sepsis cases data, this paper adopts for parameters used in random forest modeling interval division choice; divide feature interval into high correlation and uncertain correlation intervals; select data from two intervals respectively for modeling. Eventually reduce model generalization error, and improve accuracy of prediction.

...read moreread less

2 citations

Book Chapter•DOI•

A Quantitative Model for Sepsis Stratification

[...]

Jing Xia¹, Min Zhu¹, Shengyu Zhang¹, Molei Yan, Guolong Cai, Jing Yan, Gangmin Ning¹ - Show less +3 more•Institutions (1)

Zhejiang University¹

01 Jan 2015

TL;DR: Preliminary results exhibited that the established model is potential to help improve the patients’ management by quickly stratifying the sepsis severity and is superior to the conventional APACHE scoring method.

...read moreread less

Abstract: Sepsis is a kind of systemic inflammatory response syndrome caused by infection and it endangers the life of patients seriously due to its rapid development progression and high mortality rate. In clinic it is highly demanded to quantitatively stratify the severity of sepsis for individual management. This work aimed to build a quantitative model for sepsis patients which can stratify the disease severity in three levels. For this purpose, clinical data were collected and preprocessed, i.e. screening, normalization and data replenishing. Afterwards, sepsis sensitive parameters were tested and selected, which were utilized as the input of the stratification model. For the model, the algorithm of Support Vector Machine was applied. Eventually, the model was tested in total of 522 clinical cases and an accuracy of 67.5% in stratification was achieved. The performance of the established model is superior to the conventional APACHE scoring method. Preliminary results exhibited that the established model is potential to help improve the patients’ management by quickly stratifying the sepsis severity.

...read moreread less

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

A Deep CNN-LSTM Model for Particulate Matter (PM 2.5 ) Forecasting in Smart Cities.

[...]

Chiou-Jye Huang¹, Ping-Huan Kuo²•Institutions (2)

Jiangxi University of Science and Technology¹, National Pingtung University²

10 Jul 2018-Sensors

TL;DR: A deep neural network model that integrates the CNN and LSTM architectures is developed, and through historical data such as cumulated hours of rain, cumulated wind speed and PM2.5 concentration, the forecasting accuracy of the proposed CNN-LSTM model (APNet) is verified to be the highest in this paper.

...read moreread less

Abstract: In modern society, air pollution is an important topic as this pollution exerts a critically bad influence on human health and the environment. Among air pollutants, Particulate Matter (PM2.5) consists of suspended particles with a diameter equal to or less than 2.5 μm. Sources of PM2.5 can be coal-fired power generation, smoke, or dusts. These suspended particles in the air can damage the respiratory and cardiovascular systems of the human body, which may further lead to other diseases such as asthma, lung cancer, or cardiovascular diseases. To monitor and estimate the PM2.5 concentration, Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) are combined and applied to the PM2.5 forecasting system. To compare the overall performance of each algorithm, four measurement indexes, Mean Absolute Error (MAE), Root Mean Square Error (RMSE) Pearson correlation coefficient and Index of Agreement (IA) are applied to the experiments in this paper. Compared with other machine learning methods, the experimental results showed that the forecasting accuracy of the proposed CNN-LSTM model (APNet) is verified to be the highest in this paper. For the CNN-LSTM model, its feasibility and practicability to forecast the PM2.5 concentration are also verified in this paper. The main contribution of this paper is to develop a deep neural network model that integrates the CNN and LSTM architectures, and through historical data such as cumulated hours of rain, cumulated wind speed and PM2.5 concentration. In the future, this study can also be applied to the prevention and control of PM2.5.

...read moreread less

426 citations

Journal Article•

Epidemiology of osteoporotic fractures

[...]

Hiroshi Hagino¹•Institutions (1)

Tottori University¹

01 Aug 2003-Clinical calcium

TL;DR: Patients with vertebral, hip, distal radius, and proximal humerus fractures are most common among the osteoporosis-related fractures.

...read moreread less

Abstract: Patients with vertebral, hip, distal radius, and proximal humerus fractures are most common among the osteoporosis-related fractures. The incidences of these fractures increase with age, however, the increase patterns differ between the fracture sites. The prevalence of vertebral fracture for Japanese is similar or slightly higher and the incidences of osteoporosis-related limb fractures are lower than those for Caucacians. A decrease in prevalence of vertebral fractures and an increase in the incidence of limb fractures are the secular trend in Japan. Previous fractures are significant risk factor for both vertebral and hip fractures. Greater physical activity increases the risk of distal radius fractures, and decreases the risk of proximal humerus fractures.

...read moreread less

364 citations

Journal Article•DOI•

An Electricity Price Forecasting Model by Hybrid Structured Deep Neural Networks

[...]

Ping-Huan Kuo, Chiou-Jye Huang

21 Apr 2018-Sustainability

TL;DR: Experimental results show that compared with other traditional machine learning methods, the prediction performance of the estimating model proposed in this paper is proven to be the best and the feasibility and practicality of electricity price prediction is confirmed.

...read moreread less

Abstract: Electricity price is a key influencer in the electricity market. Electricity market trades by each participant are based on electricity price. The electricity price adjusted with the change in supply and demand relationship can reflect the real value of electricity in the transaction process. However, for the power generating party, bidding strategy determines the level of profit, and the accurate prediction of electricity price could make it possible to determine a more accurate bidding price. This cannot only reduce transaction risk, but also seize opportunities in the electricity market. In order to effectively estimate electricity price, this paper proposes an electricity price forecasting system based on the combination of 2 deep neural networks, the Convolutional Neural Network (CNN) and the Long Short Term Memory (LSTM). In order to compare the overall performance of each algorithm, the Mean Absolute Error (MAE) and Root-Mean-Square error (RMSE) evaluating measures were applied in the experiments of this paper. Experiment results show that compared with other traditional machine learning methods, the prediction performance of the estimating model proposed in this paper is proven to be the best. By combining the CNN and LSTM models, the feasibility and practicality of electricity price prediction is also confirmed in this paper.

...read moreread less

131 citations

Journal Article•DOI•

Reviewing ensemble classification methods in breast cancer.

[...]

Mohamed Hosni, Ibtissam Abnane, Ali Idri, Juan Manuel Carrillo de Gea¹, José Luis Fernández Alemán¹ - Show less +1 more•Institutions (1)

University of Murcia¹

01 Aug 2019-Computer Methods and Programs in Biomedicine

TL;DR: This study found that of the six medical tasks that exist, the diagnosis medical task was that most frequently researched, and that the experiment-based empirical type and evaluation-based research type were the most dominant approaches adopted in the selected studies.

...read moreread less

128 citations

Journal Article•DOI•

Class Weights Random Forest Algorithm for Processing Class Imbalanced Medical Data

[...]

Min Zhu¹, Jing Xia¹, Xiaoqing Jin, Molei Yan, Guolong Cai, Jing Yan, Gangmin Ning¹ - Show less +3 more•Institutions (1)

Zhejiang University¹

04 Jan 2018-IEEE Access

...read moreread less

128 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34

Collapse