scispace - formally typeset
Search or ask a question
Author

Il-Gyo Chong

Bio: Il-Gyo Chong is an academic researcher from Samsung. The author has contributed to research in topics: Partial least squares regression & Feature selection. The author has an hindex of 4, co-authored 8 publications receiving 1427 citations. Previous affiliations of Il-Gyo Chong include Pohang University of Science and Technology.

Papers
More filters
Journal ArticleDOI
TL;DR: The nature of the VIP method is explored and it is compared with other methods through computer simulation experiments considering four factors–the proportion of the number of relevant predictor, the magnitude of correlations between predictors, the structure of regression coefficients, andThe magnitude of signal to noise.

1,595 citations

Journal ArticleDOI
TL;DR: The proposed method finds an optimal setting from historical data without constructing an explicit quality function by sequentially partitioning the reduced process variable space using a rule induction method.
Abstract: In process optimization, the setting of the process variables is usually determined by estimating a function that relates the quality to the process variables and then optimizing this estimated function. However, it is difficult to build an accurate function from process data in industrial settings because the process variables are correlated, outliers are included in the data, and the form of the functional relation between the quality and process variables may be unknown. A solution derived from an inaccurate function is normally far from being optimal. To overcome this problem, we use a data mining approach. First, a partial least squares model is used to reduce the dimensionality of the process and quality variables. Then the process settings that yield the best output are identified by sequentially partitioning the reduced process variable space using a rule induction method. The proposed method finds an optimal setting from historical data without constructing an explicit quality function. The propo...

34 citations

Journal ArticleDOI
TL;DR: This paper proposes a new PRIM-like method specially to deal with ordinal discrete variables, and performance of the proposed method is compared with the original PRIM through an extensive simulation using artificial data sets.
Abstract: This paper deals with process optimization, which establishes the optimal settings of process variables to achieve a better quality. To this end, the patient rule induction method (PRIM), widely used in various application areas, could be adopted. However, the PRIM may fail to provide successful solutions when some process variables are in discrete types. Thus, we propose a new PRIM-like method specially to deal with ordinal discrete variables. For an illustrative purpose, the proposed method is applied to a real steel-making process. Also, performance of the proposed method is compared with the original PRIM through an extensive simulation using artificial data sets.

25 citations

Journal ArticleDOI
TL;DR: This work proposes a variable selection procedure to find the main equipment factors that affect in-process wafer quality in consideration of the following issues: 1) imputation for missing values; 2) semi-supervised regression for unlabeled data; and 3) redundancy among variables.
Abstract: Manufacturing semiconductor wafers involves many sequential processes, and each process has various equipment-related variables or factors, which results in high-dimensional data. However, measuring the quality of all wafers is time and cost intensive, and only a small proportion of the wafers is labeled. Further, equipment factors are not always measured by sensors due to the complicated process. Variable selection, which is performed to reduce the dimensionality of the input variable space while improving or preserving regression performance by selecting important input factors, plays an important role in regression problems. We propose a variable selection procedure to find the main equipment factors that affect in-process wafer quality in consideration of the following issues: 1) imputation for missing values; 2) semi-supervised regression for unlabeled data; and 3) redundancy among variables. In the proposed procedure, partial least squares and least absolute shrinkage and selection operator regression are utilized as prediction models. Experiments using two semiconductor equipment datasets were conducted to evaluate the performance of the proposed procedure.

9 citations

Proceedings ArticleDOI
01 Oct 2017
TL;DR: An Optimal Computing Budget Allocation-based Stochastic Greedy Search algorithm to dynamically allocate machines, each time the authors allocate only one machine into the best performed station until all the machines are allocated into the Semiconductor Wafer Fabrication System.
Abstract: General queueing modeling method, Stochastic Timed Automation, is employed to model the Semiconductor Wafer Fabrication System because of its flexibility and fidelity. The machine allocation is formulated as a stochastic integer programming problem with the objective to optimize the throughput of the Semiconductor Wafer Fabrication System. And we propose an Optimal Computing Budget Allocation-based Stochastic Greedy Search algorithm to dynamically allocate machines, each time we allocate only one machine into the best performed station until all the machines are allocated into the Semiconductor Wafer Fabrication System. Numerical examples suggest our method is efficient to find a good allocation solution.

1 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Results suggest some changes to the pilot plant configuration are necessary to reduce power consumption although maximizing biodigester performance, and a modification of the typical continuous stirred tank reactor is a promising process being relatively stable and owing to its capability to manage considerable amounts of residuals at low operational cost.
Abstract: Intensive poultry production generates over 100,000 t of litter annually in West Virginia and 9×106 t nationwide. Current available technological alternatives based on thermophilic anaerobic digestion for residuals treatment are diverse. A modification of the typical continuous stirred tank reactor is a promising process being relatively stable and owing to its capability to manage considerable amounts of residuals at low operational cost. A 40-m3 pilot plant digester was used for performance evaluation considering energy input and methane production. Results suggest some changes to the pilot plant configuration are necessary to reduce power consumption although maximizing biodigester performance.

1,287 citations

Journal ArticleDOI
TL;DR: A review of available methods for variable selection within one of the many modeling approaches for high-throughput data, Partial Least Squares Regression, to get an understanding of the characteristics of the methods and to get a basis for selecting an appropriate method for own use.

1,180 citations

Journal ArticleDOI
15 Aug 2010-Geoderma
TL;DR: In this article, the root mean square error (RMSE) and the Akaike Information Criterion (AIC) were used to compare different data mining algorithms for modelling soil visible-near infrared (vis-NIR) diffuse reflectance spectra and to assess the interpretability of the results.

928 citations

Journal ArticleDOI
TL;DR: The emphasis in this paper is on how to use variable selection in practice and avoid the most common pitfalls.
Abstract: This paper provides a practical guide to variable selection in chemometrics with a focus on regression-based calibration models. Several approaches, such as genetic algorithms (GAs), jack-knifing, forward selection, etc., are explained; it is also explained how to choose between different kinds of variable selection methods. The emphasis in this paper is on how to use variable selection in practice and avoid the most common pitfalls. Copyright © 2010 John Wiley & Sons, Ltd.

580 citations

Journal ArticleDOI
28 Apr 2016-Nature
TL;DR: It is shown that specific plankton communities, from the surface and deep chlorophyll maximum, correlate with carbon export at 150 m and that the relative abundance of a few bacterial and viral genes can predict a significant fraction of the variability in carbon export in these regions.
Abstract: The biological carbon pump is the process by which CO2 is transformed to organic carbon via photosynthesis, exported through sinking particles, and finally sequestered in the deep ocean. While the intensity of the pump correlates with plankton community composition, the underlying ecosystem structure driving the process remains largely uncharacterized. Here we use environmental and metagenomic data gathered during the Tara Oceans expedition to improve our understanding of carbon export in the oligotrophic ocean. We show that specific plankton communities, from the surface and deep chlorophyll maximum, correlate with carbon export at 150 m and highlight unexpected taxa such as Radiolaria and alveolate parasites, as well as Synechococcus and their phages, as lineages most strongly associated with carbon export in the subtropical, nutrient-depleted, oligotrophic ocean. Additionally, we show that the relative abundance of a few bacterial and viral genes can predict a significant fraction of the variability in carbon export in these regions.

556 citations