scispace - formally typeset
Search or ask a question

Showing papers presented at "Bioinformatics and Bioengineering in 2009"


Proceedings Article•DOI•
22 Jun 2009
TL;DR: A segmentation scheme (EMaGACOR) that integrates Expectation Maximization (EM) based segmentation with a geodesic active contour (GAC) and a novel heuristic edge-path algorithm exploits the size of lymphocytes to split contours that enclose overlapping objects is proposed.
Abstract: The presence of lymphocytic infiltration (LI) has been correlated with nodal metastasis and tumor recurrence in HER2+ breast cancer (BC), making it important to study LI. The ability to detect and quantify extent of LI could serve as an image based prognostic tool for HER2+ BC patients. Lymphocyte segmentation in H & E-stained BC histopathology images is, however, complicated due to the similarity in appearance between lymphocyte nuclei and cancer nuclei. Additional challenges include biological variability, histological artifacts, and high prevalence of overlapping objects. Although active contours are widely employed in segmentation, they are limited in their ability to segment overlapping objects. In this paper, we propose a segmentation scheme (EMaGACOR) that integrates Expectation Maximization (EM) based segmentation with a geodesic active contour (GAC). Additionally, a novel heuristic edge-path algorithm exploits the size of lymphocytes to split contours that enclose overlapping objects. For a total of 62 HER2+ breast biopsy images, EMaGACOR was found to have a detection sensitivity of over 90% and a positive predictive value of over 78%. By comparison, EMaGAC (model without overlap resolution) and GAC (Randomly initialized geodesic active contour) model yielded corresponding sensitivities of 57.4% and 26.7%, respectively. Furthermore, EMaGACOR was able to resolve over 92% of overlaps. Our scheme was found to be robust, reproducible, accurate, and could potentially be applied to other biomedical image segmentation applications.

68 citations


Proceedings Article•DOI•
22 Jun 2009
TL;DR: The system concept and design principles of Wedjat are introduced, a smart phone application designed to help patients avoiding medication errors and its medication scheduling algorithms and the modular implementation of mobile computing application are introduced.
Abstract: Out-patient medication administration has been identified as the most error-prone procedure in modern health¬care. Under or over doses due to erratic in-takes, drug-drug or drug-food interactions caused by un-reconciled prescriptions and the absence of in-take enforcement and monitoring mechanisms have caused medication errors to become the common cases of all medical errors. Most medication administration errors were made when patients bought different prescribed and over-the-counter medicines from several drug stores and use them at home without little or no guidance. Elderly or chronically ill patients are particularly susceptible to these mistakes. In this paper, we introduce Wedjat, a smart phone application designed to help patients avoiding these mistakes. Wedjat can remind its users to take the correct medicines on time and record the in-take schedules for later review by healthcare professionals. Wedjat has two distinguished features: (1) it can alert the patients about potential drug-drug/drug-food interactions and plan a proper in-take schedule to avoid these interactions; (2) it can revise the in-take schedule automatically when a dose was missed. In both cases, the software always tries to produce the simplest schedule with least number of in-takes. Wedjat is equipped with user friendly interfaces to help its users to recognize the proper medicines and obtain the correct instructions of taking these drugs. It can maintain the medicine in-take records on board, synchronize them with a data¬base on a host machine or upload them onto a Personal Heath Record (PHR) system. A proof-of-concept prototype of Wedjat has been implemented on Window Mobile platform and will be migrated onto Android for Google Phones. This paper introduces the system concept and design principles of Wedjat with emphasis on its medication scheduling algorithms and the modular implementation of mobile computing application.

59 citations


Proceedings Article•DOI•
22 Jun 2009
TL;DR: A novel method to estimate the number of clusters in a micro array data set based on the consensus clustering approach, which is built upon a suitable clustering similarity measure such as the well known Adjusted Rand Index (ARI) or the authors' recently developed, information theoretic based index, namely the Adjusted Mutual Information (AMI).
Abstract: Estimating the true number of clusters in a data set is one of the major challenges in cluster analysis. Yet in certain domains,knowing the true number of clusters is of high importance. For example, in medical research, detecting the true number of groups and sub-groups of cancer would be of utmost importance for their effective treatment. In this paper we propose a novel method to estimate the number of clusters in a micro array data set based on the consensus clustering approach. Although the main objective of consensus clustering is to discover a robust and high quality cluster structure in a data set, closer inspection of the set of clusterings obtained can often give valuable information about the appropriate number of clusters present. More specifically, the set off clusterings obtained when the specified number of clusters coincides with the true number of clusters tends to be less diverse.To quantify this diversity we develop a novel index, namely the Consensus Index (CI), which is built upon a suitable clustering similarity measure such as the well known Adjusted Rand Index (ARI)or our recently developed, information theoretic based index, namely the Adjusted Mutual Information (AMI). Our experiments on both synthetic and real microarray data sets indicate that the CI is a useful indicator for determining the appropriate number of clusters.

46 citations


Proceedings Article•DOI•
22 Jun 2009
TL;DR: This paper segments the MR brain image using ant colony optimization algorithm, a relatively new meta-heuristic algorithm and a successful paradigm of all the algorithms which take advantage of the insect’s behavior.
Abstract: In this paper, we describe a segmentation method for brain MR images using an ant colony optimization (ACO) algorithm. This is a relatively new meta-heuristic algorithm and a successful paradigm of all the algorithms which take advantage of the insect’s behavior. It has been applied to solve many optimization problems with good discretion, parallel, robustness and positive feedback. As an advanced optimization algorithm, only recently, researchers began to apply ACO to image processing tasks. Hence, we segment the MR brain image using ant colony optimization algorithm. Compared to traditional meta-heuristic segmentation methods, the proposed method has advantages that it can effectively segment the fine details.

32 citations


Proceedings Article•DOI•
22 Jun 2009
TL;DR: A new algorithm for reconstruction of a 3D virtual colon segment from an individual image captured from colonoscopy is presented, potentially useful to estimate percent of colon mucosa inspected by the endoscopist.
Abstract: A new algorithm for reconstruction of a 3D virtual colon segment from an individual image captured from colonoscopy is presented. Colonoscopy is currently the gold standard method for detection and prevention of colorectal cancer. However, the protective effect depends on the amount of colon mucosa that is actually seen by the endoscopist. The proposed algorithm takes contours of colon folds in the image as input and calculates the depth and the slant angle of each fold. Finally, the colon mucosa is created using Cubic Bezier Curve interpolation between the folds. The proposed algorithm is an important step toward 3D reconstruction of the virtual colon from video of an entire colonoscopy procedure. The reconstruction is potentially useful to estimate percent of colon mucosa inspected by the endoscopist.

22 citations


Proceedings Article•DOI•
22 Jun 2009
TL;DR: A Fuzzy Influence Graph is developed to model MAPK pathway and it is shown that despite individual variations, the average behavior ofMAPK pathway in a cells group is close to results obtained by ordinary differential equations.
Abstract: In order to simulate biological processes, we use multi-agents system. However, modelling cell behavior in systems biology is complex and may be based on intracellular biochemical pathway. So, we have developed in this study a Fuzzy Influence Graph to model MAPK pathway. A Fuzzy Influence Graph is also called Fuzzy Cognitive Map.This model can be integrated in agents representing cells. Results indicate that despite individual variations, the average behavior of MAPK pathway in a cells group is close to results obtained by ordinary differential equations. So, we have also modelled multiple myeloma cells signalling by using this approach.

19 citations


Proceedings Article•DOI•
22 Jun 2009
TL;DR: Combinatorial Fusion Analysis and Association Rule Mining are applied to autism prevalence, mercury, and lead data to generate hypotheses and explore possible associations and the affect of exposure to neurotoxins during critical stages in a child's early development.
Abstract: The increase in autism prevalence has been the motivation for much research which has produced various theories for its causation. Genetic and environmental factors have been investigated. An area of focus is the affect of exposure to neurotoxins, such as mercury and lead, during critical stages in a child’s early development. In this study we apply Combinatorial Fusion Analysis (CFA) and Association Rule Mining (ARM) to autism prevalence, mercury, and lead data to generate hypotheses and explore possible associations.

18 citations


Proceedings Article•DOI•
22 Jun 2009
TL;DR: This paper presents an active example selection method with naĂ¯ve Bayes classifier (AESNB) as a solution for the imbalanced data problem and examines the performance of AESNB algorithm by using five imbalanced biomedical datasets.
Abstract: Various real-world biomedical classification tasks suffer from the imbalanced data problem which tends to make the prediction performance of some classes significantly decrease. In this paper, we present an active example selection method with naive Bayes classifier (AESNB) as a solution for the imbalanced data problem. The proposed method starts with a small balanced subset of training examples. A naive Bayes classifier is trained incrementally by actively selecting and adding informative examples regardless of the original class distribution. Informative examples are defined as examples that produce high error scores by the current classifier. We examined the performance of AESNB algorithm by using five imbalanced biomedical datasets. Our experimental results show that the naive Bayes classifier with our active example selection method achieves a competitive classification performance compared to the classifier with sampling or cost-sensitive methods.

18 citations


Proceedings Article•DOI•
22 Jun 2009
TL;DR: An automated sleep staging system using only single EEG channel to achieve on-line detection for REM state during sleep is proposed and the proposed system will be applied for clinical trials of depression therapy.
Abstract: In medical literatures, it has been reported that the increased REM (rapid eye movement) density is one of the characters of depressed sleep. Some experiments were conducted to confirm that REM sleep deprivation (REM-SD) for a period of time is therapeutic for endogenous depressed patients. However, because of its high complexity and intensive labor requirement, this therapy has not yet been proved validity by a sufficient amount of depressed patients. Therefore, we propose to develop an automated sleep staging system using only single EEG channel to achieve on-line detection for REM state during sleep. For classifier design, we use a dataset of 25 subjects and the staging accuracy can achieve 80%. Once the REM state is detected by the system, the system will alarm the subject to deprive the REM sleep. The effect of REM sleep deprivation can be examined by hypnogram and the proposed system will be applied for clinical trials of depression therapy.

15 citations


Proceedings Article•DOI•
22 Jun 2009
TL;DR: This study summarizes deformation simulation methods and their pros and cons for surgical simulations and concludes that MSM is widely accepted by surgical simulation community because of the real-time physics based behavior of MSM.
Abstract: One of the essential components of a virtual reality surgical simulation is deformation. Deformations in computer graphics and surgical simulations are commonly modeled with three different approaches e.g. geometry based methods, Finite Element method(FEM), and Mass-spring Method (MSM). The geometry based methods are quite fast and visually appealing. Late two methods take the physics of deformation into consideration. Even though the FEM results in more physically realistic deformations, a significant drawback of this method is its expensive computation cost and vulnerability to surgical procedures such as incision. However,MSM is relatively computationally inexpensive.Because of the real-time physics based behavior of MSM, it is widely accepted by surgical simulation community. This study summarizes deformation simulation methods and their pros and cons for surgical simulations. Moreover, because of wide usage of the MSM, optimized data structures for MSM are provided and analyzed under different deformation settings.

12 citations


Proceedings Article•DOI•
22 Jun 2009
TL;DR: The purpose of the EDB, a possible use of the information stored in it, query possibilities, and plans for future development are described.
Abstract: The analysis of structural and energy features of proteins can be a key to understand how proteins work and interact to each other in cellular reactions. Potential energy is a function of atomic positions in a protein structure. The distributions of energy over each atom in protein structures can be very supportive for the studies of the complex processes proteins are involved in. Energy profiles contain distributions of different potential energies in protein molecular structures. Therefore, they constitute a full descriptor of energy properties for protein structures. The Energy Distribution Data Bank (EDB, http://edb.aei.polsl.pl) stores energy profiles for protein molecular structures retrieved from the well-known Protein Data Bank. In the paper, we describe the purpose of the EDB, a possible use of the information stored in it, query possibilities, and plans for future development.

Proceedings Article•DOI•
22 Jun 2009
TL;DR: A method to automatically extract the vessel segments and construct the vascular tree with anatomical realism from a color retinal image to assist in clinical studies of diagnosis of cardio-vascular diseases, such as hypertension.
Abstract: In this paper, we present a method to automatically extract the vessel segments and construct the vascular tree with anatomical realism from a color retinal image. The significance of the work is to assist in clinical studies of diagnosis of cardio-vascular diseases, such as hypertension,which manifest abnormalities in either venous and/or arterial vascular systems. To maximize the completeness of vessel extraction, we introduce vessel connectiveness measure to improve on an existing algorithm which applies multiscale matched filtering and vessel likelihood measure.Vessel segments are grouped using extended Kalman filter to take into consideration continuities in curvature, width,and color changes at the bifurcation or crossover point. The algorithm is tested on five images from the DRIVE database,a mixture of normal and pathological images, and the results are compared with the ground truth images provided by a physician. The preliminary results show that our method reaches an average success rate of 92.1%.

Proceedings Article•DOI•
22 Jun 2009
TL;DR: The aim of this study was to investigate the difference of EEG dynamics on navigation performance and found performance related brain activities for allocentric subjects, which may due to the functional dis-sociation between the use of allo- and egocentric reference frames.
Abstract: —The aim of this study is to investigate the difference of EEG dynamics on navigation performance. A tunnel task was designed to classify subjects into allocentric or egocentric spatial representation users. Despite of the differences of mental spatial representation, behavioral performance in general were compatible between the two strategies subjects in the tunnel task. Task-related EEG dynamics in power changes were analyzed using independent component analysis (ICA), time-frequency and non-parametric statistic test. ERSP image results revealed navigation performance-predictive EEG activities which is is expressed in the parietal component by source reconstruction. For egocentric subjects, comparing to trails with well-estimation of homing angle, the power attenuation at the frequencies from 8 to 30 Hz (around alpha and beta band) was stronger when subjects overestimated homing directions, but the attenuated power was decreased when subjects were underestimated the homing angles. However, we did not found performance related brain activities for allocentric subjects, which may due to the functional dis-sociation between the use of allo- and egocentric reference frames.

Proceedings Article•DOI•
22 Jun 2009
TL;DR: This novel PCR-CTPP primer designing method provides the computation for the primer information of cost- and time-effective enzyme-free SNP genotyping.
Abstract: Many single nucleotide polymorphisms (SNPs) genotyping techniques have been developed but most of them are expensive. Polymerase chain reaction with confronting two-pair primers (PCR-CTPP) is a restriction enzyme-free and economic genotyping but its primer design is still computationally challenged. Here, we introduced a genetic algorithm (GA)-based PCR-CTPP primer design method. Thirty SNPs of the Janus kinase 2 gene with their SNP flanking length for 500 bps were tested. These GA-based designing CTPP primers were characterized with close values for melting temperature (Tm) and specificity, and their corresponding PCR products were provided with the optimal length. In conclusion, this novel PCR-CTPP primer designing method provides the computation for the primer information of cost- and time-effective enzyme-free SNP genotyping.

Proceedings Article•DOI•
22 Jun 2009
TL;DR: The optimal combination obtained from this study could be a reference for production line Monascus fermentation processing.
Abstract: The species of the fungus (Monascus purpureus, NPUST, Taiwan) used to produce the high pigment was isolated from rice waste. Its genomic DNA was confirmed by examining the NCBI database to check this species is M. purpureus strain ATCC 36114 (Identity 100%, Gap 0%). In this study, Taguchi Method was applied to find the optimal conditions of medium composition to enhance the production of pigments synthesis, and to reduce the yield of citrinin, metabolites in the M. purpureus fermentation process. In the production of the yellow pigment (OD400) and the red pigment (OD500), the result showed that their optimal conditions were A2B3C3D4E4 (1% Japonica-type rice, 1% peptone, 0.01% glycerol, 0.01% potassium phosphate and pH 9). In the production of the orange pigment (OD460), the result showed that the optimal conditions were A2B3C3D1E4 (1% Japonica-type rice, 1% peptone, 0.01% glycerol, 0.01% magnesium sulfate and pH 9). Under the optimal conditions for the best yield of pigment synthesis, the result showed that yellow pigment was 4.132 ppm, red pigment was 8.480 ppm and the orange pigment was 4.573 ppm. In the reduction of the citrinin yield, the result showed that A4B1C1D3E1 (1% whole wheat flour, 1% gelatin, 0.01% olive oil, 0.01% sodium choride and pH 3) were optimal conditions inhibit citrinin metabolism. Citrinin was reduced to 0.055 ppb. The optimal combination obtained from this study could be a reference for production line Monascus fermentation processing.

Proceedings Article•DOI•
22 Jun 2009
TL;DR: The goal of this study to provide medical decision support tools and speedily integrate heterogeneousmedical decision support systems can be effectively attained.
Abstract: This study employs the framework of web services in conjunction with Bayesian theorem and decision trees to construct a web-services-based decision support system for medical diagnosis and treatment. The purpose is to help users (physicians) with issues pertinent to medical diagnosis and treatment decisions. Users through the system key in available prior probability and through computation based on Bayesian theorem obtain the diagnosis. The process helps users enhance the quality and efficiency of medical decisions, and the diagnosis can be transmitted to a decision-tree-based treatment decision support service component via XML to generate recommendation and analysis for treatment decisions. On the other hand, features of web services enable this medical decision support system to offer more service platforms than conventional one. Users will have access to the system whether they use Windows, Macintosh, Linux or any other platforms that connect with the Internet via HTTP. More important is the fact that after the system is completed all Internet service providers will be able to access the system as a software unit freely and quickly. This way, the goal of this study to provide medical decision support tools and speedily integrate heterogeneous medical decision support systems can be effectively attained.

Proceedings Article•DOI•
22 Jun 2009
TL;DR: An image analysis based quantitative model was established to evaluate endodontic treatments (40 effective and 43 noneffective cases) and provided the accuracy of 80.7%.
Abstract: Intraoral radiographs have been taken to diagnose periapical lesions. Subsequent endodontic treatment needs to be evaluated quantitatively, that is often difficult due to various imaging factors as well as subjective visual interpretation. Therefore, we sought to establish an image analysis based quantitative model to evaluate endodontic treatments (40 effective and 43 noneffective cases). To normalize an image, the dentin area and the background were used as references. In each pair of images representing before and after treatment, the lesion area was manually selected by experts and segmented by tophat operation. Numerous features representing the effective bone healing were calculated. Using relative differences of selected features, an evaluation model was derived by logistic regression analysis. Gray level intensity and textural differences obtained from lesions significantly increased in the effectively treated cases. The model provided the accuracy of 80.7%. Our quantitative model may be helpful to evaluate endodontic treatment in clinical settings and in animal studies.

Proceedings Article•DOI•
22 Jun 2009
TL;DR: A predictive model which takes into account a patient’s physiology and the results of the stages of an IVF cycle, to assist obstetricians and gynecologists in increasing success rate of IVF is built.
Abstract: In vitro fertilization (IVF) is a medically assisted reproduction technique (ART) for treating infertility. During IVF procedures, a female patient requires hormone treatment to control ovulation, oocytes are taken from the patient and fertilized in vitro, and after fertilization, one or usually more resulting embryos are transferred into the uterus. Although IVF is considered as a method of last resort for infertile couples, the success rate is still low, which can be only as high as 40% for women under age of 30. In this study, we build a predictive model which takes into account a patient’s physiology and the results of the stages of an IVF cycle, to assist obstetricians and gynecologists in increasing success rate of IVF. The predictive model is based on a knowledge discovering technique incorporated with particle swarm optimization (PSO), which is a competitive heuristic technique for solving optimization task. This study uses the database of IVF cycles developed by a women and infants clinic in Taiwan as the foundation. A repertory grid is developed to help selecting attributes for the data mining technique. The results show that the proposed technique can exploit rules approved by the obstetrician/gynecologist and the assistant on both comprehensibility and justifiability.

Proceedings Article•DOI•
22 Jun 2009
TL;DR: It is proved that the BLM problem on a rectangular grid is NP-hard and the first integer linear programming (ILP) formulation to solve BLM problem optimally is given.
Abstract: DNA microarray technology has proven to be an invaluable tool for molecular biologists. Microarrays are used extensively in SNP detection, genomic hybridization, alternative splicing and gene expression profiling. However the manufacturers of the microarrays are often stuck with the problem of minimizing the effects of unwanted illumination (border length minimization (BLM)) which is a hard combinatorial problem. In this paper we prove that the BLM problem on a rectangular grid is NP-hard. We also give the first integer linear programming (ILP) formulation to solve BLM problem optimally. Experimental results indicate that our ILP method produces superior results (both in runtime and cost) compared to the current state of the art algorithms to solve the BLM problem optimally.

Proceedings Article•DOI•
22 Jun 2009
TL;DR: This paper leverages a previous work on the automatic prediction of Gene Ontology annotations based on the singular value decomposition (SVD) of the gene-to-term annotation matrix, and proposes a novel post-processing method that uses a Bayesian network to eliminate predictions of anomalous annotations.
Abstract: Gene and protein structural and functional annotations expressed through controlled terminologies and ontologies are paramount especially for the aim of inferring new biomedical knowledge through computational analyses. However, the available annotations are incomplete, in particular for recently studied genomes, and only a few of them are highly reliable human curated information. To support and speed up the time-consuming curation process, prioritized lists of computationally predicted annotations are hence extremely useful. In this paper we leverage a previous work on the automatic prediction of Gene Ontology annotations based on the singular value decomposition (SVD) of the gene-to-term annotation matrix, and we propose a novel post-processing method that uses a Bayesian network to eliminate predictions of anomalous annotations. In fact, we observed that the predicted annotation profiles might suggest that a gene shall be annotated to a term, but not to one of its ancestors, thus violating the constraint imposed by the Gene Ontology. To this end, the proposed algorithm processes the annotation profiles predicted by a SVD based method, and produces a ranked list of computationally discovered candidate annotations which is consistent with the Gene Ontology.

Proceedings Article•DOI•
22 Jun 2009
TL;DR: Combinatorial Fusion Analysis can robustly identify significant genes from multiple microarray data sets so that experimental biology researchers can efficiently perform the next phase of analysis on a smaller subset of genes.
Abstract: Microarray technology is a popular and informative technique widely used in experimental molecular biology, which can produce quantitative expression measurements for thousands of genes in a single cellular mRNA sample. Analysis methods for determining significant genes are essential to extracting information from the multitude of data generated from a single microarray experiment. Raw gene expression measurements alone, most often, do not indicate significant genes for the given condition. While analysis methods are abundant, there is a need for enhanced performance when attempting to identify significant genes from such experiments. Additionally, the ability to more accurately predict informative genes from cross-laboratory and/or cross-experiment data can certainly aid in disease detection. We propose the application of Combinatorial Fusion Analysis (CFA) in order to enhance and expedite the identification of significant genes in a cross-experiment analysis. Previous methods to identify significant genes applied SAM to analyze the data sets and then took the intersection of top ranked genes. In this paper, we used CFA to combine the scoring functions of two data sets produced by SAM. Moreover, both score and rank combinations are used. Both combinations can achieve better results than the previous approach of taking the intersection. In addition, by using the Rank-Score Characteristic function as a diversity measure, we are able to show that rank combination performed better than score combination. CFA can robustly identify significant genes from multiple microarray data sets so that experimental biology researchers can efficiently perform the next phase of analysis on a smaller subset of genes.

Proceedings Article•DOI•
22 Jun 2009
TL;DR: Experimental results show that the proposed method effectively simplifies features selection by reducing the total number of features needed when compared to other classification methods from the literature.
Abstract: In this paper, correlation-based feature selection (CFS) and the Taguchi-genetic algorithm (TGA) method were combined in a hybrid method, and the K-nearest neighbor (KNN) method with leave-one-out cross-validation (LOOCV) served as a classifier for eleven classification profiles. With the help of this classifier classification accuracy were calculated. Experimental results show that this method effectively simplifies features selection by reducing the total number of features needed. The proposed method obtained the highest classification accuracy in five out of the six gene expression data set test problems when compared to other classification methods from the literature.

Proceedings Article•DOI•
22 Jun 2009
TL;DR: The purpose of this study is to automatically and exactly identify the relative EntrezIDs which are mentioned in literatures and employs the similarity-based inference network to calculate the similarity score with the entities, and this EntrezID is a solution to the term variation problem.
Abstract: To construct an intelligent biomedical knowledge management system, researchers had proposed many relation extraction methods in past. Before applying these methods, the system has to recognize the name entities in the literature and map the entities to the relative EntrezIDs. The purpose of this study is to automatically and exactly identify the relative EntrezIDs which are mentioned in literatures. We employ the similarity-based inference network to calculate the similarity score with the entities, and this EntrezID is a solution to the term variation problem. The proposed de-ambiguity strategy increases the confidence of EntrezID in literature. The strategy provides researchers a good utilization of information for mapping the entity to the EntrezID. As a result, the precision of system increase about 75.1%, and it makes the identified entity even more meaningful. The system using the proposed strategies outperforms the previous methods in biomedical entity normalization.

Proceedings Article•DOI•
22 Jun 2009
TL;DR: An automated gating approach to analyze and cluster flow data and provide researchers and physicians a 3D way of viewing flow data to give a user-friendly representation and avert misclassification would occur in lower dimension is proposed.
Abstract: Flow has been widely used for the diagnosis of various diseases. We proposed an automated gating approach to analyze and cluster flow data and provide researchers and physicians a 3-D way of viewing flow data. This automated approach was designed to give reproducible results, which avoid the subjective and time-consuming human manipulation. Doublets problem will be eliminated by our tool, and the result were compared with the manual gating. In addition, a 3D visualization will not only give a user-friendly representation but also avert misclassification would occur in lower dimension. We demonstrated how the 3D approach can be used to augment the K-means method in classifying the data. To validate the feasibility of our proposed automatic approach, our result is compared with the result done at the Methodist Hospital in Houston.

Proceedings Article•DOI•
22 Jun 2009
TL;DR: This work used data mining techniques to study the gene expression values of breast cancer patients with known clinical outcome to create a classification model to be used in the clinical practice to support therapy prescription.
Abstract: Providing clinical predictions for cancer patients by analyzing their genetic make-up is a difficult and very important issue. With the goal of identifying genes more correlated with the prognosis of breast cancer, we used data mining techniques to study the gene expression values of breast cancer patients with known clinical outcome. Focus of our work was the creation of a classification model to be used in the clinical practice to support therapy prescription. We randomly subdivided a gene expression dataset of 311 samples into a training set to learn the model and a test set to validate the model and assess its performance. We evaluated several learning algorithms in their not weighted and weighted form, which we defined to take into account the different clinical importance of false positive and false negative classifications. Based on our results, these last, especially when used in their combined form, appear to provide better results.

Proceedings Article•DOI•
22 Jun 2009
TL;DR: This work proposes a novel approach to detect CNVs by aligning the short reads obtained by high-throughput sequencer to the previously assembled human genome sequence, and analyzing the distribution of the aligned reads.
Abstract: Copy-Number Variations (CNVs) can be defined as gains or losses that are greater than 1kbs of genomic DNA among phenotypically normal individuals. CNVs detected by microarray based approach are limited to medium or large sized ones because of its low resolution. Here we propose a novel approach to detect CNVs by aligning the short reads obtained by high-throughput sequencer to the previously assembled human genome sequence, and analyzing the distribution of the aligned reads. Application of our algorithm demonstrates the feasibility of detecting CNVs of arbitrary length, which include short ones that microarray based algorithms cannot detect. Also, false positive and false negative rates of the results were relatively low compared to those of microarray based algorithms.

Proceedings Article•DOI•
22 Jun 2009
TL;DR: The aim of this study is to evaluate the biochemical factors in preterm neonates, include growth hormone (GH), insulin-like growth factors (IGFs), IGF-binding proteins (IGFBPs) and leptin, the proteins known to be involved in the regulation of growth.
Abstract: The aim of this study is based on GM(0,N) model to evaluate the biochemical factors in preterm neonates, include growth hormone (GH), insulin-like growth factors (IGFs), IGF-binding proteins (IGFBPs) and leptin, the proteins known to be involved in the regulation of growth. And find the influence for each factor corresponding with the body mass index(BMI) and postnatal growth(PI). First, we use the measurement data from blood sample, which collected from 55 preterm neonates for four continuous weeks as our original data. Second, The GM(0,N) model in grey system theory is applied as our mathematics model, to rank the influence from our original data, and we also create a toolbox, which via Matlab to help our calculation. After the results are found, we find that the results are quite matching with the traditional method. Hence, we not only can get the rank of influence in preterm neonates by grey system method, but also provide the new direction in this kind of research.

Proceedings Article•DOI•
22 Jun 2009
TL;DR: A capture system based on daily list of newly order about antibiotics, devices, urine routine examination, urine culture, blood culture, infections control department of hospital can decide which high predict value criteria suggesting checklist from medical information department for infection control department member to perform active patient examination and decrease the number of direct patient examination or chart review, but still keep high capture sensitivity.
Abstract: Background: At present, passive alarm system from culture reports and announced from groups outbreak events make cases investigate delayed. Is there any predict factor can be used to suggest active high capture sensitivity surveillance and alarm outbreak early? Objectives: Is there any predict factor can be used to get active high capture sensitivity surveillance of hospital-acquired infections (HAI) in acute hospitals. Can it give alarm of outbreak early? Can it decrease the number for direct patient examination or chart review. Methods : We performed three months retrospective study to identify predictors(urine routine, device as catheter or cystoscope, antibiotics, culture, etc.) about major HAI (urinary tract infections), as defined by the Centers for Disease Control and Prevention (CDC) criteria in a medical center (733 beds). We compared patients list of predictor(s) positive collected from electronic medical record by medical information department members with confirmed nosocomial UTI cases list given from infection control department. Results: 5533 admission patients were screened. The overall prevalence of HAI was 2.5% (141/5533); 1.4% (77/5533) of patients was nosocomial UTI. At presence of urine routine examination and devices guarantees 100% capture sensitivity in detecting nosocomial UTI but requires an assessment of 2763 patients (49.9%) of the population. At presence of antibiotics and devices guarantees 98.7% capture sensitivity and requires an assessment of 1921 patients (34.7%) of the population, whereas presence of antibiotics and urine routine examination has 98.7% capture sensitivity but requires an assessment of 3019 patients (54.7%) of the population. Conclusion: A capture system based on daily list of newly order about antibiotics, devices, urine routine examination, urine culture, blood culture, infection control department of hospital can decide which high predict value criteria suggesting checklist from medical information department for infection control department member to perform active patient examination and decrease the number of direct patient examination and chart review, but still keep high capture sensitivity

Proceedings Article•DOI•
22 Jun 2009
TL;DR: A beamformer-based approach is proposed which exploits a maximum correlation criterion to maximize the significance level of correlation between brain activities and leads to a closed-form solution of the dipole orientation.
Abstract: The past findings have suggested that temporal correlation may relate the communications between the distributed areas. There are some studies in Magnetoencephalography and electroencephalography to analyze the functional connectivity between cortical areas with the oscillations feature of neuronal activity. However, it is also important to observe the functional connectivity through temporal correlation between cortical areas. We proposed a beamformer-based approach which exploits a maximum correlation criterion to maximize the significance level of correlation between brain activities. This criterion leads to a closed-form solution of the dipole orientation. Experiments with simulation data clearly demonstrate the effectiveness, necessity, and accuracy of the proposed method.

Proceedings Article•DOI•
22 Jun 2009
TL;DR: It is proposed that the differential features or methylations vary between the different regions because the features common to each DNA region made up only 50% of the top 70 features.
Abstract: During gene expression, transcription factors are unable to bind to a transcription binding site (TFBS) involved in regulation if DNA methylation has occurred at the TFBS. Methyl-CpG-binding proteins may also occupy the TFBS and prevent the functioning of a transcription factor. Thus, the methylation status of CpG sites is an important issue when trying to understand gene regulation and shows strong correlation with the TFBS involved. In addition, CpG islands would seem to undergo cell-specific and tissue-specific me-thylation. Such differential methylation is presented at numerous genetic loci that are essential for development. Current DNA methylation site prediction tools need to be improved so that they include TFBS features and have greater accuracy in terms of the DNA region that is involved in methylation. We developed models that compare the differences across these regions and tissues. The TFBSs, DNA properties and DNA distribution were used as features for this classification. From the results, we found some TFBSs that were able to discriminate whether a sequence was methylated or not. The sensitivity, specificity and accuracy estimated using 10-fold cross validation were 90.8%, 80.54%, and 86.07%, respectively. Thus, for these four regions and twelve tissues, the performance levels (ACC) were all greater than 80%. We propose that the differential features or methylations vary between the different regions because the features common to each DNA region made up only 50% of the top 70 features. An online predictor based on EpiMeP is available at http://140.115.51.41/EpiMeP/. Supplementary file is available at http://140.115.51.41/EpiMeP/supplementary.doc.