Showing papers presented at "International Conference on Bioinformatics in 2010"

PDF

Open Access

Proceedings Article•DOI•

Markov clustering of protein interaction networks with improved balance and scalability

[...]

Venu Satuluri¹, Srinivasan Parthasarathy¹, Duygu Ucar²•Institutions (2)

Ohio State University¹, University of Iowa²

02 Aug 2010

TL;DR: This paper designs an algorithm on top of Regularized MCL (R-MCL), a previously proposed modification of MCL that computes a new regularization matrix at each iteration that penalizes big cluster sizes, with the size of the penalty being tunable using a balance parameter.

...read moreread less

Abstract: Markov Clustering (MCL) is a popular algorithm for clustering networks in bioinformatics such as protein-protein interaction networks and protein similarity networks. An important requirement when clustering protein networks is minimizing the number of big clusters, since it is generally understood that protein complexes tend not to have more than 15--30 nodes. Similarly, it is important to not output too many singleton clusters, since they do not provide much useful information. In this paper, we show how MCL may be modified so as to better respect these two requirements, while also taking the link structure in the graph into account. We design our algorithm on top of Regularized MCL (R-MCL) [16], a previously proposed modification of MCL. Our proposed variation computes a new regularization matrix at each iteration that penalizes big cluster sizes, with the size of the penalty being tunable using a balance parameter. This algorithm also naturally fits in a Multi level framework that allows great improvements in speed. We perform experiments on three real protein interaction networks and show significant improvements over MCL in quality, balance and execution speed.

...read moreread less

78 citations

Proceedings Article•DOI•

ReFHap: a reliable and fast algorithm for single individual haplotyping

[...]

Jorge Duitama¹, Thomas Huebsch², Gayle K. McEwen², Eun-Kyung Suk², Margret R. Hoehe² - Show less +1 more•Institutions (2)

University of Connecticut¹, Max Planck Society²

02 Aug 2010

TL;DR: A novel problem formulation for single individual haplotyping that initially finds the best cut based on a heuristic algorithm for max-cut and then builds haplotypes consistent with that cut and is found that ReFHap performs significantly faster than previous methods without loss of accuracy.

...read moreread less

Abstract: Full human genomic sequences have been published in the latest two years for a growing number of individuals. Most of them are a mixed consensus of the two real haplotypes because it is still very expensive to separate information coming from the two copies of a chromosome. However, latest improvements and new experimental approaches promise to solve these issues and provide enough information to reconstruct the sequences for the two copies of each chromosome through bioinformatics methods such as single individual haplotyping. Full haploid sequences provide a complete understanding of the structure of the human genome, allowing accurate predictions of translation in protein coding regions and increasing power of association studies.In this paper we present a novel problem formulation for single individual haplotyping. We start by assigning a score to each pair of fragments based on their common allele calls and then we use these score to formulate the problem as the cut of fragments that maximize an objective function, similar to the well known max-cut problem. Our algorithm initially finds the best cut based on a heuristic algorithm for max-cut and then builds haplotypes consistent with that cut. We have compared both accuracy and running time of ReFHap with other heuristic methods on both simulated and real data and found that ReFHap performs significantly faster than previous methods without loss of accuracy.

...read moreread less

67 citations

Journal Article•DOI•

Leadership Style And Organizational Culture In Learning Organization: A Comparative Study

[...]

Sapna Rijal¹•Institutions (1)

Veer Bahadur Singh Purvanchal University¹

02 Nov 2010

TL;DR: In this paper, the impact of transformational leadership and organizational culture on the development of learning organizations was examined in the pharmaceutical sector and a comparison between India and Nepal was drawn between the two countries.

...read moreread less

Abstract: Scholars and practitioners have identified transformational leadership and organizational culture as important factors that influence the development of learning organization. Yet, few studies have empirically examined the impact of transformational leadership and organizational culture on learning organization. This study proposes hypotheses to understand the impact of transformational leadership and organizational culture on the development of learning organization. Data was collected from the pharmaceutical sector and a comparison was drawn between India and Nepal. Results indicate transformational leadership and organizational culture have a positive influence in the development of learning organization. The implication of the findings and possible directions for future research are discussed.

...read moreread less

66 citations

Proceedings Article•

Information-Theoretic Inference of Gene Networks Using Backward Elimination

[...]

Patrick E. Meyer¹, Daniel Marbach², Sushmita Roy³, Manolis Kellis¹•Institutions (3)

Massachusetts Institute of Technology¹, École Polytechnique Fédérale de Lausanne², University of New Mexico³

01 Jan 2010

TL;DR: MRNETB is introduced, an improved version of the previous information-theoretic algorithm, MRNET, which has competitive performance with state-of-the-art algorithms and performs comparably to CLR and significantly better than ARACNE indicating that the new variable selection strategy can successfully infer high-quality networks.

...read moreread less

Abstract: Unraveling transcriptional regulatory networks is essential for understanding and predicting cellular responses in different developmental and environmental contexts. Information-theoretic methods of network inference have been shown to produce high-quality reconstructions because of their ability to infer both linear and non-linear dependencies between regulators and targets. In this paper, we introduce MRNETB an improved version of the previous information-theoretic algorithm, MRNET, which has competitive performance with state-of-the-art algorithms. MRNET infers a network by using a forward selection strategy to identify a maximally-independent set of neighbors for every variable. However, a known limitation of algorithms based on forward selection is that the quality of the selected subset strongly depends on the first variable selected. In this paper, we present MRNETB, an improved version of MRNET that overcomes this limitation by using a backward selection strategy followed by a sequential replacement. Our new variable selection procedure can be implemented with the same computational cost as the forward selection strategy. MRNETB was benchmarked against MRNET and two other information-theoretic algorithms, CLR and ARACNE. Our benchmark comprised 15 datasets generated from two regulatory network simulators, 10 of which are from the DREAM4 challenge, which was recently used to compare over 30 network inference methods. To assess stability of our results, each method was implemented with two estimators of mutual information. Our results show that MRNETB has significantly better performance than MRNET, irrespective of the mutual information estimation method. MRNETB also performs comparably to CLR and significantly better than ARACNE indicating that our new variable selection strategy can successfully infer high-quality networks.

...read moreread less

63 citations

Proceedings Article•DOI•

Comparative analysis of biclustering algorithms

[...]

Doruk Bozdağ¹, Ashwin S. Kumar¹, Ümit V. Çatalyürek¹•Institutions (1)

Ohio State University¹

02 Aug 2010

TL;DR: This paper systematically formulate the requirements for well-known patterns and show the constraints imposed by biclustering algorithms that determine their capacity to identify such patterns and reports the biological relevance of clusters identified by each algorithm.

...read moreread less

Abstract: Biclustering is a very popular method to identify hidden co-regulation patterns among genes. There are numerous biclustering algorithms designed to undertake this challenging task, however, a thorough comparison between these algorithms is even harder to accomplish due to lack of a ground truth and large variety in the search strategies and objectives of the algorithms. In this paper, we address this less studied, yet important problem and formally analyze several biclustering algorithms in terms of the bicluster patterns they attempt to discover. We systematically formulate the requirements for well-known patterns and show the constraints imposed by biclustering algorithms that determine their capacity to identify such patterns. We also give experimental results from a carefully designed testbed to evaluate the power of the employed search strategies. Furthermore, on a set of real datasets, we report the biological relevance of clusters identified by each algorithm.

...read moreread less

56 citations

Proceedings Article•DOI•

Continuous cuffless blood pressure monitoring based on PTT

[...]

Revati Shriram¹, Asmita Wakankar¹, Nivedita Daimiwal¹, Dipali Ramdasi¹•Institutions (1)

MKSSS's Cummins College of Engineering for Women¹

16 Apr 2010

TL;DR: The developed pulse transit time based method can be used as a noninvasive and cuffless alternative to the conventional occluding-cuff approaches for long-term and continuous monitoring of blood pressure.

...read moreread less

Abstract: Blood pressure (BP) is one of the important vital signs that need to be monitored for personal healthcare. This paper describes the method developed by the authors to measure systolic blood pressure from pulse transit time (PTT) Pulse transit time is the time taken for the arterial pulse pressure wave to travel from the aortic valve to a peripheral site. It is usually measured from the R wave on the electrocardiogram to a photoplethysmography signal. PTT is inversely proportional to blood pressure. This method does not require an air cuff and only a minimal inconvenience of attaching electrodes and LED/photo detector sensors on a subject. Twenty three healthy subjects (age 18–60 yrs) were studied. Blood pressure measurement is carried out using pulse transit time and is compared with sphygmomanometry, the reference standard and the oscillometric based automatic BP measuring machine. The results show that the standard deviation of their differences was around 3 mmHg. The developed pulse transit time based method can be used as a noninvasive and cuffless alternative to the conventional occluding-cuff approaches for long-term and continuous monitoring of blood pressure.

...read moreread less

55 citations

Proceedings Article•DOI•

MetaCluster: unsupervised binning of environmental genomic fragments and taxonomic annotation

[...]

Bin Yang¹, Yu Peng¹, Henry C. M. Leung¹, Siu-Ming Yiu¹, Junjie Qin, Ruiqiang Li, Francis Y. L. Chin¹ - Show less +3 more•Institutions (1)

University of Hong Kong¹

02 Aug 2010

TL;DR: MetaCluster 2.0 is presented, an unsupervised binning method which could bin metagenomic sequencing datasets with high accuracy, and also identify unknown genomes and annotate them with proper taxonomic labels.

...read moreread less

Abstract: Limited by the laboratory technique, traditional microorganism research usually focuses on one single individual species. This significantly limits the deep analysis of intricate biological processes among complex microorganism communities. With the rapid development of genome sequencing techniques, the traditional research methods of microorganisms based on the isolation and cultivation are gradually replaced by metagenomics, also known as environmental genomics. The first step, which is also the major bottleneck of metagenomic data analysis, is the identification and taxonomic characterization of the DNA fragments (reads) resulting from sequencing a sample of mixed species. This step is usually referred as "binning".Existing binning methods based on sequence similarity and sequence composition markers rely heavily on the reference genomes of known microorganisms and phylogenetic markers. Due to the limited availability of reference genomes and the bias and unstableness of markers, these methods may not be applicable in all cases. Not much unsupervised binning methods are reported, but the unsupervised nature of these methods makes them extremely difficult to annotate the clusters with taxonomic labels. In this paper, we present MetaCluster 2.0, an unsupervised binning method which could bin metagenomic sequencing datasets with high accuracy, and also identify unknown genomes and annotate them with proper taxonomic labels. The running time of MetaCluster 2.0 is at least 30 times faster than existing binning algorithms.MetaCluster 2.0, and all the test datasets mentioned in this paper are available at http://i.cs.hku.hk/~alse/MetaCluster/.

...read moreread less

40 citations

Proceedings Article•DOI•

Feature extraction values for breast cancer mammography images

[...]

Hala M. Alshamlan¹, Ali El-Zaart¹•Institutions (1)

King Saud University¹

16 Apr 2010

TL;DR: The aims of this study are to determine the features extraction ranges values for Breast Cancer mammography image and analyze Breast cancer mammography images using these significant features.

...read moreread less

Abstract: Breast cancer is one of the most common cancers among woman of the developing countries in the world, and it has also become a major cause of death. Treatment of breast cancer is effective only if it is detected at an early stage. X-ray mammography is the most effective method for early detection but the mammography images are complex. Thus nowadays, image processing and image analysis techniques are use to assist radiologist for detecting tumors in mammography images. In this paper we specify and determined the important and significant Breast Cancer Feature Extraction. After that, we analyze Breast cancer mammography images using these significant features. However, the aim of this study is to determine the features extraction ranges values for Breast Cancer mammography image.

...read moreread less

38 citations

Proceedings Article•DOI•

Implementation of a wearerable real-time system for physical activity recognition based on Naive Bayes classifier

[...]

Xiuxin Yang¹, Anh Dinh¹, Li Chen¹•Institutions (1)

University of Saskatchewan¹

16 Apr 2010

TL;DR: A wearable real-time system on the Sun SPOT wireless sensors with Naive Bayes algorithm to recognize physical activity to work better than other algorithms both in accuracy performance and computational time in this particular application.

...read moreread less

Abstract: In this paper, we implement a wearable real-time system on the Sun SPOT wireless sensors with Naive Bayes algorithm to recognize physical activity. Naive Bayes algorithm is demonstrated to work better than other algorithms both in accuracy performance and computational time in this particular application. 20Hz is selected as the sampling rate. In terms of sensor location, one sensor attached to the thigh with 87.55% overall accuracy provides the most useful information than the shank or the chest. If two sensors are available, the combination of attaching them to the left thigh and the right thigh respectively is demonstrated to be optimal solution for recognizing physical activity, with 90.52% overall accuracy.

...read moreread less

34 citations

Proceedings Article•DOI•

REAL: an efficient REad ALigner for next generation sequencing reads

[...]

Kimon Frousios¹, Costas S. Iliopoulos¹, Laurent Mouchard², Solon P. Pissis¹, German Tischler¹ - Show less +1 more•Institutions (2)

King's College London¹, University of Rouen²

02 Aug 2010

TL;DR: REad ALigner is presented, an efficient, accurate and consistent tool for aligning short reads obtained from next generation sequencing, based on a new, simple, yet efficient mapping algorithm that can match and outperform current BWT-based software.

...read moreread less

Abstract: Motivation: The constant advances in sequencing technology are turning whole-genome sequencing into a routine procedure, resulting in massive amounts of data that need to be processed. Tens of gigabytes of data in the form of short reads need to be mapped back to reference sequences, a few gigabases long. A first generation of short read alignment software successfully employed hash tables, and the current second generation uses Burrows-Wheeler Transform, further improving mapping speed. However, there is still demand for faster and more accurate mapping.Results: In this paper, we present REad ALigner, an efficient, accurate and consistent tool for aligning short reads obtained from next generation sequencing. It is based on a new, simple, yet efficient mapping algorithm that can match and outperform current BWT-based software.

...read moreread less

33 citations

Proceedings Article•DOI•

An improved method for digital time series signal generation from scanned ECG records

[...]

Prashanth Swamy¹, Srinivasan Jayaraman¹, M. Girish Chandra¹•Institutions (1)

Tata Consultancy Services¹

16 Apr 2010

TL;DR: An improved algorithm for the existing paper ECG trace to digital time series with adaptive and iterative image processing techniques is proposed and shows an accuracy of 95%, demonstrating the usefulness of the approach.

...read moreread less

Abstract: Archiving the paper Electrocardiogram (ECG) trace as an image in hospitals and clinics is a regular practice to maintain the patients' history. However, it requires immense storage space and manpower for storage and retrieval of the patient records. In this paper we have proposed an improved algorithm for the existing paper ECG trace to digital time series with adaptive and iterative image processing techniques. We have tested our algorithm with a number of ECG sheets printed from 12-lead ECG equipments. Further, the proposed technique is enhanced to calculate the heart rate from the obtained time series to facilitate the evaluation of the methodology. Compared with the manually obtained heart rate, the methodology shows an accuracy of 95%, demonstrating the usefulness of the approach. Also, elaborate experimentation with the algorithm has brought out the robustness of the methodology in handling ECG traces from different sources.

...read moreread less

Proceedings Article•DOI•

Driver fatigue detection through pupil detection and yawing analysis

[...]

Weiwei Liu¹, Haixin Sun¹, Weijie Shen¹•Institutions (1)

Xiamen University¹

16 Apr 2010

TL;DR: This paper presents a fatigue detection system based on pupil detection and yawning analysis that works robustly at night time because of the IR illuminator.

...read moreread less

Abstract: Detecting driver fatigue is an important method to improving transportation safety. This paper presents a fatigue detection system based on pupil detection and yawning analysis. The parameters used for detecting fatigue are: eye closure duration measured through eye state information and yawning analyzed through mouth state information. Because of the IR illuminator, the system works robustly at night time.

...read moreread less

Proceedings Article•DOI•

Exact ILP solutions for phylogenetic minimum flip problems

[...]

Markus Chimani¹, Sven Rahmann², Sebastian Böcker¹•Institutions (2)

University of Jena¹, Technical University of Dortmund²

02 Aug 2010

TL;DR: This work considers the Minimum Flip Consensus Tree and Minimum Flip Supertree problem, where input trees are transformed into a 0/1-matrix, such that each row represents a taxon, and each column represents a subtree membership.

...read moreread less

Abstract: In computational phylogenetics, the problem of constructing a consensus tree or supertree of a given set of rooted input trees can be formalized in different ways. We consider the Minimum Flip Consensus Tree and Minimum Flip Supertree problem, where input trees are transformed into a 0/1/?-matrix, such that each row represents a taxon, and each column represents a subtree membership. For the consensus tree problem, all input trees contain the same set of taxa, and no ?-entries occur. For the supertree problem, the input trees may contain different subsets of the taxa, and unrepresented taxa are coded with ?-entries. In both cases, the goal is to find a perfect phylogeny for the input matrix requiring a minimum number of 0/1-flips, i.e., matrix entry corrections. Both optimization problems are NP-hard.We present the first efficient Integer Linear Programming (ILP) formulations for both problems, using three distinct characterizations of a perfect phylogeny. Although these three formulations seem to differ considerably at first glance, we show that they are in fact polytope-wise equivalent. Introducing a novel column generation scheme, it turns out that the simplest, purely combinatorial formulation is the most efficient one in practice. Using our framework, it is possible to find exact solutions for instances with ~100 taxa.

...read moreread less

Proceedings Article•DOI•

Accelerating HMMER on GPUs by implementing hybrid data and task parallelism

[...]

Narayan Ganesan¹, Roger D. Chamberlain², Jeremy Buhler², Michela Taufer¹•Institutions (2)

University of Delaware¹, Washington University in St. Louis²

02 Aug 2010

TL;DR: This paper re-designs its computational algorithm to extract data parallelism for a more efficient execution on emerging platforms, despite the fact that hmmersearch has data dependencies and outperforms other existing methods when searching a very large database of unsorted sequences on GPUs.

...read moreread less

Abstract: Many biologically motivated problems are expressed as dynamic programming recurrences and are difficult to parallelize due to the intrinsic data dependencies in their algorithms. Therefore their solutions have been sped up using task level parallelism only. Emerging platforms such as GPUs are appealing parallel architectures for high-performance; at the same time they are a motivation to rethink the algorithms associated with these problems, to extract finer-grained parallelism such as data parallelism.In this paper, we consider the hmmersearch program as a representative of these problems and we re-design its computational algorithm to extract data parallelism for a more efficient execution on emerging platforms, despite the fact that hmmersearch has data dependencies. Our approach outperforms other existing methods when searching a very large database of unsorted sequences on GPUs.

...read moreread less

Journal Article•DOI•

Radio Frequency Identification (RFID) Technology: Gaining A Competitive Value Through Cloud Computing

[...]

Daniel Owunwanne¹•Institutions (1)

Howard University¹

02 Nov 2010

TL;DR: This work proposes implementing RFID technology using cloud computing framework to alleviate or reduce the implementation cost which is the most prevalent barrier.

...read moreread less

Abstract: Radio Frequency Identification (RFID) uses radio waves to track the movement of goods through the Supply Chain system. The identity of an object is captured with a unique serial number that is transmitted wirelessly to a computer system. Small businesses are facing RFID implementation barriers. The barriers range from the perspective of the consumer-goods manufacturers and retail organizations. We propose implementing RFID technology using cloud computing framework to alleviate or reduce the implementation cost which is the most prevalent barrier.

...read moreread less

Journal Article•DOI•

Simulations in the Smart Grid Field Study MeRegio Simulationen im MeRegio Smart Grid Feldtest

[...]

Christian Hirsch¹, Lutz Hillemacher, Carsten Block, Alexander Schuller, Dominik Möst - Show less +1 more•Institutions (1)

Karlsruhe Institute of Technology¹

01 Mar 2010

TL;DR: The aim of the research project MeRegio is to meet the claim for more efficient decentralized energy systems by integrating advanced information and communication technologies into all stages of the energy supply chain through a powerful and lawful ICT infrastructure.

...read moreread less

Abstract: Abstract The aim of the research project MeRegio is to meet the claim for more efficient decentralized energy systems by integrating advanced information and communication technologies (ICT) into all stages of the energy supply chain. Several marketplaces — in particular for power and for ancillary services — which are coupled to the technical energy infrastructure through a powerful and lawful ICT infrastructure should serve as a basis for an efficient and transparent coordination of energy supply, energy demand, and services. The developed concepts will be both validated by simulations and tested within a model region.

...read moreread less

Proceedings Article•DOI•

Automatic selection of near-native protein-ligand conformations using a hierarchical clustering and volunteer computing

[...]

Trlce Estrada¹, Roger S. Armen², Michela Taufer¹•Institutions (2)

University of Delaware¹, University of Michigan²

02 Aug 2010

TL;DR: This paper addresses the problem of extensively searching large spaces of protein-ligand docking conformations, supported by the volunteer computing project Docking@Home (D@H), by using a probabilistic hierarchical clustering based on ligand geometry.

...read moreread less

Abstract: Docking simulations are commonly used to understand drug binding and require the search of a large space of proteinligand conformations. Cloud and volunteer computing enable computationally expensive docking simulations at a rate never seen before but at the same time require scientists to deal with larger datasets. When analysing these datasets, a common practice is to reduce the resulting number of candidates up to 10 to 100 conformations based on energy values and then leave the scientists with the tedious task of subjectively selecting a possible near-native ligand. Scientists normally perform this task manually by using visual tools. Not only the manual process still depends on inaccurate energy scoring but also can be highly error-prone.The contributions of this paper are twofold: First, we address the problem of extensively searching large spaces of protein-ligand docking conformations, supported by the volunteer computing project Docking@Home (D@H). Second, we address the problem of accurately, and automatically, selecting near-native ligand conformations from the large number of D@H results by using a probabilistic hierarchical clustering based on ligand geometry. Our method holds up even when we test for a search that is not biased by starting from near-native ligand conformations and clearly outperforms energy-based scoring methods.

...read moreread less

Journal Article•DOI•

Proposal And Development Of The Direct Mail Method PMCI-DM For Effectively Attracting Customers

[...]

Taku Kojima¹, Toshiyuki Kimura¹, Manabu Yamaji¹, Kakuro Amasaka¹•Institutions (1)

Aoyama Gakuin University¹

01 Nov 2010

TL;DR: In this article, the authors conducted a demonstrative study that focused on creating effective direct mail and promotional advertisements for attracting customers to dealers, and they considered these strategies to be elemental technologies vital to "forming ties with the customer".

...read moreread less

Abstract: Observations of recent changes in the marketing field reveal that a marketing and sales system that places further emphasis on interaction with the customer must be established. The authors conducted a demonstrative study that focused on creating effective direct mail and promotional advertisements for attracting customers to dealers. We consider these strategies to be elemental technologies vital to “forming ties with the customer.” Finally, the authors established “PMCI-DM” (Practical use Model of Customer Information for Direct Mail) and applied it at Company A, a foreign-funded automobile sales company. We were thus able to demonstrate this strategy’s effectiveness.

...read moreread less

Proceedings Article•DOI•

A new method for automatic detection and diagnosis of retinopathy diseases in colour fundus images based on Morphology

[...]

Mahsa Naser Langroudi¹, Hamed Sadjedi•Institutions (1)

Islamic Azad University¹

16 Apr 2010

TL;DR: The algorithm effectively diagnoses Glaucoma and other diseases which cause changes to the optic disc and determines the type of each lesion based on Morphology.

...read moreread less

Abstract: Automatic detection of lesions in retinal images can assist in early diagnosis and screening of retinopathy diseases. In this paper, the detection of five types of lesions and optic disc has been studied. These lesions include: Hard exudates, Soft exudates, Drusen, Microaneurysm and Hemorrhage, each of which is a sign of one or more types of disease. Our algorithm also effectively diagnoses Glaucoma and other diseases which cause changes to the optic disc. In our method, first the retina images are pre-processed. Then, our algorithm detects OD, fovea and lesions in the image and determines the type of each lesion based on Morphology. Later, the system finds the Characteristics of the Optic Disc for diagnosis of Glaucoma. It is shown that the performance of the proposed method is high. We have achieved a sensitivity of 92.5% and a specificity of 81.4%.

...read moreread less

Proceedings Article•DOI•

Genome-wide compatible SNP intervals and their properties

[...]

Jeremy Wang¹, Kyle J. Moore¹, Qi Zhang¹, Fernando Pardo-Manual de Villena¹, Wei Wang¹, Leonard McMillan¹ - Show less +2 more•Institutions (1)

University of North Carolina at Chapel Hill¹

02 Aug 2010

TL;DR: This work presents methods for partitioning a genome into blocks for which there are no apparent recombinations, thus providing parsimonious sets of compatible genome intervals based on the four-gamete test and defines the notion of an interval set that achieves the interval lower-bound, yet maximizes interval overlap.

...read moreread less

Abstract: Intraspecific genomes can be subdivided into blocks with limited diversity. Understanding the distribution and structure of these blocks will help to unravel many biological problems including the identification of genes associated with complex diseases, finding the ancestral origins of a given population, and localizing regions of historical recombination, gene conversion, and homoplasy.We present methods for partitioning a genome into blocks for which there are no apparent recombinations, thus providing parsimonious sets of compatible genome intervals based on the four-gamete test. Our contribution is a thorough analysis of the problem of dividing a genome into compatible intervals, in terms of its computational complexity, and by providing an achievable lower-bound on the minimal number of intervals required to cover an entire data set. In general, such minimal interval partitions are not unique. However, we identify properties that are common to every possible solution. We also define the notion of an interval set that achieves the interval lower-bound, yet maximizes interval overlap. We demonstrate algorithms for partitioning both haplotype data from inbred mice as well as outbred heterozygous genotype data using extensions of the standard four-gamete test. These methods allow our algorithms to be applied to a wide range of genomic data sets.

...read moreread less

Proceedings Article•DOI•

Protein structure alignment using elastic shape analysis

[...]

Wei Liu¹, Anuj Srivastava¹, Jinfeng Zhang¹•Institutions (1)

Florida State University¹

02 Aug 2010

TL;DR: A method for flexible protein structure alignment based on elastic shape analysis of backbones, in a manner that can include the backbone geometry, the secondary structures, and the amino-acid sequences in the matching process, is presented.

...read moreread less

Abstract: In this paper we present a method for flexible protein structure alignment based on elastic shape analysis of backbones, in a manner that can incorporate different characteristics of the backbones. In particular, it can include the backbone geometry, the secondary structures, and the amino-acid sequences in the matching process. As a result, a formal distance can be calculated and geodesic paths, showing optimal deformations between conformations/structures, can be computed for any two backbone structures. It can also be used to average shapes of conformations associated with similar proteins. Using proteins from SCOP and PDB databases we demonstrate the matching and clustering of proteins using the backbone geometries, the secondary labels and the primary sequences. We demonstrate almost 92% success rate in automatic clustering of 100 proteins from SCOP database.

...read moreread less

Proceedings Article•DOI•

Predicting breast cancer recurrence using data mining techniques

[...]

Qi Fan, Chang-jie Zhu, Liu Yin

16 Apr 2010

TL;DR: A new data pre-classification method is presented and a possible solution to discover the information of breast cancer recurrence of SEER data is found and c5 algorithm has the best performance of accuracy.

...read moreread less

Abstract: In this study, we firstly take good advantage of SEER Public-Use Data to predict breast cancer recurrence using data mining techniques. The SEER Public-Use Data 2005 is used in this research. We presented a new data pre-classification method and firstly find a possible solution to discover the information of breast cancer recurrence of SEER data. After the preprocessing of the dataset, we investigate several algorithms. As a result, we found c5 algorithm has the best performance of accuracy.

...read moreread less

Proceedings Article•DOI•

Bandage-size non-ECG heart rate monitor using ZigBee wireless link

[...]

Anh Dinh¹, Tao Wang¹•Institutions (1)

University of Saskatchewan¹

16 Apr 2010

TL;DR: This design uses a simple technique to pick up the sound of the heart beat and send the beat signal wirelessly to a computer using ZigBee protocol, which is a low-cost with a very small size, light weight and easy to use by the patient.

...read moreread less

Abstract: Heart rate is an indication for the health of a human being. Traditionally, ECG signal is used to measure and monitor heart rate. This design uses a simple technique to pick up the sound of the heart beat and send the beat signal wirelessly to a computer. The system is a low-cost with a very small size, light weight and easy to use by the patient. A microphone is used to pick up the sound of the heart beat. The signal is processed, sampled and sent wirelessly using ZigBee protocol. Experimental results show the system functioning properly. Heart beat signals are sensed, sent, displayed, monitored, stored, reviewed, and analyzed with ease. Flexible PCB can be used to further reducing size and weight of the sensing unit.

...read moreread less

Proceedings Article•

Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology

[...]

Aidong Zhang¹, Mark Borodovsky², Gultekin Ozsoyoglu³, Armin R. Mikler⁴•Institutions (4)

University at Buffalo¹, Georgia Institute of Technology², Case Western Reserve University³, University of North Texas⁴

02 Aug 2010

TL;DR: The 2010 ACM International Conference on Bioinformatics and Computational Biology is aimed to bridge interdisciplinary research areas in computer science, mathematics, statistics, biology, bioinformics, and biomedicine and to provide an interactive forum for professional researchers, practitioners, and graduate students from around the world to discuss latest advances.

...read moreread less

Abstract: The 2010 ACM International Conference on Bioinformatics and Computational Biology (ACM-BCB-2010) is the first ACM (Association for Computing Machinery) conference in the areas of bioinformatics, computational biology, and biomedical informatics. The ACM-BCB-2010 is aimed to bridge interdisciplinary research areas in computer science, mathematics, statistics, biology, bioinformatics, and biomedicine and to provide an interactive forum for professional researchers, practitioners, and graduate students from around the world to discuss latest advances in Bioinformatics and Computational Biology. Over the past ten years, we have witnessed a dramatic acceleration in biological and biomedical research due to advent of new high-throughput technologies and next generation sequencing. Novel massive data stored in fast growing databases has called for more powerful algorithms that could identify genes and proteins, infer their function, predict protein structures, and identified protein-protein interactions. Clearly, the ability to analyze all these data concurrently will lead to greater understanding of the biological mechanisms underlying heredity and disease. Bioinformatics and Computational Biology cut across a broad range of research topics from genomics, transcriptomics, proteomics and metabolomics to systems biology, tissue engineering, and medical image analysis. Development of efficient computational methods in all these areas is critically important for solving biological and biomedical problems that holds keys for improving human health. ACM-BCB annual conferences are dedicated to develop interactions between experienced and junior researchers in the fields of computer science, computer engineering, applied computing, biology, medicine, biophysical sciences and life sciences. It assembles research workshops, keynote and tutorial lectures, as well as special interest research sessions into a coordinated research meeting. To promote interdisciplinary research and training the conference agenda includes two keynote speeches, two tutorial lectures, and one panel discussion. In addition to the main conference, ACM-BCB-2010 also features four workshops: (1) Gene Networks and Pathway Analysis, (2) Graph Theoretic Approaches for Biological Network Analysis, (3) Immunoinformatics and Computational Immunology, and (4) Protein-protein Interaction Data: Management, Querying and Analysis. In preparation for ACM-BCB-2010, we received 164 paper submissions. The conference program features 37 regular papers, 27 short papers, and 30 poster papers. The conference concludes on August 4, 2010 with nominations and selections of awards for best paper, best student paper, and best poster.

...read moreread less

Journal Article•DOI•

Work-Life Balance In Hospitality: Experiences From A Geneva-Based Hotel

[...]

Robert A. Lewis

02 Nov 2010

TL;DR: In this paper, a study carried out on 30 employees in a Geneva-based hotel, found that employee work-life balance issues are affected by human resource policy and that these issues can be mitigated through organisational support and the recognition of informal feedback.

...read moreread less

Abstract: This study, carried out on 30 employees in a Geneva-based hotel, argues that employee work-life balance issues are affected by human resource policy. Questionnaires, containing attitude scales and open-ended questions, revealed that employees remained in their jobs because of work-life programmes. Variables identified in this study which positively affected employee well-being included increased schedule flexibility and mutually beneficial relationships with line managers. Negative ones included long working hours, the sacrifice of private life, invasive working hours, decreased social and family life in addition to increased fatigue and stress. Study results also revealed that work-life balance issues perceived by employees can be mitigated through organisational support and the recognition of informal feedback.

...read moreread less

Journal Article•DOI•

Network analysis of human protein location

[...]

Gaurav Kumar¹, Shoba Ranganathan¹, Shoba Ranganathan²•Institutions (2)

Macquarie University¹, National University of Singapore²

15 Oct 2010

TL;DR: The findings indicate that the metabolic network adds value to the information in the PPI network for the localisation process of proteins in human subcellular compartments, as the MLPI network has evolved to maintain high substrate specificity for proteins.

...read moreread less

Abstract: Understanding cellular systems requires the knowledge of a protein's subcellular localization (SCL). Although experimental and predicted data for protein SCL are archived in various databases, SCL prediction remains a non-trivial problem in genome annotation. Current SCL prediction tools use amino-acid sequence features and text mining approaches. A comprehensive analysis of protein SCL in human PPI and metabolic networks for various subcellular compartments is necessary for developing a robust SCL prediction methodology. Based on protein-protein interaction (PPI) and metabolite-linked protein interaction (MLPI) networks of proteins, we have compared, contrasted and analysed the statistical properties across different subcellular compartments. We integrated PPI and metabolic datasets with SCL information of human proteins from LOCATE and GOA (Gene Ontology Annotation) and estimated three statistical properties: Chi-square (χ2) test, Paired Localisation Correlation Profile (PLCP) and network topological measures. For the PPI network, Pearson's chi-square test shows that for the same SCL category, twice as many interacting protein pairs are observed than estimated when compared to non-interacting protein pairs (χ2 = 1270.19, P-value < 2.2 × 10-16), whereas for MLPI, metabolite-linked protein pairs having the same SCL are observed 20% more than expected, compared to non-metabolite linked proteins (χ2 = 110.02, P-value < 2.2 x10-16). To address the issue of proteins with multiple SCLs, we have specifically used the PLCP (Pair Localization Correlation Profile) measure. PLCP analysis revealed that protein interactions are majorly restricted to the same SCL, though significant cross-compartment interactions are seen for nuclear proteins. Metabolite-linked protein pairs are restricted to specific compartments such as the mitochondrion (P-value < 6.0e-07), the lysosome (P-value < 4.7e-05) and the Golgi apparatus (P-value < 1.0e-15). These findings indicate that the metabolic network adds value to the information in the PPI network for the localisation process of proteins in human subcellular compartments. The MLPI network differs significantly from the PPI network in its SCL distribution. The PPI network shows passive protein interaction, possibly due to its high false positive rate, across different subcellular compartments, which seem to be absent in the MLPI network, as the MLPI network has evolved to maintain high substrate specificity for proteins.

...read moreread less

Proceedings Article•DOI•

Bioinformatics: Trends in gene expression analysis

[...]

Shital A. Raut¹, S. R. Sathe¹, Adarsh Raut•Institutions (1)

Visvesvaraya National Institute of Technology¹

16 Apr 2010

TL;DR: All the major computational tools, various major methods from last many years which are used as a trends in gene expression analysis are discussed and major difficulties while applying these methods to databases for analysis purpose are discussed.

...read moreread less

Abstract: Bioinformatics is an interesting combination of biology and computational sciences, which help scientists and researchers to do more biological experiments to improve the life of living being. Gene expression is fundamental biological basics of cell biology. It is also responsible for genetic as well as physical or biochemical characteristics of an organism. The study of gene expression analysis helps to predict resultant protein product, in identifying abnormal functioning of cells which may responsible for various diseases, and in designing new drugs. For analysis purpose, DNA Microarray is an important tool as number of genes can simultaneously be observed. The output of DNA Microarray is vast databases which need to be processed by computation tools to take out biological significance. Computation tools include various algorithms of data mining, pattern recognition, support vector machines etc. Vast literature is available which demonstrate different algorithms and their result for analysis purpose. To find unique algorithms which satisfies all the requisite constraints is still the research topic. In this paper, we try to discuss all the major computational tools, various major methods from last many years which are used as a trends in gene expression analysis. We also try to discusses major difficulties while applying these methods to databases for analysis purpose.

...read moreread less

Proceedings Article•DOI•

Feature selection for semi-supervised multi-label learning with application to gene function analysis

[...]

Guo-Zheng Li¹, Mingyu You¹, Lei Ge², Jack Y. Yang³, Mary Qu Yang³ - Show less +1 more•Institutions (3)

Tongji University¹, Shanghai University², Indiana University³

02 Aug 2010

TL;DR: A novel semi-supervised multi-label learning algorithm COMN is proposed by combining Co-Training with ML-kNN to utilize the unlabeled yeast gene data to improve modeling accuracy of function annotation and an embedded feature selection algorithm PRECOMn is proposed to perform feature selection for COMN to remove the irrelevant and redundant features.

...read moreread less

Abstract: This paper investigates gene function annotation of Yeast by using semi-supervised multi-label learning. Multi-label learning has been a hot topic in the bioinformatics field, but there are many samples unlabeled. Semi-supervised learning may be employed to utilize the unlabeled data. This paper proposes a novel semi-supervised multi-label learning algorithm COMN by combining Co-Training with ML-kNN to utilize the unlabeled yeast gene data to improve modeling accuracy of function annotation. Furthermore, an embedded feature selection algorithm PRECOMN is proposed to perform feature selection for COMN to remove the irrelevant and redundant features. Experimental results on one benchmark data set of Yeast show COMN and PRECOMN perform better than the original multi-label learning algorithm ML-kNN. Furthermore PRECOMN improves generalization performance of COMN.

...read moreread less

Journal Article•DOI•

Management Principles Associated With IT Project Success

[...]

John M. Nicholas¹, Gezinus J. Hidding¹•Institutions (1)

Loyola University Chicago¹

02 Nov 2010

TL;DR: The Value-Driven Change Leadership (VDCL) paradigm as discussed by the authors is a new project management paradigm, which focuses on the project itself and on meeting time and cost targets, and it was proposed by a panel of project management experts.

...read moreread less

Abstract: Success in information technology (IT) projects remains elusive, even after decades of efforts to improve it. Most of these efforts have focused on variations of the traditional project management paradigm as promulgated by PMBOK. We suspected that a potential cause of high IT project failure is with the paradigm, which focuses on the project itself and on meeting time and cost targets. A new paradigm called Value-Driven Change Leadership (VDCL) originated from discussions of a panel of project management experts. This paper describes the principles of that paradigm. It also reports the results from a survey of four project managers on the association between project success and management principles from VDCL and PMBOK.

...read moreread less

Proceedings Article•DOI•

Differential biclustering for gene expression analysis

[...]

Omar Odibat¹, Chandan K. Reddy¹, Craig N. Giroux¹•Institutions (1)

Wayne State University¹

02 Aug 2010

TL;DR: This paper proposes DiBiCLUS, a novel Differential Biclustering algorithm, to identify differential biclusters from the gene expression data where the samples belong to one of the two classes, and introduces two criteria for any pair of genes to be considered as a differential pair across the two Classes.

...read moreread less

Abstract: Biclustering algorithms have been successfully used to find subsets of co-expressed genes under subsets of conditions. In some cases, microarray experiments are performed to compare the biological activities of the genes between two classes of cells, such as normal and cancer cells. In this paper, we propose DiBiCLUS, a novel Differential Biclustering algorithm, to identify differential biclusters from the gene expression data where the samples belong to one of the two classes. The genes in these differential biclusters can be positively or negatively co-expressed. We introduce two criteria for any pair of genes to be considered as a differential pair across the two classes. To illustrate the performance of the proposed algorithm, we present the experimental results of applying DiBiCLUS algorithm on synthetic and reallife datasets. These experiments show that the identified differential biclusters are both statistically and biologically significant.

...read moreread less

Collapse