scispace - formally typeset
Search or ask a question

Showing papers on "Dendrogram published in 2005"


Patent
12 Sep 2005
TL;DR: A document correlation diagram drawing device comprises extracting means (20, 30) for extracting content data and time data on document elements (E) each composed of one or more documents, dendrogram drawing means (50) for drawing the correlation between documents according to the content data on the document elements, clustering means (70) for cutting the dendrogrogram according to a predetermined rule and extracting clusters, and intra-cluster arranging means (90) for determining the arrangement of the document element group belonging to the clusters in the clusters according to time data.
Abstract: A document correlation diagram drawing device comprises extracting means (20, 30) for extracting content data and time data on document elements (E) each composed of one or more documents, dendrogram drawing means (50) for drawing a dendrogram showing the correlation between documents according to the content data on the document elements, clustering means (70) for cutting the dendrogram according to a predetermined rule and extracting clusters, and intra-cluster arranging means (90) for determining the arrangement of the document element group belonging to the clusters in the clusters according to the time data on the document elements. With this, a dendrogram adequately showing the temporal development in each field is automatically drawn.

207 citations


Journal ArticleDOI
SA Ogunbayo1, D. K. Ojo, R. G. Guei, O.O. Oyelakin ., K.A. Sanni . 
TL;DR: Morphological characterization of 40 rice accessions using 14 agro-botanical traits was done in a field experiment in an augmented randomized complete block design to study variations and to select lines that can be used as potential parents in a future breeding program.
Abstract: Morphological characterization of 40 rice accessions using 14 agro-botanical traits was done in a field experiment in an augmented randomized complete block design. The aim of the work was to study variations and to select lines that can be used as potential parents in a future breeding program. The single linkage clustering, the Principal Components Axes and a morphological dendrogram were used to group the accessions. Genetic relatedness among accessions based on random amplified polymorphic DNA (RAPD) molecular marker data was also presented in form of a dendrogram using the Unweighted Pair Group Method with Arithmetic mean (UPGMA). Relative effectiveness of the RAPD markers and genetic diversity among accessions as revealed by botanical descriptors were compared. The single linkage cluster technique classified the 40 accessions into six morphological groups whereas the PCA re-ordered the accessions into four broad groups that had within cluster similarities and inter-cluster morphological variations. RAPDs were highly polymorphic, more discriminatory and informative as they were able to differentiate more pairs of accessions than the botanical descriptors. IITA rice accessions TOX 3052-46-3-3-2-1 and TOX 3027-44-1-E4-2-2 and Brazilian accessions (CL SELECCION 3B and 450) that performed better than checks could be selected for a future breeding program.

81 citations


Journal ArticleDOI
TL;DR: It is demonstrated for the first time the variability of apricot SSRs in a large collection of Apricot cultivars and closely related species and the implications for the use of simple sequence repeat (SSR) markers as a tool for fingerprinting cultivars in breeders' rights protection and apricots breeding are discussed.
Abstract: A collection of 133 apricot cultivars and three related species originating from different geographical regions were studied with 10 polymorphic microsatellite markers developed in apricot. Altogether, 133 alleles were identified in the set of accessions, with an average of 13.30 alleles per locus. Out of them, 32 alleles occurred only once in the investigated samples, especially in apricots originating from different eco-geographic groups or in different species. The observed heterozygosity for individual loci ranged from 0.8636 to 0.3182, with an average of 0.6281. An unweighted pair group method with arithmetic mean dendrogram based on Nei's genetic distance grouped the accessions according to their eco-geographical origin and/or their pedigree information. Central Asian cultivars have a distinct position on the dendrogram, which supports the assumption that most cultivars have an Asian ancestor. Most East European cultivars analysed cluster together, and the data even revealed a few synonyms. Results show that American cultivars have not only European germ plasm in their pedigree, but they have also been enriched with germ plasm of Asian origin. The implications of these data for the use of simple sequence repeat (SSR) markers as a tool for fingerprinting cultivars in breeders' rights protection and apricot breeding are discussed. In this paper, we demonstrate for the first time the variability of apricot SSRs in a large collection of apricot cultivars and closely related species.

80 citations


Journal ArticleDOI
TL;DR: This work analyzed 1440 single gene knock-out mutants using the GN2-MicroPlate, which permits assay of 95 carbon-source utilizations simultaneously and genes were interrelated by the clustering method of the GeneSpring software, which would be useful for functional assignment of so called y-genes with no apparent function.

47 citations


Journal ArticleDOI
TL;DR: The sharing of functional keywords among genes is used as a basis for clustering in a new approach called BEA-PARTITION, which is simple to implement and provides a powerful approach to clustering genes or to any clustering problem where starting matrices are available from experimental observations.
Abstract: Partitioning closely related genes into clusters has become an important element of practically all statistical analyses of microarray data. A number of computer algorithms have been developed for this task. Although these algorithms have demonstrated their usefulness for gene clustering, some basic problems remain. This paper describes our work on extracting functional keywords from MEDLINE for a set of genes that are isolated for further study from microarray experiments based on their differential expression patterns. The sharing of functional keywords among genes is used as a basis for clustering in a new approach called BEA-PARTITION in this paper. Functional keywords associated with genes were extracted from MEDLINE abstracts. We modified the Bond Energy Algorithm (BEA), which is widely accepted in psychology and database design but is virtually unknown in bioinformatics, to cluster genes by functional keyword associations. The results showed that BEA-PARTITION and hierarchical clustering algorithm outperformed k\hbox{-}{\rm{means}} clustering and self-organizing map by correctly assigning 25 of 26 genes in a test set of four known gene groups. To evaluate the effectiveness of BEA-PARTITION for clustering genes identified by microarray profiles, 44 yeast genes that are differentially expressed during the cell cycle and have been widely studied in the literature were used as a second test set. Using established measures of cluster quality, the results produced by BEA-PARTITION had higher purity, lower entropy, and higher mutual information than those produced by k\hbox{-}{\rm{means}} and self-organizing map. Whereas BEA-PARTITION and the hierarchical clustering produced similar quality of clusters, BEA-PARTITION provides clear cluster boundaries compared to the hierarchical clustering. BEA-PARTITION is simple to implement and provides a powerful approach to clustering genes or to any clustering problem where starting matrices are available from experimental observations.

43 citations


Journal ArticleDOI
TL;DR: Two-dimensional representations of the relative positions of the accessions with regards to divergence using the first two canonical variates as co-ordinate axes revealed considerable variability among the cases in each group, which showed changes in the geographic origin of 11 accessions and species status of 20 accessions.
Abstract: Mulberry (Morus L.) is essential for the sericulture industry as the primary feed for the silkworm Bombyx mori L. in India, with its long tradition of practicing sericulture, has a large number of indigenous cultivars. Since knowledge on the genetic divergence of these cultivars/varieties is essential for proper conservation and utilization, Inter Simple Sequence Repeat (ISSR) profiling was employed to assess genetic relationships among 34 mulberry accessions, collected from different regions of India. By using 12 ISSR primers, which produced 72 markers displaying a high degree of polymorphism (94.4%), genetic dissimilarity coefficients were calculated for each pair of the accessions and clustering of the accessions with Unweighted Pair Group Method using Arithmetic average (UPGMA) analysis was done to unravel the genetic diversity among the accessions. The dissimilarity coefficients varied from 0.111 to 0.692. UPGMA analysis generated a dendrogram with six groups and five isolates. Clustering of the accessions did not correspond with the information on the geographic origin of many of the accessions. Two-dimensional representations of the relative positions of the accessions with regards to divergence using the first two canonical variates as co-ordinate axes revealed considerable variability among the cases in each group. Further, Discriminant Function Analysis (DFA) showed changes in the geographic origin of 11 accessions and species status of 20 accessions.

41 citations


Journal ArticleDOI
TL;DR: Seed protein profiles of 40 cultivated and wild taxa of Chenopodium have been compared by sodium dodecyl sulfate polyacrylamide gel electrophoresis and show that these taxa are a heterogenous assemblage and their taxonomic affinities need a reassessment.
Abstract: Seed protein profiles of 40 cultivated and wild taxa of Chenopodium have been compared by sodium dodecyl sulfate polyacrylamide gel electrophoresis. The relative similarity between various taxa, estimated by Jaccard’s similarity index and clustered in UPGMA dendrogram, is generally in accordance with taxonomic position, crossability relationships and other biochemical characters. Eight accessions of C. quinoa studied are clustered together and show genetic similarity with closely related C. bushianum and C. berlandieri subsp. nuttalliae. The taxa included under C. album complex are clustered in two groups which show that these taxa are a heterogenous assemblage and their taxonomic affinities need a reassessment. Other wild species studied are placed in the dendrogram more or less according to their taxonomic position.

39 citations


Journal ArticleDOI
01 Feb 2005-Botany
TL;DR: The population genetics of two hybridizing Mexican red oaks was investigated with 54 randomly amplified polymorphic DNA (RAPD) markers scored in 415 individuals representing the distribution area of the two species and a probable secondary hybrid zone, confirming a significant association between geographic and genetic distances among populations.
Abstract: The population genetics of two hybridizing Mexican red oaks, Quercus affinis Schweid. and Quercus laurina Humb. & Bonpl., was investigated with 54 randomly amplified polymorphic DNA (RAPD) markers scored in 415 individuals from 16 populations representing the distribution area of the two species and a probable secondary hybrid zone. Genetic relationships among populations, depicted in a unweighted pair group method with arithmetic averaging (UPGMA) dendrogram, were largely incongruent with the morphological classification of populations as Q. affinis-like or Q. laurina-like that was obtained in previous studies. In contrast, the two main population clusters in the UPGMA dendrogram corresponded to the location of populations in two distinct geographical areas: southwestern and northeastern. A Mantel test confirmed a significant association between geographic and genetic distances among populations. Analyses of molecular variance (AMOVA) indicated that most genetic variation is contained within populations ...

31 citations


Journal ArticleDOI
TL;DR: A clustering algorithm to cluster data with arbitrary shapes without knowing the number of clusters in advance is proposed and dendrograms and so-called tables of relative frequency counts are used to help analysts to pick some trustable clustering results from a lot of different clusteringResults.

28 citations


Journal ArticleDOI
TL;DR: The outcome of the research point to the fact that the Slovene chestnut is a rich source of genetic diversity and is very suitable for further breeding purposes.
Abstract: Phenotypic diversity of 244 chestnut trees was investigated. They originate from the Mediterranean and from two continental regions in Slovenia. In 3 years of analyses, length, diameter, thickness and weight of fruits; length and width of hilum, the pellicle intrusion; shape and colour of fruits; and the embryony were dealt with. The continental trees have smaller fruits than the Mediterranean trees; their fruits show greater variability in shape and the pellicle intrusion is stronger. They also exhibit polyembriony more frequently and they rarely have darker stripes. The sample of 46 trees from all three regions was used for the comparison between the phenotypic and genotypic diversity which had been investigated by RAPD analysis. In both cases, the UPGMA method was used for the classification of the trees into groups. Six pomological clusters were established. The clusters I, II, III and V comprise only the trees from the continental part. The cluster IV comprises nine trees of the marron type from the Mediterranean and one KOZ1 tree with equal pomological characteristics but originating from the continental part. The cluster VI is composed of eight trees from the continental part and one RAV3 tree whose fruits are very small and it originates from the Mediterranean. The Jaccard’s coefficient of similarity is used in order to evaluate genetic relations. On the RAPD dendrogram four clusters of trees are defined. The continental trees are divided into three clusters and exhibit greater genotypic diversity than the Mediterranean trees which form only one cluster. With the RAPD analyses the differentiation of the trees with regard to their pomological traits is determined in 60–90% of the cases. The outcome of the research point to the fact that the Slovene chestnut is a rich source of genetic diversity and is very suitable for further breeding purposes.

23 citations


Journal ArticleDOI
M. Kocsis1, L. Járomi1, Péter Putnoky1, Pál Kozma, A. Borhidi1 
TL;DR: The dendrogram shows that the cultivars of this study can be distinguished to a relatively high degree and the RAPD technique was useful for identification and discrimination of these grape cultivars.
Abstract: Twelve cultivars ( Vitis vinifera L.) were subjected to RAPD analysis in order to estimate the genetic diversity among these genotypes and to analyse their genetic relationships. The study was performed using 28 primers that generated 120 polymorphic fragments. There was genetic variation among the cultivars with values of genetic diversity ranging from 0.419 to 0.642 using the Jaccard coefficient. UPGMA analysis of distance matrix resulted in a dendrogram with three clusters. The dendrogram shows that the cultivars of our study can be distinguished to a relatively high degree. Results were compared with the taxonomic classification and with the synonyms of the cultivars. The RAPD technique was useful for identification and discrimination of these grape cultivars.

Journal ArticleDOI
TL;DR: This work presents a new method for comparing and visualizing relationships between different clustering results, either flat versus flat, or flat versus hierarchical, and implemented in the online gene expression data analysis tool Expression Profiler.
Abstract: Motivation: Clustering is one of the most widely used methods in unsupervised gene expression data analysis. The use of different clustering algorithms or different parameters often produces rather different results on the same data. Biological interpretation of multiple clustering results requires understanding how different clusters relate to each other. It is particularly non-trivial to compare the results of a hierarchical and a flat, e.g. k-means, clustering. Results: We present a new method for comparing and visualizing relationships between different clustering results, either flat versus flat, or flat versus hierarchical. When comparing a flat clustering to a hierarchical clustering, the algorithm cuts different branches in the hierarchical tree at different levels to optimize the correspondence between the clusters. The optimization function is based on graph layout aesthetics or on mutual information. The clusters are displayed using a bipartite graph where the edges are weighted proportionally to the number of common elements in the respective clusters and the weighted number of crossings is minimized. The performance of the algorithm is tested using simulated and real gene expression data. The algorithm is implemented in the online gene expression data analysis tool Expression Profiler. Availability: http://www.ebi.ac.uk/expressionprofiler Contact: aurora@ebi.ac.uk Supplementary information: http://www.ebi.ac.uk/microarray/General/Publications/publications.html

Reference EntryDOI
15 Oct 2005
TL;DR: In this paper, various tests for the best number of clusters in a data set are reviewed, following an exhaustive validation exercise using artificial data sets with hierarchical agglomerative clustering methods.
Abstract: Various tests for the best number of clusters in a data set are reviewed, following an exhaustive validation exercise using artificial data sets with hierarchical agglomerative clustering methods. The chapter concludes with a discussion about whether it is appropriate to apply statistical tests or to evaluate a classification at several levels of usability. Keywords: agglomerative clustering; best cut; dendrogram; C-index; critical value; error ratio; F-ratio; gamma index; hierarchical classifications; moving average rule; number of clusters; partitioning methods; stopping rule; test criterion; t statistic; upper tail rule; variance ratio

Journal ArticleDOI
TL;DR: A novel algorithm in which cluster boundaries are determined by referring to functional annotations stored in genome databases is proposed, which enables the algorithm to recognize cluster boundaries characterizing fundamental biological processes such as the Early G1, Late G 1, S, G2 and M phases in cell cycles.

01 Jan 2005
TL;DR: The results indicate that there is no clear distinction between Caninae groups when many intermediate forms are considered and the heterogeneity found in the dendrogram with respect to sectional status suggests the lack of clear reproductive barriers as is common with long lived woody perennial plants.
Abstract: A dendrogram was constructed based on RAPDs data. The variability found in the dendrogram was discussed according to sectional status and geographic origin. Our results indicate that there is no clear distinction between Caninae groups when many intermediate forms are considered. Besides, the subgenus Hulthemia seems to merit just a sectional status as proposed by other authors for other subgenus. The heterogeneity found in the dendrogram with respect to sectional status suggests the lack of clear reproductive barriers as is common with long lived woody perennial plants. Sect. Cassiorhodon may be considered as the Type of the genus since it shows the widest geographical distribution, the widest crossing ability within the Genus and it appears in most groups of the dendrogram suggesting to be the most representative Section.

Journal ArticleDOI
TL;DR: In this article, a meta-analysis was used to compare the patterns of the tree and the herb layers in a Central-European deciduous hardwood forest and the significance of the relation between the patterns was evaluated through permutation (Mantel) tests and full randomization (Monte Carlo simulation) tests.
Abstract: Meta-analysis is used to compare the patterns of the tree and the herb layers in a Central-European deciduous hardwood forest. Vegetation patterns are represented by distance matrices and dendrograms. The significance of the relation between the patterns is evaluated through permutation (Mantel) tests and full randomization (Monte Carlo simulation) tests. The relationship between the two layers is significant but weak. When using ecological indicators as variables for characterising the herb layer, the relation is stronger. Distance matrices and dendrograms describe the vegetation pattern similarly. However, the results of pairwise tests of significance strongly depend on the “level” of comparisons, i.e., whether distance matrices or dendrograms are compared. This follows perhaps from the differences between permutation and full randomization tests.

Journal Article
TL;DR: To further differentiate Lb.casei.Zhang and ZL12-1 isolated from home-made koumiss in Inner Mongolia, the genes of 16S rDNA from them were amplified in vitro and sequenced and a phylogenetic dendrogram was constructed by comparing these two sequences with other 16SRDNA sequences of Lactobacillus.
Abstract: To further differentiate Lb.casei.Zhang and ZL12-1 isolated from home-made koumiss in Inner Mongolia, the genes of 16S rDNA from them were amplified in vitro and sequenced. Then a phylogenetic dendrogram was constructed by comparing these two sequences with other 16S rDNA sequences of Lactobacillus. Results indicated that the homology of Lb.casei.Zhang with Lb.casei ATCC 334T was 100%, and the homology of ZL12-1 with Lb.gallinarum ATCC 33199T was 98%. Combined with analysis of phyolgenetic dendrogram and partial sequences of 16S rDNA, Lb.casei.Zhang was classified as Lb.casei subsp.casei , ZL12-1 belonged to Lb.gallinarum. The result was as same as that of traditional classification before.

19 Dec 2005
TL;DR: The dendrogram showed a good correlation between the clustering of cherry cultivars and their geographic origin, especially revealing a stronger genetic proximity between some of the most characteristic cultivars of the Jerte Valley, which supports the autochthonous origin hypothesis for these cultivars.
Abstract: SUMMARY Random amplified polymorphic DNA (RAPD) analysis was performed on 38 cultivars of cherry (Prunus avium L.) grown in the Jerte Valley, Caceres, Spain. Thirty five selected decamer primers produced 69 reproducible polymorphic amplification products. The degree of polymorphism detected made possible the identification of all the cultivars by combining the RAPD banding patterns of only seven primers: OPK-08, OPQ-14, OPR-09, OPS-19, OPX-02, OPX-15 and OPZ-13. Eleven unique markers allowed identification of nine cultivars while 15 cultivars were identified by unique banding patterns. A similarity matrix derived from the RAPD amplification products generated by all the primers was obtained using the index of similarity of Jaccard. The similarity coefficients among cultivars ranged from 0.27 to 0.81 with an average of 0.50. A dendrogram based on UPGMA clustering method was constructed using the similarity matrix. The dendrogram showed a good correlation between the clustering of cherry cultivars and their geographic origin, especially revealing a stronger genetic proximity between some of the most characteristic cultivars of the Jerte Valley. This result supports the autochthonous origin hypothesis for these cultivars.


Journal ArticleDOI
01 Nov 2005-Genetica
TL;DR: A new circumscription of the Notata and Linearia groups is proposed here in order to provide a more accurate delimitation of these groups and contribute to the taxonomy of Paspalum.
Abstract: A taxonomic study of Paspalum L. was carried out using a genetic diversity approach. Thirty accessions representing twenty one different species from the Notata and Linearia groups of Paspalum were studied using restriction fragment length polymorphism analysis of the amplified ITS ribosomal DNA (rDNA) and from the psbA–trnH of the chloroplast genome (cpDNA). The combined analysis of the internal transcribed spacer (ITS) and the chloroplast spacer region between the psbA and trnH genes identified genetic polymorphisms. A distance analysis of the molecular data generated a dendrogram which showed the relationships of the two informal groups of Paspalum studied here. Although the distribution of species in the dendrogram was found to be roughly in agreement with previous works based on morphological and cytological data, the results obtained reveal the current artificiality in Paspalum taxonomy. Based on molecular data, a new circumscription of the Notata and Linearia groups is proposed here in order to provide a more accurate delimitation of these groups and contribute to the taxonomy of Paspalum. This study, although preliminary, reveals the potential utility of such a molecular approach for clarifying the taxonomy of closely related taxa.

Journal Article
Liu Wen-xuan1
TL;DR: The genetic dendrogram among the materials constructed by the Neighbor-Joining method based on the Jaccard coefficient showed that the 11 native cultivars of Capsicum can be separated into two groups, one with an average genetic distance 0.150 and the other 0.134.
Abstract: DNA samples of 11 native cultivars of Capsicum were analyzed with 12 screened ISSR primers which produced polymorphisms. Out of a total of 66 amplified bands, 26 bands were divergent, accounting for 39.39%, and 7 of these 26 bands were cultivar-specific, that is 10.61%. The genetic dendrogram among the materials constructed by the Neighbor-Joining method based on the Jaccard coefficient showed that the 11 cultivars can be separated into two groups, one with an average genetic distance 0.150 and the other 0.134. The average genetic distance between the two groups were 0.194. The result of principal component analysis (PCA) based on the amplified bands revealed that two components had 68.33% cumulative contribution to the total variance with individual contributions of 57.51% and 11.37%, respectively. According to the two principal factors, the 11 cultivars were classified into two groups, each having the same cultivars as the groups in dendrogram.

Proceedings ArticleDOI
27 Dec 2005
TL;DR: This work presents a hierarchical clustering algorithm based on a completely different principle, which is the analysis of shared farthest neighbors, and presents experimental results on different data sets.
Abstract: Clustering algorithms in biomedical disciplines are usually selected between two main families, k-means and agglomerative hierarchical clustering. These methods are well studied and well established. However, both categories have some drawbacks related to data dimensionality (for partitional algorithms) and to the bottom-up structure (for hierarchical algorithms). To overcome these limitations, we present a hierarchical clustering algorithm based on a completely different principle, which is the analysis of shared farthest neighbors. The principle of operation and the rationale are illustrated, and experimental results on different data sets are presented.

Journal ArticleDOI
TL;DR: The conclusion was that the cluster groupings were not related to exposure to zinc at the sites of origin, and that the drive to generate distinct metal-tolerant populations may not occur in this species due to the existence of a constitutive tolerance to metals.

Journal Article
TL;DR: The result indicated that the position of all cultivars was variational in the dendrogram, but the affinity was changeless, the genetic diversity was rich in Guizhou native cold-tolerant rice varieties belonging to Japonica group.
Abstract: Guizhou native cold-tolerant rice cultivars and rice 4 contrasts were analyzed by SSR with 16 markers ,the results showed that 16 markers produced 95 bands ,of which 70 were polymorphic ,the diversification was from 4 to 12,the average number of alleles per locus was 5.9,the percent of effective alleles and polymorphic locus were 75.8 % and 38 %,respectively .The genetic distances (GDs) were then used to construct a dendrogram by Neighbor Joining Method (NJ).The result showed that 15 rice varieties could be divided into Indica group and Japonica group, the latter were classified into three subgroups. The genetic distance in Japonica group and Indica group were 0.06-0.53 and 0.297-0.377,respectively, in addition,the genetic distance in Guizhou cold-tolerant rice varieties was 0.21-0.46, the genetic diversity was rich in Guizhou native cold-tolerant rice varieties belonging to Japonica group.Using SSR data and NJ to construct a dendrogram of those cultivars whose cold-tolerant ability were stronger during different growth period , the result indicated that the position of all cultivars was variational in the dendrogram, but the affinity was changeless.

Journal ArticleDOI
TL;DR: Thirteen accessions of pearl millet collected from different states of India and eight wild species of the genus Pennisetum across the world were analyzed for genetic diversity using AFLP markers, revealing the extent of genetic diversity among them.
Abstract: Thirteen accessions of pearl millet (Pennisetum typhoides (L) Leeke) collected from different states of India and eight wild species of the genus Pennisetum across the world were analyzed for genetic diversity using AFLP markers. A combined analysis of eight primer combinations showed 35% polymorphism among P. typhoides accessions while analysis with five primer combinations showed 99% polymorphism among the wild species. The dendrogram constructed for the P. typhoides accessions based on the UPGMA method revealed two major clusters with samples from Gujarat forming a separate cluster from the rest of the samples. Principal component analysis of the same data set revealed similar results with the first principal component accounting for 65% of the total variation. The percentage of rare and common alleles contributing to the diversity in the sample was analyzed using the Shannon Weiner diversity index. The SW index revealed that the samples from Gujarat contributed significantly to the overall diversity among the accessions. Among accessions of each geographical region, considerable variation was revealed by SW index with samples from Tamil Nadu being most polymorphic. The genetic diversity in the accessions could be utilized for future breeding work. The dendrogram constructed for the wild species revealed the extent of genetic diversity among them. Analysis with one primer combination showed P. typhoides being closer to P. mollissimum than to the other analyzed species.

Patent
12 Sep 2005
TL;DR: A document correlation diagram drawing device includes extracting means (20, 30) for extracting content data and time data of document elements (E) each including one or more documents, dendrogram drawing means (50) for drawing a correlation between documents on the basis of the content data of the document elements.
Abstract: A document correlation diagram drawing device includes extracting means (20, 30) for extracting content data and time data of document elements (E) each including one or more documents, dendrogram drawing means (50) for drawing a dendrogram showing a correlation between documents on the basis of the content data of the document elements, clustering means (70) for cutting the dendrogram in accordance with a predetermined rule and extracting clusters, and intra-cluster arranging means (90) for determining an intra-cluster arrangement of the document elements belonging to each cluster on the basis of the time data of the document elements. Accordingly, a dendrogram adequately showing the chronological development in each field can be automatically drawn.

Journal Article
Ni Wei1
TL;DR: A novel agglomerative hierarchical clustering algorithm called WRPC is proposed in this paper, which can identify clusters with complex shapes and avrious size by introducing the influence-weight-based representative points selection mechanism and k-nearest-neighbor-method-based clusters nesting mechanim.
Abstract: As an agglomerative hierarchical clustering algorithm, CURE firstly employs the method of representing clusters by selecting some "representative points" Through the analysis of the feature of traditional hierarchical clus- tering algorithm, a novel agglomerative hierarchical clustering algorithm called WRPC is proposed in this paper WR- PC can identify clusters with complex shapes and avrious size by introducing the influence-weight-based representative points selection mechanism and k-nearest-neighbor-method-based clusters nesting mechanim Experimental results show that WRPC can provide better clustering result with high executing efficiency

Journal ArticleDOI
TL;DR: The empirical results reported here indicate that the entire clustering process can be systematically pursued using seedbased clustering, and that its performance is favorable compared to current approaches.
Abstract: Clustering methods have been often used to find biologically relevant groups of genes or conditions based on their expression levels. Since many functionally related genes tend to be coexpressed, by identifying groups of genes with similar expression profiles, the functionalities of unknown genes can be inferred from those of known genes in the same group. In this paper we address a novel clustering approach, called seed-based clustering, where seed genes are first systematically chosen by computational analysis of their expression profiles, and then the clusters are generated by using the seed genes as initial values for k-means clustering. The seed-based clustering method has strong mathematical foundations and requires only a few matrix computations for seed extraction. As a result, it provides stability of clustering results by eliminating randomness in the selection of initial values for cluster generation. Our empirical results reported here indicate that the entire clustering process can be systematically pursued using seedbased clustering, and that its performance is favorable compared to current approaches.

Journal ArticleDOI
01 Jan 2005
TL;DR: Populations of wild sunflower species were crossed with cms cultivated lines because of their high variability and the mean value differences in observed traits between parents were significant.
Abstract: Populations of wild sunflower species were crossed with cms cultivated lines because of their high variability. Variability was determined by measuring inflorescence diameter, ray flower number and the leaf length and width. The data was used for hierarchical cluster analysis in the SYSTAT 10 program and the obtained dendrogram was used to interpret divergence of used populations. Comparing 25 hybrid populations with parents tested the modes of inheritance. Cluster analysis divided plants in to three groups. The first ones were inbred lines of cultivated sunflower. In the middle of the cluster tree were annual wild species and the third group were perennial wild species. The mean value differences in observed traits between parents were significant. All modes of inheritance were present in Fi generation. Intermediate was the most frequent followed by equal number of partially dominant and dominant ones and in two hybrid combinations, negative heterotic effect was scored.

Journal ArticleDOI
01 May 2005
TL;DR: A segregation population (F2) derived from interspecific hybridization between G. tristis and G. gracilis is made for selection of a RAPD marker linked to a floral scent trait of the wild species G.Graceilis.
Abstract: We have obtained information on relationships among wild gladiolus species through RAPD analysis (Takatsu et al., 2001b). Interspecific hybridization has been carried out based on a dendrogram and F1 seeds have been derived from seven reciprocal crosses. However matured seed was not obtained from the crosses using Gladiolus orchidiflorus as a female parent. We have made a segregation population (F2) derived from interspecific hybridization between G. tristis and G. gracilis for selection of a RAPD marker linked to a floral scent trait of the wild species G. gracilis. Floral scent of each plant was evaluated by sensory evaluation in the F2 plants and it was shown that a segregation ratio of scented plant and non-scented plant fit a 1:3 (χ 2 -value=0.09) ratio.