Institution
Technical University of Dortmund
Education•Dortmund, Nordrhein-Westfalen, Germany•
About: Technical University of Dortmund is a education organization based out in Dortmund, Nordrhein-Westfalen, Germany. It is known for research contribution in the topics: Context (language use) & Large Hadron Collider. The organization has 13028 authors who have published 27666 publications receiving 615557 citations. The organization is also known as: Dortmund University & University of Dortmund.
Topics: Context (language use), Large Hadron Collider, Computer science, Neutrino, Finite element method
Papers published on a yearly basis
Papers
More filters
••
TL;DR: In this article, a new k-means clustering algorithm for data streams of points from a Euclidean space is proposed, which computes a small weighted sample of the data stream and solves the problem on the sample using the kmeans++ algorithm of Arthur and Vassilvitskii (SODA '07).
Abstract: We develop a new k-means clustering algorithm for data streams of points from a Euclidean space. We call this algorithm StreamKM++. Our algorithm computes a small weighted sample of the data stream and solves the problem on the sample using the k-means++ algorithm of Arthur and Vassilvitskii (SODA '07). To compute the small sample, we propose two new techniques. First, we use an adaptive, nonuniform sampling approach similar to the k-means++ seeding procedure to obtain small coresets from the data stream. This construction is rather easy to implement and, unlike other coreset constructions, its running time has only a small dependency on the dimensionality of the data. Second, we propose a new data structure, which we call coreset tree. The use of these coreset trees significantly speeds up the time necessary for the adaptive, nonuniform sampling during our coreset construction.We compare our algorithm experimentally with two well-known streaming implementations: BIRCH [Zhang et al. 1997] and StreamLS [Guha et al. 2003]. In terms of quality (sum of squared errors), our algorithm is comparable with StreamLS and significantly better than BIRCH (up to a factor of 2). Besides, BIRCH requires significant effort to tune its parameters. In terms of running time, our algorithm is slower than BIRCH. Comparing the running time with StreamLS, it turns out that our algorithm scalesmuch better with increasing number of centers. We conclude that, if the first priority is the quality of the clustering, then our algorithm provides a good alternative to BIRCH and StreamLS, in particular, if the number of cluster centers is large. We also give a theoretical justification of our approach by proving that our sample set is a small coreset in low-dimensional spaces.
285 citations
••
TL;DR: Using a novel NSCLC cohort together with a meta-analysis validation approach, a set of single genes with independent prognostic impact are identified and one of these genes, CADM1, was further established as an immunohistochemical marker with a potential application in clinical diagnostics.
Abstract: Purpose: Global gene expression profiling has been widely used in lung cancer research to identify clinically relevant molecular subtypes as well as to predict prognosis and therapy response. So fa ...
285 citations
••
TL;DR: In this paper, the authors prove existence and uniqueness of (Bayesian) equilibrium for a class of generally asymmetric all-pay auctions with incomplete information, and relate their uniqueness result to the well-known multiplicity of equilibria in the war of attrition (second-price allpay auction), which emerges as a limit point of the class of two-player auction games considered.
284 citations
••
TL;DR: This Review is aimed at reviving myrcene as a renewable compound suitable for sustainable chemistry in the area of fine chemicals, and the versatility of the unsaturated C(10)-hydrocarbon myrcenes, leading to products with several different areas of application, is pointed out.
Abstract: Currently, a shift towards chemical products derived from renewable, biological feedstocks is observed more and more. However, substantial differences with traditional feedstocks, such as their “hyperfunctionalization,” ethical problems caused by competition with foods, and problems with a constant qualitative/quantitative availability of the natural products, occasionally complicate the large-scale market entry of renewable resources. In this context the vast family of terpenes is often not taken into consideration, although the terpenes have been known for hundreds of years as components of essential oils obtained from leaves, flowers, and fruits of many plants. The simple acyclic monoterpenes, particularly the industrially available myrcene, provide a classical chemistry similar to unsaturated hydrocarbons already known from oil and gas. Hence, this Review is aimed at reviving myrcene as a renewable compound suitable for sustainable chemistry in the area of fine chemicals. The versatility of the unsaturated C10-hydrocarbon myrcene, leading to products with several different areas of application, is pointed out.
284 citations
••
TL;DR: A comprehensive review of existing kriging-based methods for the optimization of noisy functions is provided, and the three most intuitive criteria are found as poor alternatives.
Abstract: Responses of many real-world problems can only be evaluated perturbed by noise. In order to make an efficient optimization of these problems possible, intelligent optimization strategies successfully coping with noisy evaluations are required. In this article, a comprehensive review of existing kriging-based methods for the optimization of noisy functions is provided. In summary, ten methods for choosing the sequential samples are described using a unified formalism. They are compared on analytical benchmark problems, whereby the usual assumption of homoscedastic Gaussian noise made in the underlying models is meet. Different problem configurations (noise level, maximum number of observations, initial number of observations) and setups (covariance functions, budget, initial sample size) are considered. It is found that the choices of the initial sample size and the covariance function are not critical. The choice of the method, however, can result in significant differences in the performance. In particular, the three most intuitive criteria are found as poor alternatives. Although no criterion is found consistently more efficient than the others, two specialized methods appear more robust on average.
284 citations
Authors
Showing all 13240 results
Name | H-index | Papers | Citations |
---|---|---|---|
Hermann Kolanoski | 145 | 1279 | 96152 |
Marc Besancon | 143 | 1799 | 106869 |
Kerstin Borras | 133 | 1341 | 92173 |
Emmerich Kneringer | 129 | 1021 | 80898 |
Achim Geiser | 129 | 1331 | 84136 |
Valerio Vercesi | 129 | 937 | 79519 |
Jens Weingarten | 128 | 896 | 74667 |
Giuseppe Mornacchi | 127 | 894 | 75830 |
Kevin Kroeninger | 126 | 836 | 70010 |
Daniel Muenstermann | 126 | 885 | 70855 |
Reiner Klingenberg | 126 | 733 | 70069 |
Claus Gössling | 126 | 775 | 71975 |
Diane Cinca | 126 | 822 | 70126 |
Frank Meier | 124 | 677 | 64889 |
Daniel Dobos | 124 | 679 | 67434 |