scispace - formally typeset
Search or ask a question
Institution

Technical University of Dortmund

EducationDortmund, Nordrhein-Westfalen, Germany
About: Technical University of Dortmund is a education organization based out in Dortmund, Nordrhein-Westfalen, Germany. It is known for research contribution in the topics: Context (language use) & Large Hadron Collider. The organization has 13028 authors who have published 27666 publications receiving 615557 citations. The organization is also known as: Dortmund University & University of Dortmund.


Papers
More filters
Journal ArticleDOI
TL;DR: In this article, a new k-means clustering algorithm for data streams of points from a Euclidean space is proposed, which computes a small weighted sample of the data stream and solves the problem on the sample using the kmeans++ algorithm of Arthur and Vassilvitskii (SODA '07).
Abstract: We develop a new k-means clustering algorithm for data streams of points from a Euclidean space. We call this algorithm StreamKM++. Our algorithm computes a small weighted sample of the data stream and solves the problem on the sample using the k-means++ algorithm of Arthur and Vassilvitskii (SODA '07). To compute the small sample, we propose two new techniques. First, we use an adaptive, nonuniform sampling approach similar to the k-means++ seeding procedure to obtain small coresets from the data stream. This construction is rather easy to implement and, unlike other coreset constructions, its running time has only a small dependency on the dimensionality of the data. Second, we propose a new data structure, which we call coreset tree. The use of these coreset trees significantly speeds up the time necessary for the adaptive, nonuniform sampling during our coreset construction.We compare our algorithm experimentally with two well-known streaming implementations: BIRCH [Zhang et al. 1997] and StreamLS [Guha et al. 2003]. In terms of quality (sum of squared errors), our algorithm is comparable with StreamLS and significantly better than BIRCH (up to a factor of 2). Besides, BIRCH requires significant effort to tune its parameters. In terms of running time, our algorithm is slower than BIRCH. Comparing the running time with StreamLS, it turns out that our algorithm scalesmuch better with increasing number of centers. We conclude that, if the first priority is the quality of the clustering, then our algorithm provides a good alternative to BIRCH and StreamLS, in particular, if the number of cluster centers is large. We also give a theoretical justification of our approach by proving that our sample set is a small coreset in low-dimensional spaces.

285 citations

Journal ArticleDOI
TL;DR: Using a novel NSCLC cohort together with a meta-analysis validation approach, a set of single genes with independent prognostic impact are identified and one of these genes, CADM1, was further established as an immunohistochemical marker with a potential application in clinical diagnostics.
Abstract: Purpose: Global gene expression profiling has been widely used in lung cancer research to identify clinically relevant molecular subtypes as well as to predict prognosis and therapy response. So fa ...

285 citations

Journal ArticleDOI
TL;DR: In this paper, the authors prove existence and uniqueness of (Bayesian) equilibrium for a class of generally asymmetric all-pay auctions with incomplete information, and relate their uniqueness result to the well-known multiplicity of equilibria in the war of attrition (second-price allpay auction), which emerges as a limit point of the class of two-player auction games considered.

284 citations

Journal ArticleDOI
TL;DR: This Review is aimed at reviving myrcene as a renewable compound suitable for sustainable chemistry in the area of fine chemicals, and the versatility of the unsaturated C(10)-hydrocarbon myrcenes, leading to products with several different areas of application, is pointed out.
Abstract: Currently, a shift towards chemical products derived from renewable, biological feedstocks is observed more and more. However, substantial differences with traditional feedstocks, such as their “hyperfunctionalization,” ethical problems caused by competition with foods, and problems with a constant qualitative/quantitative availability of the natural products, occasionally complicate the large-scale market entry of renewable resources. In this context the vast family of terpenes is often not taken into consideration, although the terpenes have been known for hundreds of years as components of essential oils obtained from leaves, flowers, and fruits of many plants. The simple acyclic monoterpenes, particularly the industrially available myrcene, provide a classical chemistry similar to unsaturated hydrocarbons already known from oil and gas. Hence, this Review is aimed at reviving myrcene as a renewable compound suitable for sustainable chemistry in the area of fine chemicals. The versatility of the unsaturated C10-hydrocarbon myrcene, leading to products with several different areas of application, is pointed out.

284 citations

Journal ArticleDOI
TL;DR: A comprehensive review of existing kriging-based methods for the optimization of noisy functions is provided, and the three most intuitive criteria are found as poor alternatives.
Abstract: Responses of many real-world problems can only be evaluated perturbed by noise. In order to make an efficient optimization of these problems possible, intelligent optimization strategies successfully coping with noisy evaluations are required. In this article, a comprehensive review of existing kriging-based methods for the optimization of noisy functions is provided. In summary, ten methods for choosing the sequential samples are described using a unified formalism. They are compared on analytical benchmark problems, whereby the usual assumption of homoscedastic Gaussian noise made in the underlying models is meet. Different problem configurations (noise level, maximum number of observations, initial number of observations) and setups (covariance functions, budget, initial sample size) are considered. It is found that the choices of the initial sample size and the covariance function are not critical. The choice of the method, however, can result in significant differences in the performance. In particular, the three most intuitive criteria are found as poor alternatives. Although no criterion is found consistently more efficient than the others, two specialized methods appear more robust on average.

284 citations


Authors

Showing all 13240 results

NameH-indexPapersCitations
Hermann Kolanoski145127996152
Marc Besancon1431799106869
Kerstin Borras133134192173
Emmerich Kneringer129102180898
Achim Geiser129133184136
Valerio Vercesi12993779519
Jens Weingarten12889674667
Giuseppe Mornacchi12789475830
Kevin Kroeninger12683670010
Daniel Muenstermann12688570855
Reiner Klingenberg12673370069
Claus Gössling12677571975
Diane Cinca12682270126
Frank Meier12467764889
Daniel Dobos12467967434
Network Information
Related Institutions (5)
RWTH Aachen University
96.2K papers, 2.5M citations

93% related

University of Erlangen-Nuremberg
85.6K papers, 2.6M citations

92% related

Technische Universität München
123.4K papers, 4M citations

91% related

ETH Zurich
122.4K papers, 5.1M citations

90% related

École Polytechnique Fédérale de Lausanne
98.2K papers, 4.3M citations

89% related

Performance
Metrics
No. of papers from the Institution in previous years
YearPapers
2023131
2022306
20211,694
20201,773
20191,653
20181,579