scispace - formally typeset
Search or ask a question
Author

Richard Weber

Bio: Richard Weber is an academic researcher from University of Chile. The author has contributed to research in topics: Support vector machine & Feature selection. The author has an hindex of 31, co-authored 109 publications receiving 4531 citations. Previous affiliations of Richard Weber include Akamai Technologies & RWTH Aachen University.


Papers
More filters
Proceedings ArticleDOI
16 Aug 2009
TL;DR: The variation due to fluctuating electricity prices is characterized and it is argued that existing distributed systems should be able to exploit this variation for significant economic gains.
Abstract: Energy expenses are becoming an increasingly important fraction of data center operating costs. At the same time, the energy expense per unit of computation can vary significantly between two different locations. In this paper, we characterize the variation due to fluctuating electricity prices and argue that existing distributed systems should be able to exploit this variation for significant economic gains. Electricity prices exhibit both temporal and geographic variation, due to regional demand differences, transmission inefficiencies, and generation diversity. Starting with historical electricity prices, for twenty nine locations in the US, and network traffic data collected on Akamai's CDN, we use simulation to quantify the possible economic gains for a realistic workload. Our results imply that existing systems may be able to save millions of dollars a year in electricity costs, by being cognizant of locational computation cost differences.

896 citations

Journal ArticleDOI
TL;DR: A novel wrapper Algorithm for Feature Selection, using Support Vector Machines with kernel functions, based on a sequential backward selection, using the number of errors in a validation subset as the measure to decide which feature to remove in each iteration.

407 citations

Journal ArticleDOI
01 Jan 1997
TL;DR: This paper first provides an overview of data preprocessing, focusing on problems of real world data, and details of dataPreprocessing techniques achieving each of the above mentioned objectives.
Abstract: This paper first provides an overview of data preprocessing, focusing on problems of real world data. These are primarily problems that have to be carefully understood and solved before any data analysis process can start. The paper discusses in detail two main reasons for performing data preprocessing: i problems with the data and ii preparation for data analysis. The paper continues with details of data preprocessing techniques achieving each of the above mentioned objectives. A total of 14 techniques are discussed. Two examples of data preprocessing applications from two of the most data rich domains are given at the end. The applications are related to semiconductor manufacturing and aerospace domains where large amounts of data are available, and they are fairly reliable. Future directions and some challenges are discussed at the end.

304 citations

Journal ArticleDOI
01 Jan 2007
TL;DR: A hybrid intelligent system combining Autoregressive Integrated Moving Average models and neural networks for demand forecasting is presented and a replenishment system for a Chilean supermarket is proposed, which leads simultaneously to fewer sales failures and lower inventory levels than the previous solution.
Abstract: Demand forecasts play a crucial role for supply chain management. The future demand for a certain product is the basis for the respective replenishment systems. Several forecasting techniques have been developed, each one with its particular advantages and disadvantages compared to other approaches. This motivates the development of hybrid systems combining different techniques and their respective strengths. In this paper, we present a hybrid intelligent system combining Autoregressive Integrated Moving Average (ARIMA) models and neural networks for demand forecasting. We show improvements in forecasting accuracy and propose a replenishment system for a Chilean supermarket, which leads simultaneously to fewer sales failures and lower inventory levels than the previous solution.

290 citations

Journal ArticleDOI
TL;DR: An embedded method that simultaneously selects relevant features during classifier construction by penalizing each feature's use in the dual formulation of support vector machines (SVM) called kernel-penalized SVM (KP-SVM).

238 citations


Cited by
More filters
01 Jan 2002

9,314 citations

Book
01 Jan 2009

8,216 citations

01 Jan 2012

3,692 citations

Journal ArticleDOI
TL;DR: An in depth review of rare event detection from an imbalanced learning perspective and a comprehensive taxonomy of the existing application domains of im balanced learning are provided.
Abstract: 527 articles related to imbalanced data and rare events are reviewed.Viewing reviewed papers from both technical and practical perspectives.Summarizing existing methods and corresponding statistics by a new taxonomy idea.Categorizing 162 application papers into 13 domains and giving introduction.Some opening questions are discussed at the end of this manuscript. Rare events, especially those that could potentially negatively impact society, often require humans decision-making responses. Detecting rare events can be viewed as a prediction task in data mining and machine learning communities. As these events are rarely observed in daily life, the prediction task suffers from a lack of balanced data. In this paper, we provide an in depth review of rare event detection from an imbalanced learning perspective. Five hundred and seventeen related papers that have been published in the past decade were collected for the study. The initial statistics suggested that rare events detection and imbalanced learning are concerned across a wide range of research areas from management science to engineering. We reviewed all collected papers from both a technical and a practical point of view. Modeling methods discussed include techniques such as data preprocessing, classification algorithms and model evaluation. For applications, we first provide a comprehensive taxonomy of the existing application domains of imbalanced learning, and then we detail the applications for each category. Finally, some suggestions from the reviewed papers are incorporated with our experiences and judgments to offer further research directions for the imbalanced learning and rare event detection fields.

1,448 citations

Journal ArticleDOI
TL;DR: An elegant and remarkably simple algorithm ("the threshold algorithm", or TA) is analyzed that is optimal in a much stronger sense than FA, and is essentially optimal, not just for some monotone aggregation functions, but for all of them, and not just in a high-probability worst-case sense, but over every database.

1,315 citations