Home
/
Authors
/
Daniel Chaver

Author

Daniel Chaver

Bio: Daniel Chaver is an academic researcher from Complutense University of Madrid. The author has contributed to research in topics: Memory hierarchy & Cache. The author has an hindex of 12, co-authored 43 publications receiving 451 citations.

Topics: Memory hierarchy, Cache, Cache pollution, Scheduling (computing), SIMD ...read more

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Branch prediction on demand: an energy-efficient solution

[...]

Daniel Chaver¹, Luis Piñuel¹, Manuel Prieto¹, Francisco Tirado¹, Michael C. Huang² - Show less +1 more•Institutions (2)

Complutense University of Madrid¹, University of Rochester²

25 Aug 2003

TL;DR: This work proposes a methodology to reduce the energy consumption of the branch predictor by characterizing prediction demand using profiling and dynamically adjusting predictor resources accordingly, and disabling components of the hybrid direction predictor and resize the branch target buffer.

...read moreread less

Abstract: High-end processors typically incorporate complex branch predictors consisting of many large structures that together consume a notable fraction of total chip power (more than 10% in some cases). Depending on the applications, some of these resources may remain underused for long periods of time. We propose a methodology to reduce the energy consumption of the branch predictor by characterizing prediction demand using profiling and dynamically adjusting predictor resources accordingly. Specifically, we disable components of the hybrid direction predictor and resize the branch target buffer. Detailed simulations show that this approach reduces the energy consumption in the branch predictor by an average of 72% and up to 89% with virtually no impact on prediction accuracy and performance.

...read moreread less

47 citations

Journal Article•

2-D wavelet Transform enhancement on general-purpose microprocessors: Memory hierarchy and SIMD parallelism exploitation

[...]

Daniel Chaver, Christian Tenllado, Luis Piñuel, Manuel Prieto, Francisco Tirado - Show less +1 more

01 Jan 2002-Lecture Notes in Computer Science

TL;DR: In this work, locality has been significantly improved by means of a novel approach called pipelined computation, which complements previous techniques based on loop tiling and non-linear layouts.

...read moreread less

Abstract: This paper addresses the implementation of a 2-D Discrete Wavelet Transform on general-purpose microprocessors, focusing on both memory hierarchy and SIMD parallelization issues. Both topics are somewhat related, since SIMD extensions are only useful if the memory hierarchy is efficiently exploited. In this work, locality has been significantly improved by means of a novel approach called pipelined computation, which complements previous techniques based on loop tiling and non-linear layouts. As experimental platforms we have employed a Pentium-III (P-III) and a Pentium-4 (P-4) microprocessor. However, our SIMD-oriented tuning has been exclusively performed at source code level. Basically, we have reordered some loops and introduced some modifications that allow automatic vectorization. Taking into account the abstraction level at which the optimizations are carried out, the speedups obtained on the investigated platforms are quite satisfactory, even though further improvement can be obtained by dropping the level of abstraction (compiler intrinsics or assembly code).

...read moreread less

36 citations

Proceedings Article•DOI•

Reducing writes in phase-change memory environments by using efficient cache replacement policies

[...]

Roberto Rodríguez-Rodríguez¹, Fernando Castro¹, Daniel Chaver¹, Luis Piñuel¹, Francisco Tirado¹ - Show less +1 more•Institutions (1)

Complutense University of Madrid¹

18 Mar 2013

TL;DR: This work presents a behavior analysis of conventional cache replacement policies in terms of the amount of write to main memory, and new last level cache (LLC) replacement algorithms are exposed, aimed at reducing the number of writes to PCM and hence increasing its lifetime, without significantly degrading system performance.

...read moreread less

Abstract: Phase Change Memory (PCM) is currently postulated as the best alternative for replacing Dynamic Random Access Memory (DRAM) as the technology used for implementing main memories, thanks to its significant advantages such as good scalability and low leakage. However, PCM also presents some drawbacks compared to DRAM, like its lower endurance. This work presents a behavior analysis of conventional cache replacement policies in terms of the amount of writes to main memory. Besides, new last level cache (LLC) replacement algorithms are exposed, aimed at reducing the number of writes to PCM and hence increasing its lifetime, without significantly degrading system performance.

...read moreread less

33 citations

Proceedings Article•DOI•

Parallel wavelet transform for large scale image processing

[...]

Daniel Chaver¹, Manuel Prieto¹, Luis Piñuel¹, Francisco Tirado¹•Institutions (1)

Complutense University of Madrid¹

15 Apr 2002

TL;DR: This paper discusses several issues relevant to the parallel implementation of a 2-D discrete wavelet transform (DWT) on general purpose multiprocessors and pays much attention to memory hierarchy exploitation.

...read moreread less

Abstract: In this paper we discuss several issues relevant to the parallel implementation of a 2-D discrete wavelet transform (DWT) on general purpose multiprocessors. Our interest in this transform is motivated by its usage in an image fission application which has to manage large image sizes, making parallel computing highly advisable. We have also paid much attention to memory hierarchy exploitation, since it has a tremendous impact on performance due to the lack of spatial locality when the DWT processes image columns.

...read moreread less

30 citations

Proceedings Article•DOI•

Vectorization of the 2D wavelet lifting transform using SIMD extensions

[...]

Daniel Chaver¹, Christian Tenllado¹, Luis Piñuel¹, Manuel Prieto¹, Francisco Tirado¹ - Show less +1 more•Institutions (1)

Complutense University of Madrid¹

22 Apr 2003

TL;DR: This paper addresses the vectorization of the lifting-based wavelet transform on general-purpose microprocessors in the context of JPEG2000 by avoiding assembler language programming in order to improve both code portability and development cost.

...read moreread less

Abstract: This paper addresses the vectorization of the lifting-based wavelet transform on general-purpose microprocessors in the context of JPEG2000. Since SIMD exploitation strongly depends on an efficient memory hierarchy usage, this research is based on previous work about cache-conscious DWT implementations. The experimental platform on which we have chosen to study the benefits of the SIMD extensions is an Intel Pentium-4 (P-4) based PC. However, unlike other authors, the vectorization has been performed avoiding assembler language programming in order to improve both code portability and development cost.

...read moreread less

30 citations

1
2
3
4
…
5
6
7
8
9

Collapse

Cited by

PDF

Open Access

More filters

Digital Design And Computer Architecture

[...]

Daniela Fischer

01 Jan 2016

TL;DR: The digital design and computer architecture is universally compatible with any devices to read and is available in the digital library an online access to it is set as public so you can download it instantly.

...read moreread less

Abstract: Thank you for downloading digital design and computer architecture. As you may know, people have search numerous times for their chosen novels like this digital design and computer architecture, but end up in malicious downloads. Rather than enjoying a good book with a cup of coffee in the afternoon, instead they juggled with some infectious virus inside their laptop. digital design and computer architecture is available in our digital library an online access to it is set as public so you can download it instantly. Our digital library hosts in multiple countries, allowing you to get the most less latency time to download any of our books like this one. Merely said, the digital design and computer architecture is universally compatible with any devices to read.

...read moreread less

246 citations

Patent•

Resource management for cloud computing platforms

[...]

Navendu Jain¹, Ishai Menache¹•Institutions (1)

Microsoft¹

27 Jun 2011

TL;DR: In this article, a system for managing allocation of resources based on service level agreements between application owners and cloud operators is proposed, where the cloud operator may have responsibility for managing resource allocation to the software application and may manage the allocation such that the application executes within an agreed performance level.

...read moreread less

Abstract: A system for managing allocation of resources based on service level agreements between application owners and cloud operators. Under some service level agreements, the cloud operator may have responsibility for managing allocation of resources to the software application and may manage the allocation such that the software application executes within an agreed performance level. Operating a cloud computing platform according to such a service level agreement may alleviate for the application owners the complexities of managing allocation of resources and may provide greater flexibility to cloud operators in managing their cloud computing platforms.

...read moreread less

199 citations

Journal Article•DOI•

A Survey of Architectural Techniques For Improving Cache Power Efficiency

[...]

Sparsh Mittal¹•Institutions (1)

Oak Ridge National Laboratory¹

01 Mar 2014-Sustainable Computing: Informatics and Systems

TL;DR: The aim of this survey is to enable engineers and researchers to get insights into the techniques for improving cache power efficiency and motivate them to invent novel solutions for enabling low-power operation of caches.

...read moreread less

125 citations

Journal Article•DOI•

Parallel Implementation of the 2D Discrete Wavelet Transform on Graphics Processing Units: Filter Bank versus Lifting

[...]

Christian Tenllado¹, Javier Setoain¹, Manuel Prieto¹, Luis Piñuel¹, Francisco Tirado¹ - Show less +1 more•Institutions (1)

Complutense University of Madrid¹

01 Mar 2008-IEEE Transactions on Parallel and Distributed Systems

TL;DR: This study indicates that FBS outperforms LS in current-generationGPUs, and design trends suggest higher gains in future-generation GPUs.

...read moreread less

Abstract: The widespread usage of the discrete wavelet transform (DWT) has motivated the development of fast DWT algorithms and their tuning on all sorts of computer systems. Several studies have compared the performance of the most popular schemes, known as filter bank scheme (FBS) and lifting scheme (LS), and have always concluded that LS is the most efficient option. However, there is no such study on streaming processors such as modern Graphics Processing Units (GPUs). Current trends have transformed these devices into powerful stream processors with enough flexibility to perform intensive and complex floating-point calculations. The opportunities opened up by these platforms, as well as the growing popularity of the DWT within the computer graphics field, make a new performance comparison of great practical interest. Our study indicates that FBS outperforms LS in current-generation GPUs. In our experiments, the actual FBS gains range between 10 percent and 140 percent, depending on the problem size and the type and length of the wavelet filter. Moreover, design trends suggest higher gains in future-generation GPUs.

...read moreread less

116 citations

Patent•

Dynamic application placement based on cost and availability of energy in datacenters

[...]

Navendu Jain¹•Institutions (1)

Microsoft¹

13 May 2010

TL;DR: In this article, an optimization framework for hosting sites that dynamically places application instances across multiple hosting sites based on the energy cost and availability of energy at these sites, application SLAs (service level agreements), and cost of network bandwidth between sites, just to name a few.

...read moreread less

Abstract: An optimization framework for hosting sites that dynamically places application instances across multiple hosting sites based on the energy cost and availability of energy at these sites, application SLAs (service level agreements), and cost of network bandwidth between sites, just to name a few. The framework leverages a global network of hosting sites, possibly co-located with renewable and non-renewable energy sources, to dynamically determine the best datacenter (site) suited to place application instances to handle incoming workload at a given point in time. Application instances can be moved between datacenters subject to energy availability and dynamic power pricing, for example, which can vary hourly in day-ahead markets and in a time span of minutes in realtime markets.

...read moreread less

90 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69

Collapse