P
Per Stenström
Researcher at Chalmers University of Technology
Publications - 251
Citations - 8514
Per Stenström is an academic researcher from Chalmers University of Technology. The author has contributed to research in topics: Cache & Cache coherence. The author has an hindex of 43, co-authored 245 publications receiving 8193 citations. Previous affiliations of Per Stenström include Stanford University & Ericsson.
Papers
More filters
Proceedings ArticleDOI
Modelling accesses to migratory and producer-consumer characterised data in a shared memory multiprocessor
Mats Brorsson,Per Stenström +1 more
TL;DR: A set of parameters that characterises the accesses to migratory and producer-consumer data in sufficient detail is identified so as to predict the number of cache misses in directory-based, write-invalidate protocols.
BookDOI
Transactions on High-Performance Embedded Architectures and Compilers I
TL;DR: A High Performance Adaptive Miss Handling Architecture for Chip Multiprocessors and Characterizing Time-Varying Program Behavior Using Phase Complexity Surfaces.
Journal ArticleDOI
Performance evaluation and cost analysis of cache protocol extensions for shared-memory multiprocessors
TL;DR: It is shown that a basic write-invalidate protocol augmented by appropriate extensions can eliminate most memory access penalties without any support from the programmer or the compiler.
Proceedings ArticleDOI
Using hints to reduce the read miss penalty for flat COMA protocols
TL;DR: A new protocol using hints that simultaneously sends a request to the potential holder and to the directory is studied, which reduces the read-miss penalty for all applications but the protocol complexity does not seem to justify the performance improvement.
Journal ArticleDOI
Using dataflow analysis techniques to reduce ownership overhead in cache coherence protocols
Jonas Skeppstedt,Per Stenström +1 more
TL;DR: The static analysis using classical dataflow analysis techniques in removing overhead in write-invalidate cache coherence protocols for shared-memory multiprocessors results in similar performance improvements as dynamic hardware-based approaches.