scispace - formally typeset
Search or ask a question
Author

Patrick Gallagher

Other affiliations: IBM
Bio: Patrick Gallagher is an academic researcher from Cadence Design Systems. The author has contributed to research in topics: Automatic test pattern generation & Cache pollution. The author has an hindex of 11, co-authored 24 publications receiving 419 citations. Previous affiliations of Patrick Gallagher include IBM.

Papers
More filters
Patent
20 Aug 1991
TL;DR: In this paper, a multilevel cache buffer for a multiprocessor system is described, where each processor has a level one cache storage unit which interfaces with a level two cache unit and main storage unit shared by all processors.
Abstract: A multilevel cache buffer for a multiprocessor system in which each processor has a level one cache storage unit which interfaces with a level two cache unit and main storage unit shared by all processors. The multiprocessors share the level two cache according to a priority algorithm. When data in the level two cache is updated, corresponding data in level one caches is invalidated until it is updated.

115 citations

Patent
30 Oct 2006
TL;DR: In this article, a method of adding power control circuitry to a circuit design at each RTL and a netlist level comprising demarcating multiple power domains within the circuit design, specifying multiple power modes each power mode corresponding to a different combination of on/off states of the multiple demarcated power domains, and defining isolation behavior relative to respective power domains is presented.
Abstract: A method of adding power control circuitry to a circuit design at each of an RTL and a netlist level comprising: demarcating multiple power domains within the circuit design; specifying multiple power modes each power mode corresponding to a different combination of on/off states of the multiple demarcated power domains; and defining isolation behavior relative to respective power domains.

53 citations

Patent
12 May 1994
TL;DR: In this paper, a page mover is coupled to the higher level cache subsystem and main memory, and responds to a request from one of the CPUs to store data into the main memory.
Abstract: A hierarchical cache system comprises a plurality of first level cache subsystems for storing data or instructions of respective CPUs, a higher level cache subsystem containing data or instructions of the plurality of cache subsystems, and a main memory coupled to the higher level cache subsystem. A page mover is coupled to the higher level cache subsystem and main memory, and responds to a request from one of the CPUs to store data into the main memory, by storing the data into the main memory without copying previous contents of a store-to address of the request to the higher level cache subsystem in response to said request. Also, the page mover invalidates the previous contents in the higher level cache subsystem if already resident there when the CPU made the request. A buffering system within the page mover comprises request buffers and data segment buffers to store a segment of predetermined size of the data. When all of the request buffers have like priority and there are fewer request buffers that contain respective, outstanding requests than the number of data segment buffers, the page mover means allocates to the request buffers with outstanding requests use of the data segment buffers for which there are no outstanding requests.

35 citations

Proceedings ArticleDOI
18 Dec 2009
TL;DR: By taking advantage of the existing clock gating circuitry and selectively holding the value of some scan flip-flops, switching activity during the capture cycles of a test can be reduced.
Abstract: Scan-based manufacturing test of low power designs often exceeds the very tight functional constraints on average and instantaneous logic switching. The logic activity during the shift and launch-capture of test pattern data may lead to excessive power consumption and voltage droop. This paper focuses on the management of instantaneous power during the capture phase. By taking advantage of the existing clock gating circuitry and selectively holding the value of some scan flip-flops, switching activity during the capture cycles of a test can be reduced. The effectiveness of this technique is demonstrated on several industrial designs that show up to 30% (55%) reduction in instantaneous (average) capture switching.

29 citations

Proceedings ArticleDOI
08 Dec 2008
TL;DR: A novel solution to address the manufacturing test of an MSMV/PSO design is described by using power-mode specifications to map multiple power modes to their target test modes and enhancing the DFT and ATPG methodology to enable a comprehensive test methodology.
Abstract: This paper describes the challenges of testing low-power designs that use the commonly used multi-supply multi-voltage (MSMV) and power shut-off (PSO) design methodology. We describe a novel solution to address the manufacturing test of an MSMV/PSO design by using power-mode specifications to map multiple power modes to their target test modes and enhancing the DFT and ATPG methodology to enable a comprehensive test methodology. We provide experimental results and future directions for power-aware test.

26 citations


Cited by
More filters
Patent
26 Jan 1994
TL;DR: The thread group structure maintains collective timeslice and CPU accounting for all threads in the group, each individual thread has a local scheduling priority for scheduling among the threads in its group as discussed by the authors.
Abstract: Closely related processing threads within a process in a multiprocessor system are collected into thread groups which are globally scheduled as a group based on the thread group structure's priority and scheduling parameters. The thread group structure maintains collective timeslice and CPU accounting for all threads in the group. Within each thread group, each individual thread has a local scheduling priority for scheduling among the threads in its group. The system utilizes a hierarchy of processing levels and run queues to facilitate affining thread groups with processors or groups of processors when possible. The system will tend to balance out the workload among system processors and will migrate threads groups up and down through processing levels to increase cache hits and overall performance. The system is periodically reset to avoid long term unbalanced operation conditions.

289 citations

Patent
20 Jun 1996
TL;DR: An operating system for a non-uniform memory access (NUMA) multiprocessor system that utilizes a software abstraction of the NUMA system hardware representing a hierarchical tree structure to maintain the most efficient level of affinity and to maintain balanced processor and memory loads is presented in this paper.
Abstract: An operating system for a non-uniform memory access (NUMA) multiprocessor system that utilizes a software abstraction of the NUMA system hardware representing a hierarchical tree structure to maintain the most efficient level of affinity and to maintain balanced processor and memory loads. The hierarchical tree structure includes leaf nodes representing the job processors, a root node representing at least one system resource shared by all the job processors, and a plurality of intermediate level nodes representing resources shared by different combinations of the job processors. The operating system includes a medium term scheduler for monitoring the progress of active thread groups distributed throughout the system and for assisting languishing thread groups, and a plurality of dispatchers each associated with one of the job processors for monitoring the status of the associated job processor and for obtaining thread groups for the associated job processor to execute. The operating system further includes a memory manager for allocating virtual and physical memory using a plurality of memory pools and frame treasuries.

211 citations

Patent
08 Dec 2003
TL;DR: In this paper, a data-aware data flow manager is proposed to determine whether to cache data or pipe it directly through based on many factors including type of data requested, state of cache, and user or system policies.
Abstract: A method and system directed to reducing the bottleneck to storage. In one aspect of the invention, a data-aware data flow manager is inserted between storage and a process or device requesting access to the storage. The data-aware data flow manager determines which data to cache and which data to pipe directly through. Through intelligent management and caching of data flow, the data-aware data flow manager is able to avoiding some of the latencies associated with caches that front storage devices. The data-aware data flow manager may determine whether to cache data or pipe it directly through based on many factors including type of data requested, state of cache, and user or system policies.

200 citations

Patent
14 Jan 2005
TL;DR: In this article, a lock data structure for concurrent access to a resource object, such as a database object, is proposed. But the approach is not suitable for the use of a legacy database without requiring burdensome changes to a database table schema.
Abstract: Techniques for concurrent access to a resource object, such as a database object, include generating a lock data structure for a particular resource object. The lock data structure includes data values for a resource object identification, a lock type, and a version number. The version number is related to a number of changes to the resource object since the lock data structure was generated. By carrying a lock version number in a lock data structure managed by a lock manager, improved optimistic locking is provided in a database. In particular, the approach enables introduction of optimistic locking to a legacy database without requiring burdensome changes to a database table schema.

199 citations

Patent
Cheryl Senter1, Johannes Wang1
25 Apr 2005
TL;DR: In this article, a load store unit is provided whose main purpose is to make load requests out of order whenever possible to get the load data back for use by an instruction execution unit as quickly as possible.
Abstract: The present invention provides a system and method for managing load and store operations necessary for reading from and writing to memory or I/O in a superscalar RISC architecture environment. To perform this task, a load store unit is provided whose main purpose is to make load requests out of order whenever possible to get the load data back for use by an instruction execution unit as quickly as possible. A load operation can only be performed out of order if there are no address collisions and no write pendings. An address collision occurs when a read is requested at a memory location where an older instruction will be writing. Write pending refers to the case where an older instruction requests a store operation, but the store address has not yet been calculated. The data cache unit returns 8 bytes of unaligned data. The load/store unit aligns this data properly before it is returned to the instruction execution unit. Thus, the three main tasks of the load store unit are: (1) handling out of order cache requests; (2) detecting address collisions; and (3) alignment of data.

196 citations