scispace - formally typeset
Search or ask a question

Showing papers in "Ibm Journal of Research and Development in 2003"


Journal ArticleDOI
R. Lougee-Heimer1
TL;DR: The Common Optimization INterface for Operations Research, an initiative to promote open-source software for the operations research community, is reviewed, and the goals and status of COIN-OR are presented.
Abstract: The Common Optimization INterface for Operations Research (COIN-OR, http://www.coin-or.org/) is an initiative to promote open-source software for the operations research (OR) community. In OR practice and research, software is fundamental. The dependence of OR on software implies that the ways in which software is developed, managed, and distributed can have a significant impact on the field. Open source is a relatively new software development and distribution model which offers advantages over current practices. Its viability depends on the precise definition of open source, on the culture of a distributed developer community, and on a version-control system which makes distributed development possible. In this paper, we review open-source philosophy and culture, and present the goals and status of COIN-OR.

376 citations


Journal ArticleDOI
TL;DR: This paper describes low-voltage random-access memory (RAM) cells and peripheral circuits for standalone and embedded RAMs, focusing on stable operation and reduced subthreshold current in standby and active modes.
Abstract: This paper describes low-voltage random-access memory (RAM) cells and peripheral circuits for standalone and embedded RAMs, focusing on stable operation and reduced subthreshold current in standby and active modes. First, technology trends in low-voltage dynamic RAMs (DRAMs) and static RAMs (SRAMs) are reviewed and the challenges of low-voltage RAMs in terms of cell signal charge are clarified, including the necessary threshold voltage, VT, and its variations in the MOS field-effect transistors (MOSFETs) of RAM cells and sense amplifiers, leakage currents (subthreshold current and gate-tunnel current), and speed variations resulting from design parameter variations. Second, developments in conventional RAM cells and emerging cells, such as DRAM gain cells and leakage-immune SRAM cells, are discussed from the viewpoints of cell area, operating voltage, and leakage currents of MOSFETs. Third, the concepts proposed to date to reduce subthreshold current and the advantages of RAMs with respect to reducing the subthreshold current are summarized, including their applications to RAM circuits to reduce the current in standby and active modes, exemplified by DRAMs. After this, design issues in other peripheral circuits, such as sense amplifiers and low-voltage supporting circuits, are discussed, as are power management to suppress speed variations and reduce the power of power-aware systems, and testing. Finally, future prospects based on the above discussion are examined.

181 citations


Journal ArticleDOI
TL;DR: The PowerTimer toolset is useful in assessing the typical and worst-case power swings that occur between successive cycle windows in a given workload execution and helps pinpoint potential inductive noise problems on the voltage rail that can be addressed by designing an appropriate package or by suitably tuning the dynamic power management controls within the processor.
Abstract: The PowerTimer toolset has been developed for use in early-stage, microarchitecture-level power-performance analysis of microprocessors. The key component of the toolset is a parameterized set of energy functions that can be used in conjunction with any given cycle-accurate microarchitectural simulator. The energy functions model the power consumption of primitive and hierarchically composed building blocks which are used in microarchitecture-level performance models. Examples of structures modeled are pipeline stage latches, queues, buffers and component read/write multiplexers, local clock buffers, register files, and cache array macros. The energy functions can be derived using purely analytical equations that are driven by organizational, circuit, and technology parameters or behavioral equations that are derived from empirical, circuit-level simulation experiments. After describing the modeling methodology, we present analysis results in the context of a current-generation superscalar processor simulator to illustrate the use and effectiveness of such early-stage models. In addition to average power and performance tradeoff analysis, PowerTimer is useful in assessing the typical and worst-case power (or current) swings that occur between successive cycle windows in a given workload execution. Such a characterization of workloads at the early stage of microarchitecture definition helps pinpoint potential inductive noise problems on the voltage rail that can be addressed by designing an appropriate package or by suitably tuning the dynamic power management controls within the processor.

123 citations


Journal ArticleDOI
TL;DR: An overview of the IBM PowerNPTM NP4GS3 network processor is provided and its hardware and software design characteristics and its comprehensive base operating software make it well suited for a wide range of networking applications.
Abstract: Deep packet processing is migrating to the edges of service provider networks to simplify and speed up core functions. On the other hand, the cores of such networks are migrating to the switching of high-speed traffic aggregates. As a result, more services will have to be performed at the edges, on behalf of both the core and the end users. Associated network equipment will therefore require high flexibility to support evolving high-level services as well as extraordinary performance to deal with the high packet rates. Whereas, in the past, network equipment was based either on general-purpose processors (GPPs) or application-specific integrated circuits (ASICs), favoring flexibility over speed or vice versa, the network processor approach achieves both flexibility and performance. The key advantage of network processors is that hardware-level performance is complemented by flexible software architecture. This paper provides an overview of the IBM PowerNPTM NP4GS3 network processor and how it addresses these issues. Its hardware and software design characteristics and its comprehensive base operating software make it well suited for a wide range of networking applications.

112 citations


Journal ArticleDOI
TL;DR: The paper reviews the process development and integration methodology, presents the device characteristics, and shows how the development and device selection were geared toward usage in mixed-signal IC development.
Abstract: This paper provides a detailed description of the IBM SiGe BiCMOS and rf CMOS technologies. The technologies provide high-performance SiGe heterojunction bipolar transistors (HBTs) combined with advanced CMOS technology and a variety of passive devices critical for realizing an integrated mixed-signal system-on-a-chip (SoC). The paper reviews the process development and integration methodology, presents the device characteristics, and shows how the development and device selection were geared toward usage in mixed-signal IC development.

100 citations


Journal ArticleDOI
TL;DR: Using the new approach, conventional passive optical components such as arrayed waveguide gratings for wavelength-division-multiplexed transmission systems can be fabricated in a more compact way than using standard silica-on-silicon waveguide methods.
Abstract: The rapidly growing optical communication market requires photonic components with ever-increasing functionality and complexity that can be fabricated reliably at low cost. Of the various approaches used to fabricate photonic components, those based on planar waveguides have achieved high performance and represent a promising path toward compact integration of optical functions. We present an overview of an approach used to produce an optical single-mode waveguide. Through its strong mode confinement, the approach makes it possible to integrate optical filter functions with higher functionality, as required for high-data-rate communication networks. The waveguide is based on the use of a silicon oxynitride (SiON) core and silicon oxide cladding layers, and can be fabricated using conventional chip fabrication techniques. Using the new approach, conventional passive optical components such as arrayed waveguide gratings for wavelength-division-multiplexed transmission systems can be fabricated in a more compact way than using standard silica-on-silicon waveguide methods. Moreover, the realization of more enhanced, adaptive optical functions such as finite- impulse-response as well as infinite-impulse-response filters is possible. Reconfiguration is achieved through the thermo-optic effect. A reconfigurable gain-flattening filter and an adaptive dispersion compensator are presented as examples.

98 citations


Journal ArticleDOI
TL;DR: This work proposes a QoS model in which applications may have several versions, each with different time and energy requirements, while providing different levels of accuracy (reward), and three algorithms are devised that closely approximate the optimal solution while taking only a fraction of the runtime of an optimal solution.
Abstract: Embedded devices designed for various real-time applications typically have three constraints that must be addressed: energy, deadlines, and reward. These constraints play important roles in the next generation of embedded systems, since they provide users with a variety of quality-of-service (QoS) tradeoffs. We propose a QoS model in which applications may have several versions, each with different time and energy requirements, while providing different levels of accuracy (reward). An optimal scheme would allow the device to run the most critical and valuable versions of applications without depleting the energy source, while still meeting all deadlines. A solution is presented for frame-based and periodic task sets. Three algorithms are devised that closely approximate the optimal solution while taking only a fraction of the runtime of an optimal solution.

91 citations


Journal ArticleDOI
TL;DR: In this article, the authors present a systematic approach for deriving a clocked storage element suitable for "time borrowing" and absorption of clock uncertainties, and explain how to compare different clock storage elements with each other, and discuss issues related to power consumption and low power designs.
Abstract: Clocking considerations and the design of clocked storage elements are discussed in this paper. We present a systematic approach for deriving a clocked storage element suitable for "time borrowing" and absorption of clock uncertainties. We explain how to compare different clocked storage elements with each other, and discuss issues related to power consumption and low-power designs. Finally, results of comparisons among representative designs are presented.

72 citations


Journal ArticleDOI
TL;DR: The algorithms are evaluated using workload data obtained from production servers from several applications, showing that energy savings of 20% or more can readily be achieved, with a small degree of unmet demand and acceptable reliability, availability, and serviceability impact.
Abstract: This paper describes and evaluates predictive power management algorithms that we have developed to minimize energy consumption and unmet demand in parallel computer systems. The algorithms are evaluated using workload data obtained from production servers from several applications, showing that energy savings of 20% or more can readily be achieved, with a small degree of unmet demand and acceptable reliability, availability, and serviceability (RAS) impact. The implementation of these algorithms in IBM system management software and the possibilities for future work are discussed.

69 citations


Journal ArticleDOI
TL;DR: The design, development, testing, and integration methodology for antennas integrated into laptop computers is described, and measurements indicate that the resulting design attains both performance and cost targets.
Abstract: The design, development, testing, and integration methodology for antennas integrated into laptop computers is described. Two key parameters are proposed and discussed for laptop antenna design and evaluation: standing wave ratio (SWR) and average antenna gain. A novel averaging technique was developed and applied to these to yield a measurable, repeatable, and generalized metric. A prototype antenna was built using this methodology, and measurements indicate that the resulting design attains both performance and cost targets. A PC-card-version wireless system is also discussed and compared with the integrated one. The impact of the antenna on the overall wireless system is studied through a link budget model.

60 citations


Journal ArticleDOI
TL;DR: An overview of the Data Abstraction Research Group's major technical accomplishments is presented, which include advances in methods for feature analysis, rule-based pattern discovery, and probabilistic modeling, and novel solutions for insurance risk management, targeted marketing, and text mining.
Abstract: The Data Abstraction Research Group was formed in the early 1990s, to bring focus to the work of the Mathematical Sciences Department in the emerging area of knowledge discovery and data mining (KD & DM). Most activities in this group have been performed in the technical area of predictive modeling, roughly at the intersection of machine learning, statistical modeling, and database technology. There has been a major emphasis on using business and industrial problems to motivate the research agenda. Major accomplishments include advances in methods for feature analysis, rule-based pattern discovery, and probabilistic modeling, and novel solutions for insurance risk management, targeted marketing, and text mining. This paper presents an overview of the group's major technical accomplishments.

Journal ArticleDOI
Fred G. Gustavson1
TL;DR: A novel way to produce dense linear algebra factorization algorithms using new data structures (NDS) along with so-called kernel routines is presented, and block hybrid formats (BHF) are described, which allow one to use no additional storage over conventional matrix storage.
Abstract: We present a novel way to produce dense linear algebra factorization algorithms. The current state-of-the-art (SOA) dense linear algebra algorithms have a performance inefficiency, and thus they give suboptimal performance for most LAPACK factorizations. We show that using standard Fortran and C two-dimensional arrays is the main source of this inefficiency. For the other standard format (packed one-dimensional arrays for symmetric and/or triangular matrices), the situation is much worse. We show how to correct these performance inefficiencies by using new data structures (NDS) along with so-called kernel routines. The NDS generalize the current storage layouts for both standard formats. We use the concept of Equivalence and Elementary Matrices along with coordinate (linear) transformations to prove that our method works for an entire class of dense linear algebra algorithms. Also, we use the Algorithms and Architecture approach to explain why our new method gives higher efficiency. The simplest forms of the new factorization algorithms are a direct generalization of the commonly used LINPACK algorithms. On IBM platforms they can be generated from simple, textbook-type codes by the XLF Fortran compiler. On the IBM POWER3 processor, our implementation of Cholesky factorization achieves 92% of peak performance, whereas conventional SOA full-format LAPACK DPOTRF achieves 77% of peak performance. All programming for our NDS can be accomplished in standard Fortran through the use of three- and four-dimensional arrays. Thus, no new compiler support is necessary. Finally, we describe block hybrid formats (BHF). BHF allow one to use no additional storage over conventional (full and packed) matrix storage. This means that new algorithms based on BHF can be used as a backward-compatible replacement for LAPACK or LINPACK algorithms.

Journal ArticleDOI
TL;DR: Three additional limits facing future scaled U LP technologies are discussed, including the well-known gate-oxide leakage limitation for ULP technologies, which is reviewed in this paper.
Abstract: An ultralow-standby-power technology has been developed in both 0.18-µm and 0.13-µm lithography nodes for embedded and standalone SRAM applications. The ultralow-leakage six-transistor (6T) SRAM cell sizes are 4.81 µm2 and 2.34 µm2, corresponding respectively to the 0.18-µm and 0.13-µm design dimensions. The measured array standby leakage is equal to an average cell leakage current of less than 50 fA per cell at 1.5 V, 25°C and is less than 400 fA per cell at 1.5 V, 85°C. Dual gate oxides of 2.9 nm and 5.2 nm provide optimized cell leakage, I/O compatibility, and performance. Analyses of the critical parasitic leakage components and paths within the 6T SRAM cell are reviewed in this paper. In addition to the well-known gate-oxide leakage limitation for ULP technologies, three additional limits facing future scaled ULP technologies are discussed.

Journal ArticleDOI
Hazim Shafi1, Patrick J. Bohrer1, J. Phelan1, C. A. Rusu1, James L. Peterson1 
TL;DR: The design and validation of a performance and power simulator that is part of the Mambo simulation environment for PowerPC® systems, designated as Tempo, and examples of how well it can predict the runtime power consumption of a 405GP microprocessor during application execution are shown.
Abstract: This paper describes the design and validation of a performance and power simulator that is part of the Mambo simulation environment for PowerPC® systems. One of the most notable features of the simulator, designated as Tempo, is the incorporation of an event-driven power model. Tempo satisfies an important need for fast and accurate performance and power simulation tools at the system level. The power and performance predictions from the simulated model of a PowerPC 405GP (or simply 405GP) were validated against a 405GP-based evaluation board instrumented for power measurements using 42 application/dataset combinations from the EEMBC benchmark suite. The average performance and energy-prediction errors were 0.6% and -4.1%, respectively. In addition to describing Tempo, we show examples of how well it can predict the runtime power consumption of a 405GP microprocessor during application execution.

Journal ArticleDOI
Roy L. Adler1, Bruce Kitchens1, Marco Martens1, Charles Tresser1, Chai Wah Wu1 
TL;DR: Some mathematical aspects of halftoning in digital printing, including the method of dithering, are described, with main emphasis on error diffusion.
Abstract: This paper describes some mathematical aspects of halftoning in digital printing. Halftoning is the technique of rendering a continuous range of colors using only a few discrete ones. There are two major classes of methods: dithering and error diffusion. Some discussion is presented concerning the method of dithering, but the main emphasis is on error diffusion.

Journal ArticleDOI
G. A. Jaquette1
TL;DR: The eclectic features designed into the LTO format are described, including the format-enabled aspects of drive functionality that have been improved over previous tape drive products, including enabling backward writing, elimination of problematic failure mechanisms, dynamic rewrite of defective data, handling servo errors without stopping tape, and enabling robust reading.
Abstract: In the last two years, Linear Tape-Open® (LTO®) tape drives have become clear leaders in the mid-range tape marketplace. Tape drives designed to the LTO Ultrium® format were the first of the super drives to be shipped to mid-range tape customers. This paper describes some of the eclectic features designed into the LTO format. The technical emphasis is on aspects of the logical format that are new or different from preceding drives, though some aspects of the LTO roadmap and physical format are also discussed. The logical format comprises all of the data manipulations and organization involved in writing customer data to tape. This includes data compression to compact the data, appending of error-correction codes (ECCs) to protect the data, run-length-limited encoding of the ECC-encoded data, prepending headers to the encoded data to make it self-identifying on read-back, and storing of information about the data and the way it is stored in a cartridge memory module. Physical format aspects that are discussed include encoding data into the servo pattern and write shingling. Also discussed are the format-enabled aspects of drive functionality that have been improved over previous tape drive products, including enabling backward writing, elimination of problematic failure mechanisms, dynamic rewrite of defective data, handling servo errors without stopping tape, and enabling robust reading. Contrasts are made with previous products and competing products based on other format choices. Also discussed throughout is the way in which an eclectic format can be created by cooperation among three format-development companies.

Journal ArticleDOI
Victor Zyuban1, P. N. Strenski1
TL;DR: A relation is established between the architectural energy-efficiency metric and hardware intensity, and expressions for evaluating the effect of modifications at the microarchitectural level on processor frequency and power, assuming the optimal tuning of the pipeline are derived.
Abstract: The evaluation of architectural tradeoffs is complicated by implications in the circuit domain which are typically not captured in the analysis but substantially affect the results. We propose a metric of hardware intensity (), which is useful for evaluating issues that affect both circuits and architecture. Analyzing data for actual designs, we show how to measure the introduced parameters and discuss variations between observed results and common theoretical assumptions. For a power-efficient design, we derive relations for and supply voltage V under progressively more general situations and illustrate the use of these equations in simple examples. Then we establish a relation between the architectural energy-efficiency metric and hardware intensity, and we derive expressions for evaluating the effect of modifications at the microarchitectural level on processor frequency and power, assuming the optimal tuning of the pipeline. These relations will guide the architect to achieve an energy-optimal balance between architectural complexity and hardware intensity.

Journal ArticleDOI
TL;DR: A demonstration of a tenfold increase in capacity over current-generation Linear Tape-Open (LTO) systems, which involves significant advances in nearly every aspect of the recording process: heads, media, channel electronics, and recording platform.
Abstract: For the last 50 years, tape has persisted as the media of choice when inexpensive data storage is required and speed is not critical. The cost of tape storage normalized per unit capacity (dollars per gigabyte) decreased steadily over this time, driven primarily by advances in areal density and reduction of tape thickness. This paper reports the next advance in tape storage--a demonstration of a tenfold increase in capacity over current-generation Linear Tape-Open® (LTO®) systems. One terabyte (1 TB, or 1000 GB) of uncompressed data was written on half-inch tape using the LTO form factor. This technical breakthrough involves significant advances in nearly every aspect of the recording process: heads, media, channel electronics, and recording platform.

Journal ArticleDOI
TL;DR: Circuit techniques for low-power communication systems which exploit the capabilities of advanced CMOS technology are described and described.
Abstract: As CMOS technology scales to deep-submicron dimensions, designers face new challenges in determining the proper balance between aggressive high-performance transistors and lower-performance transistors to optimize system power and performance for a given application. Determining this balance is crucial for battery-powered handheld devices in which transistor leakage and active power limit the available system performance. This paper explores these questions and describes circuit techniques for low-power communication systems which exploit the capabilities of advanced CMOS technology.

Journal ArticleDOI
TL;DR: The resulting methodology, architecture, and compiler represent an advance of the state of the art in the area of low-power, domain-specific microprocessors.
Abstract: We describe an innovative, low-power, high-performance, programmable signal processor (DSP) for digital communications. The architecture of this processor is characterized by its explicit design for low-power implementations, its innovative ability to jointly exploit instruction-level parallelism and data-level parallelism to achieve high performance, its suitability as a target for an optimizing high-level language compiler, and its explicit replacement of hardware resources by compile-time practices. We describe the methodology used in the development of the processor, highlighting the techniques deployed to enable application/architecture/compiler/implementation co-development, and the optimization approach and metric used for power-performance evaluation and tradeoff analysis. We summarize the salient features of the architecture, provide a brief description of the hardware organization, and discuss the compiler techniques used to exercise these features. We also summarize the simulation environment and associated software development tools. Coding examples from two representative kernels in the digital communications domain are also provided. The resulting methodology, architecture, and compiler represent an advance of the state of the art in the area of low-power, domain-specific microprocessors.

Journal ArticleDOI
TL;DR: This paper is primarily an overview of data link design efforts in IBM pertinent to local area networks (LANs) using both multimode fiber (MMF) and single-mode (SMF) links, with emphasis on MMF links operating at short wavelengths.
Abstract: This paper is primarily an overview of data link design efforts in IBM pertinent to local area networks (LANs) using both multimode fiber (MMF) and single-mode (SMF) links, with emphasis on MMF links operating at short wavelengths. Device models (laser and receiver) and multimode fiber models are discussed, as well as noise aspects (modal and mode partition noise). In addition, new simulation and measurement results for a 20-Gb/s 1-km-long link are presented.

Journal ArticleDOI
TL;DR: It is concluded that until technology advances allow denser packaging of memory or more efficient use of memory across nodes, the best performance and energy efficiency can be obtained by heterogeneous deployment of both traditional high-end and dense servers.
Abstract: Dense servers trade performance at the node level for higher deployment density and lower power consumption as well as the possibility of reduced cost of ownership. System performance and the details of energy consumption for this class of servers, however, are not well understood. In this paper, we describe a research prototype designated as the Super Dense Server (SDS), which was optimized for high-density deployment. We describe its hardware features, show how they challenge the operating system and middleware, and describe how we have enhanced its software to handle these challenges. Our performance evaluation has shown that dense servers are a viable deployment alternative for the edge and application servers commonly found at conventional Web sites and large data centers. Using industry benchmarks, we have shown that SDS outperforms a comparable traditional server by almost a factor of 2 for CPU-bound electronic commerce workloads for the same space and roughly equivalent power budget. We have observed the same advantage in performance when SDS is compared to the alternative solution of virtualizing a high-end server to handle "scaled-down" workloads. We have also shown that SDS offers finer power management control than traditional servers, allowing higher energy efficiency per unit of computation. However, for high-intensity Web-serving workloads, SDS does not perform as well as a traditional server when many nodes must be configured into a cluster to provide a single system image. In that case, the limited memory of each SDS node reduces its performance scalability, and a traditional server is a better alternative. We have concluded that until technology advances allow denser packaging of memory or more efficient use of memory across nodes, the best performance and energy efficiency can be obtained by heterogeneous deployment of both traditional high-end and dense servers.

Journal ArticleDOI
Robert G. Biskeborn1, J. H. Eaton1
TL;DR: The flat tape head manufacturing processes, drive implementation, performance, and outlook are described and the flat-lapped tape heads used in IBM Linear Tape-Open (LTO) products are discussed.
Abstract: IBM thin-film tape heads have evolved from the ferrite-based heads first used in the IBM Model 3480 Tape Drive to the hard-disk-drive (HDD) technology flat-profile heads used in IBM Linear Tape-Open® (LTO®) products. This paper describes that transition and discusses the flat tape head manufacturing processes, drive implementation, performance, and outlook. Thin-film head technology for hard-disk drives was first used in tape heads in the early 1990s, when IBM built quarter-inch cartridge head images on HDD-type wafers. This was a springboard for the next step, flat-lapped tape heads, which use not only HDD wafers, but also HDD post-wafer machining technologies. With the emergence of LTO, flat heads entered mainstream tape head production in IBM. These have proven to have high performance and durability.

Journal ArticleDOI
TL;DR: The ideas behind grammar evolution are used to automatically generate and evolve Lindenmayer grammars which represent fractal curves with a fractal dimension that approximates a predefined required value.
Abstract: Lindenmayer grammars have frequently been applied to represent fractal curves. In this work, the ideas behind grammar evolution are used to automatically generate and evolve Lindenmayer grammars which represent fractal curves with a fractal dimension that approximates a predefined required value. For many dimensions, this is a nontrivial task to be performed manually. The procedure we propose closely parallels biological evolution because it acts through three different levels: a genotype (a vector of integers), a protein-like intermediate level (the Lindenmayer grammar), and a phenotype (the fractal curve). Variation acts at the genotype level, while selection is performed at the phenotype level (by comparing the dimensions of the fractal curves to the desired value).

Journal ArticleDOI
TL;DR: This paper describes a mature project infrastructure consisting of predictive device models, complete rf characterization, statistical and scalable compact models that are hardware-verified, and a robust design automation environment, and provides an overview of associated development work.
Abstract: The rapidly expanding telecommunications market has led to a need for advanced rf integrated circuits. Complex rf- and mixed-signal system-on-chip designs require accurate prediction early in the design schedule, and time-to-market pressures dictate that design iterations be kept to a minimum. Signal integrity is seen as a key issue in typical applications, requiring very accurate interconnect transmission-line modeling and RLC extraction of parasitic effects. To enable this, IBM has in place a mature project infrastructure consisting of predictive device models, complete rf characterization, statistical and scalable compact models that are hardware-verified, and a robust design automation environment. Finally, the unit and integration testing of all of these components is performed thoroughly. This paper describes each of these aspects and provides an overview of associated development work.

Journal ArticleDOI
TL;DR: The prototype design represents a first step toward a fully integrated monolithic WCDMA/UMTS receiver system-on-a-chip and a rigorous set of performance tests are used to characterize the noise and linearity performance of the packaged IC across its full frequency band of operation.
Abstract: A prototype of a 3-V SiGe direct-conversion receiver integrated circuit for use in third-generation (3G) WCDMA mobile cellular systems has been completed. The goal of its design was to minimize current draw while meeting WCDMA receiver rf specifications with margin. The design includes a bypassable low-noise amplifier, quadrature downconverter, and first-stage variable-gain baseband amplifiers integrated on chip. The design is optimized for use with a single-ended off-chip bandpass surface-acoustic-wave filter with no external matching components. The prototype design represents a first step toward a fully integrated monolithic WCDMA/UMTS receiver system-on-a-chip. A rigorous set of performance tests are used to characterize the noise and linearity performance of the packaged IC across its full frequency band of operation. A receiver test-bed system with a software baseband demodulator is used to determine the bit-error-rate performance of the receiver integrated circuit (IC) at sensitivity. Measured results are compared with estimated system performance requirements to determine compliance with key WCDMA rf specifications.

Journal ArticleDOI
Brenda Dietrich1, Alan J. Hoffman1
TL;DR: A common generalization of some theorems of Queyranne–Spieksma– Tardella, Faigle–Kern, and Fujishige about greedy algorithms for linear programs in diverse contexts is found and a well-known theorem of Topkis about submodular functions on the product of chains is extended.
Abstract: Recent developments in the use of greedy algorithms in linear programming are reviewed and extended. We find a common generalization of some theorems of Queyranne–Spieksma– Tardella, Faigle–Kern, and Fujishige about greedy algorithms for linear programs in diverse contexts. Additionally, we extend a well-known theorem of Topkis about submodular functions on the product of chains to submodular functions on the product of lattices.

Journal ArticleDOI
TL;DR: The key challenges in mixed-signal and analog integrated circuit design at such ultrahigh data rates, and the solutions which leverage high-speed and microwave design and broadband SiGe technologies are highlighted.
Abstract: Considerable progress has been made in integrating multi-Gb/s functions into silicon chips for data- and telecommunication applications. This paper reviews the key requirements for implementing such functions in monolithic form and describes their implementation in the IBM SiGe BiCMOS technology. Aspects focused on are the integration of 10-13-Gb/s serializer/deserializer chips with subpicosecond jitter performance, the realization of 40-56-Gb/s multiplexer/demultiplexer functions and clock-and-data- recovery/clock-multiplier units, and, finally, the implementation of some analog front-end building blocks such as limiting amplifiers and electro-absorption modulator drivers. Highlighted in this paper are the key challenges in mixed-signal and analog integrated circuit design at such ultrahigh data rates, and the solutions which leverage high-speed and microwave design and broadband SiGe technologies.

Journal ArticleDOI
TL;DR: The migration of a complete digital circuit library from bulk to SOI is discussed to prove that SOI CMOS supports ASIC-style as well as fully custom circuit design.
Abstract: Systems-on-chips (SoCs) that combine digital and high-speed communication circuits present new opportunities for power-saving designs. This results from both the large number of system specifications that can be traded off to minimize overall power and the inherent low capacitance of densely integrated devices. As shown in this paper, aggressively scaled silicon-on-insulator (SOI) CMOS is a promising technology for SoCs for several reasons: Transistor scaling leads to active power reduction in the sub-50-nm-channel-length regime, standard interconnect supports the high-quality passive devices essential to communications circuitry, and high-speed analog circuits on SOI are state of the art in terms of both performance and power dissipation. We discuss the migration of a complete digital circuit library from bulk to SOI to prove that SOI CMOS supports ASIC-style as well as fully custom circuit design.

Journal ArticleDOI
TL;DR: The organization of the SoC design is described, the capabilities provided in the design to match the performance and power consumption with the need of the application are described, and measured results for the PowerPC 405LP processor are presented.
Abstract: The PowerPC® 405LP system-on-a-chip (SoC) processor, which was developed for high-content, battery-powered application space, provides dynamic voltage-scaling and on-the-fly frequency-scaling capabilities that allow the system and applications to adapt to changes in their performance demands and power constraints during operation. The 405LP operates over a voltage supply range of 1.95 to 0.9 V with a range of power efficiencies of 1.0 to 3.9 MIPS/mW when executing the Dhrystone benchmark. Operating system and application software support allow the applications to take full advantage of the energy-efficiency capabilities of the SoC. This paper describes the organization of the SoC design, details the capabilities provided in the design to match the performance and power consumption with the need of the application, describes how these capabilities are employed, and presents measured results for the PowerPC 405LP processor.