## Analogue integrated circuit sizing with several optimization runs using heuristics for setting initial points # Comment proportionner des circuits analogues intégrés avec plusieures marches d'optimisation à l'aide d'heuristique pour détérminer les points du départ J. Puhan, Á. Bűrmen and T. Tuma\* Circuit sizing (e.g. determining MOSFET channel widths and lengths which result in the most appropriate and robust circuit) is an optimization process. When it is completed, there always remains a dilemma, whether a better solution exists. With different starting points one can arrive at different local minima. A heuristic process, consisting of many optimization runs started from different initial points, is proposed. It tries to find another local minimum of the cost function in every run and thus reveals some additional information about the circuit. The mathematical background of the algorithm used is described. Finally, the heuristic algorithm is tested on some real integrated operating amplifier designs. The results show, that from the cost function's point of view surprisingly many equivalent solutions exists. Proportinner les circuits, c'est à dire, détérminer la largeur et la longueur des cannaux MOSFET, qui résultent en le circuit le plus approprié et le plus robuste, est sans doute un procès d'optimisation. Mais, le procès terminé, il y reste toujours une dilèmme, s'il on y existe pas une solutions meilleure. Si on prend les points du départ différents, on peut arriver aux minima locaux très divers. L'article propose de mettre en oeuvre un procès heuristique, en train duquel on fait marcher l'optimisation plusieures fois déterminant les points du départ très différents. Pendant chaque marche d'optimisation on essaie de trouver un minimum local d'une fonction de critère et ainsi obtenir des informations supplémentaires sur le circuit. L'article engage le fond mathématique de l'argorithme utilisé. Enfin, l'argorithme heuristique est mis sous l'épreuve en cadre des dessins réels intégrés d'amplificateur opérant. Le résultat est surprénant; du point du vu de la fonction de critère, il y existe beaucoup des solutions équivalentes. $\textbf{Keywords:} \ computer \ aided \ design, integrated \ circuits, optimization \ algorithms, \ sizing$ #### I. Introduction Creating a good analogue integrated circuit (or analogue part in a mixed circuit) design is still a hard task, which usually requires senior designer knowledge and skills. There are no predefined libraries of standard cells and networks as in the digital world. Therefore the design of an analogue circuit consisting of a few transistors can be more time consuming than designing a fairly complex digital circuit. Application specific integrated circuit (ASIC) designers also frequently reuse their previous solutions and adapt them to their current needs. A circuit simulator is indispensable in this development procedure. The computers are mainly used to analyze human designs. Initially a suitable circuit configuration is required, which can potentially fulfil the given requirements. This task is mostly left to the designer although several tools partially automating the topology synthesis appeared in the past [1]–[4]. Then the circuit sizing problem has to be solved. One desires such element sizes (e.g. MOSFET channel widths and lengths, capacitors, resistors, etc.) that required circuit properties are met in the most robust manner. Circuit sizing is an optimization process by its nature and one can find quite extensive literature in this area. Sizing of nominal circuits was considered in [5]–[6], e-mails: {janez.puhan, arpad.buermen, tadej.tuma}@fe.uni-lj.si sizing problems accounting for parameter tolerances (parameter centering) were addressed in [7]–[9], and worst-case optimization in [10]–[12]. Various optimization tools were developed, like equation based GPCAD [13], which uses geometric programming formulation of an optimization problem [14] on predefined posynomial equations, AMG [15], utilising a symbolic simulator [16] to obtain circuit equations, and the simulation based ASTRX/OBLX [17]. Recently numerous papers (e.g. [12], [18]–[23]) are addressing the sizing problem from different aspects like process and operating tolerances, mismatch, yield and robustness. Despite all the research efforts made, circuit sizing is still a task that is addressed manually. New sizes for the next experiment are determined by a human designer and not automatically by the optimization method. In our opinion the automated optimization is rarely used because of three major reasons: - there are no general optimization tools integrated into any of the most popular circuit simulators for ASIC design (optimization tools, e.g. [13], [15], [17], are not integrated into commercial simulators and therefore offer only very limited capabilities), - the mathematical formulation of the cost function, which would yield acceptable solutions, is rather complicated and demands an experienced user (optimization algorithms can get trapped in senseless regions of parameter space, resulting in degenerated solutions; searching for the minimum of the cost function can also result in circuits highly sensitive to manufacturing process and $<sup>^*{\</sup>it Faculty}$ of Electrical Engineering, University of Ljubljana, Tržaška cesta 25, 1001 Ljubljana, Slovenia operating condition variations [21]; a possible solution is the use of implicit constraints [14], [20], [23]), and • the results of the optimization run are not to be unlimitedly trusted (in many cases the minimum found is not the global one, even if a global optimization method was used). This paper focuses on the last of these three drawbacks. There exists many different gradient, quasi gradient, and direct search optimization algorithms. A good survey of the first family can be found in [24]. Gradient based methods are greedy by default and require the derivatives of the cost function to be calculated at each iteration. When applied to circuit sizing, the derivatives are usually calculated by a sensitivity analysis, meaning that the cost function can't be of arbitrary form. Those methods have a strong local nature and are therefore usually used for finetuning circuits [25]. On the other hand direct search methods [26]–[28] do not require additional gradient computations. Convergence properties for pattern search methods have been reported in [29]. These methods can be classified by their behaviour as local or global. Some global methods even guarantee to find the global minimum if certain conditions are fulfilled [30]–[31]. Performance of an optimization method on cost functions depends on many parameters one of which is the initial point. The same method can lead to quite different results for different initial points. Local methods are more sensitive than global ones. The latter have always some randomness build into them, which at least partially neutralizes the importance of the proper selection of the algorithm's initial point. The selection of the initial point is usually left to the user, who relies upon knowledge and intuition. Usually a point is chosen where the circuit's best performance is expected. If the choice is right, the minimum of the cost function lies near and the optimization task turns to fine tuning of the circuit. But on the other hand no additional information is gained. The optimization process just confirms the expectations. A great part of the parameter space is left unexplored and the question of finding a better solution remains open. If we want to be assured that no better point exists then the whole parameter space has to be explored. One way to do this is to optimize the circuit starting from several different initial points, and each optimization run has to cover a different part of the parameter space. The optimization process becomes a group of individual optimization runs. Optimization methods have limited memory and therefore only a few points from previous iterations are used to determine the next step. Today computers easily store all the evaluated points, while the evaluation itself is still computationally expensive. Thus the initial point for the next optimization run should be determined using the information obtained from evaluated points. This paper proposes a heuristic method based on the probabilistic approach [32]–[33]. The method puts the new initial point in a part of the parameter space, where the probability of finding a new minimum is high. It can be applied to multidimensional parameter space and does not require significant computer effort. Several minima are obtained in such an optimization process. The designer can decide, which one is most appealing and may even continue with the investigation of the unexplored parts of the parameter space. First the mathematical background of the assumptions used later in the heuristic algorithm are highlighted. Several optimization cases of CMOS integrated operational amplifiers are illustrated and the obtained results are commented. ### II. Mathematical Background, One Dimensional Probabilistic Approach Let us define a continuous positive stochastic process $f(x,\omega)$ . It assigns a positive function $f(x)\geq 0$ to every outcome $\omega\in\Omega$ of experiment $\zeta$ . The domain of $\omega$ is the set of all experimental outcomes $\Omega$ , and the domain of x is a set of real numbers $\Re$ . The probability density function $g(f_0,x)$ is the derivative of the probability of an event $\{f(x,\omega)\leq f_0\}$ . Because we defined a positive process we know that $g(f_0,x)=0$ for $f_0<0$ . Lets further say we have k pairs $(x_i, F(x_i))$ , i = 1, 2, ..., k, named known points. An event $Z_k$ occurs, when a realization of stochastic process f(x) goes through all known points. In other words, the event $Z_k$ is defined as $\{f(x_i, \omega) = F(x_i), i = 1, 2, ..., k\}$ . It becomes certain if mean value m(x) is equal to the function value at all known points and if variance $\sigma^2(x)$ at those points is zero. Therefore $m(x_i) = F(x_i)$ and $\sigma^2(x_i) \to 0$ for i = 1, 2, ..., k. Cost function $F(\mathbf{x}), \mathbf{x} \in A \subseteq \Re^n, F: \Re^n \to \Re^+$ of an optimization problem is usually defined as a transformation from n dimensional closed and simply connected feasible region A into a positive number (zero included). If we constrain ourselves to one dimension then $F(\mathbf{x})$ becomes F(x) and feasible region becomes an interval $A = [x_{low}, x_{high}]$ . After one or more optimization runs the cost function has been evaluated at several points which were denoted as known points in the paragraph above. Therefore cost function values $F(x_i)$ at parameter values $x_i$ , $i=1,2,\ldots k$ , represents all the information we have at the time. Let opt be the index of a point with the lowest cost function value among known points $(F(x_{opt}) \leq F(x_i), i=1,2,\ldots k)$ . If mean value and variance have the properties mentioned above then the event $Z_k$ always occurs. In this case every realization of the stochastic process can represent the unknown cost function. Remind that the probability density function of the stochastic process was not defined yet. The question is where to choose the new initial point for the next optimization run, if the cost function is already known in k points. A natural decision is to set it where the expected value $\mathrm{E}\{\min(F(x_{opt}), f(x,\omega)) \mid Z_k\}$ is minimal. To find out a new starting point $x_0$ a minimization problem (1) has to be solved. The integral definition of the expected value expresses the minimization problem with the density function $g_{min}(f_0,x)$ of $\min(F(x_{opt}),f(x,\omega))$ . $$x_{0} = \arg\min_{x \in A} \left( \mathbb{E}\{\min(F(x_{opt}), f(x, \omega)) \mid Z_{k}\} \right)$$ $$= \arg\min_{x \in A} \left( \int_{-\infty}^{\infty} f_{0}g_{min}(f_{0}, x)df_{0} \right)$$ $$m(x_{i}) = F(x_{i}), \sigma^{2}(x_{i}) \rightarrow 0, i = 1, 2, \dots k$$ $$(1)$$ Since $\min(F(x_{opt}), f(x, \omega)) \leq F(x_{opt})$ we know that the probability density $g_{min}(f_0, x) = 0$ for $f_0 > F(x_{opt})$ . Otherwise $g_{min}(f_0, x)$ is equal to $g(f_0, x)$ . It follows that $$g_{min}(f_0, x) = u(F(x_{opt}) - f_0)g(f_0, x) + \delta(F(x_{opt}) - f_0) \int_{F(x_{opt})}^{\infty} g(f, x)df$$ (2) where functions $u(F(x_{opt}) - f_0)$ and $\delta(F(x_{opt}) - f_0)$ represent a unit step function and its derivative, a unit Dirac impulse, respectively (Fig. 1). With inserting (2) into (1) we can obtain the equation (3). $$x_{0} = \arg\min_{x \in A} \left( \int_{-\infty}^{F(x_{opt})} f_{0}g(f_{0}, x) df_{0} + F(x_{opt}) \int_{F(x_{opt})}^{\infty} g(f, x) df \right)$$ (3) $$m(x_{i}) = F(x_{i}), \sigma^{2}(x_{i}) \to 0, i = 1, 2, \dots k$$ Due to the assumptions on mean value and variance any realization of the stochastic process can represent the unknown cost function. The probability density $g(f_0,x)$ is still an arbitrary function. In order to solve the minimization problem (3) it has to be defined. In other words, we have to finally define the last undetermined property of the stochastic process. Lets assume that the cost function is continuous in the neighbourhood of all known points $(x \in [x_i - \varepsilon_i, x_i + \varepsilon_i], \varepsilon_i > 0, i = 1, 2, \ldots k)$ . Then inside those it can be a sample path of constrained limited random walk $w_{pos}(x,\omega) = \max(0,w(x,\omega))$ , where function $w(x,\omega)$ represents the unconstrained limited random walk also known as Wiener process. The probability density function $g_{nor}(f_0,x)$ of Wiener process is normal with constant mean value and linearly increasing variance. Wiener process is a continuous function of variable x. Let our stochastic process be a constrained limited random walk $f(x,\omega) = w_{pos}(x,\omega)$ inside all known point neighbourhoods. Then in those areas its probability density function can be expressed as a function of $g_{nor}(f_0,x)$ (Fig. 1). The mean value $m_i(x) = F(x_i)$ and the variance $\sigma_i^2(x) = \alpha |x - x_i|, \alpha > 0$ , of $g_{nor}(f_0,x)$ are different in every neighbourhood (denoted by index i). $$g(f_0, x) = \delta(f_0) \int_{-\infty}^0 g_{nor_i}(f, x) df + u(f_0) g_{nor_i}(f_0, x)$$ (4) $$x \in [x_i - \varepsilon_i, x_i + \varepsilon_i], \varepsilon_i > 0$$ $$m_i(x) = F(x_i), \sigma_i^2(x) = \alpha |x - x_i|, \alpha > 0$$ $$i = 1, 2, \dots k$$ The assumption of continuity in the neighbourhood of known points does not place any physically unrealistic limitations on types of cost functions, which result from in circuit design optimization problems. Also note that mean and variance have the properties which make the event $Z_k$ certain. The probability density function is now defined in the neighbourhood of known points and not for a whole feasible region A. In the neighbourhood of $i^{\text{th}}$ known point expression (3) which needs to be minimized becomes a function of mean value and variance (5). $$h(m_{i}(x), \sigma_{i}^{2}(x)) = \int_{0}^{F(x_{opt})} f_{0}g_{nor_{i}}(f_{0}, x)df_{0} + F(x_{opt}) \int_{F(x_{opt})}^{\infty} g_{nor_{i}}(f, x)df$$ (5) $$x \in [x_i - \varepsilon_i, x_i + \varepsilon_i], \varepsilon_i > 0$$ $$m_i(x) = F(x_i), \sigma_i^2(x) = \alpha |x - x_i|, \alpha > 0$$ Since $g_{nor_i}(f_0,x)$ represents the normal distribution (5) becomes a monotonically increasing function of mean value and monotonically decreasing function of variance. Mean corresponds to the cost function value at the $i^{\rm th}$ known point and variance corresponds to the distance from it. Because we are searching for the minimum in (3) the above statements lead to two conclusions: - first due to increase with $F(x_i)$ the new initial point $x_0$ lies rather closer to the known points with lower cost function value, than to those with higher cost function value, - due to the decrease resulting from |x x<sub>i</sub>| it lies away from all known points so the distance to the nearest one is as large as possible. Both conclusions can be intuitively generalized to n dimensional parameter space. A simple heuristic method described in the following section is based on this generalization. #### III. A Heuristic Method for Finding New Initial Points The second conclusion tells us, that a new initial point has to be somewhere in the parameter space, where the density of already evaluated points is low. If it is low, then we expect the average distance between two nearest points to be large in general. But we have to define how to measure the density of known points. Let us divide the parameter space into $2^n$ equal subspaces $(2^n$ equal boxes). Let the density be equal to the number of known points in a particular subspace, and let it be constant across the whole subspace. A new initial point will be chosen in the subspace with the lowest density. The first conclusion on the other hand tells us, that the contribution to the density is not always the same for all already evaluated points. Those with lower cost function values should contribute less, than the ones with higher cost function values. In the previous definition all of them contributed one unit, regardless of the cost function value. Therefore known points have to be weighted. Each point will contribute its weight, which has to be proportional to its cost. Let the weight u of a point with cost function value F be defined by equation (6). $$u = \frac{(\beta - 1)F + F_{max} - \beta F_{min}}{F_{max} - F_{min}}$$ (6) $F_{min}$ and $F_{max}$ represent the lowest and the highest cost function value among already determined points, respectively. The point with the lowest cost function value has always weight one. The weight of the point with the highest cost function value is given by coefficient $\beta$ , and now it contributes $\beta$ times more to the density, than the lowest point. So far all known points, for which we know, that they violate implicit constraints, are still not included in our definition of density. They lack a cost function value F, so their weight can not be calculated by equation (6). But those points give us some information about the cost function and therefore they have to be taken into account. We set their weight to $2\beta$ . Finally the heuristic algorithm for determining a new initial point for the next optimization run is described in the repeat until loop (Fig. 2) below. The space is divided into $2^n$ equal subspaces, until we find a subspace with no points determined yet. A new initial point is selected there randomly. The algorithm is very simple, so it demands only a small amount of computational time. #### IV. Sizing Problem Cases and Results In this section three CMOS design cases are described to illustrate the capabilities of the proposed approach. Two simple two-stage operational amplifiers with p and n-channel differential pair (Figs. 3 and 4) and a telescopic cascode operational amplifier (Fig. 5) were optimized. Several versions of the above three sample circuits optimized to meet different requirements were used as a part of larger mixed signal integrated circuits. The amplifiers were designed for and produced in $0.3 \mu \rm m$ and $0.8 \mu \rm m$ technology. The optimized parameters were all transistor channel dimensions (widths and lengths), MOS multiplier factors and also the resistances and the capacitances. The variations of circuit device (transistor, capacitor, resistor, etc.) properties arising from the manufacturing process variations can cause a circuit to fail to fulfil the design requirements. IC manufacturers describe process variations by means of so called corner models. Corner models describe several extreme conditions, which may occur during IC fabrication and result in some extreme circuit device behaviour. For a CMOS process for instance worst power, worst speed, worst one, worst zero etc. corner models of MOSFETs are provided. Usually a typical or nominal model is also supplied. Beside corner models every operating condition (supply and reference voltages, bias currents, temperature etc.) brings along a nominal value and at least two (minimal and maximal) extreme values. A particular combination of corner model and operating condition values is called a corner point. One should keep in mind that the number of corner points is usually quite large. The idea of robust design as sometimes practiced by IC designers relies on the assumption, that the circuit characteristics reach their extreme values at extreme operating conditions and process variations. In order to establish whether the design is robust, designers examine the performance of the circuit for all corner points. If the circuit performances are satisfactory in all relevant corner points then the design is considered robust. Device's physical dimension tolerances have far smaller impact to its properties than the manufacturing process and operating condition variations. Therefore they are usually neglected. The only case when dimension tolerances are important is mismatch analysis. In our cases mismatch is simulated by slight transistor model variations of one of the matching transistors. The circuit characteristics that participate in the cost function are listed in the upper part of Tables 1 and 2. The cost function for a single corner point is formulated as a weighted sum which combines results of several types of analyses. Including all circuit characteristics, which are essential for the circuit design, into the cost function guarantees that the optimization procedure searches for the most acceptable trade-off among them. Therefore no single circuit property is individually minimized or maximized. Due to the variety of circuit characteristics several types of analyses have to be performed in order to calculate the cost function. Global minimum of the cost function for a single corner point represents the tradeoff where the circuit has the most suitable properties for that particular corner point. Beside searching for an optimal nominal circuit in the nominal corner point the robustness is also taken into account. The information about design robustness is included into the cost function. Since the design is considered robust if it performs well in all relevant corner points, the circuit characteristics are evaluated for those corner points as well [12] and the corresponding terms are added to the cost function. Therefore in each iteration of the optimization procedure the required circuit analyses have to be done for the nominal operating and process conditions and the relevant corner points. The cost function summarizes all the obtained data into a single number which represent how good and how robust a particular design is. The shape of such a complicated cost functions in multidimensional parameter space is completely unknown. Finding a global minimum (the best robust tradeoff) is a difficult task for any optimization method and circuit simulator since it requires many circuit analyses. Nevertheless we expect that somewhere in the parameter space there is a global minimum which defines the optimal solution satisfying the given requirements. The results for the two-stage operational amplifiers are summarized in Table 1 and for the telescopic cascode operational amplifier in Table 2. Only some of the optima found with the initial point set by the described heuristics are given because of the tables size. The upper part of both Tables contains nominal circuit performances. The lower part summarizes parameter values in each minimum. Multiplying factor \* channel width / channel length (mw/l) ratio is given for some transistors in all three cases. If short channel effects in submicron region are neglected then the ratio defines a transistor. Therefore it is convenient for estimating if two solutions are equivalent. The optimization method used in a particular run is not essential. In fact any local method can be used since global methods tend to the global minimum regardless of the chosen initial point. Direct methods are preferable since the derivatives of the cost function are not required (often impossible to calculate without resorting to perturbation methods which are not accurate enough). So one can use any simplex, quasi gradient (metric matrix, trust region etc.), heuristic, etc. based method. In our experiments a heuristic simplex based method was used. The cost function was composed as a weighted sum of deviations from the target values for nominal and worst conditions. If a particular target is fulfilled the optimization process does not tend to improve it any further. Approximately 500 to 1000 circuit evaluations were needed for one run to converge and on the average every third run was successful. Thus the results in Table 2 were obtained in 30000 circuit evaluations. Comparing this result to a performance of well known global optimization methods like simulated annealing or genetic algorithms is encouraging since over 150000 circuit evaluations are needed to optimize a circuit like the telescopic cascode amplifier. From all presented cases we can see that many different solutions of the circuit sizing problem exist. An interesting parallel can be drawn with [34]–[35] where the entire circuit synthesis problem (topology and sizing) was addressed by genetic programming. Uncommon circuit topology solutions were found beside well known ones. More or less the same circuit properties can be obtained with several different sets of circuit parameters. Two explanations are at hand: 1.) the target values are to loose for the used circuit configuration and for the given technology and are easily fulfilled, or 2.) the optimization run is stopped at different tradeoffs among given targets. Because all requirements are never fulfilled the second explanation is more probable. To confirm this, the same experiments were repeated with tighter targets. The requirements remained unfulfilled and individual solutions didn't merge. A closer look at the Table 2 also confirms that the solutions represent tradeoffs among required targets. We can see for instance that the last two results have complementary properties. While the solution from column nine has low $v_{pp}$ , pm and am it has high $i_p$ and $f_{0\rm dB}$ . On the other hand the last circuit (column 10) has opposite properties. The same observations can be made in Table 1. #### V. Conclusion A simple heuristic method for setting the initial points of individual optimization runs was described. The idea is based on a one dimensional probabilistic approach extended to multidimensional parameter space. The main objective is to uniformly search the parameter space with a sequence of optimization runs. Each run contributes some new information about the cost function shape in the multidimensional parameter space. Different local minima are found, if they are present. Multiple solutions are obtained providing additional insight into circuit behaviour. The designer can decide which one is the most appropriate and continues his/her work from there with finetuning. Finetuning is usually necessary since the obtained minimum of the cost function not necessarily satisfies the designer's expectations. A statistical model of the cost function was presented. The construction of cost function itself [12] is beyond the scope of this paper. The method takes into account all collected cost function data. Therefore all calculated points must be stored and some additional MBytes of RAM are occupied for that reason. But on the other hand it requires only a small computing effort and does not take a considerable amount of time. The optimization method used in the individual runs can be an arbitrary fast greedy (local) method. Fast convergence of such methods ensures short runtimes since global methods (like simulated annealing or genetic algorithms etc.) have in general slow convergence. More information is obtained instead of a single minimum. Our method can try several different initial points in the time needed by a global method to converge. #### References - M.G.R. Degrauwe et al., "IDAC: An interactive design tool for analog CMOS circuits," *IEEE J. Solid-State Circuits*, vol. sc-22, no. 6, Dec. 1987, pp. 1106–1116 - [2] R. Harjani, R.A. Rutenbar and L.R. Carley, "OASYS: A framework for analog circuit synthesis," *IEEE Trans. Computer-Aided Design*, vol. 8, no. 12, Dec. 1989, pp. 1247–1266 - [3] H.Y. Koh, C.H. Séquin and P.R. Gray, "OPASYN: A compiler for CMOS operational amplifiers," *IEEE Trans. Computer-Aided Design*, vol. 9, no. 2, Feb. 1990, pp. 113–125 - [4] J.P. Harvey, M.I. Elmasry and B. Leung, "STAIC: An interactive framework for synthesizing CMOS and BiCMOS analog circuits," *IEEE Trans. Computer-Aided Design*, vol. 11, no. 11, Nov. 1992, pp. 1402–1417 - [5] R.K. Brayton, G.D. Hachtel and A.L. Sangiovanni-Vincentelli, "A survey of optimization techniques for integrated-circuit design," *Proc. IEEE*, vol. 69, no. 10, Oct. 1981, pp. 1334–1364 - [6] W. Nye et al., "DELIGHT.SPICE: An optimization-based system for the design of integrated circuits," *IEEE Trans. Computer-Aided Design*, vol. 7, no. 4, Apr. 1988, pp. 501–519 - [7] S.W. Director and G.D. Hachtel, "The simplicial approximation approach to design centering," *IEEE Trans. Circuits Syst. I*, vol. cas-24, no. 7, July 1977, pp. 363–372 - [8] K.J. Antreich and R.K. Koblitz, "Design centering by yield prediction," *IEEE Trans. Circuits Syst. I*, vol. cas-29, no. 2, Feb. 1982, pp. 88–96 - [9] P. Feldmann and S.W. Director, "Integrated circuit quality optimization using surface integrals," *IEEE Trans. Computer-Aided Design*, vol. 12, no. 12, Dec. 1993; pp. 1868–1879 - [10] K.J. Antreich, H.E. Hraeb and C.U. Wieser, "Circuit analysis and optimization driven by worst-case distances," *IEEE Trans. Computer-Aided Design*, vol. 13, no. 1, Jan. 1994, pp. 57–71 - [11] A. Dharchoudhury and S.M. Kang, "Worst-case analysis and optimization of VLSI circuit performances," *IEEE Trans. Computer-Aided Design*, vol. 14, no. 4, Apr. 1995, pp. 481–492 - [12] Á. Bűrmen et al., "Automated robust design and optimization of integrated circuits by means of penalty functions," Int. J. Electron. Comm., 57, no. 1, 2003, pp. 47–56 - [13] M. del Mar Hershenson, S.P. Boyd and T.H. Lee, "GPCAD: A tool for CMOS op-amp synthesis," 1998 IEEE/ACM Int. Conf. Comput.-Aided Design, New York, 1998, pp. 296–303 - [14] ——, "Optimal design of a CMOS op-amp via geometric programming," IEEE Trans. Computer-Aided Design, vol. 20, no. 1, Jan. 2001, pp. 1–21 - [15] G. Gielen et al., "An analogue module generator for mixed analogue/digital ASIC design," Int. J. Circuit Theory and App., vol. 23, no. 4, July–Aug. 1995, pp. 269–283 - [16] G.G.E. Gielen, H.C.C. Walscharts and W.M.C. Sansen, "Analog circuit design optimization based on symbolic simulation and simulated annealing," *IEEE J. Solid-State Circuits*, vol. 25, no. 3, June 1990, pp. 707–713 - [17] E.S. Ochotta, R.A. Rutenbar and L.R. Carley, "Synthesis of high-performance analog circuits in ASTRX/OBLX," *IEEE Trans. Computer-Aided Design*, vol. 15, no. 3, Mar. 1996, pp. 273–294 - [18] G. Debyser and G. Gielen, "Efficient analog circuit synthesis with simultaneous yield and robustness optimization," 1998 IEEE/ACM Int. Conf. Comput.-Aided Design, New York, 1998, pp. 308–311 - [19] R. Schwencker et al., "Automating the sizing of analog CMOS circuits by consideration of structural constraints," DATE Conf. and Exhibition, 1999, Los Alamitos, 1999, pp. 323–327 - [20] R. Phelps et al., "Anaconda: simulation-based synthesis of analog circuits via stochastic pattern search," *IEEE Trans. Computer-Aided Design*, vol. 19, no. 6, June 2000, pp. 703–717 - [21] T. Mukherjee, L.R. Carley and R.A. Rutenbar, "Efficient handling of operating range and manufacturing line variations in analog cell synthesis," *IEEE Trans. Computer-Aided Design*, vol. 19, no. 8, Aug. 2000, pp. 825–839 - [22] F. Schenkel et al., "Mismatch analysis and direct yield optimization by spec-wise linearization and feasibility-guided search," Proc. 38th DAC, New York, 2001, pp. 858–863 - [23] P. Mandal and V. Visvanathan, "CMOS op-amp sizing using a geometric programming formulation," *IEEE Trans. Computer-Aided Design*, vol. 20, no. 1, Jan. 2001, pp. 22–38 - [24] J. Nocedal, "Theory of algorithms for unconstrained optimization," *Acta Numerica*, vol. 1, 1992, pp. 199–242 - [25] A.R. Conn et al., "JiffyTune: circuit optimization using time-domain sensitivities," IEEE Trans. Computer-Aided Design, vol. 17, no. 12, Dec. 1998, pp. 1292–1309 - [26] M.H. Wright, "Direct search methods: once scorned, now respectable," Proc. 1995 Dundee Biennial Conf. in Numerical Analysis, 1995, pp. 191–208 - [27] M.J.D. Powell, "Direct search algorithms for optimization calculations," Acta Numerica, vol. 7, 1998, pp. 287–336 - [28] R.M. Lewis, V.Torczon and M.W. Trosset, "Direct search methods: then and now," J. Computational and Applied Math., vol. 124, no. 1–2, Dec. 2000, pp. 191–207 - [29] V. Torczon, "On the convergence of pattern search algorithms," SIAM J. Optimization, vol. 7, no. 1, Feb. 1997, pp. 1–25 - [30] F. Romeo and A. Sangiovanni-Vincentelli, "A theoretical framework for simulated annealing," Algorithmica, vol. 6, no. 3, 1991, pp. 302–345 - [31] A.W. Johnson and S.H. Jacobson, "On the convergence of generalized hill climbing algorithms," *Discrete Applied Math.*, vol. 119, no. 1–2, June 2002, pp. 37–57 - [32] A.G. Zilinskas, "On statistical models of complex multimodal functions and their application to the design of optimization algorithms," *Problems of Control and In*form. Theory (English translation), vol. 10, no. 1, 1981, pp. 19–30 - [33] R. Buche and H.J. Kushner, "Rate of convergence for constrained stochastic approximation algorithms," SIAM J. Control and Optimization, vol. 40, no. 4, 2001, pp. 1011–1041 - [34] J.R. Koza et al., "Automated synthesis of analog electrical circuits by means of genetic programming," *IEEE Trans. Evol. Comput.*, vol. 1, no. 2, July 1997, pp. 109–128. - [35] ——, "Synthesis of topology and sizing of analog electrical circuits by means of genetic programming," Comput. Methods in Applied Mechanics and Eng., vol. 186, no. 2–4, June 2000, pp. 459–482 **Figure 1:** Probability density functions of Wiener, stochastic and constrained stochastic process for $i \neq opt$ , $x \neq x_i$ and x in $i^{th}$ neighbourhood. The Dirac impulses at 0 and $F(x_{opt})$ represent definite integrals of $g_{nor_i}(f_0, x)$ from $-\infty$ to 0 and from $F(x_{opt})$ to $\infty$ , respectively. calculate weights for all known points; temporary space := explicitly constrained space; repeat divide temporary space into 2<sup>n</sup> equal subspaces; add up weights in particular subspaces; temporary space := subspace with the lowest sum of weights; until lowest sum ≠ 0 randomly pick new point in temporary space; Figure 2: Symbolic algorithm of heuristic initial point determination for a new optimization run. Figure 3: Operational amplifier with p-channel differential pair. Figure 4: Operational amplifier with n-channel differential pair. Figure 5: Telescopic cascode operational amplifier. Table 1 Results of some successful optimization runs for both two-stage amplifiers (0.8 $\mu$ m technology) | property target | | <i>p</i> -channel diff. pair | | | | <i>n</i> -channel diff. pair | | | | | | |--------------------------|----------------------------------|------------------------------|----------------|-------------|-------------|------------------------------|----------------|-------|-------------|-------------|--| | $\overline{A}$ | $\mu\mathrm{m}^2$ | <b></b> | 11619 | 12289 | 12241 | 10105 | 14151 | 13521 | 17706 | 14286 | | | $\overline{v_{pp}}$ | V | 1 | 3.7 | 3.7 | 3.8 | 3.6 | 3.8 | 3.9 | 3.8 | 3.8 | | | $v_{pp}/v_{inpp}$ | | <b>†</b> | 2101 | 2937 | 2153 | 2159 | 4535 | 4246 | 4232 | 4741 | | | $v_{offset}$ | $\mu V$ | ↓ | 87 | 60 | 96 | 49 | 32 | 81 | 49 | 11 | | | $v_{outoffset}$ | mV | ↓ ↓ | 201 | 199 | 198 | 199 | 99 | 101 | 100 | 100 | | | $i_p$ | $\mu\mathrm{A}$ | ↓ | 727 | 636 | 674 | 559 | 689 | 828 | 754 | 659 | | | $f_{0{ m dB}}$ | MHz | 1 | 20 | 20 | 20 | 14 | 16 | 20 | 14 | 13 | | | pm | 0 | <b>†</b> | 37 | 37 | 31 | 23 | 34 | 40 | 55 | 37 | | | am | dB | ↓ | <b>-</b> 39 | -37 | -24 | -22 | <b>-</b> 40 | -32 | <b>-</b> 38 | -40 | | | CMRR | dB | ↓ | <b>-</b> 96 | -100 | <b>-</b> 91 | <b>-</b> 97 | -108 | -106 | -104 | -102 | | | PSRRp | dB | ↓ | -89 | <b>-</b> 90 | -112 | -101 | <b>-</b> 49 | -50 | <b>-</b> 46 | <b>-</b> 48 | | | PSRRn | dB | <b>1</b> | -62 | -62 | <b>-</b> 60 | -58 | -50 | -51 | -51 | -52 | | | $\overline{noise}_{1/f}$ | $\mathrm{nV}/\sqrt{\mathrm{Hz}}$ | <b>↓</b> | 100 | 91 | 80 | 56 | 114 | 100 | 102 | 108 | | | $noise_{term}$ | $\mathrm{nV}/\sqrt{\mathrm{Hz}}$ | $\downarrow$ | 9.0 | 9.5 | 9.0 | 10.3 | 8.6 | 9.4 | 9.0 | 8.6 | | | $\overline{t_{rise}}$ | ns | <b>\</b> | 361 | 431 | 405 | 431 | 285 | 259 | 243 | 312 | | | $t_{fall}$ | ns | $\downarrow$ | 174 | 134 | 171 | 216 | 479 | 426 | 440 | 582 | | | | transistor | | mw/l ratio | | | | | | | | | | differential pair | | | 173 | 141 | 200 | 151 | 130 | 117 | 41 | 154 | | | active load | | | 12 | 9 | 12 | 4 | 18 | 21 | 41 | 14 | | | current sou | irce | | 18 | 24 | 18 | 7 | 19 | 10 | 31 | 39 | | | Notations: | 1 0000 0 | nook t | o pook voltage | / | de gain | 21 | offeat voltage | 01 | eximmetry i | cur | | Notations: $A\dots$ area, $v_{pp}\dots$ peak-to-peak voltage, $v_{pp}/v_{inpp}\dots$ dc gain, $v_{offset}\dots$ offset voltage, $v_{outoffset}\dots$ symmetry, $i_p\dots$ current consumption, $f_{0\mathrm{dB}}\dots$ frequency at 0dB gain, $pm\dots$ phase margin, $am\dots$ amplitude margin, $CMRR\dots$ common mode rejection ratio, $PSRRp\dots$ power supply rejection ratio to positive terminal, $PSRRn\dots$ power supply rejection ratio to negative terminal, $noise_{1/f}\dots 1/f$ noise at low frequencies (at 100Hz), $noise_{term}\dots$ thermal noise at higher frequencies (at 100KHz), $t_{rise}\dots$ rise time, $t_{fall}\dots$ fall time, m transistor multiplier, m channel width and m channel length. Symbols $\uparrow$ and $\downarrow$ indicate that the desired value is as high or as low as possible. | property target | | telescopic cascode operational amplifier | | | | | | | | | | | |---------------------------------|-------------------|------------------------------------------|------------|------|------|------|------|------|------|------|------|------| | $\overline{A}$ | $\mu\mathrm{m}^2$ | <b></b> | 2795 | 2605 | 2688 | 2603 | 2735 | 2706 | 3000 | 2686 | 2905 | 2479 | | $\overline{v_{pp}}$ | V | 1 | 3.0 | 2.7 | 2.9 | 2.8 | 2.8 | 2.9 | 2.8 | 2.8 | 2.3 | 3.1 | | $v_{pp}/v_{in}$ | pp dB | ↑ | 133 | 139 | 135 | 135 | 137 | 134 | 135 | 135 | 136 | 135 | | $cmfb_{offset}mV$ $\downarrow$ | | ↓ | 24 | 0.4 | 34 | 1 | 38 | 5 | 21 | 0.3 | 25 | 30 | | $i_p$ | $m\mathrm{A}$ | ↓ ↓ | 1.4 | 1.2 | 1.3 | 1.4 | 1.4 | 1.3 | 1.3 | 1.4 | 1.4 | 1.1 | | $f_{0\mathrm{dB}}$ | MHz | 1 | 242 | 260 | 269 | 263 | 250 | 261 | 268 | 273 | 305 | 171 | | pm | 0 | <b>†</b> | 74 | 73 | 73 | 75 | 76 | 70 | 65 | 73 | 66 | 79 | | $\overline{am}$ | dB | . ↓ | -25 | -25 | -26 | -25 | -28 | -25 | -20 | -25 | -24 | -28 | | transistor | | | mw/l ratio | | | | | | | | | | | main differential pair | | | 290 | 350 | 290 | 290 | 230 | 350 | 530 | 290 | 410 | 230 | | auxiliary p differential pair | | | 28 | 22 | 22 | 16 | 16 | 16 | 28 | 28 | 22 | 28 | | auxiliary $n$ differential pair | | | 14 | 20 | 8 | 8 | 11 | 14 | 11 | 20 | 11 | 11 | Notations: $A \dots$ area, $v_{pp} \dots$ peak-to-peak voltage, $v_{pp}/v_{inpp} \dots$ dc gain, $cmfb_{offset} \dots$ common mode feedback offset, $i_p \dots$ current consumption, $f_{0dB} \dots$ frequency at 0dB gain, $pm \dots$ phase margin, $am \dots$ amplitude margin, m transistor multiplier, m channel width and m channel length. Symbols $\uparrow$ and $\downarrow$ indicate that the desired value is as high or as low as possible.