scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Transactions on Computers in 1991"


Journal ArticleDOI
TL;DR: A general neural-network (connectionist) model for fuzzy logic control and decision systems is proposed, in the form of feedforward multilayer net, which avoids the rule-matching time of the inference engine in the traditional fuzzy logic system.
Abstract: A general neural-network (connectionist) model for fuzzy logic control and decision systems is proposed. This connectionist model, in the form of feedforward multilayer net, combines the idea of fuzzy logic controller and neural-network structure and learning abilities into an integrated neural-network-based fuzzy logic control and decision system. A fuzzy logic control decision network is constructed automatically by learning the training examples itself. By combining both unsupervised (self-organized) and supervised learning schemes, the learning speed converges much faster than the original backpropagation learning algorithm. The connectionist structure avoids the rule-matching time of the inference engine in the traditional fuzzy logic system. Two examples are presented to illustrate the performance and applicability of the proposed model. >

1,476 citations


Journal ArticleDOI
TL;DR: It is shown that the same technique used to prove that any VLSI implementation of a single output Boolean function has area-time complexity AT/sup 2/= Omega (n/Sup 2/) also proves that any OBDD representation of the function has Omega (c/sup n/) vertices for some c>1 but that the converse is not true.
Abstract: Lower-bound results on Boolean-function complexity under two different models are discussed. The first is an abstraction of tradeoffs between chip area and speed in very-large-scale-integrated (VLSI) circuits. The second is the ordered binary decision diagram (OBDD) representation used as a data structure for symbolically representing and manipulating Boolean functions. The lower bounds demonstrate the fundamental limitations of VLSI as an implementation medium, and that of the OBDD as a data structure. It is shown that the same technique used to prove that any VLSI implementation of a single output Boolean function has area-time complexity AT/sup 2/= Omega (n/sup 2/) also proves that any OBDD representation of the function has Omega (c/sup n/) vertices for some c>1 but that the converse is not true. An integer multiplier for word size n with outputs numbered 0 (least significant) through 2n-1 (most significant) is described. For the Boolean function representing either output i-1 or output 2n-i-1, where 1 >

566 citations


Journal ArticleDOI
TL;DR: The concept of virtual channels is extended to multiple virtual communication systems that provide adaptability and fault tolerance in addition to being deadlock-free, and virtual interconnection networks allowing adaptive, deadlocks-free routing are examined.
Abstract: The concept of virtual channels is extended to multiple virtual communication systems that provide adaptability and fault tolerance in addition to being deadlock-free. A channel dependency graph is taken as the definition of what connections are possible, and any routing function must use only those connections defined by it. Virtual interconnection networks allowing adaptive, deadlock-free routing are examined for three k-ary n-cube topologies: unidirectional, torus-connected bidirectional, and mesh-connected bidirectional. >

477 citations


Journal ArticleDOI
Abstract: Rate-optimal compile-time multiprocessor scheduling of iterative dataflow programs suitable for real-time signal processing applications is discussed. It is shown that recursions or loops in the programs lead to an inherent lower bound on the achievable iteration period, referred to as the iteration bound. A multiprocessor schedule is rate-optimal if the iteration period equals the iteration bound. Systematic unfolding of iterative dataflow programs is proposed, and properties of unfolded dataflow programs are studied. Unfolding increases the number of tasks in a program, unravels the hidden concurrently in iterative dataflow programs, and can reduce the iteration period. A special class of iterative dataflow programs, referred to as perfect-rate programs, is introduced. Each loop in these programs has a single register. Perfect-rate programs can always be scheduled rate optimally (requiring no retiming or unfolding transformation). It is also shown that unfolding any program by an optimum unfolding factor transforms any arbitrary program to an equivalent perfect-rate program, which can then be scheduled rate optimally. This optimum unfolding factor for any arbitrary program is the least common multiple of the number of registers (or delays) in all loops and is independent of the node execution times. An upper bound on the number of processors for rate-optimal scheduling is given. >

390 citations


Journal ArticleDOI
TL;DR: A new interconnection structure is proposed as a basis for distributed-memory parallel computer architectures that is a variation of the hypercube and preserves many of its desirable properties, including regularity and large vertex connectivity.
Abstract: A new interconnection structure is proposed as a basis for distributed-memory parallel computer architectures. The network is a variation of the hypercube and preserves many of its desirable properties, including regularity and large vertex connectivity. It has the same node and link complexity, but has a diameter only about half of the hypercube's. Some of the basic properties of this topology are discussed. Efficient routing and broadcasting algorithms are presented. >

294 citations


Journal ArticleDOI
TL;DR: The authors study run-time methods to automatically parallelize and schedule iterations of a do loop in certain cases where compile-time information is inadequate and present performance results from experiments conducted on the Encore Multimax, illustrating that run- time reordering of loop indexes can have a significant impact on performance.
Abstract: The authors study run-time methods to automatically parallelize and schedule iterations of a do loop in certain cases where compile-time information is inadequate. The methods presented involve execution time preprocessing of the loop. At compile-time, these methods set up the framework for performing a loop dependency analysis. At run-time, wavefronts of concurrently executable loop iterations are identified. Using this wavefront information, loop iterations are reordered for increased parallelism. The authors utilize symbolic transformation rules to produce: inspector procedures that perform execution time preprocessing, and executors or transformed versions of source code loop structures. These transformed loop structures carry out the calculations planned in the inspector procedures. The authors present performance results from experiments conducted on the Encore Multimax. These results illustrate that run-time reordering of loop indexes can have a significant impact on performance. >

256 citations


Journal ArticleDOI
Akhil Kumar1
TL;DR: It is shown that, given a collection of n copies of an object, the method allows a quorum to be formed with n/sup 0.63/ copies versus ((n+1)/2) copies in the case of the majority voting algorithm.
Abstract: A novel algorithm for managing replicated data is presented. The proposed method is based on organizing the copies of an object into a logical, multilevel hierarchy, and extending the quorum consensus algorithm to such an environment. Several properties of the method are derived and optimality conditions are given for minimizing the quorum size. It is shown that, given a collection of n copies of an object, the method allows a quorum to be formed with n/sup 0.63/ copies versus ((n+1)/2) copies in the case of the majority voting algorithm. Tradeoffs between the proposed method and three other quorum-based methods are discussed, and the main features of each method are highlighted. >

231 citations


Journal ArticleDOI
TL;DR: In the proposed methods, one does not need to calculate the scale factor during the computation, and can make a more efficient sine and cosine generator than that based on the previous redundant CORDIC.
Abstract: Proposes two redundant CORDIC (coordinate rotation digital computer) methods with a constant scale factor for sine and cosine computation, called the double rotation method and the correcting rotation method. In both methods, the CORDIC is accelerated by the use of a redundant binary number representation, as in the previously proposed redundant CORDIC. In the proposed methods, since the number of rotation-extensions performed for each angle is a constant, the scale factor is a constant independent of the operand. Hence, one does not need to calculate the scale factor during the computation, and can make a more efficient sine and cosine generator than that based on the previous redundant CORDIC. >

221 citations


Journal ArticleDOI
TL;DR: It is shown that this protocol leads to freedom from mutual deadlock and can be used by schedulability analysis to guarantee that a set of periodic transactions using this protocol can always meet its deadlines.
Abstract: The authors examine a priority driven two-phase lock protocol called the read/write priority ceiling protocol. It is shown that this protocol leads to freedom from mutual deadlock. In addition, a high-priority transactions can be blocked by lower priority transactions for at most the duration of a single embedded transaction. These properties can be used by schedulability analysis to guarantee that a set of periodic transactions using this protocol can always meet its deadlines. Finally, the performance of this protocol is examined for randomly arriving transactions using simulation studies. >

217 citations


Journal ArticleDOI
TL;DR: Express cubes as mentioned in this paper are k-ary n-cube interconnection networks augmented by express channels that provide a short path for nonlocal messages, which reduces the network diameter and thus the distance component of network latency.
Abstract: The author discusses express cubes, k-ary n-cube interconnection networks augmented by express channels that provide a short path for nonlocal messages. An express cube combines the logarithmic diameter of a multistage network with the wire-efficiency and ability to exploit locality of a low-dimensional mesh network. The insertion of express channels reduces the network diameter and thus the distance component of network latency. Wire length is increased, allowing networks to operate with latencies that approach the physical speed-of-light limitation rather than being limited by node delays. Express channels increase wire bisection in a manner that allows the bisection to be controlled independently of the choice of radix, dimension, and channel width. By increasing wire bisection to saturate the available wiring media, throughput can be substantially increased. With an express cube both latency and throughput are wire-limited and within a small factor of the physical limit on performance. >

213 citations


Journal ArticleDOI
TL;DR: The authors show that a stabilizing protocol is nonterminating, has an infinite number of safe states, and has timeout actions and discuss how to redesign a number of well-known protocols to make them stabilizing.
Abstract: A communication protocol is stabilizing if and only if starting from any unsafe state (i.e. one that violates the intended invariant of the protocol), the protocol is guaranteed to converge to a safe state within a finite number of state transitions. Stabilization allows the processes in a protocol to reestablish coordination between one another whenever coordination is lost due to some failure. The authors identify some important characteristics of stabilizing protocols; they show in particular that a stabilizing protocol is nonterminating, has an infinite number of safe states, and has timeout actions. They also propose a formal method for proving protocol stabilization: in order to prove that a given protocol is stabilizing, it is sufficient (and necessary) to exhibit and verify what is called a 'convergence stair' for the protocol. Finally, they discuss how to redesign a number of well-known protocols to make them stabilizing; these include the sliding-window protocol and the two-way handshake. >

Journal ArticleDOI
TL;DR: These results represent the best known explicit constructions with limited numbers of stages relative to both crosspoint and control algorithm complexity and are highly useful for practical applications involving the movement of and collaboration with voice/video/text/graphics information that require broadcast capability.
Abstract: Results are presented for nonblocking multistage broadcast networks wherein a request from an idle input port to be connected to some set of idle output ports can be satisfied without any disturbance of other broadcast connections already existing in the network. Furthermore, a linear network control algorithm for realizing such a broadcast connection request is given. These results represent the best known explicit constructions with limited numbers of stages relative to both crosspoint and control algorithm complexity. Thus, these networks are highly useful for practical applications involving the movement of and collaboration with voice/video/text/graphics information that require broadcast capability. These networks are also useful for the interconnection of processor and memory units in parallel processing systems. >

Journal ArticleDOI
TL;DR: The methods proposed to expand the range of convergence for the CORDIC algorithm do not necessitate any unwidely overhead calculation, thus making this work amenable to a hardware implementation.
Abstract: The limitations on the numerical values of the functional arguments that are passed to the CORDIC computational units are discussed, with a special emphasis on the binary, fixed-point hardware implementation Research in the area of expanding the allowed ranges of the input variables for which accurate output values can be obtained is presented The methods proposed to expand the range of convergence for the CORDIC algorithm do not necessitate any unwidely overhead calculation, thus making this work amenable to a hardware implementation The number of extra iterations introduced in the modified CORDIC algorithms is significantly less than the number of extra iterations discussed elsewhere This reduction in the number of extra iterations will lead to a faster hardware implementation Examples demonstrate the usefulness of the methods in realistic situations >

Journal ArticleDOI
TL;DR: A hypercube with extra connections added between pairs of nodes through otherwise unused links is investigated and achieves noticeable improvement in diameter, mean internode distance, and traffic density.
Abstract: A hypercube with extra connections added between pairs of nodes through otherwise unused links is investigated. The extra connections are made in a way that maximizes the improvement of the performance measure of interest under various traffic distributions. The resulting hypercube, called the enhanced hypercube, requires a simple routing algorithm and is guaranteed not to create any traffic-congested points or links. The enhanced hypercube achieves noticeable improvement in diameter, mean internode distance, and traffic density, and it also is more cost effective than a regular hypercube. An efficient broadcast algorithm that can considerably speed up the broadcast process in enhanced hypercubes is provided. >

Journal ArticleDOI
TL;DR: The design of feedback (or recurrent) neural networks to produce good solutions to complex optimization problems is discussed, and a design rule that serves as a primitive for constructing a wide class of constraints is introduced.
Abstract: The design of feedback (or recurrent) neural networks to produce good solutions to complex optimization problems is discussed. The theoretical basis for applying neural networks to optimization problems is reviewed, and a design rule that serves as a primitive for constructing a wide class of constraints is introduced. The use of the design rule is illustrated by developing a neural network for producing high-quality solutions to a probabilistic resource allocation task. The resulting neural network has been simulated on a high-performance parallel processor that has been optimized for neural network simulation. >

Journal ArticleDOI
TL;DR: It is shown that by exchanging any two independent edges in any shortest cycle of the n-cube, its diameter decreases by one unit, which leads to the definition of a new class of n-regular graphs, denoted TQ/sub n/, with 2/sup n/ vertices and diameter n-1, which has the (n-1)-cube as subgraph.
Abstract: It is shown that by exchanging any two independent edges in any shortest cycle of the n-cube (n>or=3), its diameter decreases by one unit. This leads to the definition of a new class of n-regular graphs, denoted TQ/sub n/, with 2/sup n/ vertices and diameter n-1, which has the (n-1)-cube as subgraph. Other properties of TQ/sub n/ such as connectivity and the lengths of the disjoints paths are also investigated. Moreover, it is shown that the complete binary tree on 2/sup n/-1 vertices, which is not a subgraph of the n-cube, is a subgraph of TQ/sub n/. How these results can be used to enhance hypercube multiprocessors is discussed. >

Journal ArticleDOI
TL;DR: The authors address the problem of identifying optimal linear schedules for uniform dependence algorithms so that their execution time is minimized by proposing procedures based on the mathematical solution of a nonlinear optimization problem.
Abstract: The authors address the problem of identifying optimal linear schedules for uniform dependence algorithms so that their execution time is minimized. Procedures are proposed to solve this problem based on the mathematical solution of a nonlinear optimization problem. The complexity of these procedures is independent of the size of the algorithm. Actually, the complexity is exponential in the dimension of the index set of the algorithm, and for all practical purposes, very small due to the limited dimension of the index set of algorithms of practical interest. A particular class of algorithms for which the proposed solution is greatly simplified is considered, and the corresponding simpler organization procedure is provided. >

Journal ArticleDOI
TL;DR: A general framework for shift register-based signature analysis is presented, and a mathematical model for this framework-based on coding theory-is developed, which allows for uniform treatment of LFSR, MISR, and multiple MISR- based signature analyzer.
Abstract: A general framework for shift register-based signature analysis is presented, and a mathematical model for this framework-based on coding theory-is developed. There are two key features of this formulation, first, it allows for uniform treatment of LFSR, MISR, and multiple MISR-based signature analyzer. In addition, using this formulation, a new compression scheme for multiple output CUT is proposed. This scheme, referred to as multiinput LFSR, has the potential to achieve better aliasing than other schemes such as the multiple MISR scheme of comparable hardware complexity. Several results on aliasing are presented, and certain known results are shown to be direct consequences of the formulation. Also developed are error models that take into account the circuit topology and the effect of faults at the outputs. Using these models, exact closed-form expressions for aliasing probability are developed. A closed-form aliasing expression for MISR under an independent error model is provided. >

Journal ArticleDOI
TL;DR: It is shown that a significant reduction in the size is possible for symmetric functions and some arithmetic functions, at the expense of a small constant increase in depth, and several neural networks which have the minimum size among all the known constructions have been developed.
Abstract: The tradeoffs between the depth (i.e., the time for parallel computation) and the size (i.e., the number of threshold gates) in neural networks are studied. The authors focus the study on the neural computations of symmetric Boolean functions and some arithmetic functions. It is shown that a significant reduction in the size is possible for symmetric functions and some arithmetic functions, at the expense of a small constant increase in depth. In the process, several neural networks which have the minimum size among all the known constructions have been developed. Results on implementing symmetric functions can be used to improve results about arbitrary Boolean functions. In particular, it is shown that any Boolean function can be computed in a depth-3 neural network with O(2/sup n/ /sup 2/) threshold gates; it is also proven that at least Omega (2/sup n/ /sup 3/) threshold gates are required. >

Journal ArticleDOI
Michelle Y. Kim1, Asser N. Tantawi1
TL;DR: The performance implications of asynchronous disk interleaving are examined and a simple expression for the expected value of a maximum delay of an n-disk system is obtained.
Abstract: The performance implications of asynchronous disk interleaving are examined. In an asynchronous system, adjacent subblocks are placed independently of each other. Since each of the disks in such a system is treated independently while being accessed as a group, the access delay of a request for a data block in an n-disk system is the maximum of n access delays. Using approximate analysis, a simple expression for the expected value of such a maximum delay is obtained. The analysis approximation is verified by simulation using trace data; the relative error is found to be at most 6%. >

Journal ArticleDOI
TL;DR: The authors introduce and study a family of interconnection schemes, the Midimew networks, based on circulant graphs of degree 4, which are isomorphic to the optimal distance circulants previously considered and determined to be optimal with respect to two distance parameters simultaneously.
Abstract: The authors introduce and study a family of interconnection schemes, the Midimew networks, based on circulant graphs of degree 4. A family of such circulants is determined and shown to be optimal with respect to two distance parameters simultaneously, namely maximum distance and average distance, among all circulants of degree 4.. These graphs are regular, point-symmetric, and maximally connected, and one such optimal graph exists for any given number of nodes. The proposed interconnection schemes consist of mesh-connected networks with wrap-around links, and are isomorphic to the optimal distance circulants previously considered. Ways to construct one such network for any number of nodes are shown, their good properties to build interconnection schemes for multicomputers are examined, and some interesting particular cases are discussed. The problem of routing is also addressed, and a basic algorithm is provided which is adequate for implementing the routing policy required to convey messages, traversing shortest paths between nodes. >

Journal ArticleDOI
TL;DR: By extending the results obtained by D. E. Knuth (1986), a parallel unordered coding scheme with 2/sup r/ information bits is described and Balanced codes in which each codeword contains equal amounts of zeros and ones are constructed.
Abstract: By extending the results obtained by D. E. Knuth (1986), a parallel unordered coding scheme with 2/sup r/ information bits is described. Balanced codes in which each codeword contains equal amounts of zeros and ones, with r check bits and up to 2/sup r+1/-(r+2) information bits, are constructed. Unordered codes with r check bits and up to 2/sup r/+2/sup r-1/-1 information bits are designed. Codes capable of detecting 2/sup r-1/+(2/sup r//2)-1 unidirectional errors using r check bits are also described. A review of previous work is presented. >

Journal ArticleDOI
TL;DR: Eleven methods for the synthesis of communication protocols are described and it is noted that interactive methods allow flexibility in the design process; as a result, communication patterns are not prespecified but may be constructed interactively.
Abstract: Eleven methods for the synthesis of communication protocols are described. Based on particular features of the synthesis process, these methods are classified and compared. In particular, it is noted that interactive methods allow flexibility in the design process; as a result, communication patterns are not prespecified but may be constructed interactively. Methods that only consider the synchronous mode of behavior of communicating entities exclude a wide range of real-life protocols. Methods that make no reference to service requirements do not guarantee the semantic correctness of the synthesized protocol and therefore require the application of a semantic verification procedure. Most methods concentrate on the synthesis of the control part of the protocol entities, which mainly consists of the exchange of synchronization messages. The data part is not adequately treated by any of the synthesis methods. Other than the exchange of synchronization messages, some methods have been extended to deal with unreliable media by synthesizing error-recovery patterns. Some new research directions for enhancing the applicability of the synthesis approach to the design of real-life protocols are obtained. >

Journal ArticleDOI
TL;DR: Results indicate that the degree of diagnosability of the n-dimensional hypercube (for short, n-cube), where n>or=4, increases from n to 2n-2 as the diagnosis strategy changes from the precise one- step strategy to the pessimistic one-step diagnosis strategy.
Abstract: The capabilities of a system-diagnosis technique based on mutual testing are discussed. The technique is applied to hypercube computer systems. A one-step diagnosis of hypercubes that involves only one testing phase, in which processors test each other, is described. Two kinds of one-step diagnosis are presented: the precise one-step diagnosis and the pessimistic one-step diagnosis. Results indicate that the degree of diagnosability of the n-dimensional hypercube (for short, n-cube), where n>or=4, increases from n to 2n-2 as the diagnosis strategy changes from the precise one-step strategy to the pessimistic one-step diagnosis strategy. If the fault bound, the upper bound on the possible number of faulty processors, is kept to the same number n in both cases of diagnosis, then the pessimistic strategy requires fewer testing links per processor than the precise strategy. An algorithm for selecting the bidirectional links in an n-cube for use as testing links is also presented. >

Journal ArticleDOI
S.W. Ng1
TL;DR: A sensitivity study using a simple analytical queueing model shows that a reduction in rotational latency and RPS miss delay has the greatest impact in reducing the disk's basic service time and in turn produces the greatest improvement in overall subsystem performance.
Abstract: It is shown that in many environments the rotational latency and the rotational position sensing (RPS) miss delay are the major contributors to a disk's basic service time. A sensitivity study using a simple analytical queueing model shows that a reduction in these two components (both of which are related to the rotation of disk drives) has the greatest impact in reducing the disk's basic service time and in turn produces the greatest improvement in overall subsystem performance. While the most straightforward way to reduce latency and RPS miss penalty would be to increase the disk's rotation speed, there are some limitations to such an approach. Several alternatives to reducing latency and RPS miss penalty are proposed and explored, and their performance is analyzed using analytical queuing models. >

Journal ArticleDOI
TL;DR: A graph-based solution to the mapping problem using the simulated annealing optimization heuristic and implemented using the hypercube as a host architecture, and results for several image graphs are presented.
Abstract: A graph-based solution to the mapping problem using the simulated annealing optimization heuristic is developed. An automated two-phase mapping strategy is formulated: process annealing assigns parallel processes to processing nodes, and connection annealing schedules traffic connections on network data links so that interprocess communication conflicts are minimized. To evaluate the quality of generated mappings. cost functions suitable for simulated annealing that accurately quantify communications overhead are derived. Communication efficiency is formulated to measure the quality of assignments when the optimal mapping is unknown. The mapping scheme is implemented using the hypercube as a host architecture, and results for several image graphs are presented. >

Journal ArticleDOI
TL;DR: Improved algorithms for mapping pipelined or parallel computations onto linear array, shared memory, and host-satellite systems are extended and have significantly lower time and space complexities than the more general algorithms.
Abstract: Recent work on the problem of mapping pipelined or parallel computations onto linear array, shared memory, and host-satellite systems is extended. It is shown how these problems can be solved even more efficiently when computation module execution times are bounded from below, intermodule communication times are bounded from above, and the processors satisfy certain homogeneity constraints. The improved algorithms have significantly lower time and space complexities than the more general algorithms: in one case, an O(nm/sup 3/) time algorithm for mapping m modules onto n processors is replaced with an O(nm log m) time algorithm, and the space requirements are reduced from O(nm/sup 2/) to O(m). Run-time complexity is reduced further with parallel mapping algorithms based on these improvements, which run on the architectures for which they create mappings. >

Journal ArticleDOI
TL;DR: Four scheduling strategies for dataflow graphs onto parallel processors are classified: (1) fully dynamic, (2) static-assignment, (3) self-timed, and (4) fully static.
Abstract: Four scheduling strategies for dataflow graphs onto parallel processors are classified: (1) fully dynamic, (2) static-assignment, (3) self-timed, and (4) fully static. Scheduling techniques valid for strategies (2), (3), and (4) are proposed. The focus is on dataflow graphs representing data-dependent iteration. A known probability mass function for the number of cycles in the data-dependent iteration is assumed, and how a compile-time decision about assignment and/or ordering as well as timing can be made is shown. The criterion used is to minimize the expected total idle time caused by the iteration. In certain cases, this will also minimize the expected makespan of the schedule. How to determine the number of processors that should be assigned to the data-dependent iteration is shown. The method is illustrated with a practical programming example. >

Journal ArticleDOI
TL;DR: The problem of determining whether a redundant random-access memory (RRAM) containing faulty memory cells can be repaired with spare rows and columns is discussed and a computationally efficient algorithm for detecting unrepairability is presented.
Abstract: The problem of determining whether a redundant random-access memory (RRAM) containing faulty memory cells can be repaired with spare rows and columns is discussed. The approach is to increase the number of working RRAMs manufactured per unit time, rather than per wafer, by presenting a computationally efficient algorithm for detecting unrepairability, a computationally efficient algorithm for optimal repair for special patterns of faulty memory cells and online algorithms that can find an optimal repair or else detect unrepairability during memory testing, aborting unnecessary testing. Experimental validation of the approach is given that is based on industrial device fabrication data. >

Journal ArticleDOI
TL;DR: It is shown that it is NP-complete to decide if a given graph has a normal maximum cut with at least a fraction 1/2+1/2n of its edges, where the positive constant epsilon can be taken smaller than any value chosen.
Abstract: The maximum cut problem is known to be an important NP-complete problem with many applications. The authors investigate this problem (which they call the normal maximum cut problem) and a variant of it (which is referred to as the connected maximum cut problem). They show that any n-vertex e-edge graph admits a cut with at least the fraction 1/2+1/2n of its edges, thus improving the ratio 1/2+2/e known before. It is shown that it is NP-complete to decide if a given graph has a normal maximum cut with at least a fraction (1/2+ epsilon ) of its edges, where the positive constant epsilon can be taken smaller than any value chosen. The authors present an approximation algorithm for the normal maximum cut problem on any graph that runs in O((e log e+n log n)/p+log p*log n) parallel time using p(1 >