scispace - formally typeset
Search or ask a question

Showing papers in "Complex Systems in 1988"


Journal Article
TL;DR: The relationship between 'learning' in adaptive layered networks and the fitting of data with high dimensional surfaces is discussed, leading naturally to a picture of 'generalization in terms of interpolation between known data points and suggests a rational approach to the theory of such networks.
Abstract: : The relationship between 'learning' in adaptive layered networks and the fitting of data with high dimensional surfaces is discussed. This leads naturally to a picture of 'generalization in terms of interpolation between known data points and suggests a rational approach to the theory of such networks. A class of adaptive networks is identified which makes the interpolation scheme explicit. This class has the property that learning is equivalent to the solution of a set of linear equations. These networks thus represent nonlinear relationships while having a guaranteed learning rule. Great Britain.

3,538 citations


Journal Article
TL;DR: This work investigates the merits of using this error function over t he traditional quad ratic function for gradient descent for conrf­ guity problem and explains the characteristic steepness of the landscape defined by the error function in configuration space.
Abstract: Abst ract . Learning in layered neu ral networks is posed as the mini­ miz at ion of an error function defined over t he training set. A proba­ bilistic interpretation of the target act ivities sugges ts th e use of rela­ t ive entro py as an error measure. We investigate t he merits of using this error function over t he traditional quad ratic function for gradient descent learni ng. Com parative numerical sim ulations for the conrf­ guity problem show marked redu ct ion s in learn ing t imes. This im ­ provement is explained in terms of the characteristic steepness of the landscape defined by the error function in configuration space.

234 citations


Journal Article
TL;DR: It is shown that it is undecidable to which class a given cellula r automaton belongs, even when choosing only between the two simp lest classes.
Abstract: Abstra ct. Stephen wolfram int roduced the use of cellula r au tom ata. as mod els of complex sys tems and proposed a clas sification of th ese automata based on th eir st a t ist ical ly observed behavior. We invest igate various properti es of these classes; in part icular, we as k wheth er certain prop erties are effective, and we obtain several somewhat surpri sing result s. For examp le, we show th at it is undecidable wheth er all th e fini te configu rations of a given cellular automaton eventually become qu iescent. Consequently, it is undecidable to which class a given cellula r automaton belongs, even when choosing only between the two simp lest classes.

152 citations



Journal Article

114 citations


Journal Article

113 citations




Journal Article

81 citations


Journal Article
TL;DR: In this article, a learning algorithm for multilayer neural networks composed of binary linear threshold elements is proposed, which treats the internal representations as the fundamental entities to be determined and finds the weights by the local and biologically plausible perceptual learning rule (PLR).
Abstract: We introduce a learning algorithm for multilayer neural networks composed of binary linear threshold elements. Whereas existing algorithms reduce the learning process to minimizing a cost function over the weights, our method treats the internal representations as the fundamental entities to be determined. Once a correct set of internal representations is arrived at, the weights are found by the local and biologically plausible Perceptron Learning Rule (PLR). We tested our learning algorithm on four problems: adjacency, symmetry, parity and combined symmetry-parity.

66 citations


Journal Article
TL;DR: The print-to-sound connectionist model provides support for a modified dual-route hypothesis involving indirectly inter active routes and verifies the hypothesis' consistency with a set of replicab le psychological data.
Abstract: Abstra ct. Thi s pap er describes a connectioni st mod el of prin t-tosound transformation (\"word naming\" or \"reading aloud\" ). T he associative network it uses is based on published stu dies of oral reading, and simulat ion res ults are compare d to experiment al data in t he psy chological literature. Th e results ob ta ined are of interest for two separa te reasons. Fir st , the print-to-sound connectionist model is based on an indirectly interactive dual-route hy pothesis of reading aloud. The model confirms t hat this hypothesis, when implemented as a detailed and sizeable compute r simulat ion, can account qualitatively for anumber of behavioral phenomena such as regularity and word frequency effects. The model thu s provides support for a modified dual-route hypothesis involving indirectly inter active routes and verifies the hypothesis' consistency with a set of replicab le psychological data. Th e second reason t he print-to -sound connectioni st model is of interest is t ha t it uses a new approach to implementing compet it ive dynamics in connectionist models. Focused spread of network act ivation and avoidance of network sat uration are produced by using a com petit ive activation mechanism rat her than explicit inhibi tory links between compet ing nodes. T he print -to-sou nd model demonstrates for t he first time t hat competit ive activation mechanisms can funct ion usefully in relatively large, complex situations of interest in cognitive psychology and ar tificial intelligence.


Journal Article
TL;DR: A simple lattice gas model for solving the linear wave equation is presented, using a photon representation and energy and momentum are shown to be conserved.
Abstract: A simple lattice gas model for solving the linear wave equation is presented. In this model a photon representation is used. Energy and momentum are shown to be conserved.

Journal Article
TL;DR: In this article, the authors study the existence of large basins of attraction in both hetero-as sociative and auto-associative systems and study the size of these basins.
Abstract: We st udy the perfor mance of a neu ral net work of the per­ cept ron typ e. We isolate two important set s of pa rameters which ren ­ der t he network fau lt tolerant (existence of large basins of attraction ) in both hetero-as sociative and auto-associative systems and study t he size of the bas ins of attraction (the maximal allowable noise level still ens uring recognition ) for sets of random patterns. The releva nce of ou r result s to t he pe rcept ron's ability t o gene ralize are pointed out, as is t he role of diagonal couplings in t he fully connected Hopfield model.

Journal Article
TL;DR: The prob lem of deciding whether a given con­ figuration has a predecessor is solvable in polynomial t ime or NP-complete is investigated and a linear algorithm is given to decide reversibility of unicyclic graphs.
Abstract: A bstract. We study cellular automata. with addi tive rules on finite undi rect ed graphs. The addition is carried ou t in some finit e abelian monoid. We investigate the prob lem of deciding wheth er a given con­ figuration has a predecessor . Depending on th e underl ying monoid this prob lem is solvable in polynomial t ime or NP-complete. Fur­ t hermo re, we st udy t he global reversibility of cellular graph automata based on addition modulo two . We give a linear t ime algorithm to decide reversibility of unicyclic graphs. 1. I ntroduction

Journal Article
TL;DR: The releva nce of ou r result s to t he pe rcept ron's ability t o gene ralize are pointed out, as is the role of diagonal couplings in t he fully connected Hopfield model.
Abstract: We st udy the perfor mance of a neu ral net work of the per­ cept ron typ e. We isolate two important set s of pa rameters which ren ­ der t he network fau lt tolerant (existence of large basins of attraction ) in both hetero-as sociative and auto-associative systems and study t he size of the bas ins of attraction (the maximal allowable noise level still ens uring recognition ) for sets of random patterns. The releva nce of ou r result s to t he pe rcept ron's ability t o gene ralize are pointed out, as is t he role of diagonal couplings in t he fully connected Hopfield model.

Journal Article
TL;DR: Simulated annealing is applied to the problem of teaching feed-forward neural networks with discrete-valued weights, and several examples, including the parity and "clump-recognition" problems are treated, scaling with networkcomplexity is discussed, and the viability of mean-field approximation in the annealed process is considered.
Abstract: Abstract. Simulated annealing is applied to the problem of teaching feed-forward neural networks with discrete-valued weights. Networkperformance is optimized by repeated presentation of training data at lower and lower temperatures. Several examples, including the parity and \"clump-recognition\" problems are treated, scaling with networkcomplexity is discussed, and the viability of mean-fieldapproximationsto the annealing process is considered.1. Introduction Back propagation [1] and related techniques have focused attention on theprospect of effective learning by feed-forward neural networks with hiddenlayers. Most current teaching methods suffer from one of several problems,including the tendency to get stuck in local minima, and poor performance inlarge-scale examples. In addition, gradient-descent methods are applicableonly when the weights can assume a continuum of values. The solut ionsreached by back propagation are sometimes only marginally stable agai nstperturbations, and rounding off weights after or during the procedure canseverely affect network performance. If the weights are restricted to a fewdiscrete values, an alternative is required.At first sight, such a restriction seems counterproductive. Why decreasethe flexibility of the network any more than necessary? One answer is thattruly tunable analog weights are still a bit beyond the capabilities of currentVLSI technology. New techniques will no doubt be developed, but there areother more fundamental reasons to prefer discrete-weight networks. Appli­cation of back propagation often results in weights that vary greatly, mustbe preci sely specified, and embody no parti cular pattern. Ifthe network isto incorporate structured rules underlying the examples it has learned, theweights ought often to assume regular, integer values. Examples of such verystructured sets of weights include the \"human solution\" to the \"clump recog­nition\" problem [2] discussed in section 2.2, and the configuration presentedin reference [1] that solves the parity problem. T he relatively low number of

Journal Article
TL;DR: The results of Fujiki and Dickinson on the Iterated Prisoners' Dilemma problem (IPD) are confirmed and possibilities for further work are outlined.
Abstract: A bs t r a ct. A system is descr ibed in which a numb er of art ificial agents (represented by simple mathematical expressions) compe te for t he right to "re produce" (that is, to cause new agents wit h similar properties t o be genera ted). By simulat ing some of the essential fea­ tures of biological evolu tion, the syste m makes pos sible some novel insight s into t he behavior of communi ties of agent s over t ime. The results of Fujiki and Dickinson on the Iterated Prisoners' Dilemma problem (IPD) are essentia lly confirmed. T he ty pical course of evo­ lution of a community of IPD players is descri bed) and possibilities for further work are outlined . T his study is also relevant to machine learning) an d adapti ve systems in gener al .

Journal Article
TL;DR: In this article, a theory of parity filter automata is presented, and period and velocity theo rems for particles, existence and uniqueness theorems, conser vation and monotone nonconservation laws, and phase shifts in soliton collisions are proved.
Abstract: Parity filter automata are a class of two-st a te cellular aut omata on the intege r grid po ints of t he real line in which cells are updated serially from left to right in each time period ra.ther than 5YOchronously in parallel. Parity filter auto mata support large numbers of \"par ticles,\" or persistent repeating configurations, and the collision of these particles is frequently a. \"soliton\" collision in which the particles in teract , bu t from which both emerge with t heir identit ies preser ved . This paper presen ts a theory of such par ity filter automata . Period and velocity theo rems for particles, existence and uniqueness theorems, conser vation and monotone nonconservation laws, durat ion and phase shifts in soliton collisions, and other results are proved.

Journal Article
TL;DR: Aily of symmet ric weight connect is constructed to reach a stable configurat ion when the descent update rule is used and it is shown that the configuration is stable when the rule is changed.



Journal Article
TL;DR: Scaling laws with new ano malous exponent s are found bot h for opt imal forecast s and for forecasts which are nearly opti mal and not only for forecastin g but also for data.
Abstract: A bstract . We deal in this paper wit h the di fficul ty of per for ming opt imal or nearl y optimal forecasts of discrete symbol sequences generated by very simp le models . T hese are spatial sequences gene rated by elementary one-dimensional cellular automata. after one time step , wit h com pletely ran dom input st rings . T hey have p ositi ve entropy and t hus cannot be ent irely predict ed . Mak ing forecas ts which are optimal wit hin th is Li mita t ion is proven to be sur prisingly difficult. Scaling laws with new ano malous exponent s are found bot h for opt imal forecast s and for forecasts which are nearly opti mal . The same remark s hold not only for forecastin g but also for data. compression.

Journal Article
TL;DR: It is shown that the periods of t he periods of a utomaton always divide and the case of reversible syste ms reversibili ty versus t he coupling coeffici ents is given.
Abstract: We stu dy t he dyn ami cs of an automaton wit h me mory whose equation is t he following: '-1 Xntl = lC L: a j Xn _ l - 0) i =O wher e a = (ai)i=o...k _ l deno tes t he cou pling coefficients vector. We show that if a is symmetric, t hen wecan introduce an energy operator; t hereby we slate that the periods of t he a utomaton always divide (k + 1) and give a bound of t he t ra nsient . We also st udy t he case of reversible syste ms a nd cha racte rize reversibili ty versus t he coupling coeffici ents . T hereafter, we give some result s about t he pivot sums syste ms. Some conject ures concern ing the general case are given.

Journal Article
TL;DR: A complet e analysis of the evolution of single particles is presented, in particular, conditions are given for the existence of periodic particles, and the period is computed in terms of the initial data.
Abstract: An initial configurat ion of the filter cellular automat on introduced by Park, Steiglitz, and Thurston can be thought of as con­ sisting of a number of particles. Here we present a complet e analysis of the evolution of single particles. In particular, conditions are given for th e existence of periodic particles, and the period is computed in t erms of the initial data.

Journal Article
TL;DR: The predicted capacity of one of the schemes is compared with the actual measur ements of the coar secoded working memory of DCP S, Touret zky and Hinto n's dist ributed connectionist product ion system, and a simple linear relat ionship between the resources allocated to the syste m and the capacity is found.
Abstract: Coarse-coded symbol memories have appeare d in several neural net work symbol processing models. T hey are st atic memories that use overlap ping codes to store mu ltiple items simultaneo usly. In order to determine how t hese models would scale, one must first have some understanding of the mathematics of coarse-coded represe ntations . The general struct ure of coarse-coded symbo l memories is defined, and their strengths and weaknesses are discussed . Memory schemes can be characterized by their memory size, symbol-set size, and capacity. We derive mathematical relationships between these par ameters for various memory schemes, using both analysis and numerical method s. We find a simple linear relat ionship between the resources allocated to the syste m and the capacity t hey yield . The predicted capacity of one of the schemes is compar ed wit h actual measur ements of the coar secoded working memory of DCP S, Touret zky and Hinto n's dist ributed connectionist product ion system. Finally we provide a heurist ic algorithm for generating receptive fields which is efficient and pro duces good results in practice.

Journal Article
TL;DR: A novel feature of a new "energy funct ion" which does yield a bound for the t rans ient is that it contains not only linea r an d bilinea r ter ms, as is common, but also te rms involving th e minimum function.
Abstract: Th e work of Ghiglia , Masti n, an d Romero on a "p hase­ unwrapping" algorith m gives rise to the following operation : for an y undirect ed graph wit h a rbit ra ry int eger values attached to the verti ces, simulta neous upd at es are perform ed on these values, with t he value of a verte x being cha nged by one in th e direction of th e average of t he values of t he adjace nt vert ices. (When t he average equals t he value of a vertex, th e val ue of th e ver tex is incremented by one, unless all t he neighbors have th e same value, in which case no change is made.) Ea rlier work of Odlyzko a nd Rand all showed that iterat ing t his operation always leads to a cycle of length one or two, but did not give a bound on how many iterat ions might be needed to reach such a cycle. T his paper int rodu ces a new "energy funct ion" which does yield a bound for the t rans ient . A novel feature of t his energy is that it contains not only linea r an d bilinea r ter ms, as is common, but also te rms involving th e minimum function .


Journal Article
TL;DR: It is investigated if the Kauffman mod el is a phase tran sit ion as a function of the degree of mixing, par ticularly if one chooses one rule as forcing and t he other as non-forcing, and if there are quenched mixtures of two Boolean functi ons.
Abstract: We consider cellular automata on a square lattice wit h nearest-neighbor inputs. OUf automa ta. are quenched mixtures of two Boolean functi ons. We investigate if th ere is a phase tran sit ion as a function of the degree of mixing, par ticularly if one chooses one rule as forcing and t he other as non-forcing. 1. Int roduction Th e Kauffman model [1,2,31 is a cellular automata [41 defined on a lat tice where one associat es a binary variable Pi to each site which is either zero or uni ty. Th e tim e evolution for each site is determined by a rule rando mly picked among t he complete set of Boolean functions of J( inputs. In the present work, the J( inpu ts are the near est neighbors on a square lat ti ce and in one dimension. This model presents under certain condit ions a phase transit ion between a frozen phase and a chaotic one. In the frozen phase , a disturbance cannot propagate, whereas in t he chaotic phase it does. Derri da and Stauffer [5J have explored the Kauffman case, where they used instead of K as varying parameter the probab ility p for a Boolean funct ion to have the value 1 (most early studies were made for p = 0.5). They treat the annealed case (in each iteration, the Boolean functions are changed) and the quenched one (once t he Boolean functions are chosen for t = 0 they are kept for all t imes). In both cases, a phase tra nsit ion was found . A Boolean funct ion is said forcing [13] if at least one input site assumin g a determined value, determines the output of the function, for example the logical OR for the value 1 and AND for the value O. We know that among the Boolean funct ions in the Kauffman model there are forcing functions and non-forcing ones. Neighboring forcing functions (C) 1988 Complex SYstems Publications. Inc. 30 L. R. da Silva, H. J. Herrmann, and L. S. Lu cena tend to correlate the system favoring an ordered ph ase whereas non -forcing st ructures ten d to sca tter 0'5 and 1'5, disorganizing th e sys tem. So th ere exists com pe t it ion bet ween ru les in Kauffman's mod el. A damage is defined [6] as the number of sites differing as a result of the t ime-evolutio n of a single err or introduced in t he sys te m, if initially one Pi is changed. Of cours e, damage (in th e sense of genetic mutat ion) p ropagates easier in weakly correlated systems. It ha s been proposed [13,7] that com petit ion be tween forc ing and non-forcing functions is responsible for the phase t ra nsition in cellular auto mat a like the Kauffman mod el. In this work , we want to investiga te this question in more detail. We consider the quenched case and mix [7] only two rules Fl and F2 , for exam ple one forcing and ano ther non-forcing. Specifically, we consider only symmet ric rules, i.e., F( l, 2,3 1 ••• , k 1, k) = F (l, k, k 1, . . . , 3, 2) where 1 is the central site and 2, 3, . .. , k 1, k are its neighbors. We take as varying par ameter the probability p to have rule Fl on a given site and 1 P to have F2 . 2. Method A way to characterize the chaotic phase is through the time-developmen t of the normalized Hamming distance w(t) between two configurations (Pitt)) and {gj(t)} on which we apply simultaneously the same set of functions {Fi}. Th e distance W(t ) is defined by N W(!) '\" L:)Pi(t ) ~i(t))2IN ;=1 where N is the number of sites of the lattice. Initi ally, Pi = ei except for one rando mly select ed sit e. Vlie t hus start with two configurat ions having fort = 0 a distance 1/N between each other and calculate w= == limt_ = w(t). Since the frozen pha se is insensitive to an initi al disturbance, we have in this case w= = 0 for the limit W( t = 0) ----+ O. In the chaotic phase, the limit distance W= is different from zero. So one can use the distance Wco as a disorder parameter. We calcula te the distance '11 = if the initial disturbance tend s to zero [5]. Since one works with finite systems, one has to perform thi s via an extra polat ion. Following the lines of Stanley et. a]. [6] we take three different init ial configurat ions, namely A, B, and C1 const ructed as follows. Configuration A Th e origina l \"undamaged\" configuration. Configuration B Differs from configurat ion A in one ra ndom ly chosen site. Configuration C Differs from configura t ion B in another randomly chosen site and from configuration A in these two sites. Th en we extrapolate to zero initial damage by the equation Simulations of Mixt ures of Two Boolean CA R ules 31 We calculate Wc:o for various mixtu res of rules, using periodic boundary condit ions, by numerical simulat ion using Mult i-Spin-Coding techn iques [8], implemented on a Cray XMP and getting a speed of 7 * 10' updates of the three confi gurati ons per second. For improvements in the method see [1 2). The init ial configurat ions are constructed randomly having a certain concentrat ion q of l 's . In our data, we used q = 0.5 and verified in some cases that th e resuJts are unaffected for ot her values of q except for q = 0 and q = l. In order to calculate Woo one must let the system evolve to an equilibrium. \"\" emonitor wet ) as function of time t and see it saturating towards Woo aft er a characte rist ic t ime TO which is of order TO = 500 10000 for the systems that we considered. Thus, we iterated T t imes where T was chosen to be several t imes TO. Two averages must be performed: one over different initial configurations and one over different dist ribut ions of rules. We perform bot h averages at once by choosing for each sample as well a new set of rules as a new initial configurati on; typically, we average over M = 20 500 samples. Finally, in order to take the thermody namic limit, we simulate systems of different linear sizes L. To opt imize vecto rizat ion, our sizes must be mult iples of 64 and the smallest choice is L = 192; our largest L was 40000 in one and 768 in two dimensions. 3. R esults We start t reating the one-dimensional problem. We choose as Boolean funct ions FI , the generalized OR (which is true (I ) if at least one of its J( argument s is tru e) and as F 2 the generalized XOR (which is true if an odd number of its arguments is true) . We study the cases I< = 3 (nearest neighbors and cent ral site) and J( = 5 (nearest and next-nearest neighbors and cent ral site) and verify that there is no phase tra nsit ion. In figur e l a, we show the size-dependence of t he order par ameter \\II00 as a functio n of p and L. We see that for p = 0, Woo goes to zero as the size L goes to infinity. This shows that the rule XOR, although being non-forcing, is frozen; i.e., 'l'oo{p = 0) = 0 in the thermodynamic limit . Anot her interest ing point is the fact tha t there exists a maximum in the curve of figure lao T he height of this maximum goes to zero with increasing L. In summary, t his one-dimensional system does not spread its damage over infinite distances and thus is not chaot ic. Similarly, we st udied the mixt ure of OR and XOR on the square lat t ice. Again , pure XOR is frozen as is OR. For the mixture of OR and XOR, however, for 0 < p < 0.4 there is a chaotic phase as seen in figure lb. The points do not show a significant size dependence, but there are st rong statist ical fl uctu at ions: for some init ial configurat ions Woo = 0 and for ot hers Wm of O. Th is is wha t one would expect: if the two initially damaged [91 sites and their neighbors hap pen to be not susceptible to damage, the init ial 32 L. R. da Silva, H. J. Herrm ann , and L. S. L ucena

Journal Article
TL;DR: A technique for the deform ation of the error surface is introduced as a way of using the backpropagation algorithm to learn hard problem domai ns by gr adually changing the shape of theerror face from a gentle to the final craggy form.
Abstract: T he backpropagation algorithm for feed-forwa rd layer ed neural networks is ap plied to a problem domain of vari abl e difficulty. The algori thm in its basic form is shown to be very sens itive to stepsize and momentum parameters as problem di fficulty is increased. To counter this we suggest a way of changing them during learning for a faster and more stable gradient descent . A technique for the deform ation of the error surface is introduced as a way of using t he algorithm to learn hard problem domai ns by gr adually changing the shape of the error sur face from a gentle to the final craggy form . This deformation procedure is applied to a second problem domain and is shown to improve the learn ing performan ce by gradually increasing the difficulty of th e problem domain so that the net may build upon past exp erience, rather than being subjecte d to a complicated set of assoc iat ions from the st ar t .