# Adaptation, Performance and Vapnik-Chervonenkis Dimension of Straight Line Programs

TL;DR: Empirical comparation between model selection methods based on Linear Genetic Programming shows an upper bound for the Vapnik-Chervonenkis (VC) dimension of classes of programs representing linear code defined by arithmetic computations and sign tests is used.

Abstract: We discuss here empirical comparation between model selection methods based on Linear Genetic Programming. Two statistical methods are compared: model selection based on Empirical Risk Minimization (ERM) and model selection based on Structural Risk Minimization (SRM). For this purpose we have identified the main components which determine the capacity of some linear structures as classifiers showing an upper bound for the Vapnik-Chervonenkis (VC) dimension of classes of programs representing linear code defined by arithmetic computations and sign tests. This upper bound is used to define a fitness based on VC regularization that performs significantly better than the fitness based on empirical risk.

## Summary (2 min read)

### 1 Introduction

- Throughout these pages the authors study some theoretical and empirical properties of a new structure for representing computer programs in the GP paradigm.
- Another advantage with respect to trees is that the slp structure can describe multivariate functions by selecting a number of assignments as the output set.
- The GP approach with slp’s can be seen as a particular case of LGP where the data structures representing the programs are lists of computational assignments.
- The authors study the practical performance of ad-hoc recombination operators for slp’s.
- This bound constitutes their basic tool in order to perform structural risk minimization of the slp structure.

### 2 Straight Line Programs: Basic Concepts and Properties

- Straight line programs are commonly used for solving problems of algebraic and geometric flavor.
- The formal definition of slp’s the authors provide in this section is taken from [2].

### 3 Vapnik-Chervonenkis Dimension of Families of slp’s

- In the last years GP has been applied to a range of complex learning problems including that of classification and symbolic regression in a variety of fields like quantum computing, electronic design, sorting, searching, game playing, etc.
- A common feature of both tasks is that they can be thought of as a supervised learning problem (see [5]) where the hypothesis class C is the search space described by the genotypes of the evolving structures.
- In the seventies the work by Vapnik and Chervonenkis ([6], [7], [8]) provided a remarkable family of bounds relating the performance of a learning machine (see [9] for a modern presentation of the theory).
- The VCD depends on the class of classifiers.

### 3.1 Estimating the VC dimension of slp’s parameterized by real numbers

- Next, make the following assumptions about the functions τi.
- Form the sv functions τi(w,αj) from IRk to IR.
- With the above setup, the following result is proved in [10].
- In the new class C parameters αji, βjiare allowed to take values in IR.

### 3.2 Estimating the Average Error of slp’s

- The authors show how to apply the bound in Equation 8 to estimate the average error with respect to the unknown distribution from which the examples are drawn.
- The average error of a classifier with parameters (α, β) is ε(α, β) = ∫ Q(t, α, β; y)dµ, (16) where Q measures the loss between the semantic function of Γ(α,β) and the target concept, and µ is the distribution from which examples {(ti, yi)}1≤i≤m are drawn to the GP machine.
- Now, the results by Vapnik state that the average error ε(α, β) can be estimated independently of the distribution of µ(t, y) due to the following formula.
- The constant η is the probability that the bound is violated.

### 4 SLP-Based Genetic Programming

- The authors keep homogeneous populations of equal length slp’s.
- Next, the authors describe the recombination operator.
- Then a new random selection is made within the arguments of the function f ∈ F that constitutes the instruction ui.

### 4.1 Fitness based on Structural Risk Minimization

- In this situation one chooses the model that minimizes the right side of Equation 17.
- For practical use of Equation 17 the authors adopt the following formula with appropriately chosen practical values of theoretical constants (see [12] for the derivation of this formula).

### 4.2 Experimentation

- The authors consider instances of Symbolic Regression for their experimentation.
- The authors adopt slp’s as the structures that evolve within the process.
- In table 2 the authors show the corresponding success rates for each crossover method and target function.
- The above experimental procedure is repeated 100 times using 100 different random realizations of n training samples (from the same statistical distribution).
- Accordingly, the values in the comparative rows that are bigger than or equal to 1 represent a better performance of VC-fitness.

### 5 Conclusions and Future Research

- The authors have calculated a sharp bound for the VC dimension of the GP genotype defined by computer programs using straight line code.
- The authors have used this bound to perform VC-based model selection under the GP paradigm showing that this model selection method consistently outperforms LGP algorithms based on empirical risk minimization.
- A second goal in their research on SLP-based GP is to study the experimental behavior of the straight line program computation model under Vapnik-Chervonenkis regularization but without assuming previous knowledge of the length of the structure.
- This investigation is crucial in practical applications for which the GP machine must be able to learn not only the shape but also the length of the evolved structures.
- To this end new recombination operators must be designed since the crossover procedure employed in this paper only applies to populations having fixed length chromosomes.

Did you find this useful? Give us your feedback

...read more

##### Citations

16 citations

### Cites background from "Adaptation, Performance and Vapnik-..."

...The exact relationship between non-scalar size of a GP-tree (more generally, a computer program) and its VC dimension is showed in [4]....

[...]

11 citations

### Cites background from "Adaptation, Performance and Vapnik-..."

...A further development of this point of view, including some experimental discussion, can be found in [7]....

[...]

^{1}

11 citations

### Cites background from "Adaptation, Performance and Vapnik-..."

...The exact relationship between non-scalar size of a slp (more generally, a computer program) and its VC dimension is showed in [6]....

[...]

10 citations

### Cites background from "Adaptation, Performance and Vapnik-..."

...Connections with Statistical Learning Theory (Amil et al., 2009; Chen et al., 2016; Montana et al., 2009), and PAC learning (Kötzing et al....

[...]

...Statistical Learning Theory (SLT) (Vapnik, 1995) is by now a mature eld that provides theoretical considerations to guide the design of learning algorithms....

[...]

1 citations

##### References

26,121 citations

13,137 citations

3,669 citations

554 citations

368 citations