EA Symbolic regression

Symbolic regression is a type of regression analysis that searches the space of mathematical expressions to find the model that best fits a given dataset, both in terms of accuracy and simplicity. No particular model is provided as a starting point to the algorithm. Instead, initial expressions are formed by randomly combining mathematical building blocks such as mathematical operators, analytic functions, constants, and state variables. (Usually, a subset of these primitives will be specified by the person operating it, but that’s not a requirement of the technique.) New equations are then formed by recombining previous equations, using genetic programming.

In this exercise we will try to evolve an entity (phenotype) that is created of 2 genes. The a and b genes. Together these genes will represent out function given by a linear equation y=a*x + b. To evaluate each entity from a required dataset, we will use a mean square error value.

Where P is y value for i-th point in our solution, and E is expected output from dataset. Therefore we will try to minimaze the loss function (mse) to find best (most fitting solution)


  1. Init new generation parents.
  2. Breed new child’s (apply crossovers and mutations).
  3. Evaluate each child (compute mse).
  4. Pick best child’s for next generation parents.

Return to top


Data Value
Source https://en.wikipedia.org/wiki/Symbolic_regression
Code EA_symb_regresion_code.zip
Solution code EA_symb_regresion_solution_code.zip