Numerical precision problem in the outputted expression #15

folivetti · 2022-11-21T11:59:55Z

Running some experiments with SBP using this dataset:

And by running this script:

from pyGPGOMEA import GPGOMEARegressor as GPG

def standardNotation(expr):
    expr = (expr.replace("X0", "x0")
            .replace("X1", "x1")
            .replace("X2", "x2")
            .replace("_", "")
            .replace("+-", "-")
            .replace("--", "+")
            .replace("^", "**")
            )
    expr = re.sub(r"/(-\d+\.\d+)", r"/(\1)", expr)
    return re.sub(r"\*(-\d+\.\d+)", r"*(\1)", expr)

est = GPG( popsize=500, generations=200,
    linearscaling=True, functions='+_-_*_div_log_exp', erc=True,
    initmaxtreeheight=6, maxtreeheight=20, maxsize=1000,
    subcross=0.0, sbagx=False,
    sbrdo=0.75, submut=0.25,
    unifdepthvar=True,
    tournament=4,
    sblibtype='p_10_9999_l_n',
    caching=False,
    gomea=False, ims=False, silent=True, parallel=False, seed=1 )

z = np.loadtxt("Pagie.csv", delimiter=",")
x = z[:,:-1]
y = z[:,-1]
x0 = x[:,0]
x1 = x[:,1]

est.fit(x,y)
eq = standardNotation(model(est))
yhat = eval(eq)
yhat2 = est.predict(x)
print(np.square(yhat-yhat2).mean()) # squared error between the predicted output from `predict` method and from evaluating the symbolic model

I get a mean squared error of 5624673608570.937, as discussed it is possibly due to truncation of the coefficient values.

The text was updated successfully, but these errors were encountered:

marcovirgolin · 2022-11-24T09:49:56Z

Thank you @folivetti . This is indeed a rounding problem because C++ uses double precision but the output displays only the first (I think) 5 digits.

A way to fix this is to restrict the evolution to work up to a certain numerical precision, another (which you suggested and I report in order to remember) is to try to use scientific notation for the output.

Gotta find some time to do that, though. For now, I suggest using est.predict instead of re-interpreting the formula, to get the correct prediction.

marcovirgolin added enhancement New feature or request help wanted Extra attention is needed labels Nov 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Numerical precision problem in the outputted expression #15

Numerical precision problem in the outputted expression #15

folivetti commented Nov 21, 2022

marcovirgolin commented Nov 24, 2022

Numerical precision problem in the outputted expression #15

Numerical precision problem in the outputted expression #15

Comments

folivetti commented Nov 21, 2022

marcovirgolin commented Nov 24, 2022