-
Notifications
You must be signed in to change notification settings - Fork 295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve closure by returning more reasonable values #130
Comments
Or just add epsilon... 😄 Sometimes the easier solution doesn't immediately come to mind. |
0.001 is quite a high value for protected division. |
The value I used was a bit of hand wringer at the time. What's a small number? What's a big number? :-) In the end I chose a value that I felt wasn't going to explode the output values "too much" ... https://cswww.essex.ac.uk/staff/rpoli/gp-field-guide/B3SourceCode.html#24_3 included 0.001 for protected division and so I just adopted that. It felt reasonable at the time. Haven't been challenged on it until now, but figured that Poli, Langdon & McPhee wouldn't be spreading things that were not reasonable. I'll carry on the discussion at #131 |
I can't help but think... why not make a closure at the fitness evaluation?
>>> 1 / np.zeros(1)
array([inf])
>>> -1 / np.zeros(1)
array([-inf])
>>> y_pred = np.array([float('-inf'), float('nan'), float('inf')])
>>> y = np.array([1, 2, 3])
>>> y_pred - y
array([-inf, nan, inf])
>>> err = (y_pred - y)
>>> np.where(np.isnan(err), np.inf, err)
array([-inf, inf, inf]) And if you want to clip: >>> MAX_PENALTY=1000
>>> np.clip(np.where(np.isnan(err), np.inf, err), -MAX_PENALTY, MAX_PENALTY)
array([-1000., 1000., 1000.]) Infinitely punishing NaNs is presented here: Genetic programming as a model induction engine
But I couldn't find any scientific references to the "MAX_PENALTY" I came up with. Treating the closure at the fitness-function level would allow for much greater flexibility in the supported building block functions, and eliminate the effort to ensure each of their closures. In addition, it would eliminate the need for an arbitrary "safety" threshold. |
@danuker Making import numpy as np
def _fitness(y, y_pred, w):
# example for mean square error, replace line below as appropriate for other error types
r = np.average(((y_pred - y) ** 2), weights=w)
return np.nan_to_num(r, nan=float("inf"))
est = SymbolicRegressor(...
metric=make_fitness(_fitness, greater_is_better=False),
) So far it seems to work fine. Population average fitness is always So IMHO it would be great if this was available out-of-the-box via some keyword argument, e.g. |
Evolutionary speaking it would make more sense to assign a penalty to those functions. Giving them the worst fitness is akin to simply deleting them from the population. |
@hwulfmeyer That is true. But removing numerically unstable individuals means the remaining ones will be stable. It is objectively much worse to be unstable, in a model. We might lose diversity, but that can be addressed through other ways. For instance, Nutonian Eureqa preserves greater population diversity by not using just one fitness metric, but letting all Pareto-nondominated individuals survive, which are not dominated in both error AND complexity.
This is already requested in #33. |
If you assign a penalty on individuals which are illegal the evolutionary process will handle the stableness itself. Unstable individuals will have a less likely chance to reproduce and thus producing further unstable individuals. At the same time there is a small chance they reproduce, which provides us the chance to not loose all genetic data. Anyways, this discussion is unecessary I guess. I just wanted to point out that there are different approaches to handle illegal individuals beside assigning the worst fitness. One could also try to implement 'repairing', which is a technique known in genetic algorithms. Pareto-nondominated genetic programming is a great approach but lacks in one part. Complexity is not something that necessarily needs to be optimized. Anyways, there never is only one true solution to something, especially here. It will be good to have several different approaches. |
Sure, it's not mandatory, but is a good second-metric to be good at, because all things being equal, you prefer the simpler explanation (Occam's Razor). In addition, there are some performance benefits (evaluating a complex function is more expensive). Edit: also, it acts sort of like regularization by preferring simpler models (and I suspect offers a slight overfitting reduction).
I agree. It would be hard to find a method that is fastest at everything. Better implement multiple, and let people try it out. But I have used Eureqa, and it seems to work better than most GP logistic regression implementations, that is why I insisted. |
Very interesting discussion! I created a related issue (related to protected operators and closure, that is): #242. |
From the introduction:
Are these return values widely accepted?
If the denominator lies between -0.001 and 0.001, why should the node return 1.0? Alternatively,
Similar approaches could be used for the logarithm.
The text was updated successfully, but these errors were encountered: