-
Notifications
You must be signed in to change notification settings - Fork 411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metrics support for sweeping #148
Comments
I think it is great addition. |
Yeah, maybe something like this: https://ax.dev/tutorials/tune_cnn.html But I do imagine that any sort of sweeping requires us to be able to a) select the target metric b) compare two runs to see if the metric |
@maximsch2 what do you think about something like this: _REGISTER = {}
def register(metric, minimize, index=None):
if minimize:
compare_fn = torch.less
init_val = torch.tensor(float("inf"))
else:
compare_fn = torch.greater
init_val = -torch.tensor(float("inf"))
_REGISTER[metric] = (minimize, compare_fn, init_val, index)
register(MeanSquaredError, True)
class MetricCompare:
def __init__(self, metric):
self.base_metric = metric
minimize, compare_fn, init_val, index = _REGISTER[type(metric)]
self._minimize = minimize
self._compare_fn = compare_fn
self._index = index
self._init_val = init_val
self._new_val = deepcopy(init_val)
self._old_val = deepcopy(init_val)
def update(self, *args, **kwargs):
self.base_metric.update(*args, **kwargs)
def compute(self):
self._old_val = self._new_val
val = self.base_metric.compute()
self._new_val = val.detach()
return val
def reset(self):
self.base_metric.reset()
self._new_val = deepcopy(self._init_val)
self._old_val = deepcopy(self._init_val)
@property
def has_improved(self):
if self._index is None:
return self._compare_fn(self._new_val, self._old_val)
else:
return self._compare_fn(self._new_val[index], self._old_val[index])
@property
def minimize(self):
return self._minimize
@property
def maximize(self):
return not self.minimize
metric = MetricCompare(MeanSquaredError())
metric.update(torch.randn(100,), torch.randn(100,))
val = metric.compute()
print(metric.has_improved) this is basically a wrapper for metrics that adds additional properties that can tell if the metric should be minimized/maximized and after |
Usually sweeps will be run in a distributed fashion (e.g. schedule runs with different hyperparams separately, compute metric values, pick the one with the best metric), so Thinking about it a bit more, just providing a way to convert a metric to optimization value might be enough (with a semantics that we are increasing or decreasing it). Another example of package for hyperparam optimization that also takes objective: http://hyperopt.github.io/hyperopt/ |
I'd like to see this implemented as well. We're using PL + Optuna (+ Hydra's plugin_sweeper_optuna) and running into the same problem. Esp. when a metric of a model is configurable. I think the approach with property While the solutions with wrappers work, I think it'd be good if PL somehow standardized this, so the other HP optimization libraries can integrate this. |
Okay, then settle on adding a property to each metric.
|
add
Ie. Optuna let's you define a tuple
I'd say we don't care for the first iteration and just leave these as None. And we cannot decide anyway on pareto-optimal front. ... and you probably meant multi-dim metric's output, not multidim optimization, right? Can we say that for the first draft, this feature works only form metrics that |
For multi-output metrics we need ability to extract the value that is actually being optimized over. E.g. some metrics can return value and corresponding threshold (e.g. recall@precision=90%) and we only want to optimize over the actual value. |
@maximsch2 @breznak @SkafteNicki how is it going here? do we have a resolution on what to do? |
I think we got stuck on more advanced cases (eg. metrics that return more values, as above). While I see it's important to design it well so it works for all usecases in the future, I think we should find a MVP that we can easily implement, otherwise this will likely get stuck. In practice, what we're running into is that this would ideally be coordinated "API" for
could you elaborate on this example, please, @maximsch2 ? From what I understand, the metric returns multiple values for several thresholds. But wouldn't the direction still be the same for all of them? (recall -> max ?) |
@breznak since
I think what @maximsch2 is referring to, is that metrics such as precision, recall, thresholds = pr_curve(pred, target) where I basically want to optimize the precision/recall but not the threshold values. |
good to know, thanks! then it should be easier.
how about adding a "tell us what is the (1) optimization criterion for you" to the metric, then? |
I'm actually thinking that maybe let's defer the multi-output metrics to later as long as we can support those in |
I'm for starting small, but doing it rather soon. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
🚀 Feature
We would like to have tighter integration of metrics and sweeping. This requires a few features:
higher_is_better
(e.g. are we trying to minimize or maximize the metric in a sweep)Alternatives
An alternative implementation will be for each metric to have
is_better(left: TMetricResult, right: TMetricResult)
where TMetricResult is whatevercompute
returns.If we don't have it, people will have to have wrappers around the metrics to support this functionality in sweepers.
The text was updated successfully, but these errors were encountered: