Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add default losses to KerasClassifier and KerasRegressor #208

Open
wants to merge 52 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 13 commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
4be5d0c
Add default loss to KerasClassifier
stsievert Feb 27, 2021
dccd92b
update message/tests
stsievert Feb 27, 2021
1d80b57
black
stsievert Feb 27, 2021
1f45285
isort
stsievert Feb 27, 2021
9358d6e
better test
stsievert Feb 28, 2021
6cc112e
Add default loss for KerasRegressor
stsievert Feb 28, 2021
80618bf
black
stsievert Feb 28, 2021
60b2404
catch binary cross entropy
stsievert Mar 1, 2021
dcb0823
black
stsievert Mar 1, 2021
0faadd9
Clean type hints in __init__
stsievert Mar 1, 2021
0449481
isort
stsievert Mar 1, 2021
ed4c1f5
change KerasRegressor.__init__
stsievert Mar 1, 2021
c58ec74
tests run
stsievert Mar 1, 2021
e73710d
MAINT
stsievert Mar 2, 2021
4e7e09f
add right loss back
stsievert Mar 2, 2021
2e830ff
Try removing binary_crossentropy check
stsievert Mar 2, 2021
e1ea339
black
stsievert Mar 2, 2021
36e6499
remove annoying 'needs linting'
stsievert Mar 2, 2021
8310834
Uncomment error
stsievert Mar 2, 2021
6ee8b50
warn for user compiled models
stsievert Mar 2, 2021
b88b74e
Union[T, None] → Optional[T]
stsievert Mar 2, 2021
3a3a536
DOC: complete docstring
stsievert Mar 2, 2021
9808cf2
DOC: complete docstring
stsievert Mar 2, 2021
d0147ac
fix loss?
stsievert Mar 2, 2021
7243995
Revert "fix loss?"
stsievert Mar 2, 2021
9735974
Warn if compiled with wrong loss
stsievert Mar 2, 2021
8cc0474
draft at loss=None
stsievert Mar 2, 2021
b0229c5
v2
stsievert Mar 2, 2021
dccfc5e
black
stsievert Mar 2, 2021
d2e23cb
Tell mypy to use type hints
stsievert Mar 2, 2021
9c3af6b
loss=None to docs
stsievert Mar 2, 2021
5121131
whoops on type hints
stsievert Mar 2, 2021
0dfa526
Update tests/test_simple_usage.py
stsievert Mar 2, 2021
7b379d7
Update scikeras/wrappers.py
stsievert Mar 2, 2021
e4338fc
Update tests/test_simple_usage.py
stsievert Mar 2, 2021
ca69f2e
Add classifier default loss test
stsievert Mar 2, 2021
2ac57e0
Merge branch 'clf-default-loss' of https://github.com/stsievert/scike…
stsievert Mar 2, 2021
0de8abe
Better warning for (really rare) use case
stsievert Mar 2, 2021
0cf7610
update warning with more recommendations
stsievert Mar 2, 2021
d4c3eea
TST: all classification losses
stsievert Mar 4, 2021
7fab517
Re-initialize
stsievert Mar 4, 2021
59e7012
tmp
stsievert Mar 4, 2021
0386e4e
loss_name is None
stsievert Mar 4, 2021
3a46538
black
stsievert Mar 4, 2021
8f2b00b
Remove backticks
stsievert Mar 4, 2021
e80338b
typing for utils/*_name
stsievert Mar 4, 2021
7e23480
raise
stsievert Mar 4, 2021
94df48a
API: loss_name / metric_name return None
stsievert Mar 4, 2021
35e1a6c
try cce
stsievert Mar 4, 2021
f092b7a
catch loss is not None
stsievert Mar 4, 2021
5af8b4c
tmp
stsievert Mar 4, 2021
7b38bc8
typo
stsievert Mar 4, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions scikeras/_types.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
from typing import Callable, List, Type, Union

import numpy as np
import tensorflow as tf
import tensorflow.keras as keras

from tensorflow.keras.callbacks import Callback as TF_Callback
from tensorflow.keras.losses import Loss as TF_Loss
from tensorflow.keras.metrics import Metric as TF_Metric
from tensorflow.keras.optimizers import Optimizer as TF_Optimizer


Model = Union[Callable[..., keras.Model], keras.Model]
RandomState = Union[int, np.random.RandomState]
Optimizer = Union[str, TF_Optimizer, Type[TF_Optimizer]]
Loss = Union[str, TF_Loss, Type[TF_Loss], Callable]
Metrics = Union[List[Union[str, TF_Metric, Type[TF_Metric], Callable]]]
Callbacks = Union[List[Union[TF_Callback, Type[TF_Callback]]]]
adriangb marked this conversation as resolved.
Show resolved Hide resolved
adriangb marked this conversation as resolved.
Show resolved Hide resolved
153 changes: 94 additions & 59 deletions scikeras/wrappers.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
import warnings

from collections import defaultdict
from typing import Any, Callable, Dict, Iterable, List, Tuple, Type, Union
from typing import Any, Callable, Dict, Iterable, List, Optional, Tuple, Type, Union

import numpy as np
import tensorflow as tf
Expand All @@ -23,6 +23,7 @@
from tensorflow.keras.models import Model
from tensorflow.keras.utils import register_keras_serializable

from scikeras import _types as T
from scikeras._utils import (
TFRandomState,
_class_from_strings,
Expand Down Expand Up @@ -192,37 +193,18 @@ class BaseWrapper(BaseEstimator):

def __init__(
self,
model: Union[None, Callable[..., tf.keras.Model], tf.keras.Model] = None,
model: T.Model,
*,
build_fn: Union[
None, Callable[..., tf.keras.Model], tf.keras.Model
] = None, # for backwards compatibility
build_fn: Optional[T.Model] = None, # for backwards compatibility
warm_start: bool = False,
random_state: Union[int, np.random.RandomState, None] = None,
optimizer: Union[
str, tf.keras.optimizers.Optimizer, Type[tf.keras.optimizers.Optimizer]
] = "rmsprop",
loss: Union[
Union[str, tf.keras.losses.Loss, Type[tf.keras.losses.Loss], Callable], None
] = None,
metrics: Union[
List[
Union[
str,
tf.keras.metrics.Metric,
Type[tf.keras.metrics.Metric],
Callable,
]
],
None,
] = None,
batch_size: Union[int, None] = None,
validation_batch_size: Union[int, None] = None,
random_state: Optional[T.RandomState] = None,
optimizer: T.Optimizer = "rmsprop",
loss: Optional[T.Loss] = None,
metrics: Optional[T.Metrics] = None,
batch_size: Optional[int] = None,
validation_batch_size: Optional[int] = None,
verbose: int = 1,
callbacks: Union[
List[Union[tf.keras.callbacks.Callback, Type[tf.keras.callbacks.Callback]]],
None,
] = None,
callbacks: Optional[T.Callbacks] = None,
validation_split: float = 0.0,
shuffle: bool = True,
run_eagerly: bool = False,
Expand Down Expand Up @@ -1142,12 +1124,18 @@ class KerasClassifier(BaseWrapper):
an instance of tf.keras.optimizers.Optimizer
or a class inheriting from tf.keras.optimizers.Optimizer.
Only strings and classes support parameter routing.
loss : Union[Union[str, tf.keras.losses.Loss, Type[tf.keras.losses.Loss], Callable], None], default None
loss : Union[Union[str, tf.keras.losses.Loss, Type[tf.keras.losses.Loss], Callable], None], default "categorical_crossentropy"
The loss function to use for training.
This can be a string for Keras' built in losses,
an instance of tf.keras.losses.Loss
or a class inheriting from tf.keras.losses.Loss .
Only strings and classes support parameter routing.

For convience, the loss defaults to
`"categorical_crossentropy"`. This assumes that the model has
``N`` outputs if the dataset has ``N`` classes. It assumes that
the input

random_state : Union[int, np.random.RandomState, None], default None
Set the Tensorflow random number generators to a
reproducible deterministic state using this seed.
Expand Down Expand Up @@ -1245,42 +1233,23 @@ class KerasClassifier(BaseWrapper):

def __init__(
self,
model: Union[None, Callable[..., tf.keras.Model], tf.keras.Model] = None,
model: T.Model,
*,
build_fn: Union[
None, Callable[..., tf.keras.Model], tf.keras.Model
] = None, # for backwards compatibility
build_fn: Optional[T.Model] = None, # for backwards compatibility
warm_start: bool = False,
random_state: Union[int, np.random.RandomState, None] = None,
optimizer: Union[
str, tf.keras.optimizers.Optimizer, Type[tf.keras.optimizers.Optimizer]
] = "rmsprop",
loss: Union[
Union[str, tf.keras.losses.Loss, Type[tf.keras.losses.Loss], Callable], None
] = None,
metrics: Union[
List[
Union[
str,
tf.keras.metrics.Metric,
Type[tf.keras.metrics.Metric],
Callable,
]
],
None,
] = None,
batch_size: Union[int, None] = None,
validation_batch_size: Union[int, None] = None,
random_state: Optional[T.RandomState] = None,
optimizer: T.Optimizer = "rmsprop",
loss: Optional[T.Loss] = None,
metrics: Optional[T.Metrics] = None,
batch_size: Optional[int] = None,
validation_batch_size: Optional[int] = None,
verbose: int = 1,
callbacks: Union[
List[Union[tf.keras.callbacks.Callback, Type[tf.keras.callbacks.Callback]]],
None,
] = None,
callbacks: Optional[T.Callbacks] = None,
validation_split: float = 0.0,
shuffle: bool = True,
run_eagerly: bool = False,
epochs: int = 1,
class_weight: Union[Dict[Any, float], str, None] = None,
class_weight: Optional[Union[Dict[Any, float], str]] = None,
**kwargs,
):
super().__init__(
Expand Down Expand Up @@ -1308,8 +1277,34 @@ def _type_of_target(self, y: np.ndarray) -> str:
if target_type == "binary" and self.classes_ is not None:
# check that this is not a multiclass problem missing categories
target_type = type_of_target(self.classes_)
if target_type == "binary" and self.loss == "categorical_crossentropy":
raise ValueError(
"A binary target with two targets is specified; "
"however loss='categorical_crossentropy' is specified. "
"Keras will not learn in this use case. "
"Any one of the following will resolve this error:\n\n"
" * Set loss='binary_crossentropy' or loss='bce'\n"
)
return target_type

def _fit_keras_model(self, *args, **kwargs):
try:
super()._fit_keras_model(*args, **kwargs)
except ValueError as e:
if (
self.loss == "categorical_crossentropy"
and hasattr(self, "model_")
and 1 in {o.shape[1] for o in getattr(self.model_, "outputs", [])}
):
raise ValueError(
"The model is configured to have one output, but the "
f"loss='{self.loss}' is expecting multiple outputs "
"(which is often used with one-hot encoded targets). "
"More detail on Keras losses: https://keras.io/api/losses/"
) from e
else:
raise e
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it should live in _check_model_compatibility, or they should be merged in some way.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error message only provides marginal utility: it protects against cases when the model has one output but there are multiple classes.

It can not go in _check_model_compatibility; I wait for an error to be raised before issuing this warning (otherwise a model with a single output raises an error).

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Is there a specific error message we can check for, like if "some Keras error" in str(e)?

getattr(self.model_, "outputs", [])

Is this necessary? model_ should always have an outputs attribute, except in the case described in #207, but that should be a separate check/error.

f"loss='{self.loss}' is expecting multiple outputs "

Can you clarify what you mean by a loss expecting a number of outputs? My understanding is that Keras "broadcasts" losses to outputs, so if you give it a scalar loss (ie.. loss="bce") with 2 outputs (i.e. len(model_.outputs) == 2), it will implicitly compile the model with loss=[original_loss] * len(outputs). But you can actually map losses to outputs manually, by passing loss=["bce", "mse"] or loss={"out1": "bce", "out2": "mse"}. From the tests, it seems like by "loss is expecting multiple outputs" you mean that there is a single output unit but multiple classes, which I feel like could be confused with the above concept of configuring a loss for multiple outputs.

I'm also curious about the iteration through outputs (1 in {o.shape[1] for o in self.model_.outputs}). SciKeras does not support >1 output out of the box (users need to override target_encoder) so it seems a bit strange to try to account for that when using the default loss. I feel that using the default loss should only be supported for the simple single-output cases that target_encoder supports.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a side note: I think giving users better errors and validating their inputs like you are doing here can be a very valuable part of SciKeras, but currently it is done in an ad-hoc manner via _check_model_compatibility, etc. I think if we add more of these types of things, it would be nice to have an organized interface for it. I opened #209 to try to brainstorm ideas for this.


@staticmethod
def scorer(y_true, y_pred, **kwargs) -> float:
"""Scoring function for KerasClassifier.
Expand Down Expand Up @@ -1611,6 +1606,46 @@ class KerasRegressor(BaseWrapper):
**BaseWrapper._tags,
}

def __init__(
self,
model: T.Model,
*,
build_fn: Optional[T.Model] = None, # for backwards compatibility
warm_start: bool = False,
random_state: Optional[T.RandomState] = None,
optimizer: T.Optimizer = "rmsprop",
loss: Optional[T.Loss] = "mse",
metrics: Optional[T.Metrics] = None,
batch_size: Optional[int] = None,
validation_batch_size: Optional[int] = None,
verbose: int = 1,
callbacks: Optional[T.Callbacks] = None,
validation_split: float = 0.0,
shuffle: bool = True,
run_eagerly: bool = False,
epochs: int = 1,
class_weight: Optional[Union[Dict[Any, float], str]] = None,
**kwargs,
):
super().__init__(
model=model,
build_fn=build_fn,
warm_start=warm_start,
random_state=random_state,
optimizer=optimizer,
loss=loss,
metrics=metrics,
batch_size=batch_size,
validation_batch_size=validation_batch_size,
verbose=verbose,
callbacks=callbacks,
validation_split=validation_split,
shuffle=shuffle,
run_eagerly=run_eagerly,
epochs=epochs,
**kwargs,
)

@staticmethod
def scorer(y_true, y_pred, **kwargs) -> float:
"""Scoring function for KerasRegressor.
Expand Down
103 changes: 103 additions & 0 deletions tests/test_simple_usage.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
import numpy as np
import pytest
import tensorflow as tf

from sklearn.datasets import make_classification
from sklearn.preprocessing import OneHotEncoder

from scikeras.wrappers import KerasClassifier, KerasRegressor


N_CLASSES = 4
FEATURES = 8
n_eg = 100
X = np.random.uniform(size=(n_eg, FEATURES)).astype("float32")


def shallow_net(single_output=False, in_dim=FEATURES):
model = tf.keras.Sequential()
model.add(tf.keras.layers.Input(shape=(in_dim,)))
model.add(tf.keras.layers.Dense(in_dim, activation="sigmoid"))

if single_output:
model.add(tf.keras.layers.Dense(1))
else:
model.add(tf.keras.layers.Dense(N_CLASSES))

return model


@pytest.mark.parametrize(
"use_case",
[
"binary_classification",
"binary_classification_w_one_class",
"classification_w_1d_targets",
"classification_w_onehot_targets",
],
)
def test_classifier_only_model_specified(use_case):
"""
Test uses cases where KerasClassifier works with the default loss.
"""

model__single_output = True if "binary" in use_case else False
if use_case == "binary_classification":
y = np.random.choice(2, size=len(X)).astype(int)
elif use_case == "binary_classification_w_one_class":
y = np.zeros(len(X))
elif use_case == "classification_w_1d_targets":
y = np.random.choice(N_CLASSES, size=len(X)).astype(int)
elif use_case == "classification_w_onehot_targets":
y = np.random.choice(N_CLASSES, size=len(X)).astype(int)
y = OneHotEncoder(sparse=False).fit_transform(y.reshape(-1, 1))
else:
raise ValueError("use_case={use_case} not recognized")

est = KerasClassifier(model=shallow_net, model__single_output=model__single_output)
if "binary" in use_case:
with pytest.raises(ValueError, match="Set loss='binary_crossentropy'"):
est.partial_fit(X, y)
est.set_params(loss="binary_crossentropy")

est.partial_fit(X, y=y)
stsievert marked this conversation as resolved.
Show resolved Hide resolved
assert est.current_epoch == 1


def test_classifier_raises_for_single_output_with_multiple_classes():
"""
KerasClassifier does not work with one output and multiple classes
in the target (duh).
"""
est = KerasClassifier(model=shallow_net, model__single_output=True)
y = np.random.choice(N_CLASSES, size=len(X))
msg = (
"The model is configured to have one output, but the "
"loss='categorical_crossentropy' is expecting multiple outputs "
stsievert marked this conversation as resolved.
Show resolved Hide resolved
)
with pytest.raises(ValueError, match=msg):
est.partial_fit(X, y)
assert est.current_epoch == 0


def test_classifier_raises_loss_binary_multi_misspecified():
est = KerasClassifier(
model=shallow_net,
model__single_output=True,
model__in_dim=1,
loss="bce",
epochs=100,
random_state=42,
)
X = np.random.choice(2, size=(20000, 1))
y = X.copy()
est.partial_fit(X, y)
assert est.score(X, y) >= 0.9


def test_regressor_default_loss():
y = np.random.uniform(size=len(X))
est = KerasRegressor(model=shallow_net, model__single_output=True)
assert est.loss == "mse"
est.partial_fit(X, y)
assert est.model_.loss.__name__ == "mean_squared_error"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the assertion that the "long" name for the loss was used in the model necessary here? I don't see the same assertion for classifiers.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assert statement is present to make sure the BaseWrapper.loss is mirror in BaseWrapper.model_.loss. I'll add a test for KerasClassifier too.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is good use case for scikeras.utils.loss_name?