Setting class_weight in model.fit() with tf.data.Dataset causes error #47032

tensortorch · 2021-02-09T10:39:43Z

System information

Have I written custom code (as opposed to using a stock example script provided in TensorFlow): yes
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04 / Windows 10
TensorFlow installed from (source or binary): binary
TensorFlow version (use command below): 2.3.0 / 2.4.1
Python version: 3.6 / 3.7
CUDA/cuDNN version: 10.1 / none
GPU model and memory: RTX2080 / none

Describe the current behavior
When a tf.data.Dataset is used in model.fit(), setting class_weight causes an error.

Describe the expected behavior
No error occurs.

Standalone code to reproduce the issue

from tensorflow import keras
import tensorflow as tf
import numpy as np


def get_model():
    inputs = keras.layers.Input(shape=(10, 10, 3))
    x = keras.layers.Flatten()(inputs)
    outputs = keras.layers.Dense(5)(x)
    model = keras.Model(inputs, outputs)
    model.compile(loss='categorical_crossentropy', optimizer=tf.keras.optimizers.Adam(learning_rate=0.001))
    return model


def map_fun(_):
    dummy_image = np.zeros((10, 10, 3))  
    dummy_label = np.array([0, 0, 1, 0, 0]) 
    return dummy_image, dummy_label


if __name__ == '__main__':
    # dummy dataset
    dataset = tf.data.Dataset.from_tensor_slices([1, 2])  # values are ignored, dummy data generated in map()
    dataset = dataset.map(map_func=lambda x: tf.py_function(map_fun, [x], [tf.uint8, tf.uint8])).batch(2)

    # dummy model
    model = get_model()

    # call fit() without class weights - ok
    model.fit(dataset, epochs=1)

    # define class weights
    class_weight = {idx: weight for (idx, weight) in enumerate([1., 1., 1., 1., 1.])}

    # transform dataset to iterator, call fit() with class weights - ok
    model.fit(dataset.as_numpy_iterator(), class_weight=class_weight, epochs=1)

    # call fit() with class weights on tf.data.Dataset - error
    model.fit(dataset, class_weight=class_weight, epochs=1)

Error message

Traceback (most recent call last):
  File "/data/sandbox/reproduce.py", line 39, in <module>
    model.fit(dataset, class_weight=class_weight, epochs=1)
  File "/data/sandbox/venv/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 108, in _method_wrapper
    return method(self, *args, **kwargs)
  File "/data/sandbox/venv/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 1063, in fit
    steps_per_execution=self._steps_per_execution)
  File "/data/sandbox/venv/lib/python3.6/site-packages/tensorflow/python/keras/engine/data_adapter.py", line 1122, in __init__
    dataset = dataset.map(_make_class_weight_map_fn(class_weight))
  File "/data/sandbox/venv/lib/python3.6/site-packages/tensorflow/python/data/ops/dataset_ops.py", line 1695, in map
    return MapDataset(self, map_func, preserve_cardinality=True)
  File "/data/sandbox/venv/lib/python3.6/site-packages/tensorflow/python/data/ops/dataset_ops.py", line 4045, in __init__
    use_legacy_function=use_legacy_function)
  File "/data/sandbox/venv/lib/python3.6/site-packages/tensorflow/python/data/ops/dataset_ops.py", line 3371, in __init__
    self._function = wrapper_fn.get_concrete_function()
  File "/data/sandbox/venv/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 2939, in get_concrete_function
    *args, **kwargs)
  File "/data/sandbox/venv/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 2906, in _get_concrete_function_garbage_collected
    graph_function, args, kwargs = self._maybe_define_function(args, kwargs)
  File "/data/sandbox/venv/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 3213, in _maybe_define_function
    graph_function = self._create_graph_function(args, kwargs)
  File "/data/sandbox/venv/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 3075, in _create_graph_function
    capture_by_value=self._capture_by_value),
  File "/data/sandbox/venv/lib/python3.6/site-packages/tensorflow/python/framework/func_graph.py", line 986, in func_graph_from_py_func
    func_outputs = python_func(*func_args, **func_kwargs)
  File "/data/sandbox/venv/lib/python3.6/site-packages/tensorflow/python/data/ops/dataset_ops.py", line 3364, in wrapper_fn
    ret = _wrapper_helper(*args)
  File "/data/sandbox/venv/lib/python3.6/site-packages/tensorflow/python/data/ops/dataset_ops.py", line 3299, in _wrapper_helper
    ret = autograph.tf_convert(func, ag_ctx)(*nested_args)
  File "/data/sandbox/venv/lib/python3.6/site-packages/tensorflow/python/autograph/impl/api.py", line 255, in wrapper
    return converted_call(f, args, kwargs, options=options)
  File "/data/sandbox/venv/lib/python3.6/site-packages/tensorflow/python/autograph/impl/api.py", line 532, in converted_call
    return _call_unconverted(f, args, kwargs, options)
  File "/data/sandbox/venv/lib/python3.6/site-packages/tensorflow/python/autograph/impl/api.py", line 339, in _call_unconverted
    return f(*args, **kwargs)
  File "/data/sandbox/venv/lib/python3.6/site-packages/tensorflow/python/keras/engine/data_adapter.py", line 1314, in _class_weights_map_fn
    if y.shape.rank > 2:
TypeError: '>' not supported between instances of 'NoneType' and 'int'

Process finished with exit code 1

The text was updated successfully, but these errors were encountered:

amahendrakar · 2021-02-09T18:10:29Z

@tensortorch,
Please take a look at this comment from similar issue and check if it helps. Thanks!

tensortorch · 2021-02-10T09:51:51Z

@amahendrakar,
thank you for your suggestion!

It does help since I can make my dataset return a sample weight as a third value, which is what model.fit() does anyway under the hood when class_weight is provided. So one can work around this issue using sample weights instead.

However, I believe providing class_weight in model.fit() should still work. The linked comment explains that it is not expected to work for 3+ dimensional targets, but this is not the case here. In fact, the error occurs within the check for target dimensionality, and is caused by the target rank being None for some reason:

    if y.shape.rank > 2:   # <== this is where the error occurs, because y.shape.rank is None
      raise ValueError("`class_weight` not supported for "
                       "3+ dimensional targets.")

Also, it does work for for the same inputs when the dataset is converted to an iterator, as my minimal example shows. The error only occurs for a tf.data.Dataset as input value. Here, the DataHandler attempts to add the third output to the dataset by calling map() to convert class_weight to sample_weight:

    if class_weight:
      dataset = dataset.map(_make_class_weight_map_fn(class_weight))

which fails due to the aforementioned error.

tensortorch · 2021-02-15T08:50:31Z

The workaround for a somewhat related problem also works in this case: an additional call to map() to manually set the tensor shape makes it work.

I previously tried manually converting the outputs of the py_function to tensors and also manually setting their shapes, but it did not work, so the key here is the second call to map() to set the shapes after batch().

amahendrakar · 2021-02-22T06:26:50Z

@tensortorch,
Thank you for the update. Is this still an issue?

tensortorch · 2021-02-25T11:18:45Z

@amahendrakar
I still think this is an issue. I believe it should not be necessary to call map() a second time - I would expect this to work without the extra steps. The issue might not be with the class weights functionality itself though - maybe rather with the combination of map/py_function.

amahendrakar · 2021-02-28T08:04:47Z

@jvishnuvardhan,
I was able to reproduce the issue with TF v2.3, TF v2.4 and TF-nightly. Please find the gist of it here. Thanks!

yuriy-vorontsov · 2021-02-28T08:42:06Z

Same error, checked on python 3.8 and TF: v2.2.0, 2.3.0, 2.4.0, 2.4.1
Found a quick solution with custom loss function:

def weighted_categorical_crossentropy( weights ):
    # weights = [ 0.9, 0.05, 0.04, 0.01 ]
    def wcce( y_true, y_pred ):
        tf_weights = tf.constant( weights )
        if not tf.is_tensor( y_pred ):
            y_pred = tf.constant( y_pred )

        y_true = tf.cast( y_true, y_pred.dtype )
        return tf.keras.losses.categorical_crossentropy( y_true, y_pred ) * tf.experimental.numpy.sum( y_true * tf_weights, axis = -1 )
    return wcce

...
config['loss'] = weighted_categorical_crossentropy( config['classWeight'] )
model.compile(
    loss = config['loss'],
    optimizer = config['optimizer'],
    metrics = ['accuracy'],
    run_eagerly = True
)

kretes · 2021-05-21T12:59:35Z

This still fails in tf 2.5.0. On the go I've created another reproduction script: https://gist.github.com/kretes/ca911085b2eb0fa3985894245ce3fd0c
Setting shape works but introduce burden on user's code.

I suggest changing the name of the issue to 'Setting class_weight in model.fit() with tf.data.Dataset using py_function causes error'. as this is the step that makes for unknown shape

sumanttyagi · 2021-05-31T08:15:16Z

Same error, checked on python 3.8 and TF: v2.2.0, 2.3.0, 2.4.0, 2.4.1
Found a quick solution with custom loss function:

def weighted_categorical_crossentropy( weights ):
    # weights = [ 0.9, 0.05, 0.04, 0.01 ]
    def wcce( y_true, y_pred ):
        tf_weights = tf.constant( weights )
        if not tf.is_tensor( y_pred ):
            y_pred = tf.constant( y_pred )

        y_true = tf.cast( y_true, y_pred.dtype )
        return tf.keras.losses.categorical_crossentropy( y_true, y_pred ) * tf.experimental.numpy.sum( y_true * tf_weights, axis = -1 )
    return wcce

...
config['loss'] = weighted_categorical_crossentropy( config['classWeight'] )
model.compile(
    loss = config['loss'],
    optimizer = config['optimizer'],
    metrics = ['accuracy'],
    run_eagerly = True
)

this still shows error , please help
tf.keras.losses.categorical_crossentropy( y_true, y_pred ) expects 0 arguments got 2

sachinprasadhs · 2022-06-10T23:02:08Z

When I do the type(dataset) in your code, it returns tensorflow.python.data.ops.dataset_ops.BatchDataset, for BatchDataset, you don't have to specify y separately, you simply have to feed dataset is a tuple object containing both (x,y).
Check the document here which states,

Target data. Like the input data x, it could be either Numpy array(s) or TensorFlow tensor(s). It should be consistent with x (you cannot have Numpy inputs and tensor targets, or inversely). If x is a dataset, generator, or keras.utils.Sequence instance, y should not be specified (since targets will be obtained from x).

For validation data, you can explicitly mention fit with validation data like below.
model.fit(train_dataset, validation_data=val_dataset, batch_size=32, epochs=100)

kretes · 2022-06-13T06:33:27Z

@sachinprasadhs you are referring to specifying y separately, but I can't see that in the code in the issue. Same is in the gist I prepared previously: https://gist.github.com/kretes/ca911085b2eb0fa3985894245ce3fd0c where the dataset yields a tuple of (x,y), and so is inline with documentation.

Can you try running the gist e.g. in collab to see if it fails on your side, and modify accordingly for it to pass?

sachinprasadhs · 2022-06-14T22:46:46Z

@kretes , train_dataset in model.fit() is a BatchDataset, in this case y should not be specified (since targets will be obtained from train_dataset).

If you still face the issue, could you please open the issue in keras/team-keras repo. Thanks!

kretes · 2022-06-15T08:21:21Z

Ok, I will move to keras repo. Just that I am not passing y separately - it is part of the BatchDataset - as you say.

google-ml-butler · 2022-06-22T08:21:31Z

This issue has been automatically marked as stale because it has no recent activity. It will be closed if no further activity occurs. Thank you.

tensortorch · 2022-06-23T11:53:04Z

@sachinprasadhs as @kretes already mentioned, the fit method in the example is following the documentation and not using a separate y argument. You can also see from the example that the fit call works when the same dataset is transformed to an iterator.

sachinprasadhs · 2022-06-23T20:52:32Z

Development of keras moved to separate repository https://github.com/keras-team/keras/issues

Please post this issue on keras-team/keras repo.
To know more see;
https://discuss.tensorflow.org/t/keras-project-moved-to-new-repository-in-https-github.aaakk.us.kg-keras-team-keras/1999
Thank you!

google-ml-butler · 2022-06-30T21:48:06Z

This issue has been automatically marked as stale because it has no recent activity. It will be closed if no further activity occurs. Thank you.

google-ml-butler · 2022-07-07T22:39:13Z

Closing as stale. Please reopen if you'd like to work on this further.

google-ml-butler · 2022-07-07T22:39:16Z

Are you satisfied with the resolution of your issue?
Yes
No

tensortorch added the type:bug Bug label Feb 9, 2021

google-ml-butler bot assigned amahendrakar Feb 9, 2021

amahendrakar added comp:keras Keras related issues stat:awaiting response Status - Awaiting response from author TF 2.4 for issues related to TF 2.4 labels Feb 9, 2021

tensorflowbutler removed the stat:awaiting response Status - Awaiting response from author label Feb 12, 2021

amahendrakar added the stat:awaiting response Status - Awaiting response from author label Feb 22, 2021

tensorflowbutler removed the stat:awaiting response Status - Awaiting response from author label Feb 27, 2021

amahendrakar assigned jvishnuvardhan and unassigned amahendrakar Feb 28, 2021

jvishnuvardhan added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Mar 10, 2021

sachinprasadhs assigned sachinprasadhs and unassigned jvishnuvardhan Jun 10, 2022

sachinprasadhs added stat:awaiting response Status - Awaiting response from author and removed stat:awaiting tensorflower Status - Awaiting response from tensorflower labels Jun 10, 2022

google-ml-butler bot added the stale This label marks the issue/pr stale - to be closed automatically if no activity label Jun 22, 2022

google-ml-butler bot removed stat:awaiting response Status - Awaiting response from author stale This label marks the issue/pr stale - to be closed automatically if no activity labels Jun 23, 2022

sachinprasadhs added the stat:awaiting response Status - Awaiting response from author label Jun 23, 2022

google-ml-butler bot added the stale This label marks the issue/pr stale - to be closed automatically if no activity label Jun 30, 2022

google-ml-butler bot closed this as completed Jul 7, 2022

tilakrayal mentioned this issue Jul 11, 2024

tf.data.Dataset.from_generator: TypeError: '>' not supported between instances of 'NoneType' and 'int' #71330

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Setting class_weight in model.fit() with tf.data.Dataset causes error #47032

Setting class_weight in model.fit() with tf.data.Dataset causes error #47032

tensortorch commented Feb 9, 2021

amahendrakar commented Feb 9, 2021

tensortorch commented Feb 10, 2021 •

edited

Loading

tensortorch commented Feb 15, 2021

amahendrakar commented Feb 22, 2021

tensortorch commented Feb 25, 2021

amahendrakar commented Feb 28, 2021

yuriy-vorontsov commented Feb 28, 2021

kretes commented May 21, 2021

sumanttyagi commented May 31, 2021

sachinprasadhs commented Jun 10, 2022

kretes commented Jun 13, 2022

sachinprasadhs commented Jun 14, 2022

kretes commented Jun 15, 2022

google-ml-butler bot commented Jun 22, 2022

tensortorch commented Jun 23, 2022

sachinprasadhs commented Jun 23, 2022

google-ml-butler bot commented Jun 30, 2022

google-ml-butler bot commented Jul 7, 2022

google-ml-butler bot commented Jul 7, 2022

Setting class_weight in model.fit() with tf.data.Dataset causes error #47032

Setting class_weight in model.fit() with tf.data.Dataset causes error #47032

Comments

tensortorch commented Feb 9, 2021

amahendrakar commented Feb 9, 2021

tensortorch commented Feb 10, 2021 • edited Loading

tensortorch commented Feb 15, 2021

amahendrakar commented Feb 22, 2021

tensortorch commented Feb 25, 2021

amahendrakar commented Feb 28, 2021

yuriy-vorontsov commented Feb 28, 2021

kretes commented May 21, 2021

sumanttyagi commented May 31, 2021

sachinprasadhs commented Jun 10, 2022

kretes commented Jun 13, 2022

sachinprasadhs commented Jun 14, 2022

kretes commented Jun 15, 2022

google-ml-butler bot commented Jun 22, 2022

tensortorch commented Jun 23, 2022

sachinprasadhs commented Jun 23, 2022

google-ml-butler bot commented Jun 30, 2022

google-ml-butler bot commented Jul 7, 2022

google-ml-butler bot commented Jul 7, 2022

tensortorch commented Feb 10, 2021 •

edited

Loading