Feature Engineering using Lambda Layers for an end to end training pipeline. #812

fernandonieuwveldt · 2022-02-26T06:26:43Z

In this example we look at how we can create a full training and inference pipeline implemented only using the Keras library. As we build up our graph we also visualize our network.

We will end up with only one artifact containing the full pipeline. This can easily be deployed and you do not need to create features with other libraries before feeding data to your model.

Feature engineering will be part of our network.

google-cla · 2022-02-26T06:26:46Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

For more information, open the CLA check for this pull request.

fchollet

Thanks for the PR! Feature engineering for categorical data is a great topic. However, using Lambda layers is not recommended. They're not safely serializable and I wouldn't recommend them in production for this reason.

We also already have an example on structured data feature engineering here: https://keras.io/examples/structured_data/structured_data_classification_from_scratch/

I would recommend turning your example into a tutorial that focuses on something that's absent from the example above. Perhaps we could take the approach of doing the feature engineering in a single Layer subclass that takes in a dict of data. What do you think?

examples/structured_data/feature_engineering_with_lambda.py

fernandonieuwveldt · 2022-03-01T10:20:15Z

Thanks for the PR! Feature engineering for categorical data is a great topic. However, using Lambda layers is not recommended. They're not safely serializable and I wouldn't recommend them in production for this reason.

We also already have an example on structured data feature engineering here: https://keras.io/examples/structured_data/structured_data_classification_from_scratch/

I would recommend turning your example into a tutorial that focuses on something that's absent from the example above. Perhaps we could take the approach of doing the feature engineering in a single Layer subclass that takes in a dict of data. What do you think?

@fchollet That sounds great yes. I will have a look at changing it to subclass Layer class.

fernandonieuwveldt · 2022-03-08T16:27:28Z

Thanks for the PR! Feature engineering for categorical data is a great topic. However, using Lambda layers is not recommended. They're not safely serializable and I wouldn't recommend them in production for this reason.

We also already have an example on structured data feature engineering here: https://keras.io/examples/structured_data/structured_data_classification_from_scratch/

I would recommend turning your example into a tutorial that focuses on something that's absent from the example above. Perhaps we could take the approach of doing the feature engineering in a single Layer subclass that takes in a dict of data. What do you think?

@fchollet Hi Francois. Thanks for the suggestion. Changes made to use a single feature layer by subclassing Layer class and using a dict of Input objects as input. Please let me know if this is also what you had in mind.

fchollet

Thanks for the update. A lot of the complexity here comes from the fact that you use a separate Input layer for each feature in the data, which isn't necessary if you use a Layer subclass. In addition, we should be showcasing Keras preprocessing layers.

I recommend something like:

class FeaturePreprocessing(layers.Layer):
    def __init__(self):
        # Create preprocessing layers that will be needed for feature encoding / normalization / etc

    def adapt(self, dataset):
        # Split the dataset into individual feature datasets and use them to adapt the previously created  layers

    def call(self, data):
        # Preprocess the data dict with the previously created layers, then concatenate the features

Does that make sense? Perhaps a different dataset might be a better fit too, since we're going to want to do things like:

Indexing a set of categorical string values
Indexing a set of categorical int values
Normalizing numerical features
Hashing large categorical feature spaces
etc.

fernandonieuwveldt · 2022-03-21T11:58:09Z

Hi @fchollet . Thanks for the suggestions. We now have one layer for feature preprocessing that utilises keras preprocessing layers. I have implemented the class you suggested. It contains multiple preprocessing layers and combinations of them.

Please let me know what you think. Is the dataset used fine for showcasing preprocessing layers?

fchollet

Thanks for the update!

examples/structured_data/feature_preprocessing_layer.py

pcoet · 2023-08-15T16:41:22Z

Hi @fernandonieuwveldt, thanks again for this PR. Are you planning to make the requested changes? Let us know if you're still working on this. Otherwise we'll close the request. Thanks!

fernandonieuwveldt · 2023-08-15T17:09:30Z

Hi. Let me give it another go and than we can see if this will be a good addition to the website.

fernandonieuwveldt · 2023-08-27T09:03:53Z

Hi. Let me give it another go and than we can see if this will be a good addition to the website.

Hi @fernandonieuwveldt, thanks again for this PR. Are you planning to make the requested changes? Let us know if you're still working on this. Otherwise we'll close the request. Thanks!

Hi. I made the requested changes. Hope we can still work further on this.

fernandonieuwveldt · 2023-09-08T12:09:57Z

@fchollet @pcoet Let me know if the changes look good. All the requested changes have now been implemented.

fernandonieuwveldt added 2 commits February 26, 2022 06:07

feature engineering with lambda functions example

df6ef12

fixed typos

d625587

fernandonieuwveldt changed the title ~~Example/feature engineering lambda~~ Feature Engineering using Lambda Layers for an end to end training pipeline. Feb 27, 2022

fchollet reviewed Feb 28, 2022

View reviewed changes

examples/structured_data/feature_engineering_with_lambda.py Outdated Show resolved Hide resolved

use custom layer instead of lambda

e56c9cb

typo in feature layer class

0422543

fchollet added the Need review label Mar 10, 2022

fchollet reviewed Mar 11, 2022

View reviewed changes

fchollet removed the Need review label Mar 11, 2022

added feature preprocessing layer

9bf4937

fchollet added the Need review label Mar 24, 2022

fchollet requested changes Mar 26, 2022

View reviewed changes

fchollet removed the Need review label Mar 27, 2022

pcoet added the stat:awaiting response from contributor label Aug 15, 2023

fernandonieuwveldt added 3 commits August 26, 2023 23:49

create individual preprocessing layers per feature

6394a2b

Implemented requested changes

176a70f

Merge branch 'keras-team:master' into example/feature-engineering-lambda

910ca73

pcoet assigned pcoet and fchollet and unassigned pcoet Sep 8, 2023

pcoet requested a review from fchollet September 8, 2023 16:59

pcoet removed the stat:awaiting response from contributor label Sep 8, 2023

pcoet added the awaiting review label Sep 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Engineering using Lambda Layers for an end to end training pipeline. #812

Feature Engineering using Lambda Layers for an end to end training pipeline. #812

fernandonieuwveldt commented Feb 26, 2022 •

edited

Loading

google-cla bot commented Feb 26, 2022

fchollet left a comment

fernandonieuwveldt commented Mar 1, 2022

fernandonieuwveldt commented Mar 8, 2022 •

edited

Loading

fchollet left a comment

fernandonieuwveldt commented Mar 21, 2022 •

edited

Loading

fchollet left a comment

pcoet commented Aug 15, 2023

fernandonieuwveldt commented Aug 15, 2023

fernandonieuwveldt commented Aug 27, 2023

fernandonieuwveldt commented Sep 8, 2023

Feature Engineering using Lambda Layers for an end to end training pipeline. #812

Are you sure you want to change the base?

Feature Engineering using Lambda Layers for an end to end training pipeline. #812

Conversation

fernandonieuwveldt commented Feb 26, 2022 • edited Loading

google-cla bot commented Feb 26, 2022

fchollet left a comment

Choose a reason for hiding this comment

fernandonieuwveldt commented Mar 1, 2022

fernandonieuwveldt commented Mar 8, 2022 • edited Loading

fchollet left a comment

Choose a reason for hiding this comment

fernandonieuwveldt commented Mar 21, 2022 • edited Loading

fchollet left a comment

Choose a reason for hiding this comment

pcoet commented Aug 15, 2023

fernandonieuwveldt commented Aug 15, 2023

fernandonieuwveldt commented Aug 27, 2023

fernandonieuwveldt commented Sep 8, 2023

fernandonieuwveldt commented Feb 26, 2022 •

edited

Loading

fernandonieuwveldt commented Mar 8, 2022 •

edited

Loading

fernandonieuwveldt commented Mar 21, 2022 •

edited

Loading