Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix] CheckpointManager couldn't work properly and KV files couldn't be restored when user using a lazy build Keras model. #374

Merged
merged 5 commits into from
Dec 18, 2023

Conversation

MoFHeka
Copy link
Collaborator

@MoFHeka MoFHeka commented Dec 14, 2023

Description

Fix the problems which checkpointManager couldn't work properly and KV files couldn't be restored when user using a lazy build Keras model.
Also fix some DE Keras Embedding demo which couldn't run. Fixes #370

Type of change

  • Bug fix
  • New Tutorial
  • Updated or additional documentation
  • Additional Testing
  • New Feature

Checklist:

  • I've properly formatted my code according to the guidelines
    • By running yapf
    • By running clang-format
  • This PR addresses an already submitted issue for TensorFlow Recommenders-Addons
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works

How Has This Been Tested?

horovodrun -np 2 tensorflow_recommenders_addons/dynamic_embedding/python/kernel_tests/horovod_sync_train_test.py

also run the demo demo/dynamic_embedding/movielens-1m-keras and demo/dynamic_embedding/movielens-1m-keras-with-horovod

@rhdong rhdong requested a review from Lifann December 14, 2023 03:44
@MoFHeka MoFHeka force-pushed the master-dev branch 6 times, most recently from fc1e67c to 3114413 Compare December 15, 2023 16:44
…eate their Keras model with lazy building. Also now fully support using CheckpointManager.
Copy link
Member

@rhdong rhdong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rhdong rhdong merged commit 7a6bce1 into tensorflow:master Dec 18, 2023
39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ERROR IN EXAMPLE: movielens-1m-keras-with-horovod
2 participants