Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Automatic model refresh notebook broken #28773

Closed
1 of 16 tasks
damccorm opened this issue Oct 2, 2023 · 2 comments · Fixed by #28777
Closed
1 of 16 tasks

[Bug]: Automatic model refresh notebook broken #28773

damccorm opened this issue Oct 2, 2023 · 2 comments · Fixed by #28777
Assignees
Labels
bug done & done Issue has been reviewed after it was closed for verification, followups, etc. P2 python

Comments

@damccorm
Copy link
Contributor

damccorm commented Oct 2, 2023

What happened?

As I was working through the automatic model refresh notebook, I found the following bugs:

  1. Main session isn't saved correctly, leading to dependencies not being available at runtime. When save_main_session is specified, it fails because its not able to correctly pickle the file
  2. Relies on read_image function from an example, should just inline it
  3. Models are saved in unusable format and thus can't be loaded

Issue Priority

Priority: 2 (default / most bugs should be filed as P2)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner
@damccorm
Copy link
Contributor Author

damccorm commented Oct 2, 2023

  1. It looks like the model updates aren't actually propagating through as well... e.g. https://pantheon.corp.google.com/dataflow/jobs/us-central1/2023-10-02_10_40_00-5726123037803609215

@damccorm
Copy link
Contributor Author

damccorm commented Oct 2, 2023

Fixes I've gotten working so far:

  1. Main session isn't saved correctly, leading to dependencies not being available at runtime. When save_main_session is specified, it fails because its not able to correctly pickle the file
  1. Add save_main_session flag
  2. Update requirements.txt to use tensorflow_hub instead of tensorflow-hub
  3. Put the colab auth + dependency in a function and then invoke that function so that it doesn't get automatically imported when the main session is loaded.
  1. Relies on read_image function from an example, should just inline it

Inlined function

  1. Models are saved in unusable format and thus can't be loaded

Instead of downloading directly, for each model type do something like:

model = tf.keras.applications.resnet.ResNet152()
model.save('/path/to/model/resnet152_weights_tf_dim_ordering_tf_kernels.h5')
  1. It looks like the model updates aren't actually propagating through as well... e.g. https://pantheon.corp.google.com/dataflow/jobs/us-central1/2023-10-02_10_40_00-5726123037803609215

Not sure yet

@AnandInguva AnandInguva self-assigned this Oct 2, 2023
@github-actions github-actions bot added this to the 2.52.0 Release milestone Oct 4, 2023
@jrmccluskey jrmccluskey added the done & done Issue has been reviewed after it was closed for verification, followups, etc. label Oct 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug done & done Issue has been reviewed after it was closed for verification, followups, etc. P2 python
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants