Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Docs] Simplifying for better user understanding #5878

Merged
merged 3 commits into from
Oct 23, 2024

Conversation

10sharmashivam
Copy link
Contributor

@10sharmashivam 10sharmashivam commented Oct 21, 2024

Tracking issue

Reference #3249

Why are the changes needed?

The changes are needed to enhance the Flyte documentation by providing a comprehensive overview of caching mechanisms. As per User feedback, these changes aim to simplify the caching documentation for better user understanding.

Additional Comments

Short Video Demo: (If required) I can also create a short 1-2 minute video that provides a brief overview of caching and shows how to enable/disable it in Flyte. This would help users quickly grasp the concept.

Check all the applicable boxes

  • [X ] I updated the documentation accordingly.
  • All new and existing tests passed.
  • [X ] All commits are signed-off.

Docs link

Docs Link

Signed-off-by: 10sharmashivam <[email protected]>
Copy link

codecov bot commented Oct 22, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 36.80%. Comparing base (a971ead) to head (071725f).
Report is 11 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5878      +/-   ##
==========================================
+ Coverage   36.71%   36.80%   +0.09%     
==========================================
  Files        1304     1309       +5     
  Lines      130081   130895     +814     
==========================================
+ Hits        47757    48179     +422     
- Misses      78153    78534     +381     
- Partials     4171     4182      +11     
Flag Coverage Δ
unittests-datacatalog 51.58% <ø> (ø)
unittests-flyteadmin 54.01% <ø> (-0.41%) ⬇️
unittests-flytecopilot 11.73% <ø> (ø)
unittests-flytectl 62.40% <ø> (ø)
unittests-flyteidl 6.92% <ø> (+0.03%) ⬆️
unittests-flyteplugins 53.59% <ø> (-0.04%) ⬇️
unittests-flytepropeller 43.00% <ø> (+0.17%) ⬆️
unittests-flytestdlib 55.41% <ø> (+0.64%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

This allows you to explicitly indicate when a change has been made to the task that should invalidate any existing cached results.
Note that this is not the only change that will invalidate the cache (see below).
Also, note that you can manually trigger cache invalidation per execution using the [`overwrite-cache` flag](#overwrite-cache-flag).
* `cache_serialize` (`bool`): Enables or disables [cache serialization](./cache_serializing).
When enabled, Flyte ensures that a single instance of the task is run before any other instances that would otherwise run concurrently.
This allows the initial instance to cache its result and lets the later instances reuse the resulting cached outputs.
Cache serialization is disabled by default.
* `cache_ignore_input_vars` (`Tuple[str, ...]`): Input variables that should not be included when calculating hash for cache. By default, no input variables are ignored. This parameter only applies to task serialization.
* `cache_ignore_input_vars` (`Tuple[str, ...]`): Input values that Flyte should ignore when deciding if a task’s result can be reused. By default, no input variables are ignored. This parameter only applies to task serialization.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* `cache_ignore_input_vars` (`Tuple[str, ...]`): Input values that Flyte should ignore when deciding if a task’s result can be reused. By default, no input variables are ignored. This parameter only applies to task serialization.
* `cache_ignore_input_vars` (`Tuple[str, ...]`): Input variables that Flyte should ignore when deciding if a task’s result can be reused (hash calculation). By default, no input variables are ignored. This parameter only applies to task serialization.

@@ -127,7 +135,7 @@ Task executions can be cached across different versions of the task because a ch

### How does local caching work?

The flytekit package uses the [diskcache](https://github.com/grantjenks/python-diskcache) package, specifically [diskcache.Cache](http://www.grantjenks.com/docs/diskcache/tutorial.html#cache), to aid in the memoization of task executions. The results of local task executions are stored under `~/.flyte/local-cache/` and cache keys are composed of **Cache Version**, **Task Signature**, and **Task Input Values**.
Flyte uses a tool called [diskcache](https://github.com/grantjenks/python-diskcache) package, specifically [diskcache.Cache](http://www.grantjenks.com/docs/diskcache/tutorial.html#cache), to save task results locally on your computer so they don’t need to be recomputed if the same task is run again. The results of local task executions are stored under `~/.flyte/local-cache/` and cache keys are composed of **Cache Version**, **Task Signature**, and **Task Input Values**.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Flyte uses a tool called [diskcache](https://github.com/grantjenks/python-diskcache) package, specifically [diskcache.Cache](http://www.grantjenks.com/docs/diskcache/tutorial.html#cache), to save task results locally on your computer so they don’t need to be recomputed if the same task is run again. The results of local task executions are stored under `~/.flyte/local-cache/` and cache keys are composed of **Cache Version**, **Task Signature**, and **Task Input Values**.
Flyte uses a tool called [diskcache](https://github.com/grantjenks/python-diskcache), specifically [diskcache.Cache](http://www.grantjenks.com/docs/diskcache/tutorial.html#cache), to save task results so they don’t need to be recomputed if the same task is executed again, a technique known as ``memoization``. The results of local task executions are stored under `~/.flyte/local-cache/` and cache keys are composed of **Cache Version**, **Task Signature**, and **Task Input Values**.

Copy link
Contributor

@davidmirror-ops davidmirror-ops left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@davidmirror-ops davidmirror-ops merged commit 8bd573e into flyteorg:master Oct 23, 2024
50 checks passed
Copy link

welcome bot commented Oct 23, 2024

Congrats on merging your first pull request! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants