Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Force absolute paths in file based registries #4774

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

boliri
Copy link
Contributor

@boliri boliri commented Nov 20, 2024

What this PR does / why we need it:

While I was working on #4772, I discovered an issue related to the Feast CLI and how FileRegistryStore instances are populated when running a CLI command using the -f argument. Assume a really simple setup made with the feast init command, i.e. the Feature Store is backed by file stores both in the registry and in the online store.

When the registry is set to file in the feature_store.yaml, RegistryConfig's path attribute is filled with the same exact value provided in the YAML, and at some point, the Registry is instantiated. This is not the same for the FileRegistryStore that is instantiated later, however - in that class, paths from the RegistryConfig are processed to turn them into absolute paths in the event they are relative:

class FileRegistryStore(RegistryStore):
    def __init__(self, registry_config: RegistryConfig, repo_path: Path):
        registry_path = Path(registry_config.path)
        if registry_path.is_absolute():
            self._filepath = registry_path
        else:
            self._filepath = repo_path.joinpath(registry_path)

    [...]

The problem here is that the tool used to build the CLI injects the current working directory in its context's CHDIR key, and that CHDIR is passed down the successive calls as the repo_path until the FileRegistryStore is finally built, which results in a malformed absolute path. Consequently, things like the UI cannot load (the landing page just claims there was an issue while loading the project list due to a malformed feature_store.yaml, but that's actually misleading).

Besides these issues, there's another one related to the env var FEATURE_STORE_YAML_BASE64. When the YAML is base-64 encoded and passed to the CLI with a file-based registry configured using a relative path, the final path built in FileRegistryStore takes as prefix a subfolder of /tmp, hence assuming that the file registry is stored in that subfolder (which is impossible, as it is created on the fly just to store the decoded env var as an actual file somewhere). Once again, the FileRegistryStore ends up in a weird state.

To mitigate these issues, I propose enforcing the usage of absolute paths while configuring the feature_store.yaml, be it a file or an env var. The onus is on the user then, and I am aware this can be a breaking change for people currently using file-based registries on their Feature Store setups, but the odds of using registries like these in production environments are very low IMHO.

Aside from that constraint, I also updated the logic behind the feast init command to replace the default values of the feature_store.yaml template with absolute paths built according to the dir from which the command was issued.

Which issue(s) this PR fixes:

Misc

@boliri boliri requested a review from a team as a code owner November 20, 2024 17:51
@boliri boliri changed the title Feat/force absolute paths in file based registries feat: force absolute paths in file based registries Nov 20, 2024
@boliri boliri changed the title feat: force absolute paths in file based registries feat: Force absolute paths in file based registries Nov 20, 2024
@boliri
Copy link
Contributor Author

boliri commented Nov 20, 2024

Just noticed some unit tests broke after these changes, will take a look at them when I have some time.

@boliri boliri marked this pull request as draft November 20, 2024 18:02
@dmartinol
Copy link
Contributor

Hi @boliri, I think this change goes against a recent update to use relative file paths in the generated template #4624.
This has been designed to improve code portability, which absolute paths seem to be against.

Wrt using encoded YAML in the env var, I understand the severity of the issue that you mention and we can think of enforcing the use of absolute paths there.
In any case, we expect to have a limited number of setups where the env var is requested to the user. In fact, with the advent of the new Feast Go Operator #4561, the variable will be entirely managed by the Kubernetes controller which will create the correct absolute path for you.
Helm chart users will be instead requested to encode a YAML file with the correct absolute path(s).

@boliri
Copy link
Contributor Author

boliri commented Nov 21, 2024

Hi @dmartinol! Thanks for the quick reply.

Didn't think about portability to be honest, but it definitely makes sense.

About the env var holding encoded YAMLs, if you don't foresee a high number of setups using it then I think we can safely drop this PR.

Let me know if you want me to do that, or feel free to close it. Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants