Saved_dataset for spark offline store can be accessed only within the scope of the spark session, where it was created. #3644

nadejdaSuraeva · 2023-06-05T23:58:42Z

Is your feature request related to a problem? Please describe.
I would like to have a possibility to use the data of the registered saved_dataset in a different spark session. Now I have only a name in the feast registry without the data if I create a new spark session.

Part of persist function:

"""
Run the retrieval and persist the results in the same offline store used for read.
Please note the persisting is done only within the scope of the spark session.
"""
assert isinstance(storage, SavedDatasetSparkStorage)
table_name = storage.spark_options.table
if not table_name:
    raise ValueError("Cannot persist, table_name is not defined")
self.to_spark_df().createOrReplaceTempView(table_name)

Describe the solution you'd like
Add possibility to save dataset as a table, for example when Spark Session config is included info about remote storage (hive, s3 path, etc)

Describe alternatives you've considered
Add an optional parameter for SparkOptions, which allows to save dataset as a table in any spark session configurations.

Additional context

The text was updated successfully, but these errors were encountered:

nadejdaSuraeva added the kind/feature New feature or request label Jun 5, 2023

nadejdaSuraeva mentioned this issue Jun 6, 2023

feat: Add possibility to save dataset as table, when spark config has remote warehouse info #3645

Merged

felixwang9817 closed this as completed in #3645 Aug 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Saved_dataset for spark offline store can be accessed only within the scope of the spark session, where it was created. #3644

Saved_dataset for spark offline store can be accessed only within the scope of the spark session, where it was created. #3644

nadejdaSuraeva commented Jun 5, 2023

Saved_dataset for spark offline store can be accessed only within the scope of the spark session, where it was created. #3644

Saved_dataset for spark offline store can be accessed only within the scope of the spark session, where it was created. #3644

Comments

nadejdaSuraeva commented Jun 5, 2023