-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE]: Migrate direct filesystem access to UC tables #2021
Comments
As I understand it, a plan for implementing this feature could look as follows:
|
Are we looking to migrate access for any format (csv, parquet, json, delta...) or only delta ? |
We have instances of
spark.read.format("delta").load("s3a://prefix/...")
in the code, though we want to migrate that intospark.table("catalog.schema.table")
to follow UC practices. Build on top of "tables in mounts". See:Do we migrate to UC Volumes?
yes
Do we resolve mounts?
yes
Do we resolve
dbutils.widgets.get()
?if possible
where to store mappings? add a prefix in the table mapping?
TBD
what scans all jobs?
migration-progress
(new) workflow on a daily scheduleSee:
what determines all direct filesystem accesses?
FromDbfsFolder
,DirectFilesystemAccessMatcher
, andFromTable
to return file access.open('/dbfs/...')
literals.WorkflowLinter
to persist this information in a new table.The text was updated successfully, but these errors were encountered: