You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As a Trino Iceberg user I want to define a CR that allows me to regularly run maintenance actions on my tables.
Come up with a CRD
Figure out how to authenticate against Trino Cluster, e.g. always create a k8s Secret for a service user and add that into the authentication chain using Password file authentication as well as mount it into the k8s CronJob
Should
Allow to run at whole schema, which iterates through tables
Emit Prometheus metrics so we can alert on failures and have a Dashboard
Could
Prometheus alters
Grafana dashboard with e.g. files compacted, bytes and rows read/written
One possible solution would be to create a k8s CronJob for every maintenance CR.
CRD could look something like
spec:
target:
catalog: lakehouse
schema: default
table: my_table # Optional
schedule:
interval: 24h # using new Duration struct
# OR
cronExpression: XXX
actions:
- name: optimize
fileSizeThreshold: 100MB # optional, otherwise let trino use it's internal default
- name: expire_snapshots
retentionThreshold: 7d # optional, otherwise let trino use it's internal default
- name: remove_orphan_files
# Document: The value for retention_threshold must be higher than or equal to iceberg.remove_orphan_files.min-retention in the catalog otherwise the procedure fails with a similar message: Retention specified (1.00d) is shorter than the minimum retention configured in the system (7.00d)
retentionThreshold: 7d # optional, otherwise let trino use it's internal default
The text was updated successfully, but these errors were encountered:
At the risk of killing this issue with scope-creep, we discussed having TrinoTable crds a while back that the operator would read and actually go and create the tables in Trino based on the information in there.
If that hits, I think that object should contain the information described in this issue as well, not be put into a separate crd?
As a Trino Iceberg user I want to define a CR that allows me to regularly run maintenance actions on my tables.
Should
Could
One possible solution would be to create a k8s CronJob for every maintenance CR.
CRD could look something like
The text was updated successfully, but these errors were encountered: