You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The spec is being developed along with the development work but at a higher level, the decided APIs look like:
void append(Tables...) or appendDataFiles/appendTables: writes tables to data files 1:1, does a transaction to add new data files
void overwrite(Tables...) : writes tables to data files 1:1, does a transaction to remove all data files and add new ones
List<URI> write(Tables…) : writes tables to data files 1:1, does not put anything in transaction
An important requirement is that we need to persist the Iceberg schema element field-ids into the parquet schema Type field_id field, to map iceberg columns to parquet columns.
The text was updated successfully, but these errors were encountered:
We should also see if there is any specific guidance on metadata we should be writing down; in the case of writing a pyarrow table using pyiceberg, we've noticed that the metadata key ARROW:schema contains the arrow schema; in the case of pyspark, it wrote a metadata key iceberg.schema that contains the iceberg schema.
The spec is being developed along with the development work but at a higher level, the decided APIs look like:
void append(Tables...)
orappendDataFiles/appendTables
: writes tables to data files 1:1, does a transaction to add new data filesvoid overwrite(Tables...)
: writes tables to data files 1:1, does a transaction to remove all data files and add new onesList<URI> write(Tables…)
: writes tables to data files 1:1, does not put anything in transactionAn important requirement is that we need to persist the Iceberg schema element field-ids into the parquet schema Type field_id field, to map iceberg columns to parquet columns.
The text was updated successfully, but these errors were encountered: