Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Old table@hash syntax is no longer supported in Spark #7556

Closed
dimas-b opened this issue Sep 25, 2023 · 7 comments · Fixed by #7565
Closed

Old table@hash syntax is no longer supported in Spark #7556

dimas-b opened this issue Sep 25, 2023 · 7 comments · Fixed by #7565

Comments

@dimas-b
Copy link
Member

dimas-b commented Sep 25, 2023

Nessie Spark docs indicate that it is possible to refer to an older version of a table using the table@<hash> syntax.

However, it does not work with Iceberg 1.3.1, Spark 3.3.0 and Nessie 0.71.2:

spark-sql> select * from `t1@164ebabd990fababcf38caa3cc2af0c150ce949614c6c2358d1963b5d8902d7c`;
23/09/25 17:30:18 ERROR SparkSQLDriver: Failed in [select * from `t1@164ebabd990fababcf38caa3cc2af0c150ce949614c6c2358d1963b5d8902d7c`]
java.lang.IllegalArgumentException: Reference name must start with a letter, followed by letters, digits, one of the ./_- characters, not end with a slash or dot, not contain '..' - but was: 164ebabd990fababcf38caa3cc2af0c150ce949614c6c2358d1963b5d8902d7c

but this works:

spark-sql> select * from `t1@main#164ebabd990fababcf38caa3cc2af0c150ce949614c6c2358d1963b5d8902d7c`;
1
Time taken: 0.112 seconds, Fetched 1 row(s)
@dimas-b
Copy link
Member Author

dimas-b commented Sep 25, 2023

Another odd error:

spark-sql> select * from `t1@main#2652878af071d54b1ca1230f662ddcb194e710eed2b9eb807dd1b376bfab7f36^1`;
23/09/25 17:38:35 ERROR SparkSQLDriver: Failed in [select * from `t1@main#2652878af071d54b1ca1230f662ddcb194e710eed2b9eb807dd1b376bfab7f36^1`]
java.lang.IllegalArgumentException: Invalid table name: # is only allowed for hashes (reference by timestamp is not supported)
	at org.apache.iceberg.relocated.com.google.common.base.Preconditions.checkArgument(Preconditions.java:145)
	at org.apache.iceberg.nessie.NessieCatalog.parseTableReference(NessieCatalog.java:321)

@pratyakshsharma
Copy link
Contributor

pratyakshsharma commented Sep 26, 2023

@dimas-b , just curious to know, what were you trying to do by adding ^1 at the end in this query -

select * from `t1@main#2652878af071d54b1ca1230f662ddcb194e710eed2b9eb807dd1b376bfab7f36^1`;

Sorry if it is a naive question, I am still pretty new to nessie stuff.

@pratyakshsharma
Copy link
Contributor

Also another point to highlight, I had below setup when I faced this error -

spark-sql --packages org.apache.iceberg:iceberg-spark-runtime-3.2_2.12:1.2.1,org.projectnessie:nessie-spark-3.2-extensions:0.40.1 \
--conf spark.sql.extensions="org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions,org.projectnessie.spark.extensions.NessieSpark32SessionExtensions" \
--conf spark.sql.catalog.nessie.uri="http://localhost:19120/api/v1" \
--conf spark.sql.catalog.nessie.ref=main \
--conf spark.sql.catalog.nessie.authentication.type=NONE \
--conf spark.sql.catalog.nessie.catalog-impl=org.apache.iceberg.nessie.NessieCatalog \
--conf spark.sql.catalog.nessie=org.apache.iceberg.spark.SparkCatalog \
--conf spark.sql.catalog.nessie.warehouse=<> \

it looks like the above syntax does not work with nessie 0.40.1 as well with iceberg 1.2.1 and spark 3.2

@dimas-b
Copy link
Member Author

dimas-b commented Sep 26, 2023

hash^1 as in git is the parent commit of hash

@adutra
Copy link
Contributor

adutra commented Sep 28, 2023

select * from t1@164ebabd990fababcf38caa3cc2af0c150ce949614c6c2358d1963b5d8902d7c;

You should actually use:

select * from `t1#164ebabd990fababcf38caa3cc2af0c150ce949614c6c2358d1963b5d8902d7c`;

select * from t1@main#2652878af071d54b1ca1230f662ddcb194e710eed2b9eb807dd1b376bfab7f36^1;

This one is indeed a problem because we didn't add support for relative hashes here. I will provide a fix.

@dimas-b
Copy link
Member Author

dimas-b commented Sep 28, 2023

t1#164ebabd990fababcf38caa3cc2af0c150ce949614c6c2358d1963b5d8902d7c - I guess we still have to adjust Spark SQL docs, then :)

@pratyakshsharma
Copy link
Contributor

Included this in my PR #7559

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants