Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix docs on website #7559

Merged
merged 4 commits into from
Sep 29, 2023
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 14 additions & 14 deletions site/docs/develop/java.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,23 +25,23 @@ tests) + Jackson's DataBinding and JAX-RS modules (any version from the last ~3+
## API

The `NessieClientBuilder` and concrete builder implementations (such as `HttpClientBuilder`) provide an easy way of configuring and building a `NessieApi`. The currently stable API that should be used
is `NessieApiV1`, which can be instantiated as shown below:
is `NessieApiV2`, which can be instantiated as shown below:


```java

import java.net.URI;
import java.util.List;
import org.projectnessie.client.api.NessieApiV1;
import org.projectnessie.client.api.NessieApiV2;
import org.projectnessie.client.NessieClientBuilder;
import org.projectnessie.model.Reference;

NessieApiV2 api = NessieClientBuilder.builder()
.withUri(URI.create("http://localhost:19121/api/v2"))
NessieApiV2 api = NessieClientBuilder.createClientBuilder(null, null)
.withUri(URI.create("http://localhost:19120/api/v2"))
.build(NessieApiV2.class);

List<Reference> references = api.getAllReferences().get();
references.stream()
api.getAllReferences()
.stream()
.map(Reference::getName)
.forEach(System.out::println);
```
Expand Down Expand Up @@ -131,8 +131,8 @@ config.getVersion();
Creates a new commit by adding metadata for an `IcebergTable` under the specified `ContentKey` instance represented by `key` and deletes content represented by `key2`

```java
ContentKey key = ContentKey.of("table.name.space", "name");
ContentKey key2 = ContentKey.of("other.name.space", "name2");
ContentKey key = ContentKey.of("your-namespace", "your-table-name");
ContentKey key2 = ContentKey.of("your-namespace2", "your-table-name2");
IcebergTable icebergTable = IcebergTable.of("path1", 42L);
api.commitMultipleOperations()
.branchName(branch)
Expand All @@ -147,17 +147,17 @@ api.commitMultipleOperations()

Fetches the content for a single `ContentKey`
```java
ContentKey key = ContentKey.of("table.name.space", "name");
ContentKey key = ContentKey.of("your-namespace", "your-table-name");
Map<ContentKey, Content> map = api.getContent().key(key).refName("dev").get();
```

Fetches the content for multiple `ContentKey` instances
```java
List<ContentKey> keys =
Arrays.asList(
ContentKey.of("table.name.space", "name1"),
ContentKey.of("table.name.space", "name2"),
ContentKey.of("table.name.space", "name3"));
ContentKey.of("your-namespace1", "your-table-name1"),
ContentKey.of("your-namespace1", "your-table-name2"),
ContentKey.of("your-namespace2", "your-table-name3"));
Map<ContentKey, Content> allContent = api.getContent().keys(keys).refName("dev").get();
```

Expand Down Expand Up @@ -212,7 +212,7 @@ Note that `BASIC` is not supported in production and should only be used for dev
```java
NessieApiV1 api =
HttpClientBuilder.builder()
.withUri(URI.create("http://localhost:19121/api/v1"))
.withUri(URI.create("http://localhost:19120/api/v1"))
.withAuthentication(BasicAuthenticationProvider.create("my_username", "very_secret"))
.build(NessieApiV1.class);
```
Expand All @@ -221,7 +221,7 @@ The `BearerAuthenticationProvider` allows connecting to a Nessie server that has
```java
NessieApiV1 api =
HttpClientBuilder.builder()
.withUri(URI.create("http://localhost:19121/api/v1"))
.withUri(URI.create("http://localhost:19120/api/v1"))
.withAuthentication(BearerAuthenticationProvider.create("bearerToken"))
.build(NessieApiV1.class);
```
2 changes: 1 addition & 1 deletion site/docs/features/management.md
Original file line number Diff line number Diff line change
Expand Up @@ -177,7 +177,7 @@ Cut-off policies are parsed using the following logic and precedence:

### Running the _sweep_ (or _expire_) phase: Identifying live content references

Nessie GC's sweep phase uses the the actual table format, for example Iceberg, to map the collected
Nessie GC's sweep phase uses the actual table format, for example Iceberg, to map the collected
live content references to live file references. The _sweep_ phase operates on each content-ID. So
it collects the live file references for each content ID. Those file references refer to Iceberg
assets:
Expand Down
5 changes: 3 additions & 2 deletions site/docs/tools/iceberg/flink.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,10 +83,11 @@ SELECT * FROM `<catalog_name>`.`<database_name>`.`<table_name>`;
```

As well, similar to [Spark](spark.md#reading), you can read tables from specific
branches or hashes from within a `SELECT` statement. The general pattern is `<table_name>@<branch/ref>` (e.g: `salaries@main`):
branches or hashes from within a `SELECT` statement. The general pattern is `<table_name>@<branch>` or `<table>#<hash>` (e.g: `salaries@main`):

```sql
SELECT * FROM `<catalog_name>`.`<database_name>`.`<table_name>@<branch/ref>`;
SELECT * FROM `<catalog_name>`.`<database_name>`.`<table_name>@<branch>`;
SELECT * FROM `<catalog_name>`.`<database_name>`.`<table_name>#<hash>`;
```

## Other DDL statements
Expand Down
2 changes: 1 addition & 1 deletion site/docs/tools/iceberg/hive.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ Whereby the above properties are explained as below:

To read and write into tables that are managed by Iceberg and Nessie, typical Hive SQL queries can be used. Refer to this documentation [here](https://iceberg.apache.org/hive/#querying-with-sql) for more information.

**Note**: Hive doesn't support the notation of `table@branch`, therefore everytime you want to execute against a specific branch, you will need to set this property to point to the working branch, e.g: `SET iceberg.catalog.<catalog_name>.ref=main`. E.g:
**Note**: Hive doesn't support the notation of `<table>@<branch>`, therefore everytime you want to execute against a specific branch, you will need to set this property to point to the working branch, e.g: `SET iceberg.catalog.<catalog_name>.ref=main`. E.g:
```
SET iceberg.catalog.<catalog_name>.ref=dev

Expand Down
8 changes: 4 additions & 4 deletions site/docs/tools/iceberg/spark.md
Original file line number Diff line number Diff line change
Expand Up @@ -292,7 +292,7 @@ To read a Nessie table in iceberg simply:

The examples above all use the default branch defined on initialisation. There are several ways to reference specific
branches or hashes from within a read statement. We will take a look at a few now from pyspark3, the rules are the same
across all environments though. The general pattern is `<table>@<branch>`. Table must be present and either
across all environments though. The general pattern is `<table>@<branch>` or `<table>#<hash>`. Table must be present and either
branch and/or hash are optional. We will throw an error if branch or hash don't exist.
Branch or hash references in the table name will override passed `option`s and the settings in the
Spark/Hadoop configs.
Expand All @@ -301,13 +301,13 @@ Spark/Hadoop configs.
# read from branch dev
spark.read().format("iceberg").load("testing.region@dev")
# read specifically from hash
spark.read().format("iceberg").load("testing.region@<hash>")
spark.read().format("iceberg").load("testing.region#<hash>")

spark.sql("SELECT * FROM nessie.testing.`region@dev`")
spark.sql("SELECT * FROM nessie.testing.`region@<hash>`")
spark.sql("SELECT * FROM nessie.testing.`region#<hash>`")
dimas-b marked this conversation as resolved.
Show resolved Hide resolved
```

Notice in the SQL statements the `table@branch` must be escaped separately from namespace or catalog arguments.
Notice in the SQL statements the `<table>@<branch>` or `<table>#<hash>` must be escaped separately from namespace or catalog arguments.

Future versions may add the ability to specify a timestamp to query the data at a specific point in time
(time-travel). In the meantime the history can be viewed on the command line or via the python client and a specific
Expand Down