-
Notifications
You must be signed in to change notification settings - Fork 383
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[#5472] improvement(docs): Add example to use cloud storage fileset and polish hadoop-catalog document. #6059
Conversation
docs/hadoop-catalog-with-s3.md
Outdated
## Prerequisites | ||
|
||
In order to create a Hadoop catalog with S3, you need to place [`gravitino-aws-bundle-${gravitino-version}.jar`](https://mvnrepository.com/artifact/org.apache.gravitino/gravitino-aws-bundle) in Gravitino Hadoop classpath located | ||
at `${HADOOP_HOME}/share/hadoop/common/lib/`. After that, start Gravitino server with the following command: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in Gravitino Hadoop classpath located at ${HADOOP_HOME}/share/hadoop/common/lib/
, use hadoop catalog classpath not hadoop classpath?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The user should place the jar in {GRAVITINO_HOME}/catalogs/hadoop/libs
not ${HADOOP_HOME}/share/hadoop/common/lib/
, YES?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems not fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have confirmed that this has been fixed, can you please reresh the web and see if it has been resolved
Instead of placing it in a separate part at the last which seems optional and not important, place it in the catalog properties part like |
|
Other points are
If you have any further thoughts on it, please let me know your thoughts and ideas. |
Strongly agree that Credential vending is an advanced feature. We can distinguish it from simple examples, using different sections or even separate documents; otherwise, we will make simple examples not simple. |
docs/hadoop-catalog-with-adls.md
Outdated
license: "This software is licensed under the Apache License version 2." | ||
--- | ||
|
||
This document describes how to configure a Hadoop catalog with ADLS (Azure Blob Storage). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ADLS is based on Azure Blob Storage
, but not Azure Blob Storage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, we support Azure Blob Storage, ADLS, and ADLS (v2), I used Azure Blob Storage
for all, but there are no abbreviations for "Azure Blob Storage", so I used ADLS
to stands for storage service provided by Azure.
Is there a good term to describe all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems ADLS is enough
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I use this sentences to replace it ADLS (aka. Azure Blob Storage (ABS), or Azure Data Lake Storage (v2))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ADLS couldn't represent ABS, IMO, azure hadoop connector just supports ADLS
LGTM except minor comments |
…e and account key in python client accordingly.
@jerryshao @mchades any other comments? |
I don't have further comment, @mchades can also take a review. |
docs/hadoop-catalog-index.md
Outdated
|
||
- [Hadoop catalog overview and features](./hadoop-catalog.md): This chapter provides an overview of the Hadoop catalog, its features, capabilities and related configurations. | ||
- [Manage Hadoop catalog with Gravitino API](./manage-fileset-metadata-using-gravitino.md): This chapter explains how to manage fileset metadata using Gravitino API and provides detailed examples. | ||
- [Using Hadoop catalog with Gravitino virtual System](how-to-use-gvfs.md): This chapter explains how to use Hadoop catalog with Gravitino virtual System and provides detailed examples. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gravitino virtual System -> Gravitino virtual file system
…nd polish hadoop-catalog document. (#6059) ### What changes were proposed in this pull request? 1. Add full example about how to use cloud storage fileset like S3, GCS, OSS and ADLS 2. Polish how-to-use-gvfs.md and hadoop-catalog-md. 3. Add document how fileset using credential. ### Why are the changes needed? For better user experience. Fix: #5472 ### Does this PR introduce _any_ user-facing change? N/A. ### How was this patch tested? N/A
What changes were proposed in this pull request?
Why are the changes needed?
For better user experience.
Fix: #5472
Does this PR introduce any user-facing change?
N/A.
How was this patch tested?
N/A