-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SDK and CLI for CrateDB Cloud Cluster APIs #81
base: main
Are you sure you want to change the base?
Conversation
cratedb_toolkit/cluster/croud.py
Outdated
# TODO: `--product-name=crfree` is not always the right choice. ;] | ||
# TODO: How to select CrateDB nightly, like `--version=nightly`? | ||
# TODO: Add more parameters, like `--org-id`, `--channel`, `--unit`, and more. | ||
# TODO: What about `--sudo`? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few spots to be addressed on another iteration.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #81 +/- ##
==========================================
+ Coverage 81.17% 83.10% +1.92%
==========================================
Files 73 84 +11
Lines 2826 3409 +583
==========================================
+ Hits 2294 2833 +539
- Misses 532 576 +44
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
89f8c3a
to
f794fd9
Compare
119c7ac
to
a9f4d82
Compare
# FIXME: Fix `croud clusters deploy`. | ||
# It yields *two* payloads to stdout, making it | ||
# unusable in JSON-capturing situations. | ||
# The main advantage of the `JSONDecoder` class is that it also provides | ||
# a `.raw_decode` method, which will ignore extra data after the end of the JSON. | ||
# https://stackoverflow.com/a/75168292 | ||
payload = wr.invoke() | ||
decoder = json.JSONDecoder() | ||
data = decoder.raw_decode(payload) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Observation
This works around a minor flaw of croud
, which yields a JSON output two times, rendering the output not parseable.
Suggestion
croud
should be changed to only output a single JSON payload to stdout when invoking croud clusters deploy
.
cratedb_toolkit/io/croud.py
Outdated
def fix_job_info_table_name(self): | ||
""" | ||
Adjust full-qualified table name by adding appropriate quotes. | ||
Fixes a minor flaw on the upstream API. | ||
|
||
Currently, the API returns `testdrive.pems-1`, but that can not be used at | ||
all, because it is not properly quoted. It also can not be used 1:1, because | ||
it is not properly quoted. | ||
|
||
So, converge the table name into `"testdrive"."pems-1"` manually, for a | ||
full-qualified representation. | ||
|
||
FIXME: Remove after upstream has fixed the flaw. | ||
""" | ||
job_info = self.info | ||
if "destination" in job_info and "table" in job_info["destination"]: | ||
table = job_info["destination"]["table"] | ||
if '"' not in table and "." in table: | ||
schema, table = table.split(".") | ||
table = f'"{schema}"."{table}"' | ||
job_info["destination"]["table"] = table |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Observation
This works around another minor upstream flaw.
Suggestion
The API should either return schema and table names within separate attributes (preferred), but also quote the value of the existing destination.table
attribute so that it can be re-used without further ado.
Example
{
"destination": {
"table": "\"testdrive\".\"pems-1\""
}
}
def configure_croud(no_spinner: bool = None, use_spinner: bool = None): | ||
""" | ||
Turn off croud's Halo spinner when running in Jupyter Notebooks. It does not work well. | ||
|
||
- https://github.com/ManrajGrover/halo/issues/32 | ||
- https://github.com/manrajgrover/halo/issues/179 | ||
""" | ||
if no_spinner or ((CONFIG.RUNNING_ON_JUPYTER or CONFIG.RUNNING_ON_PYTEST) and not use_spinner): | ||
mod = types.ModuleType( | ||
"croud.tools.spinner", "Mocking the croud.tools.spinner module, to turn off the Halo spinner" | ||
) | ||
setattr(mod, "HALO", NoopContextManager()) # noqa: B010 | ||
sys.modules["croud.tools.spinner"] = mod | ||
|
||
|
||
class NoopContextManager: | ||
""" | ||
For making the Halo progressbar a no-op. | ||
""" | ||
|
||
def __init__(self, *args, **kwargs): | ||
pass | ||
|
||
def __enter__(self): | ||
pass | ||
|
||
def __exit__(self, exc_type, exc_value, exc_traceback): | ||
pass | ||
|
||
def stop(self): | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Observation
We had to turn off the Halo spinner currently used in croud
, because it did not work well within a Jupyter Notebook environment. We are exactly observing those issues, despite the former being officially resolved. Apparently, it came back.
- Jupyter Notebooks don't support Halo manrajgrover/halo#32
- BUG: Halo.__init__.<locals>.clean_up() takes 0 positional arguments but 1 was given manrajgrover/halo#179
Suggestion
Submit a patch to croud
to only use interactivity when is_tty() is True
, or such. At least, don't start the HALO
at module-scope level, but initialize/configure it at runtime instead.
062ee9b
to
7d89748
Compare
# Log in to CrateDB Cloud. | ||
croud login --idp azuread |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Observation
There is an alternative, headless way of doing that: Using croud config show
, you can display the location of the croud configuration file in YAML format.
current-profile: cratedb.cloud
default-format: table
profiles:
cratedb.cloud:
auth-token: xxxxxxxxxx
endpoint: https://console.cratedb.cloud
key: REDACTED
organization-id: null
region: _any_
secret: xxxxxxxxxx
If you fill in the key
and secret
values, obtained by running croud api-keys create
, to create an API key 1, operations like croud clusters list
will start working without further ado, even after logging out again using croud logout
.
It works well. Thanks, @proddata.
Suggestion
Based on those insights, improve the SDK correspondingly, by also accepting environment variables CRATEDB_CLOUD_KEY
and CRATEDB_CLOUD_SECRET
.
Footnotes
-
Alternatively, you can obtain an API key on your account page, at the "API keys" section. -- https://console.cratedb.cloud/account/settings ↩
cratedb_toolkit.configure( | ||
runtime_errors="exit", | ||
settings_accept_cli=True, | ||
settings_accept_env=True, | ||
settings_errors="exit", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may eventually be refactored into a context manager.
@pytest.mark.skip( | ||
"Does not work: Apparently, the 'responses' mockery is not properly activated when evaluating the notebook" | ||
) | ||
@responses.activate | ||
def test_example_cloud_import_notebook(mocker, mock_cloud_cluster_exists): | ||
""" | ||
Verify the Jupyter Notebook example works. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the meanwhile, we discovered a solution based on testbook, which will also let you execute a notebook on behalf of a pytest test case, but also use corresponding mocking to manage the environment.
Applying that here will make it possible to actually unskip the test case.
It is just a rough sketch, which barely works.
da905dc
to
8ca895a
Compare
- Naming things: Use more appropriate names for variables. - OO: Add `EnvironmentConfiguration`, to manage information about the toolkit environment. - OO: Add `CloudJob` entity, to manage information about a cloud job, and to streamline passing of information. - OO: Refactor low-level methods to `CloudCluster`. - Capability to configure error handling based on the environment. - Capability to conveniently acquire configuration settings from the environment, for obtaining the CrateDB Cluster identifier or name. - Tests: Add `DockerSkippingContainer` and `PytestTestcontainerAdapter`. Both skip test execution when the Docker daemon is not running, or not available to the environment. - Examples: Improve UX of `cloud_*.py` example programs.
It can be used to acquire corresponding client handles (adapter, dbapi, sqlalchemy), in order to communicate with the database.
When not logged in using `croud login`, the "create project" operation needs an organization identifier.
About
Wraps a few calls of
croud
into a different kind of API, mostly about managing clusters. It is in the same spirit as, and is also building upon GH-73.Configuration
Configuration works by defining a few environment variables. For convenience, put them within a dotenv file (
.env
).Python SDK
Deploy cluster, run database workload, and stop cluster again.
See also
examples/python/cloud_cluster.py
andexamples/python/cloud_import.py
.CLI interface
Deploy cluster, import data, and query a few samples worth of data.
See also
examples/shell/cloud_cluster.sh
andexamples/shell/cloud_import.sh
.References
Backlog