-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add cli access #25
Open
observingClouds
wants to merge
6
commits into
mllam:main
Choose a base branch
from
observingClouds:feature/cli
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
add cli access #25
Changes from 2 commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
f70600d
add CLI
observingClouds c006ebc
test and document changes
observingClouds e0db9de
rephrase introduced changes
observingClouds cb6f05a
clarify changes
observingClouds 7d568f3
update call of mllam_data_prep
observingClouds 672a582
fix cli command in changelog
observingClouds File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,80 +1,4 @@ | ||
import os | ||
from pathlib import Path | ||
|
||
from loguru import logger | ||
|
||
from .create_dataset import create_dataset_zarr | ||
|
||
# Attempt to import psutil and dask.distributed modules | ||
DASK_DISTRIBUTED_AVAILABLE = True | ||
try: | ||
import psutil | ||
from dask.diagnostics import ProgressBar | ||
from dask.distributed import LocalCluster | ||
except ImportError or ModuleNotFoundError: | ||
DASK_DISTRIBUTED_AVAILABLE = False | ||
from .cli import call | ||
|
||
if __name__ == "__main__": | ||
import argparse | ||
|
||
parser = argparse.ArgumentParser( | ||
formatter_class=argparse.ArgumentDefaultsHelpFormatter | ||
) | ||
parser.add_argument("config", help="Path to the config file", type=Path) | ||
parser.add_argument( | ||
"-o", "--output", help="Path to the output zarr file", type=Path, default=None | ||
) | ||
parser.add_argument( | ||
"--show-progress", help="Show progress bar", action="store_true" | ||
) | ||
parser.add_argument( | ||
"--dask-distributed-local-core-fraction", | ||
help="Fraction of cores to use on the local machine to do multiprocessing with dask.distributed", | ||
type=float, | ||
default=0.0, | ||
) | ||
parser.add_argument( | ||
"--dask-distributed-local-memory-fraction", | ||
help="Fraction of memory to use on the local machine (when doing multiprocessing with dask.distributed)", | ||
type=float, | ||
default=0.9, | ||
) | ||
args = parser.parse_args() | ||
|
||
if args.show_progress: | ||
ProgressBar().register() | ||
|
||
if args.dask_distributed_local_core_fraction > 0.0: | ||
# Only run this block if dask.distributed is available | ||
if not DASK_DISTRIBUTED_AVAILABLE: | ||
raise ModuleNotFoundError( | ||
"Currently dask.distributed isn't installed and therefore can't " | ||
"be used in mllam-data-prep. Please install the optional dependency " | ||
'with `python -m pip install "mllam-data-prep[dask-distributed]"`' | ||
) | ||
# get the number of system cores | ||
n_system_cores = os.cpu_count() | ||
# compute the number of cores to use | ||
n_local_cores = int(args.dask_distributed_local_core_fraction * n_system_cores) | ||
# get the total system memory | ||
total_memory = psutil.virtual_memory().total | ||
# compute the memory per worker | ||
memory_per_worker = ( | ||
total_memory / n_local_cores * args.dask_distributed_local_memory_fraction | ||
) | ||
|
||
logger.info( | ||
f"Setting up dask.distributed.LocalCluster with {n_local_cores} cores and {memory_per_worker/1024/1024:0.0f} MB of memory per worker" | ||
) | ||
|
||
cluster = LocalCluster( | ||
n_workers=n_local_cores, | ||
threads_per_worker=1, | ||
memory_limit=memory_per_worker, | ||
) | ||
|
||
client = cluster.get_client() | ||
# print the dashboard link | ||
logger.info(f"Dashboard link: {cluster.dashboard_link}") | ||
|
||
create_dataset_zarr(fp_config=args.config, fp_zarr=args.output) | ||
args = call(args=None) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
""" | ||
Command line interface for mllam_data_prep | ||
""" | ||
import argparse | ||
import os | ||
from pathlib import Path | ||
|
||
from loguru import logger | ||
|
||
from .create_dataset import create_dataset_zarr | ||
|
||
# Attempt to import psutil and dask.distributed modules | ||
DASK_DISTRIBUTED_AVAILABLE = True | ||
try: | ||
import psutil | ||
from dask.diagnostics import ProgressBar | ||
from dask.distributed import LocalCluster | ||
except ImportError or ModuleNotFoundError: | ||
DASK_DISTRIBUTED_AVAILABLE = False | ||
|
||
|
||
def call(args=None): | ||
parser = argparse.ArgumentParser( | ||
formatter_class=argparse.ArgumentDefaultsHelpFormatter | ||
) | ||
parser.add_argument("config", help="Path to the config file", type=Path) | ||
parser.add_argument( | ||
"-o", "--output", help="Path to the output zarr file", type=Path, default=None | ||
) | ||
parser.add_argument( | ||
"--show-progress", help="Show progress bar", action="store_true" | ||
) | ||
parser.add_argument( | ||
"--dask-distributed-local-core-fraction", | ||
help="Fraction of cores to use on the local machine to do multiprocessing with dask.distributed", | ||
type=float, | ||
default=0.0, | ||
) | ||
parser.add_argument( | ||
"--dask-distributed-local-memory-fraction", | ||
help="Fraction of memory to use on the local machine (when doing multiprocessing with dask.distributed)", | ||
type=float, | ||
default=0.9, | ||
) | ||
args = parser.parse_args(args) | ||
|
||
if args.show_progress: | ||
ProgressBar().register() | ||
|
||
if args.dask_distributed_local_core_fraction > 0.0: | ||
# get the number of system cores | ||
n_system_cores = os.cpu_count() | ||
# compute the number of cores to use | ||
n_local_cores = int(args.dask_distributed_local_core_fraction * n_system_cores) | ||
# get the total system memory | ||
total_memory = psutil.virtual_memory().total | ||
# compute the memory per worker | ||
memory_per_worker = ( | ||
total_memory / n_local_cores * args.dask_distributed_local_memory_fraction | ||
) | ||
|
||
logger.info( | ||
f"Setting up dask.distributed.LocalCluster with {n_local_cores} cores and {memory_per_worker/1024/1024:0.0f} MB of memory per worker" | ||
) | ||
|
||
cluster = LocalCluster( | ||
n_workers=n_local_cores, | ||
threads_per_worker=1, | ||
memory_limit=memory_per_worker, | ||
) | ||
|
||
# print the dashboard link | ||
logger.info(f"Dashboard link: {cluster.dashboard_link}") | ||
|
||
create_dataset_zarr(fp_config=args.config, fp_zarr=args.output) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
import tempfile | ||
|
||
import pytest | ||
import xarray as xr | ||
|
||
from mllam_data_prep.cli import call | ||
|
||
|
||
@pytest.mark.parametrize("args", [["example.danra.yaml"]]) | ||
def test_call(args): | ||
with tempfile.TemporaryDirectory(suffix=".zarr") as tmpdir: | ||
args.extend(["--output", tmpdir]) | ||
call(args) | ||
_ = xr.open_zarr(tmpdir) |
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this test is pure elegance, nice! |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
import importlib | ||
import tempfile | ||
|
||
import pytest | ||
import xarray as xr | ||
|
||
from mllam_data_prep.cli import call | ||
|
||
|
||
def call_wrapper(args): | ||
with tempfile.TemporaryDirectory(suffix=".zarr") as tmpdir: | ||
args.extend(["--output", tmpdir]) | ||
call(args) | ||
_ = xr.open_zarr(tmpdir) | ||
|
||
|
||
def distributed(): | ||
"""Check if dask.distributed is installed""" | ||
try: | ||
importlib.import_module("dask.distributed") | ||
|
||
return True | ||
except (ModuleNotFoundError, ImportError): | ||
return False | ||
|
||
|
||
@pytest.mark.parametrize( | ||
"args", | ||
[ | ||
["example.danra.yaml", "--dask-distributed-local-core-fraction", "1.0"], | ||
["example.danra.yaml"], | ||
], | ||
) | ||
def test_run_distributed(args): | ||
if distributed(): | ||
call_wrapper(args) | ||
elif not distributed() and "--dask-distributed-local-core-fraction" in args: | ||
index = args.index("--dask-distributed-local-core-fraction") | ||
core_fraction = float(args[index + 1]) | ||
if core_fraction > 0: | ||
pytest.raises( | ||
NameError, | ||
call_wrapper, | ||
args=args, | ||
) | ||
else: | ||
pytest.raises( | ||
ModuleNotFoundError, | ||
call_wrapper, | ||
args=args, | ||
) | ||
else: | ||
call_wrapper(args) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe it would be clearer to say "add support for calling cli from python function and add tests for cli with/without
dask.distributed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I updated it. Please check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we should just write what the CLI call is now? Is it still
python -m ...
?Maybe the changelog should then say
or something like that?