Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plot parallelization off failing in GCPy 1.4.1 #285

Closed
lizziel opened this issue Jan 25, 2024 · 4 comments · Fixed by #287
Closed

Plot parallelization off failing in GCPy 1.4.1 #285

lizziel opened this issue Jan 25, 2024 · 4 comments · Fixed by #287
Assignees
Labels
category: Bug Something isn't working topic: Benchmark Plots and Tables Issues pertaining to generating plots/tables from benchmark output
Milestone

Comments

@lizziel
Copy link
Contributor

lizziel commented Jan 25, 2024

Name and Institution (Required)

Name: Lizzie Lundgren
Institution: Harvard University

Description of your issue or question

I am getting the following error when running the transport tracer benchmark with GCPy 1.4.1 with plotting parallelization turned off in the 1yr transport tracer benchmark configuration file. I am using python 3.9.18. Full package list is in #284.

Traceback (most recent call last):
  File "/gpfsm/dnb34/ewlundgr/python/mambaforge/envs/gcpy_v1_4_1/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/gpfsm/dnb34/ewlundgr/python/mambaforge/envs/gcpy_v1_4_1/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/ewlundgr/nb/python/gcpy/gcpy/benchmark/run_benchmark.py", line 1606, in <module>
    main(sys.argv)
  File "/home/ewlundgr/nb/python/gcpy/gcpy/benchmark/run_benchmark.py", line 1602, in main
    choose_benchmark_type(config)
  File "/home/ewlundgr/nb/python/gcpy/gcpy/benchmark/run_benchmark.py", line 100, in choose_benchmark_type
    run_1yr_tt_benchmark(
  File "/home/ewlundgr/nb/python/gcpy/gcpy/benchmark/modules/run_1yr_tt_benchmark.py", line 662, in run_benchmark
    bmk.make_benchmark_conc_plots(
  File "/home/ewlundgr/nb/python/gcpy/gcpy/benchmark_funcs.py", line 1510, in make_benchmark_conc_plots
    dict_sfc = {list(result.keys())[0]: result[list(
  File "/home/ewlundgr/nb/python/gcpy/gcpy/benchmark_funcs.py", line 1510, in <dictcomp>
    dict_sfc = {list(result.keys())[0]: result[list(
AttributeError: 'str' object has no attribute 'keys'
@lizziel lizziel added the category: Bug Something isn't working label Jan 25, 2024
@yantosca
Copy link
Contributor

Thanks @lizziel. I think I see what the problem is. When you turn off parallelization you might be getting a string back instead of a dict. Let me see if I can reproduce this locally.

@yantosca
Copy link
Contributor

Hi @lizziel! I think I've figured this out. This is happening in the various places where plots are parallelized. There are code blocks such as:

    # --------------------------------------------
    # Create the plots in parallel
    # Turn off parallelization if n_job=1
    if n_job != 1:
        results = Parallel(n_jobs=n_job)(
            delayed(createplots)(filecat)
            for _, filecat in enumerate(catdict)
        )
    else:
        for _, filecat in enumerate(catdict):
            results = createplots(filecat)
    # --------------------------------------------

in e.g. gcpy/benchmark_funcs.py.

So when parallelization (n_cores: -1) is on, the results variable comes back as:

[{'Aerosols': {'sfc': [], '500': [], 'zm': []}}, {'Bromine': {'sfc': [], '500': [], 'zm': []}}, {'Chlorine': {'sfc': [], '500': [], 'zm': []}}, {'Iodine': {'sfc': [], '500': [], 'zm': []}}, {'Nitrogen': {'sfc': [], '500': [], 'zm': []}}, {'Oxidants': {'sfc': [], '500': [], 'zm': []}}, {'Primary_Organics': {'sfc': [], '500': [], 'zm': []}}, {'ROy': {'sfc': [], '500': [], 'zm': []}}, {'Secondary_Organic_Aerosols': {'sfc': [], '500': [], 'zm': []}}, {'Secondary_Organics': {'sfc': [], '500': [], 'zm': []}}, {'Sulfur': {'sfc': [], '500': [], 'zm': []}}]

but when parallelization is off (n_cores: 1), the results variable comes back as:

{'Sulfur': {'sfc': [], '500': [], 'zm': []}}

I think the solution is to make results a list and then append the output of the createplots function to the list when parallelization is off. I'll implement a fix.

yantosca added a commit that referenced this issue Jan 25, 2024
This commit fixes the issue reported by @lizziel in #285.  The "results"
variable was being overwritten instead of appended to when plots are
generated sequentially (i.e. with "n_cores: 1" in the YAML input).

gcpy/benchmark/modules/run_1yr_*_fullchem.py
gcpy/benchmark_funcs.py
gcpy/plot/compare_*.py
- For the case when parallelization is off:
    1. Declare "results" as an empty list
    2. Append the output of the routine being called into "results"
  This will prevent a dictionary key error as described in #285.

Signed-off-by: Bob Yantosca <[email protected]>
@yantosca
Copy link
Contributor

Closed by #287

@yantosca yantosca self-assigned this Jan 25, 2024
@yantosca yantosca added the topic: Benchmark Plots and Tables Issues pertaining to generating plots/tables from benchmark output label Jan 25, 2024
@yantosca yantosca added this to the 1.4.2 milestone Jan 25, 2024
@yantosca
Copy link
Contributor

We can close this issue now because #287 has been merged. This problem is now fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: Bug Something isn't working topic: Benchmark Plots and Tables Issues pertaining to generating plots/tables from benchmark output
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants