Plot parallelization off failing in GCPy 1.4.1 #285

lizziel · 2024-01-25T16:02:39Z

Name and Institution (Required)

Name: Lizzie Lundgren
Institution: Harvard University

Description of your issue or question

I am getting the following error when running the transport tracer benchmark with GCPy 1.4.1 with plotting parallelization turned off in the 1yr transport tracer benchmark configuration file. I am using python 3.9.18. Full package list is in #284.

Traceback (most recent call last):
  File "/gpfsm/dnb34/ewlundgr/python/mambaforge/envs/gcpy_v1_4_1/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/gpfsm/dnb34/ewlundgr/python/mambaforge/envs/gcpy_v1_4_1/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/ewlundgr/nb/python/gcpy/gcpy/benchmark/run_benchmark.py", line 1606, in <module>
    main(sys.argv)
  File "/home/ewlundgr/nb/python/gcpy/gcpy/benchmark/run_benchmark.py", line 1602, in main
    choose_benchmark_type(config)
  File "/home/ewlundgr/nb/python/gcpy/gcpy/benchmark/run_benchmark.py", line 100, in choose_benchmark_type
    run_1yr_tt_benchmark(
  File "/home/ewlundgr/nb/python/gcpy/gcpy/benchmark/modules/run_1yr_tt_benchmark.py", line 662, in run_benchmark
    bmk.make_benchmark_conc_plots(
  File "/home/ewlundgr/nb/python/gcpy/gcpy/benchmark_funcs.py", line 1510, in make_benchmark_conc_plots
    dict_sfc = {list(result.keys())[0]: result[list(
  File "/home/ewlundgr/nb/python/gcpy/gcpy/benchmark_funcs.py", line 1510, in <dictcomp>
    dict_sfc = {list(result.keys())[0]: result[list(
AttributeError: 'str' object has no attribute 'keys'

The text was updated successfully, but these errors were encountered:

yantosca · 2024-01-25T19:08:12Z

Thanks @lizziel. I think I see what the problem is. When you turn off parallelization you might be getting a string back instead of a dict. Let me see if I can reproduce this locally.

yantosca · 2024-01-25T20:48:18Z

Hi @lizziel! I think I've figured this out. This is happening in the various places where plots are parallelized. There are code blocks such as:

    # --------------------------------------------
    # Create the plots in parallel
    # Turn off parallelization if n_job=1
    if n_job != 1:
        results = Parallel(n_jobs=n_job)(
            delayed(createplots)(filecat)
            for _, filecat in enumerate(catdict)
        )
    else:
        for _, filecat in enumerate(catdict):
            results = createplots(filecat)
    # --------------------------------------------

in e.g. gcpy/benchmark_funcs.py.

So when parallelization (n_cores: -1) is on, the results variable comes back as:

[{'Aerosols': {'sfc': [], '500': [], 'zm': []}}, {'Bromine': {'sfc': [], '500': [], 'zm': []}}, {'Chlorine': {'sfc': [], '500': [], 'zm': []}}, {'Iodine': {'sfc': [], '500': [], 'zm': []}}, {'Nitrogen': {'sfc': [], '500': [], 'zm': []}}, {'Oxidants': {'sfc': [], '500': [], 'zm': []}}, {'Primary_Organics': {'sfc': [], '500': [], 'zm': []}}, {'ROy': {'sfc': [], '500': [], 'zm': []}}, {'Secondary_Organic_Aerosols': {'sfc': [], '500': [], 'zm': []}}, {'Secondary_Organics': {'sfc': [], '500': [], 'zm': []}}, {'Sulfur': {'sfc': [], '500': [], 'zm': []}}]

but when parallelization is off (n_cores: 1), the results variable comes back as:

{'Sulfur': {'sfc': [], '500': [], 'zm': []}}

I think the solution is to make results a list and then append the output of the createplots function to the list when parallelization is off. I'll implement a fix.

@lizziel

This commit fixes the issue reported by @lizziel in #285. The "results" variable was being overwritten instead of appended to when plots are generated sequentially (i.e. with "n_cores: 1" in the YAML input). gcpy/benchmark/modules/run_1yr_*_fullchem.py gcpy/benchmark_funcs.py gcpy/plot/compare_*.py - For the case when parallelization is off: 1. Declare "results" as an empty list 2. Append the output of the routine being called into "results" This will prevent a dictionary key error as described in #285. Signed-off-by: Bob Yantosca <[email protected]>

yantosca · 2024-01-25T21:21:38Z

Closed by #287

yantosca · 2024-01-26T22:39:15Z

We can close this issue now because #287 has been merged. This problem is now fixed.

lizziel added the category: Bug Something isn't working label Jan 25, 2024

yantosca mentioned this issue Jan 25, 2024

Fix dictionary key error when benchmark plots are generated sequentially #287

Merged

1 task

yantosca linked a pull request Jan 25, 2024 that will close this issue

Fix dictionary key error when benchmark plots are generated sequentially #287

Merged

1 task

yantosca self-assigned this Jan 25, 2024

yantosca added the topic: Benchmark Plots and Tables Issues pertaining to generating plots/tables from benchmark output label Jan 25, 2024

yantosca added this to the 1.4.2 milestone Jan 25, 2024

yantosca closed this as completed Jan 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Plot parallelization off failing in GCPy 1.4.1 #285

Plot parallelization off failing in GCPy 1.4.1 #285

lizziel commented Jan 25, 2024 •

edited

Loading

yantosca commented Jan 25, 2024

yantosca commented Jan 25, 2024

yantosca commented Jan 25, 2024

yantosca commented Jan 26, 2024

Plot parallelization off failing in GCPy 1.4.1 #285

Plot parallelization off failing in GCPy 1.4.1 #285

Comments

lizziel commented Jan 25, 2024 • edited Loading

Name and Institution (Required)

Description of your issue or question

yantosca commented Jan 25, 2024

yantosca commented Jan 25, 2024

yantosca commented Jan 25, 2024

yantosca commented Jan 26, 2024

lizziel commented Jan 25, 2024 •

edited

Loading