Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmarks #5

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
Open

Benchmarks #5

wants to merge 10 commits into from

Conversation

asinghvi17
Copy link
Collaborator

Continuation of #4 but as a local branch.

Still getting that FileNotFound error from Python specifically if I've loaded the datasets a couple times - curious if you guys can replicate.

@asinghvi17
Copy link
Collaborator Author

Current output:

 "random access" => 2-element BenchmarkTools.BenchmarkGroup:
          tags: []
          "Julia" => Trial(31.736 s)
          "Python" => Trial(32.366 s)
  "single chunk read" => 2-element BenchmarkTools.BenchmarkGroup:
          tags: []
          "Julia" => Trial(41.416 ms)
          "Python" => Trial(49.045 ms)
  "hundred chunk read contiguous" => 2-element BenchmarkTools.BenchmarkGroup:
          tags: []
          "Julia" => Trial(25.751 s)
          "Python" => Trial(26.807 s)

so things seem to be more or less in sync. I still have to do pure Python benchmarks, probably tomorrow, to compare.

@felixcremer
Copy link

I can't test it, because I run into some SSL error trying to open it from the python side:

julia> py_array = YAXArrays.open_dataset(xr.open_zarr(data_url, decode_times=false))
ERROR: Python: RuntimeError: SSL is not supported.
Python stacktrace:
 [1] _get_ssl_context
   @ aiohttp.connector ~/.julia/dev/PyYAXArrays/benchmarks/.CondaPkg/env/lib/python3.12/site-packages/aiohttp/connector.py:1015
 [2] _create_direct_connection
   @ aiohttp.connector ~/.julia/dev/PyYAXArrays/benchmarks/.CondaPkg/env/lib/python3.12/site-packages/aiohttp/connector.py:1285
 [3] _create_connection
   @ aiohttp.connector ~/.julia/dev/PyYAXArrays/benchmarks/.CondaPkg/env/lib/python3.12/site-packages/aiohttp/connector.py:975
 [4] connect
   @ aiohttp.connector ~/.julia/dev/PyYAXArrays/benchmarks/.CondaPkg/env/lib/python3.12/site-packages/aiohttp/connector.py:564
 [5] _request
   @ aiohttp.client ~/.julia/dev/PyYAXArrays/benchmarks/.CondaPkg/env/lib/python3.12/site-packages/aiohttp/client.py:657
 [6] __aenter__
   @ aiohttp.client ~/.julia/dev/PyYAXArrays/benchmarks/.CondaPkg/env/lib/python3.12/site-packages/aiohttp/client.py:1353
 [7] _cat_file
   @ fsspec.implementations.http ~/.julia/dev/PyYAXArrays/benchmarks/.CondaPkg/env/lib/python3.12/site-packages/fsspec/implementations/http.py:234
 [8] wait_for
   @ asyncio.tasks ~/.julia/dev/PyYAXArrays/benchmarks/.CondaPkg/env/lib/python3.12/asyncio/tasks.py:520
 [9] _run_coro
   @ fsspec.asyn ~/.julia/dev/PyYAXArrays/benchmarks/.CondaPkg/env/lib/python3.12/site-packages/fsspec/asyn.py:245
 [10] _cat
   @ fsspec.asyn ~/.julia/dev/PyYAXArrays/benchmarks/.CondaPkg/env/lib/python3.12/site-packages/fsspec/asyn.py:461
 [11] _runner
   @ fsspec.asyn ~/.julia/dev/PyYAXArrays/benchmarks/.CondaPkg/env/lib/python3.12/site-packages/fsspec/asyn.py:56
 [12] sync
   @ fsspec.asyn ~/.julia/dev/PyYAXArrays/benchmarks/.CondaPkg/env/lib/python3.12/site-packages/fsspec/asyn.py:103
 [13] wrapper
   @ fsspec.asyn ~/.julia/dev/PyYAXArrays/benchmarks/.CondaPkg/env/lib/python3.12/site-packages/fsspec/asyn.py:118
 [14] __getitem__
   @ fsspec.mapping ~/.julia/dev/PyYAXArrays/benchmarks/.CondaPkg/env/lib/python3.12/site-packages/fsspec/mapping.py:155
 [15] __getitem__
   @ zarr.storage ~/.julia/dev/PyYAXArrays/benchmarks/.CondaPkg/env/lib/python3.12/site-packages/zarr/storage.py:1446
 [16] __init__
   @ zarr.storage ~/.julia/dev/PyYAXArrays/benchmarks/.CondaPkg/env/lib/python3.12/site-packages/zarr/storage.py:3046
 [17] open_consolidated
   @ zarr.convenience ~/.julia/dev/PyYAXArrays/benchmarks/.CondaPkg/env/lib/python3.12/site-packages/zarr/convenience.py:1360
 [18] _get_open_params
   @ xarray.backends.zarr ~/.julia/dev/PyYAXArrays/benchmarks/.CondaPkg/env/lib/python3.12/site-packages/xarray/backends/zarr.py:1313
 [19] open_group
   @ xarray.backends.zarr ~/.julia/dev/PyYAXArrays/benchmarks/.CondaPkg/env/lib/python3.12/site-packages/xarray/backends/zarr.py:483
 [20] open_dataset
   @ xarray.backends.zarr ~/.julia/dev/PyYAXArrays/benchmarks/.CondaPkg/env/lib/python3.12/site-packages/xarray/backends/zarr.py:1173
 [21] open_dataset
   @ xarray.backends.api ~/.julia/dev/PyYAXArrays/benchmarks/.CondaPkg/env/lib/python3.12/site-packages/xarray/backends/api.py:588
 [22] open_zarr
   @ xarray.backends.zarr ~/.julia/dev/PyYAXArrays/benchmarks/.CondaPkg/env/lib/python3.12/site-packages/xarray/backends/zarr.py:1103
Stacktrace:
 [1] pythrow()
   @ PythonCall.Core ~/.julia/packages/PythonCall/Nr75f/src/Core/err.jl:92
 [2] errcheck
   @ ~/.julia/packages/PythonCall/Nr75f/src/Core/err.jl:10 [inlined]
 [3] pycallargs
   @ ~/.julia/packages/PythonCall/Nr75f/src/Core/builtins.jl:222 [inlined]
 [4] pycall(f::Py, args::String; kwargs::@Kwargs{decode_times::Bool})
   @ PythonCall.Core ~/.julia/packages/PythonCall/Nr75f/src/Core/builtins.jl:237
 [5] pycall
   @ ~/.julia/packages/PythonCall/Nr75f/src/Core/builtins.jl:233 [inlined]
 [6] #_#11
   @ ~/.julia/packages/PythonCall/Nr75f/src/Core/Py.jl:357 [inlined]
 [7] top-level scope
   @ REPL[49]:1

@asinghvi17
Copy link
Collaborator Author

See the comment at the top of the file, it looks like Python's SSL must be loaded first.

Maybe there's a hard linking going on instead of dynamic during the build process for Python? That's the only thing I can think of...

@felixcremer
Copy link

I did load PythonCall as first package. So I am not sure, what is going on.

@asinghvi17
Copy link
Collaborator Author

Huh, interesting. What system are you on?

@asinghvi17
Copy link
Collaborator Author

And what's the output of:

]st -m OpenSSL OpenSSL_jll
]conda run conda list

?

@felixcremer
Copy link

System:

julia> versioninfo()
Julia Version 1.10.4
Commit 48d4fd48430 (2024-06-04 10:41 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 16 × AMD Ryzen 7 PRO 5850U with Radeon Graphics
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 10 default, 0 interactive, 5 GC (on 16 virtual cores)
Environment:
  JULIA_EDITOR = code
  JULIA_NUM_THREADS = 10

packages:

Details
Details

(benchmarks) pkg> st -m OpenSSL OpenSSL_jll
Status `~/.julia/dev/PyYAXArrays/benchmarks/Manifest.toml`
  [4d8831e6] OpenSSL v1.4.3
  [458c3c95] OpenSSL_jll v3.0.14+0

(benchmarks) pkg> conda run conda list
List of packages in environment: "/home/fcremer/.julia/dev/PyYAXArrays/benchmarks/.CondaPkg/env"

  Name                              Version       Build                Channel    
────────────────────────────────────────────────────────────────────────────────────
  _libgcc_mutex                     0.1           conda_forge          conda-forge
  _openmp_mutex                     4.5           2_gnu                conda-forge
  aiohappyeyeballs                  2.4.0         pyhd8ed1ab_0         conda-forge
  aiohttp                           3.10.5        py312h41a817b_0      conda-forge
  aiosignal                         1.3.1         pyhd8ed1ab_0         conda-forge
  asciitree                         0.3.3         py_2                 conda-forge
  attrs                             24.2.0        pyh71513ae_0         conda-forge
  aws-c-auth                        0.7.25        h15d0e8c_6           conda-forge
  aws-c-cal                         0.7.3         h8dac057_2           conda-forge
  aws-c-common                      0.9.27        h4bc722e_0           conda-forge
  aws-c-compression                 0.2.18        h038f3f9_10          conda-forge
  aws-c-event-stream                0.4.3         h570d160_0           conda-forge
  aws-c-http                        0.8.7         ha1f794c_4           conda-forge
  aws-c-io                          0.14.18       h0040ed1_5           conda-forge
  aws-c-mqtt                        0.10.4        hc14a930_17          conda-forge
  aws-c-s3                          0.6.4         h558cea2_8           conda-forge
  aws-c-sdkutils                    0.1.19        h038f3f9_2           conda-forge
  aws-checksums                     0.1.18        h038f3f9_10          conda-forge
  aws-crt-cpp                       0.27.5        h6e4e78f_8           conda-forge
  aws-sdk-cpp                       1.11.379      hce093eb_4           conda-forge
  azure-core-cpp                    1.13.0        h935415a_0           conda-forge
  azure-identity-cpp                1.8.0         hd126650_2           conda-forge
  azure-storage-blobs-cpp           12.12.0       hd2e3451_0           conda-forge
  azure-storage-common-cpp          12.7.0        h10ac4d7_1           conda-forge
  azure-storage-files-datalake-cpp  12.11.0       h325d260_1           conda-forge
  bokeh                             3.5.1         pyhd8ed1ab_0         conda-forge
  brotli-python                     1.1.0         py312h30efb56_1      conda-forge
  bzip2                             1.0.8         h4bc722e_7           conda-forge
  c-ares                            1.33.0        ha66036c_0           conda-forge
  ca-certificates                   2024.7.4      hbcca054_0           conda-forge
  certifi                           2024.7.4      pyhd8ed1ab_0         conda-forge
  cffi                              1.17.0        py312h1671c18_0      conda-forge
  charset-normalizer                3.3.2         pyhd8ed1ab_0         conda-forge
  click                             8.1.7         unix_pyh707e725_0    conda-forge
  cloudpickle                       3.0.0         pyhd8ed1ab_0         conda-forge
  contourpy                         1.2.1         py312h8572e83_0      conda-forge
  cytoolz                           0.12.3        py312h98912ed_0      conda-forge
  dask                              2024.8.1      pyhd8ed1ab_0         conda-forge
  dask-core                         2024.8.1      pyhd8ed1ab_0         conda-forge
  dask-expr                         1.1.11        pyhd8ed1ab_0         conda-forge
  distributed                       2024.8.1      pyhd8ed1ab_0         conda-forge
  fasteners                         0.17.3        pyhd8ed1ab_0         conda-forge
  freetype                          2.12.1        h267a509_2           conda-forge
  frozenlist                        1.4.1         py312h98912ed_0      conda-forge
  fsspec                            2024.6.1      pyhff2d567_0         conda-forge
  gflags                            2.2.2         he1b5a44_1004        conda-forge
  glog                              0.7.1         hbabe93e_0           conda-forge
  h2                                4.1.0         pyhd8ed1ab_0         conda-forge
  hpack                             4.0.0         pyh9f0ad1d_0         conda-forge
  hyperframe                        6.0.1         pyhd8ed1ab_0         conda-forge
  icu                               75.1          he02047a_0           conda-forge
  idna                              3.7           pyhd8ed1ab_0         conda-forge
  importlib-metadata                8.4.0         pyha770c72_0         conda-forge
  importlib_metadata                8.4.0         hd8ed1ab_0           conda-forge
  jinja2                            3.1.4         pyhd8ed1ab_0         conda-forge
  keyutils                          1.6.1         h166bdaf_0           conda-forge
  krb5                              1.21.3        h659f571_0           conda-forge
  lcms2                             2.16          hb7c19ff_0           conda-forge
  ld_impl_linux-64                  2.40          hf3520f5_7           conda-forge
  lerc                              4.0.0         h27087fc_0           conda-forge
  libabseil                         20240116.2    cxx17_he02047a_1     conda-forge
  libarrow                          17.0.0        h8756180_8_cpu       conda-forge
  libarrow-acero                    17.0.0        he02047a_8_cpu       conda-forge
  libarrow-dataset                  17.0.0        he02047a_8_cpu       conda-forge
  libarrow-substrait                17.0.0        hc9a23c6_8_cpu       conda-forge
  libblas                           3.9.0         23_linux64_openblas  conda-forge
  libbrotlicommon                   1.1.0         hd590300_1           conda-forge
  libbrotlidec                      1.1.0         hd590300_1           conda-forge
  libbrotlienc                      1.1.0         hd590300_1           conda-forge
  libcblas                          3.9.0         23_linux64_openblas  conda-forge
  libcrc32c                         1.1.2         h9c3ff4c_0           conda-forge
  libcurl                           8.9.1         hdb1bdb2_0           conda-forge
  libdeflate                        1.21          h4bc722e_0           conda-forge
  libedit                           3.1.20191231  he28a2e2_2           conda-forge
  libev                             4.33          hd590300_2           conda-forge
  libevent                          2.1.12        hf998b51_1           conda-forge
  libexpat                          2.6.2         h59595ed_0           conda-forge
  libffi                            3.4.2         h7f98852_5           conda-forge
  libgcc-ng                         12.4.0        h77fa898_0           conda-forge
  libgfortran-ng                    13.2.0        h69a702a_0           conda-forge
  libgfortran5                      13.2.0        ha4646dd_0           conda-forge
  libgomp                           12.4.0        h77fa898_0           conda-forge
  libgoogle-cloud                   2.28.0        h26d7fe4_0           conda-forge
  libgoogle-cloud-storage           2.28.0        ha262f82_0           conda-forge
  libgrpc                           1.62.2        h15f2491_0           conda-forge
  libiconv                          1.17          hd590300_2           conda-forge
  libjpeg-turbo                     3.0.0         hd590300_1           conda-forge
  liblapack                         3.9.0         23_linux64_openblas  conda-forge
  libnghttp2                        1.58.0        h47da74e_1           conda-forge
  libnsl                            2.0.1         hd590300_0           conda-forge
  libopenblas                       0.3.27        pthreads_hac2b453_1  conda-forge
  libparquet                        17.0.0        haa1307c_8_cpu       conda-forge
  libpng                            1.6.43        h2797004_0           conda-forge
  libprotobuf                       4.25.3        h08a7969_0           conda-forge
  libre2-11                         2023.09.01    h5a48ba9_2           conda-forge
  libsqlite                         3.46.0        hde9e2c9_0           conda-forge
  libssh2                           1.11.0        h0841786_0           conda-forge
  libstdcxx-ng                      12.4.0        hc0a3c3a_0           conda-forge
  libthrift                         0.20.0        hb90f79a_0           conda-forge
  libtiff                           4.6.0         h46a8edc_4           conda-forge
  libutf8proc                       2.8.0         h166bdaf_0           conda-forge
  libuuid                           2.38.1        h0b41bf4_0           conda-forge
  libwebp-base                      1.4.0         hd590300_0           conda-forge
  libxcb                            1.16          hd590300_0           conda-forge
  libxcrypt                         4.4.36        hd590300_1           conda-forge
  libxml2                           2.12.7        he7c6b58_4           conda-forge
  libzlib                           1.3.1         h4ab18f5_1           conda-forge
  locket                            1.0.0         pyhd8ed1ab_0         conda-forge
  lz4                               4.3.3         py312h03f37cb_0      conda-forge
  lz4-c                             1.9.4         hcb278e6_0           conda-forge
  markupsafe                        2.1.5         py312h98912ed_0      conda-forge
  msgpack-python                    1.0.8         py312h2492b07_0      conda-forge
  multidict                         6.0.5         py312h98912ed_0      conda-forge
  ncurses                           6.5           h59595ed_0           conda-forge
  numcodecs                         0.13.0        py312h1df14c2_0      conda-forge
  numpy                             2.1.0         py312h1103770_0      conda-forge
  openjpeg                          2.5.2         h488ebb8_0           conda-forge
  openssl                           3.3.1         h4bc722e_2           conda-forge
  orc                               2.0.2         h669347b_0           conda-forge
  packaging                         24.1          pyhd8ed1ab_0         conda-forge
  pandas                            2.2.2         py312h1d6d2e6_1      conda-forge
  partd                             1.4.2         pyhd8ed1ab_0         conda-forge
  pillow                            10.4.0        py312h287a98d_0      conda-forge
  pip                               24.2          pyhd8ed1ab_0         conda-forge
  psutil                            6.0.0         py312h9a8786e_0      conda-forge
  pthread-stubs                     0.4           h36c2ea0_1001        conda-forge
  pyarrow                           17.0.0        py312h9cebb41_1      conda-forge
  pyarrow-core                      17.0.0        py312h9cafe31_1_cpu  conda-forge
  pyarrow-hotfix                    0.6           pyhd8ed1ab_0         conda-forge
  pycparser                         2.22          pyhd8ed1ab_0         conda-forge
  pysocks                           1.7.1         pyha2e5f31_6         conda-forge
  python                            3.12.5        h2ad013b_0_cpython   conda-forge
  python-dateutil                   2.9.0         pyhd8ed1ab_0         conda-forge
  python-tzdata                     2024.1        pyhd8ed1ab_0         conda-forge
  python_abi                        3.12          5_cp312              conda-forge
  pytz                              2024.1        pyhd8ed1ab_0         conda-forge
  pyyaml                            6.0.2         py312h41a817b_0      conda-forge
  re2                               2023.09.01    h7f4b329_2           conda-forge
  readline                          8.2           h8228510_1           conda-forge
  requests                          2.32.3        pyhd8ed1ab_0         conda-forge
  s2n                               1.5.0         h3400bea_0           conda-forge
  setuptools                        72.2.0        pyhd8ed1ab_0         conda-forge
  six                               1.16.0        pyh6c4a22f_0         conda-forge
  snappy                            1.2.1         ha2e4443_0           conda-forge
  sortedcontainers                  2.4.0         pyhd8ed1ab_0         conda-forge
  tblib                             3.0.0         pyhd8ed1ab_0         conda-forge
  tk                                8.6.13        noxft_h4845f30_101   conda-forge
  toolz                             0.12.1        pyhd8ed1ab_0         conda-forge
  tornado                           6.4.1         py312h9a8786e_0      conda-forge
  tzdata                            2024a         h0c530f3_0           conda-forge
  urllib3                           2.2.2         pyhd8ed1ab_1         conda-forge
  wheel                             0.44.0        pyhd8ed1ab_0         conda-forge
  xarray                            2024.7.0      pyhd8ed1ab_0         conda-forge
  xorg-libxau                       1.0.11        hd590300_0           conda-forge
  xorg-libxdmcp                     1.1.3         h7f98852_0           conda-forge
  xyzservices                       2024.6.0      pyhd8ed1ab_0         conda-forge
  xz                                5.2.6         h166bdaf_0           conda-forge
  yaml                              0.2.5         h7f98852_2           conda-forge
  yarl                              1.9.4         py312h98912ed_0      conda-forge
  zarr                              2.18.2        pyhd8ed1ab_0         conda-forge
  zict                              3.0.0         pyhd8ed1ab_0         conda-forge
  zipp                              3.20.0        pyhd8ed1ab_0         conda-forge
  zstandard                         0.23.0        py312h3483029_0      conda-forge
  zstd                              1.5.6         ha6fb4c9_0           conda-forge

@asinghvi17
Copy link
Collaborator Author

asinghvi17 commented Aug 27, 2024

Openssl is 3.3.1 in Python but 3.0.14 in Julia, maybe that plays a role...

You can also set ssl=false in the xarray function which should work

@felixcremer
Copy link

If we want to use this for more than benchmarking, we would need to find a more robust setup.

@asinghvi17
Copy link
Collaborator Author

Yeah, I'm not sure what we can do beyond very strict instructions though. This may not be a sustainable workflow, but I expect that it could serve pretty well as a semi-experimental backend, and if people need some feature that is in Xarray but not in Julia.

@asinghvi17
Copy link
Collaborator Author

asinghvi17 commented Aug 27, 2024

Currently it looks like there's a lot of type instability deep in the Py call stack as well as some runtime dispatches in readblock!. Will keep looking into it. This happens especially in a hundred-chunk spatially contiguous read.

@asinghvi17
Copy link
Collaborator Author

asinghvi17 commented Aug 28, 2024

download-1
First benchmarking code - will post more tomorrow. I'm not 100% sure about the Python code though, that seems suspicious, but that's what timeit says it is....

Note that this code is not runnable on your local machine since it needs a CairoMakie patch...

@asinghvi17
Copy link
Collaborator Author

FSSpec is from https://github.com/asinghvi17/FSSpec.jl - if you have fsspec in your global condapkg env, this already works!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants