Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python upgrade 39 discussion #427

Merged
merged 28 commits into from
Jun 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
56e4788
change version to 3.9
zain-sohail Apr 11, 2024
3d5ffb4
update lock file
zain-sohail Apr 11, 2024
7b6062c
add dask dataframe dep
zain-sohail Apr 11, 2024
084a6b0
remove uncessary test file
zain-sohail Apr 11, 2024
b49eded
get version from toml version
zain-sohail Apr 11, 2024
e6f3943
fix version
zain-sohail Apr 11, 2024
2b4c93e
fix mypy error
zain-sohail Apr 11, 2024
e9fa5f2
fix old pre commit file
zain-sohail Apr 11, 2024
65e8e09
fix tests
zain-sohail Apr 11, 2024
04a580a
fix tests
zain-sohail Apr 11, 2024
510db39
merge main and update lock file
zain-sohail Jun 20, 2024
b456cba
fix some linting errors
zain-sohail Jun 20, 2024
9d61149
fix issues with matplotlib
rettigl Jun 21, 2024
7cdf158
limit dask version and update lockfile and tests
rettigl Jun 21, 2024
87c4bf1
Merge branch 'python-upgrade-39' of github.com:OpenCOMPES/sed into py…
rettigl Jun 21, 2024
502f4d7
fruther restrict dask due to never finishing tests
rettigl Jun 21, 2024
4065a37
revert tests
rettigl Jun 21, 2024
24a353f
update annotations
rettigl Jun 21, 2024
867a711
more typing fixes
rettigl Jun 21, 2024
6a4c78a
more type fixes
rettigl Jun 21, 2024
843b16b
more typing fixes
rettigl Jun 21, 2024
1284b36
more type fixes
rettigl Jun 21, 2024
3e02f48
forgotten changes
rettigl Jun 21, 2024
0bc1335
allow testing_multiversion.yml to run on v1 branch
zain-sohail Jun 22, 2024
b0644e8
try limiting python 3.12 version
rettigl Jun 22, 2024
030395c
exclude python 3.11.9
rettigl Jun 22, 2024
0f28b8d
update poetry version
rettigl Jun 22, 2024
59d76b8
Merge pull request #428 from OpenCOMPES/py39_update_annotations
rettigl Jun 22, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/benchmark.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,8 @@ jobs:
- name: "Setup Python, Poetry and Dependencies"
uses: packetcoders/action-setup-cache-python-poetry@main
with:
python-version: 3.8
poetry-version: 1.2.2
python-version: 3.9
poetry-version: 1.8.3

# Run benchmakrs
- name: Run benchmarks on python 3.8
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/documentation.yml
Original file line number Diff line number Diff line change
Expand Up @@ -47,8 +47,8 @@ jobs:
- name: "Setup Python, Poetry and Dependencies"
uses: packetcoders/action-setup-cache-python-poetry@main
with:
python-version: 3.8
poetry-version: 1.2.2
python-version: 3.9
poetry-version: 1.8.3

- name: Install notebook dependencies
run: poetry install -E notebook --with docs
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/linting.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,8 @@ jobs:
- name: "Setup Python, Poetry and Dependencies"
uses: packetcoders/action-setup-cache-python-poetry@main
with:
python-version: 3.8
poetry-version: 1.2.2
python-version: 3.9
poetry-version: 1.8.3

# Linting steps, excute all linters even if one fails
- name: ruff
Expand Down
8 changes: 4 additions & 4 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -39,8 +39,8 @@ jobs:
- name: "Setup Python, Poetry and Dependencies"
uses: zain-sohail/action-setup-cache-python-poetry@main
with:
python-version: 3.8
poetry-version: 1.2.2
python-version: 3.9
poetry-version: 1.8.3
working-directory: sed-processor

- name: Change to distribution name in toml file
Expand Down Expand Up @@ -82,8 +82,8 @@ jobs:
- name: "Setup Python, Poetry and Dependencies"
uses: zain-sohail/action-setup-cache-python-poetry@main
with:
python-version: 3.8
poetry-version: 1.2.2
python-version: 3.9
poetry-version: 1.8.3
working-directory: sed-processor

- name: Change to distribution name in toml file
Expand Down
6 changes: 3 additions & 3 deletions .github/workflows/testing_coverage.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,11 +23,11 @@ jobs:
- name: "Setup Python, Poetry and Dependencies"
uses: packetcoders/action-setup-cache-python-poetry@main
with:
python-version: 3.8
poetry-version: 1.2.2
python-version: 3.9
poetry-version: 1.8.3

# Run pytest with coverage report, saving to xml
- name: Run tests on python 3.8
- name: Run tests on python 3.9
run: |
poetry run pytest --cov --cov-report xml:cobertura.xml --full-trace --show-capture=no -sv -n auto tests/

Expand Down
9 changes: 5 additions & 4 deletions .github/workflows/testing_multiversion.yml
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
name: unit tests [Python 3.8|3.9|3.10|3.11]
# Tests for all supported versions [Python 3.9|3.10|3.11|3.12]
name: Unit Tests

on:
workflow_dispatch:
push:
branches: [ main ]
branches: [ main, v1_feature_branch ]
paths-ignore:
pyproject.toml

Expand All @@ -12,7 +13,7 @@ jobs:
# Using matrix strategy
strategy:
matrix:
python-version: ["3.8", "3.9", "3.10", "3.11"]
python-version: ["3.9", "3.10", "3.11.8", "3.12.2"]
runs-on: ubuntu-latest
steps:
# Check out repo and set up Python
Expand All @@ -25,7 +26,7 @@ jobs:
uses: packetcoders/action-setup-cache-python-poetry@main
with:
python-version: ${{matrix.python-version}}
poetry-version: 1.2.2
poetry-version: 1.8.3

# Use cached python and dependencies, install poetry
- name: Run tests on python ${{matrix.python-version}}
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/update_dependencies.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,8 @@ jobs:
- name: "Setup Python, Poetry and Dependencies"
uses: packetcoders/action-setup-cache-python-poetry@main
with:
python-version: 3.8
poetry-version: 1.2.2
python-version: 3.9
poetry-version: 1.8.3

# update poetry lockfile
- name: "Update poetry lock file"
Expand Down
6 changes: 3 additions & 3 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -27,12 +27,12 @@ repos:
rev: v3.8.2
hooks:
- id: reorder-python-imports
args: [--application-directories, '.:src', --py36-plus]
args: [--application-directories, '.:src', --py39-plus]
- repo: https://github.com/asottile/pyupgrade
rev: v2.37.3
rev: v3.16.0
hooks:
- id: pyupgrade
args: [--py36-plus]
args: [--py39-plus]
- repo: https://github.com/asottile/add-trailing-comma
rev: v2.2.3
hooks:
Expand Down
1,310 changes: 649 additions & 661 deletions poetry.lock

Large diffs are not rendered by default.

7 changes: 3 additions & 4 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,9 @@ keywords = ["sed", "mpes", "flash", "arpes"]
license = "MIT"

[tool.poetry.dependencies]
python = ">=3.8, <3.11.9"
python = ">=3.9, <3.12.3, !=3.11.9"
bokeh = ">=2.4.2"
dask = ">=2021.12.0"
dask = {version = ">=2021.12.0, <2023.12.1"}
rettigl marked this conversation as resolved.
Show resolved Hide resolved
fastdtw = ">=0.3.4"
h5py = ">=3.6.0"
ipympl = ">=0.9.1"
Expand Down Expand Up @@ -43,7 +43,6 @@ ipykernel = {version = ">=6.9.1", optional = true}
jupyterlab = {version = "^3.4.0", optional = true}
jupyterlab-h5web = {version = "^8.0.0", extras = ["full"]}


[tool.poetry.extras]
notebook = ["jupyter", "ipykernel", "jupyterlab", "jupyterlab-h5web"]
all = ["notebook"]
Expand All @@ -59,7 +58,7 @@ types-pyyaml = ">=6.0.12.12"
types-requests = ">=2.31.0.9"
pyfakefs = ">=5.3.0"
requests-mock = "^1.11.0"

pre-commit = ">=3.0.0"

[tool.poetry.group.docs]
optional = true
Expand Down
1 change: 0 additions & 1 deletion sed/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,5 +3,4 @@
"""
from .core.processor import SedProcessor

__version__ = "0.1.0"
__all__ = ["SedProcessor"]
69 changes: 28 additions & 41 deletions sed/binning/binning.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,11 @@
"""This module contains the binning functions of the sed.binning module

"""
from __future__ import annotations

import gc
from collections.abc import Sequence
from functools import reduce
from typing import cast
from typing import List
from typing import Sequence
from typing import Tuple
from typing import Union

import dask.dataframe
Expand All @@ -26,27 +25,21 @@


def bin_partition(
part: Union[dask.dataframe.DataFrame, pd.DataFrame],
bins: Union[
int,
dict,
Sequence[int],
Sequence[np.ndarray],
Sequence[tuple],
] = 100,
part: dask.dataframe.DataFrame | pd.DataFrame,
bins: int | dict | Sequence[int] | Sequence[np.ndarray] | Sequence[tuple] = 100,
axes: Sequence[str] = None,
ranges: Sequence[Tuple[float, float]] = None,
ranges: Sequence[tuple[float, float]] = None,
hist_mode: str = "numba",
jitter: Union[list, dict] = None,
jitter: list | dict = None,
return_edges: bool = False,
skip_test: bool = False,
) -> Union[np.ndarray, Tuple[np.ndarray, list]]:
) -> np.ndarray | tuple[np.ndarray, list]:
"""Compute the n-dimensional histogram of a single dataframe partition.

Args:
part (Union[dask.dataframe.DataFrame, pd.DataFrame]): dataframe on which
part (dask.dataframe.DataFrame | pd.DataFrame): dataframe on which
to perform the histogram. Usually a partition of a dask DataFrame.
bins (int, dict, Sequence[int], Sequence[np.ndarray], Sequence[tuple], optional):
bins (int | dict | Sequence[int] | Sequence[np.ndarray] | Sequence[tuple], optional):
Definition of the bins. Can be any of the following cases:

- an integer describing the number of bins for all dimensions. This
Expand All @@ -70,7 +63,7 @@ def bin_partition(
the order of the dimensions in the resulting array. Only not required if
bins are provided as dictionary containing the axis names.
Defaults to None.
ranges (Sequence[Tuple[float, float]], optional): Sequence of tuples containing
ranges (Sequence[tuple[float, float]], optional): Sequence of tuples containing
the start and end point of the binning range. Required if bins given as
int or Sequence[int]. Defaults to None.
hist_mode (str, optional): Histogram calculation method.
Expand All @@ -79,7 +72,7 @@ def bin_partition(
- "numba" use a numba powered similar method.

Defaults to "numba".
jitter (Union[list, dict], optional): a list of the axes on which to apply
jitter (list | dict, optional): a list of the axes on which to apply
jittering. To specify the jitter amplitude or method (normal or uniform
noise) a dictionary can be passed. This should look like
jitter={'axis':{'amplitude':0.5,'mode':'uniform'}}.
Expand All @@ -102,8 +95,8 @@ def bin_partition(
present in the dataframe

Returns:
Union[np.ndarray, Tuple[np.ndarray, list]]: 2-element tuple returned only when
returnEdges is True. Otherwise only hist is returned.
np.ndarray | tuple[np.ndarray: 2-element tuple returned only when
return_edges is True. Otherwise only hist is returned.

- **hist**: The result of the n-dimensional binning
- **edges**: A list of D arrays describing the bin edges for each dimension.
Expand All @@ -122,17 +115,17 @@ def bin_partition(
raise TypeError(
"axes needs to be of type 'List[str]' if tests are skipped!",
)
bins = cast(Union[List[int], List[np.ndarray]], bins)
axes = cast(List[str], axes)
ranges = cast(List[Tuple[float, float]], ranges)
bins = cast(Union[list[int], list[np.ndarray]], bins)
axes = cast(list[str], axes)
ranges = cast(list[tuple[float, float]], ranges)

# convert bin centers to bin edges:
if all(isinstance(x, np.ndarray) for x in bins):
bins = cast(List[np.ndarray], bins)
bins = cast(list[np.ndarray], bins)
for i, bin_centers in enumerate(bins):
bins[i] = bin_centers_to_bin_edges(bin_centers)
else:
bins = cast(List[int], bins)
bins = cast(list[int], bins)
# shift ranges by half a bin size to align the bin centers to the given ranges,
# as the histogram functions interprete the ranges as limits for the edges.
for i, nbins in enumerate(bins):
Expand Down Expand Up @@ -203,18 +196,12 @@ def bin_partition(

def bin_dataframe(
df: dask.dataframe.DataFrame,
bins: Union[
int,
dict,
Sequence[int],
Sequence[np.ndarray],
Sequence[tuple],
] = 100,
bins: int | dict | Sequence[int] | Sequence[np.ndarray] | Sequence[tuple] = 100,
axes: Sequence[str] = None,
ranges: Sequence[Tuple[float, float]] = None,
ranges: Sequence[tuple[float, float]] = None,
hist_mode: str = "numba",
mode: str = "fast",
jitter: Union[list, dict] = None,
jitter: list | dict = None,
pbar: bool = True,
n_cores: int = N_CPU - 1,
threads_per_worker: int = 4,
Expand All @@ -228,7 +215,7 @@ def bin_dataframe(
Args:
df (dask.dataframe.DataFrame): a dask.DataFrame on which to perform the
histogram.
bins (int, dict, Sequence[int], Sequence[np.ndarray], Sequence[tuple], optional):
bins (int | dict | Sequence[int] | Sequence[np.ndarray] | Sequence[tuple], optional):
Definition of the bins. Can be any of the following cases:

- an integer describing the number of bins for all dimensions. This
Expand All @@ -252,7 +239,7 @@ def bin_dataframe(
the order of the dimensions in the resulting array. Only not required if
bins are provided as dictionary containing the axis names.
Defaults to None.
ranges (Sequence[Tuple[float, float]], optional): Sequence of tuples containing
ranges (Sequence[tuple[float, float]], optional): Sequence of tuples containing
the start and end point of the binning range. Required if bins given as
int or Sequence[int]. Defaults to None.
hist_mode (str, optional): Histogram calculation method.
Expand All @@ -269,7 +256,7 @@ def bin_dataframe(
- 'legacy': Single-core recombination of partition results.

Defaults to "fast".
jitter (Union[list, dict], optional): a list of the axes on which to apply
jitter (list | dict, optional): a list of the axes on which to apply
jittering. To specify the jitter amplitude or method (normal or uniform
noise) a dictionary can be passed. This should look like
jitter={'axis':{'amplitude':0.5,'mode':'uniform'}}.
Expand Down Expand Up @@ -304,14 +291,14 @@ def bin_dataframe(
# create the coordinate axes for the xarray output
# if provided as array, they are interpreted as bin centers
if isinstance(bins[0], np.ndarray):
bins = cast(List[np.ndarray], bins)
bins = cast(list[np.ndarray], bins)
coords = dict(zip(axes, bins))
elif ranges is None:
raise ValueError(
"bins is not an array and range is none. this shouldn't happen.",
)
else:
bins = cast(List[int], bins)
bins = cast(list[int], bins)
coords = {
ax: np.linspace(r[0], r[1], n, endpoint=False) for ax, r, n in zip(axes, ranges, bins)
}
Expand Down Expand Up @@ -509,7 +496,7 @@ def normalization_histogram_from_timed_dataframe(


def apply_jitter_on_column(
df: Union[dask.dataframe.core.DataFrame, pd.DataFrame],
df: dask.dataframe.core.DataFrame | pd.DataFrame,
amp: float,
col: str,
mode: str = "uniform",
Expand Down
Loading