Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spelling fixes for main #454

Merged
merged 3 commits into from
Jul 2, 2024
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
405 changes: 405 additions & 0 deletions .cspell/custom-dictionary.txt

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion .github/workflows/benchmark.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ jobs:
- name: Install project dependencies
run: poetry install

# Run benchmakrs
# Run benchmarks
- name: Run benchmarks on python 3.8
run: |
poetry run pytest --full-trace --show-capture=no -sv benchmarks/benchmark_*.py
Expand Down
11 changes: 9 additions & 2 deletions .github/workflows/linting.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,15 +22,22 @@ jobs:
python-version: 3.8
poetry-version: 1.2.2

# Linting steps, excute all linters even if one fails
# Linting steps, execute all linters even if one fails
- name: ruff
run:
poetry run ruff sed tests
- name: ruff formating
- name: ruff formatting
if: ${{ always() }}
run:
poetry run ruff format --check sed tests
- name: mypy
if: ${{ always() }}
run:
poetry run mypy sed tests
- name: spellcheck
if: ${{ always() }}
uses: streetsidesoftware/cspell-action@v6
with:
check_dot_files: false
incremental_files_only: false
config: './cspell.json'
2 changes: 1 addition & 1 deletion .github/workflows/update_dependencies.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: Update depencies in poetry lockfile
name: Update dependencies in poetry lockfile

on:
schedule:
Expand Down
4 changes: 4 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -42,3 +42,7 @@ repos:
rev: 0.6.0
hooks:
- id: nbstripout
- repo: https://github.com/streetsidesoftware/cspell-cli
rev: v6.31.1
hooks:
- id: cspell
4 changes: 2 additions & 2 deletions benchmarks/Binning Benchmarks.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
"source": [
"# Binning demonstration on locally generated fake data\n",
"In this example, we generate a table with random data simulating a single event dataset.\n",
"We showcase the binning method, first on a simple single table using the bin_partition method and then in the distributed mehthod bin_dataframe, using daks dataframes.\n",
"We showcase the binning method, first on a simple single table using the bin_partition method and then in the distributed method bin_dataframe, using daks dataframes.\n",
"The first method is never really called directly, as it is simply the function called by the bin_dataframe on each partition of the dask dataframe."
]
},
Expand Down Expand Up @@ -200,7 +200,7 @@
"metadata": {},
"outputs": [],
"source": [
"data_path = '../../' # Put in Path to a storage of at least 20 Gbyte free space.\n",
"data_path = '../../' # Put in Path to a storage of at least 20 GByte free space.\n",
"if not os.path.exists(data_path + \"/WSe2.zip\"):\n",
" os.system(f\"curl --output {data_path}/WSe2.zip https://zenodo.org/record/6369728/files/WSe2.zip\")\n",
"if not os.path.isdir(data_path + \"/Scan049_1\") or not os.path.isdir(data_path + \"energycal_2019_01_08/\"):\n",
Expand Down
4 changes: 2 additions & 2 deletions benchmarks/mpes_sed_benchmarks.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@
"metadata": {},
"outputs": [],
"source": [
"dataPath = '../../' # Put in Path to a storage of at least 20 Gbyte free space.\n",
"dataPath = '../../' # Put in Path to a storage of at least 20 GByte free space.\n",
"if not os.path.exists(dataPath + \"/WSe2.zip\"):\n",
" os.system(f\"curl --output {dataPath}/WSe2.zip https://zenodo.org/record/6369728/files/WSe2.zip\")\n",
"if not os.path.isdir(dataPath + \"/Scan049_1\") or not os.path.isdir(dataPath + \"energycal_2019_01_08/\"):\n",
Expand Down Expand Up @@ -106,7 +106,7 @@
"metadata": {},
"source": [
"## compute distributed binning on the partitioned dask dataframe\n",
"We generated 100 dataframe partiions from the 100 files in the dataset, which we will bin parallelly with the dataframe binning function into a 3D grid"
"We generated 100 dataframe partitions from the 100 files in the dataset, which we will bin parallelly with the dataframe binning function into a 3D grid"
]
},
{
Expand Down
21 changes: 21 additions & 0 deletions cspell.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
{
"version": "0.2",
"ignorePaths": [
"./tests/data/*",
"*.toml",
"Makefile",
"*.bat"
],
"dictionaryDefinitions": [
{
"name": "custom-dictionary",
"path": "./.cspell/custom-dictionary.txt",
"addWords": true
}
],
"dictionaries": [ "custom-dictionary"
],
"words": [],
"ignoreWords": [],
"import": []
}
2 changes: 1 addition & 1 deletion docs/misc/contributing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ Development Workflow

3. **Write Tests:** If your contribution introduces new features or fixes a bug, add tests to cover your changes.

4. **Run Tests:** To ensure no funtionality is broken, run the tests:
4. **Run Tests:** To ensure no functionality is broken, run the tests:

.. code-block:: bash

Expand Down
2 changes: 1 addition & 1 deletion docs/misc/maintain.rst
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@ To create a release, follow these steps:
c. **If you don't see update on PyPI:**

- Visit the GitHub Actions page and monitor the Release workflow (https://github.com/OpenCOMPES/sed/actions/workflows/release.yml).
- Check if errors occured.
- Check if errors occurred.


**Understanding the Release Workflow**
Expand Down
6 changes: 3 additions & 3 deletions docs/sed/config.rst
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
Config
===================================================
The config module contains a mechanis to collect configuration parameters from various sources and configuration files, and to combine them in a hierachical manner into a single, consistent configuration dictionary.
The config module contains a mechanics to collect configuration parameters from various sources and configuration files, and to combine them in a hierarchical manner into a single, consistent configuration dictionary.
It will load an (optional) provided config file, or alternatively use a passed python dictionary as initial config dictionary, and subsequently look for the following additional config files to load:

* ``folder_config``: A config file of name :file:`sed_config.yaml` in the current working directory. This is mostly intended to pass calibration parameters of the workflow between different notebook instances.
* ``user_config``: A config file provided by the user, stored as :file:`.sed/config.yaml` in the current user's home directly. This is intended to give a user the option for individual configuration modifications of system settings.
* ``system_config``: A config file provided by the system administrator, stored as :file:`/etc/sed/config.yaml` on Linux-based systems, and :file:`%ALLUSERPROFILE%/sed/config.yaml` on Windows. This should provide all necessary default parameters for using the sed processor with a given setup. For an example for an mpes setup, see :ref:`example_config`
* ``user_config``: A config file provided by the user, stored as :file:`.config/sed/config.yaml` in the current user's home directly. This is intended to give a user the option for individual configuration modifications of system settings.
* ``system_config``: A config file provided by the system administrator, stored as :file:`/etc/sed/config.yaml` on Linux-based systems, and :file:`%ALLUSERSPROFILE%/sed/config.yaml` on Windows. This should provide all necessary default parameters for using the sed processor with a given setup. For an example for an mpes setup, see :ref:`example_config`
* ``default_config``: The default configuration shipped with the package. Typically, all parameters here should be overwritten by any of the other configuration files.

The config mechanism returns the combined dictionary, and reports the loaded configuration files. In order to disable or overwrite any of the configuration files, they can be also given as optional parameters (path to a file, or python dictionary).
Expand Down
2 changes: 1 addition & 1 deletion docs/sed/dataset.rst
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ Setting the “use_existing” keyword to False allows to download the data in a
Interrupting extraction has similar behavior to download and just continues from where it stopped.
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

Or if user deletes the extracted documents, it reextracts from zip file
Or if user deletes the extracted documents, it re-extracts from zip file
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

.. code:: python
Expand Down
14 changes: 7 additions & 7 deletions sed/binning/binning.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ def bin_partition(
- an integer describing the number of bins for all dimensions. This
requires "ranges" to be defined as well.
- A sequence containing one entry of the following types for each
dimenstion:
dimension:

- an integer describing the number of bins. This requires "ranges"
to be defined as well.
Expand Down Expand Up @@ -83,14 +83,14 @@ def bin_partition(
jittering. To specify the jitter amplitude or method (normal or uniform
noise) a dictionary can be passed. This should look like
jitter={'axis':{'amplitude':0.5,'mode':'uniform'}}.
This example also shows the default behaviour, in case None is
This example also shows the default behavior, in case None is
passed in the dictionary, or jitter is a list of strings.
Warning: this is not the most performing approach. Applying jitter
on the dataframe before calling the binning is much faster.
Defaults to None.
return_edges (bool, optional): If True, returns a list of D arrays
describing the bin edges for each dimension, similar to the
behaviour of ``np.histogramdd``. Defaults to False.
behavior of ``np.histogramdd``. Defaults to False.
skip_test (bool, optional): Turns off input check and data transformation.
Defaults to False as it is intended for internal use only.
Warning: setting this True might make error tracking difficult.
Expand Down Expand Up @@ -134,7 +134,7 @@ def bin_partition(
else:
bins = cast(List[int], bins)
# shift ranges by half a bin size to align the bin centers to the given ranges,
# as the histogram functions interprete the ranges as limits for the edges.
# as the histogram functions interpret the ranges as limits for the edges.
for i, nbins in enumerate(bins):
halfbinsize = (ranges[i][1] - ranges[i][0]) / (nbins) / 2
ranges[i] = (
Expand Down Expand Up @@ -234,7 +234,7 @@ def bin_dataframe(
- an integer describing the number of bins for all dimensions. This
requires "ranges" to be defined as well.
- A sequence containing one entry of the following types for each
dimenstion:
dimension:

- an integer describing the number of bins. This requires "ranges"
to be defined as well.
Expand Down Expand Up @@ -273,7 +273,7 @@ def bin_dataframe(
jittering. To specify the jitter amplitude or method (normal or uniform
noise) a dictionary can be passed. This should look like
jitter={'axis':{'amplitude':0.5,'mode':'uniform'}}.
This example also shows the default behaviour, in case None is
This example also shows the default behavior, in case None is
passed in the dictionary, or jitter is a list of strings.
Warning: this is not the most performing approach. applying jitter
on the dataframe before calling the binning is much faster.
Expand Down Expand Up @@ -479,7 +479,7 @@ def normalization_histogram_from_timed_dataframe(
bin_centers: np.ndarray,
time_unit: float,
) -> xr.DataArray:
"""Get a normalization histogram from a timed datafram.
"""Get a normalization histogram from a timed dataframe.

Args:
df (dask.dataframe.DataFrame): a dask.DataFrame on which to perform the
Expand Down
8 changes: 4 additions & 4 deletions sed/binning/numba_bin.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ def _hist_from_bin_range(
bit integers.

Args:
sample (np.ndarray): The data to be histogrammed with shape N,D.
sample (np.ndarray): The data to be histogram'd with shape N,D.
bins (Sequence[int]): The number of bins for each dimension D.
ranges (np.ndarray): A sequence of length D, each an optional (lower,
upper) tuple giving the outer bin edges to be used if the edges are
Expand All @@ -49,7 +49,7 @@ def _hist_from_bin_range(

for i in range(ndims):
delta[i] = 1 / ((ranges[i, 1] - ranges[i, 0]) / bins[i])
strides[i] = hist.strides[i] // hist.itemsize # pylint: disable=E1136
strides[i] = hist.strides[i] // hist.itemsize

for t in range(sample.shape[0]):
is_inside = True
Expand Down Expand Up @@ -157,7 +157,7 @@ def numba_histogramdd(
bins: Union[int, Sequence[int], Sequence[np.ndarray], np.ndarray],
ranges: Sequence = None,
) -> Tuple[np.ndarray, List[np.ndarray]]:
"""Multidimensional histogramming function, powered by Numba.
"""Multidimensional histogram function, powered by Numba.

Behaves in total much like numpy.histogramdd. Returns uint32 arrays.
This was chosen because it has a significant performance improvement over
Expand All @@ -167,7 +167,7 @@ def numba_histogramdd(
sizes.

Args:
sample (np.ndarray): The data to be histogrammed with shape N,D
sample (np.ndarray): The data to be histogram'd with shape N,D
bins (Union[int, Sequence[int], Sequence[np.ndarray], np.ndarray]): The number
of bins for each dimension D, or a sequence of bin edges on which to calculate
the histogram.
Expand Down
4 changes: 2 additions & 2 deletions sed/binning/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ def simplify_binning_arguments(
- an integer describing the number of bins for all dimensions. This
requires "ranges" to be defined as well.
- A sequence containing one entry of the following types for each
dimenstion:
dimension:

- an integer describing the number of bins. This requires "ranges"
to be defined as well.
Expand Down Expand Up @@ -123,7 +123,7 @@ def simplify_binning_arguments(
f"Ranges must be a sequence, not {type(ranges)}.",
)

# otherwise, all bins should by np.ndarrays here
# otherwise, all bins should be of type np.ndarray here
elif all(isinstance(x, np.ndarray) for x in bins):
bins = cast(List[np.ndarray], list(bins))
else:
Expand Down
4 changes: 2 additions & 2 deletions sed/calibrator/delay.py
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ def append_delay_axis(

Returns:
Union[pd.DataFrame, dask.dataframe.DataFrame]: dataframe with added column
and delay calibration metdata dictionary.
and delay calibration metadata dictionary.
"""
# pylint: disable=duplicate-code
if calibration is None:
Expand Down Expand Up @@ -407,7 +407,7 @@ def mm_to_ps(
delay_mm: Union[float, np.ndarray],
time0_mm: float,
) -> Union[float, np.ndarray]:
"""Converts a delaystage position in mm into a relative delay in picoseconds
"""Converts a delay stage position in mm into a relative delay in picoseconds
(double pass).

Args:
Expand Down
14 changes: 7 additions & 7 deletions sed/calibrator/energy.py
Original file line number Diff line number Diff line change
Expand Up @@ -446,7 +446,7 @@ def add_ranges(
traces (np.ndarray, optional): Collection of energy dispersion curves.
Defaults to self.traces_normed.
infer_others (bool, optional): Option to infer the feature detection range
in other traces from a given one using a time warp algorthm.
in other traces from a given one using a time warp algorithm.
Defaults to True.
mode (str, optional): Specification on how to change the feature ranges
('append' or 'replace'). Defaults to "replace".
Expand Down Expand Up @@ -1157,7 +1157,7 @@ def common_apply_func(apply: bool): # noqa: ARG001
update(correction["amplitude"], x_center, y_center, diameter=correction["diameter"])
except KeyError as exc:
raise ValueError(
"Parameter 'diameter' required for correction type 'sperical', ",
"Parameter 'diameter' required for correction type 'spherical', ",
"but not present!",
) from exc

Expand Down Expand Up @@ -1339,7 +1339,7 @@ def apply_energy_correction(
Defaults to config["energy"]["correction_type"].
amplitude (float, optional): Amplitude of the time-of-flight correction
term. Defaults to config["energy"]["correction"]["correction_type"].
correction (dict, optional): Correction dictionary containing paramters
correction (dict, optional): Correction dictionary containing parameters
for the correction. Defaults to self.correction or
config["energy"]["correction"].
verbose (bool, optional): Option to print out diagnostic information.
Expand Down Expand Up @@ -1939,7 +1939,7 @@ def _datacheck_peakdetect(
x_axis: np.ndarray,
y_axis: np.ndarray,
) -> Tuple[np.ndarray, np.ndarray]:
"""Input format checking for 1D peakdtect algorithm
"""Input format checking for 1D peakdetect algorithm

Args:
x_axis (np.ndarray): x-axis array
Expand Down Expand Up @@ -2109,7 +2109,7 @@ def fit_energy_calibration(
binwidth (float): Time width of each original TOF bin in ns.
binning (int): Binning factor of the TOF values.
ref_id (int, optional): Reference dataset index. Defaults to 0.
ref_energy (float, optional): Energy value of the feature in the refence
ref_energy (float, optional): Energy value of the feature in the reference
trace (eV). required to output the calibration. Defaults to None.
t (Union[List[float], np.ndarray], optional): Array of TOF values. Required
to calculate calibration trace. Defaults to None.
Expand All @@ -2131,7 +2131,7 @@ def fit_energy_calibration(
Returns:
dict: A dictionary of fitting parameters including the following,

- "coeffs": Fitted function coefficents.
- "coeffs": Fitted function coefficients.
- "axis": Fitted energy axis.
"""
vals = np.asarray(vals)
Expand Down Expand Up @@ -2248,7 +2248,7 @@ def poly_energy_calibration(
each EDC.
order (int, optional): Polynomial order of the fitting function. Defaults to 3.
ref_id (int, optional): Reference dataset index. Defaults to 0.
ref_energy (float, optional): Energy value of the feature in the refence
ref_energy (float, optional): Energy value of the feature in the reference
trace (eV). required to output the calibration. Defaults to None.
t (Union[List[float], np.ndarray], optional): Array of TOF values. Required
to calculate calibration trace. Defaults to None.
Expand Down
Loading