Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REFACTOR-#2957: Restructure project files #3210

Merged
merged 23 commits into from
Oct 13, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
ae07c00
REFACTOR-#2957: Restructure project files
devin-petersohn Jul 2, 2021
0450762
REFACTOR-#2957: Stabilize the structure
YarShev Sep 2, 2021
a31833e
REFACTOR-#2957: Apply the changes to the project structure
YarShev Sep 8, 2021
b72d1fa
REFACTOR-#2957: Fix flake8 job
YarShev Sep 8, 2021
46b6921
REFACTOR-#2957: Fix test_api and test_headers
YarShev Sep 8, 2021
18b0c9e
REFACTOR-#2957: Fix paths in GH actions and doc check
YarShev Sep 8, 2021
9773dd0
REFACTOR-#2957: Fix doc check
YarShev Sep 8, 2021
862e27d
REFACTOR-#2957: Fix paths
YarShev Sep 8, 2021
e4baf71
REFACTOR-#2957: Fix spelling and move spreadsheet to exp
YarShev Sep 8, 2021
0b867dd
REFACTOR-#2957: Apply additional changes
YarShev Sep 8, 2021
6a9d6d0
REFACTOR-#2957: Fix spreadsheet tests by adding exp pandas on dask
YarShev Sep 8, 2021
193e872
REFACTOR-#2957: d2p -> default2pandas
YarShev Sep 9, 2021
85d9f64
Merge branch 'master' into refactor/2957
anmyachev Sep 22, 2021
d235de5
REFACTOR-#2957: Address comments
devin-petersohn Sep 28, 2021
50a2c47
REFACTOR-#2957: Address comments
devin-petersohn Sep 30, 2021
5bb7b80
Merge branch 'master' into refactor/2957
devin-petersohn Sep 30, 2021
fe39fed
Merge branch 'master' into refactor/2957
YarShev Oct 11, 2021
7a39691
Apply comments
YarShev Oct 11, 2021
fd0a6e1
Fix circular import
YarShev Oct 11, 2021
7596c88
Fix omnisci tests
YarShev Oct 11, 2021
dfe2d8f
Fix paths
YarShev Oct 11, 2021
8ec8d3e
Merge branch 'master' into refactor/2957
YarShev Oct 12, 2021
b25bf0d
Merge remote-tracking branch 'upstream/master' into refactor/2957
YarShev Oct 12, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 31 additions & 30 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -55,28 +55,29 @@ jobs:
modin/pandas/series_utils.py modin/pandas/general.py \
modin/pandas/plotting.py modin/pandas/utils.py \
modin/pandas/iterator.py modin/pandas/indexing.py \
- run: python scripts/doc_checker.py modin/engines/base/frame
- run: python scripts/doc_checker.py modin/engines/dask
- run: python scripts/doc_checker.py modin/core/dataframe
- run: python scripts/doc_checker.py modin/core/execution/dask
- run: |
python scripts/doc_checker.py \
modin/pandas/accessor.py modin/pandas/general.py \
modin/pandas/groupby.py modin/pandas/indexing.py \
modin/pandas/iterator.py modin/pandas/plotting.py \
modin/pandas/series_utils.py modin/pandas/utils.py \
modin/pandas/base.py \
modin/pandas/io.py modin/engines/base/io/io.py \
modin/engines/base/frame asv_bench/benchmarks/utils \
modin/pandas/io.py \
asv_bench/benchmarks/utils \
asv_bench/benchmarks/__init__.py asv_bench/benchmarks/io/__init__.py \
asv_bench/benchmarks/scalability/__init__.py \
modin/engines/base/io/column_stores \
modin/engines/base/io/sql \
modin/engines/base/io/text \
modin/engines/base/io/__init__.py \
modin/engines/base/io/file_dispatcher.py \
modin/experimental/engines/pandas_on_ray \
modin/experimental/engines/pyarrow_on_ray \
modin/core/io/io.py \
modin/core/io/column_stores \
modin/core/io/sql \
modin/core/io/text \
modin/core/io/__init__.py \
modin/core/io/file_dispatcher.py \
vnlitvinov marked this conversation as resolved.
Show resolved Hide resolved
modin/experimental/core/execution/ray/implementations/pandas_on_ray \
modin/experimental/core/execution/ray/implementations/pyarrow_on_ray \
modin/pandas/series.py \
modin/engines/python \
modin/core/execution/python \
modin/pandas/dataframe.py \
modin/config/__init__.py \
modin/config/__main__.py \
Expand All @@ -89,24 +90,23 @@ jobs:
python scripts/doc_checker.py modin/experimental/xgboost/__init__.py \
modin/experimental/xgboost/utils.py modin/experimental/xgboost/xgboost.py \
modin/experimental/xgboost/xgboost_ray.py
- run: python scripts/doc_checker.py modin/engines/ray
- run: python scripts/doc_checker.py modin/core/execution/ray
- run: |
python scripts/doc_checker.py modin/data_management/functions \
modin/data_management/factories/factories.py \
modin/data_management/factories/dispatcher.py \
modin/data_management/utils.py
python scripts/doc_checker.py modin/core/execution/dispatching/factories/factories.py \
modin/core/execution/dispatching/factories/dispatcher.py \
- run: python scripts/doc_checker.py scripts/doc_checker.py
- run: |
python scripts/doc_checker.py modin/experimental/pandas/io_exp.py \
python scripts/doc_checker.py modin/experimental/pandas/io.py \
modin/experimental/pandas/numpy_wrap.py modin/experimental/pandas/__init__.py
- run: python scripts/doc_checker.py modin/backends/base
- run: python scripts/doc_checker.py modin/backends/pyarrow
- run: python scripts/doc_checker.py modin/backends/pandas
- run: python scripts/doc_checker.py modin/core/storage_formats/base
- run: python scripts/doc_checker.py modin/core/storage_formats/pyarrow
- run: python scripts/doc_checker.py modin/core/storage_formats/pandas
- run: |
python scripts/doc_checker.py \
modin/experimental/engines/omnisci_on_native/frame \
modin/experimental/engines/omnisci_on_native/io.py
- run: python scripts/doc_checker.py modin/experimental/backends/omnisci
modin/experimental/core/execution/native/implementations/omnisci_on_native/dataframe \
modin/experimental/core/execution/native/implementations/omnisci_on_native/io \
modin/experimental/core/execution/native/implementations/omnisci_on_native/partitioning \
- run: python scripts/doc_checker.py modin/experimental/core/storage_formats/omnisci

lint-flake8:
name: lint (flake8)
Expand Down Expand Up @@ -239,11 +239,11 @@ jobs:
conda info
conda list
- name: Internals tests
run: python -m pytest modin/data_management/factories/test/test_dispatcher.py modin/experimental/cloud/test/test_cloud.py
run: python -m pytest modin/core/execution/dispatching/factories/test/test_dispatcher.py modin/experimental/cloud/test/test_cloud.py
- run: python -m pytest modin/config/test
- run: python -m pytest modin/test/test_envvar_catcher.py
- run: python -m pytest modin/test/backends/base/test_internals.py
- run: python -m pytest modin/test/backends/pandas/test_internals.py
- run: python -m pytest modin/test/storage_formats/base/test_internals.py
- run: python -m pytest modin/test/storage_formats/pandas/test_internals.py
- run: python -m pytest modin/test/test_envvar_npartitions.py
- run: python -m pytest -n 2 modin/test/test_partition_api.py
- run: python -m pytest modin/test/test_utils.py
Expand Down Expand Up @@ -280,7 +280,7 @@ jobs:
- name: Install HDF5
run: sudo apt update && sudo apt install -y libhdf5-dev
- run: pytest modin/experimental/xgboost/test/test_default.py --backend=${{ matrix.backend }}
- run: python -m pytest -n 2 modin/test/backends/base/test_internals.py --backend=${{ matrix.backend }}
- run: python -m pytest -n 2 modin/test/storage_formats/base/test_internals.py --backend=${{ matrix.backend }}
- run: pytest -n 2 modin/pandas/test/dataframe/test_binary.py --backend=${{ matrix.backend }}
- run: pytest -n 2 modin/pandas/test/dataframe/test_default.py --backend=${{ matrix.backend }}
- run: pytest -n 2 modin/pandas/test/dataframe/test_indexing.py --backend=${{ matrix.backend }}
Expand Down Expand Up @@ -337,7 +337,7 @@ jobs:
- name: Install HDF5
run: sudo apt update && sudo apt install -y libhdf5-dev
- run: MODIN_BENCHMARK_MODE=True pytest modin/pandas/test/internals/test_benchmark_mode.py
- run: pytest modin/experimental/engines/omnisci_on_native/test/test_dataframe.py
- run: pytest modin/experimental/core/execution/native/implementations/omnisci_on_native/test/test_dataframe.py
- run: pytest modin/pandas/test/test_io.py::TestCsv --verbose
- run: |
curl -o codecov https://codecov.io/bash
Expand Down Expand Up @@ -666,6 +666,7 @@ jobs:
python-version: [ "3.7", "3.8" ]
engine: ["ray", "dask"]
env:
MODIN_EXPERIMENTAL: "True"
MODIN_ENGINE: ${{matrix.engine}}
name: test-spreadsheet (engine ${{matrix.engine}}, python ${{matrix.python-version}})
steps:
Expand All @@ -683,4 +684,4 @@ jobs:
run: |
conda info
conda list
- run: python -m pytest modin/spreadsheet/test/test_general.py
- run: python -m pytest modin/experimental/spreadsheet/test/test_general.py
9 changes: 5 additions & 4 deletions .github/workflows/push.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,10 +23,10 @@ jobs:
conda info
conda list
- name: Internals tests
run: python -m pytest modin/data_management/factories/test/test_dispatcher.py modin/experimental/cloud/test/test_cloud.py
run: python -m pytest modin/core/execution/dispatching/factories/test/test_dispatcher.py modin/experimental/cloud/test/test_cloud.py
- run: python -m pytest modin/config/test
- run: python -m pytest modin/test/test_envvar_catcher.py
- run: python -m pytest modin/test/backends/pandas/test_internals.py
- run: python -m pytest modin/test/storage_formats/pandas/test_internals.py
- run: python -m pytest modin/test/test_envvar_npartitions.py
- run: python -m pytest modin/test/test_partition_api.py

Expand Down Expand Up @@ -114,7 +114,7 @@ jobs:
conda list
- name: Install HDF5
run: sudo apt update && sudo apt install -y libhdf5-dev
- run: pytest modin/experimental/engines/omnisci_on_native/test/test_dataframe.py
- run: pytest modin/experimental/core/execution/native/implementations/omnisci_on_native/test/test_dataframe.py
- run: pytest modin/pandas/test/test_io.py::TestCsv
- run: |
curl -o codecov https://codecov.io/bash
Expand Down Expand Up @@ -287,6 +287,7 @@ jobs:
python-version: [ "3.7", "3.8" ]
engine: ["ray", "dask"]
env:
MODIN_EXPERIMENTAL: "True"
MODIN_ENGINE: ${{matrix.engine}}
name: test-spreadsheet (engine ${{matrix.engine}}, python ${{matrix.python-version}})
steps:
Expand All @@ -304,4 +305,4 @@ jobs:
run: |
conda info
conda list
- run: python -m pytest modin/spreadsheet/test/test_general.py
- run: python -m pytest modin/experimental/spreadsheet/test/test_general.py
2 changes: 1 addition & 1 deletion asv_bench/benchmarks/utils/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -458,7 +458,7 @@ def trigger_import(*dfs):
"""
assert ASV_USE_BACKEND == "omnisci"

from modin.experimental.engines.omnisci_on_native.frame.omnisci_worker import (
from modin.experimental.core.execution.native.implementations.omnisci_on_native.omnisci_worker import (
OmnisciServer,
)

Expand Down
2 changes: 1 addition & 1 deletion examples/docker/modin-omnisci/census-omnisci.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
import sys
import time
import modin.pandas as pd
from modin.experimental.engines.omnisci_on_native.frame.omnisci_worker import OmnisciServer
from modin.experimental.core.execution.native.implementations.omnisci_on_native.omnisci_worker import OmnisciServer

from sklearn import config_context
import sklearnex
Expand Down
2 changes: 1 addition & 1 deletion examples/docker/modin-omnisci/nyc-taxi-omnisci.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
import sys
import time
import modin.pandas as pd
from modin.experimental.engines.omnisci_on_native.frame.omnisci_worker import OmnisciServer
from modin.experimental.core.execution.native.implementations.omnisci_on_native.omnisci_worker import OmnisciServer


def read(filename):
Expand Down
2 changes: 1 addition & 1 deletion examples/docker/modin-omnisci/plasticc-omnisci.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
from collections import OrderedDict
from functools import partial
import modin.pandas as pd
from modin.experimental.engines.omnisci_on_native.frame.omnisci_worker import OmnisciServer
from modin.experimental.core.execution.native.implementations.omnisci_on_native.omnisci_worker import OmnisciServer

import numpy as np
import xgboost as xgb
Expand Down
11 changes: 8 additions & 3 deletions modin/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,9 +44,14 @@ def _saving_make_api_url(token, _make_api_url=modin.utils._make_api_url):
import modin.config # noqa: E402
from modin.config import IsExperimental, TestRayClient # noqa: E402

from modin.backends import PandasQueryCompiler, BaseQueryCompiler # noqa: E402
from modin.engines.python.pandas_on_python.io import PandasOnPythonIO # noqa: E402
from modin.data_management.factories import factories # noqa: E402
from modin.core.storage_formats import ( # noqa: E402
PandasQueryCompiler,
BaseQueryCompiler,
)
from modin.core.execution.python.implementations.pandas_on_python.io import ( # noqa: E402
PandasOnPythonIO,
)
from modin.core.execution.dispatching.factories import factories # noqa: E402
from modin.utils import get_current_backend # noqa: E402
from modin.pandas.test.utils import ( # noqa: E402
_make_csv_file,
Expand Down
2 changes: 2 additions & 0 deletions modin/engines/python/__init__.py → modin/core/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,5 @@
# the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF
# ANY KIND, either express or implied. See the License for the specific language
# governing permissions and limitations under the License.

"""Modin's core functionality."""
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,5 @@
# the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF
# ANY KIND, either express or implied. See the License for the specific language
# governing permissions and limitations under the License.

"""Base Modin Dataframe functionality."""
Original file line number Diff line number Diff line change
Expand Up @@ -11,15 +11,18 @@
# ANY KIND, either express or implied. See the License for the specific language
# governing permissions and limitations under the License.

"""Function module provides template for a query compiler methods for a set of common operations."""
"""Modin Dataframe algebra (core operators)."""

from .function import Function
from .mapfunction import MapFunction
from .mapreducefunction import MapReduceFunction
from .reductionfunction import ReductionFunction
from .foldfunction import FoldFunction
from .binary_function import BinaryFunction
from .groupby_function import GroupbyReduceFunction, groupby_reduce_functions
from .groupby_function import (
GroupbyReduceFunction,
groupby_reduce_functions,
)

__all__ = [
"Function",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,19 +11,19 @@
# ANY KIND, either express or implied. See the License for the specific language
# governing permissions and limitations under the License.

"""Function module provides templates for a query compiler default-to-pandas methods."""
"""Module default2pandas provides templates for a query compiler default-to-pandas methods."""

from .dataframe_default import DataFrameDefault
from .datetime_default import DateTimeDefault
from .series_default import SeriesDefault
from .str_default import StrDefault
from .binary_default import BinaryDefault
from .any_default import AnyDefault
from .resample_default import ResampleDefault
from .rolling_default import RollingDefault
from .dataframe import DataFrameDefault
from .datetime import DateTimeDefault
from .series import SeriesDefault
from .str import StrDefault
from .binary import BinaryDefault
from .any import AnyDefault
from .resample import ResampleDefault
from .rolling import RollingDefault
from .default import DefaultMethod
from .cat_default import CatDefault
from .groupby_default import GroupByDefault
from .cat import CatDefault
from .groupby import GroupByDefault

__all__ = [
"DataFrameDefault",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@

"""Module houses default binary functions builder class."""

from .any_default import AnyDefault
from .any import AnyDefault

import pandas
from pandas.core.dtypes.common import is_list_like
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@

"""Module houses default applied-on-category functions builder class."""

from .series_default import SeriesDefault
from .series import SeriesDefault


class CatDefault(SeriesDefault):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@

"""Module houses default applied-on-datetime functions builder class."""

from .series_default import SeriesDefault
from .series import SeriesDefault


class DateTimeDefault(SeriesDefault):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@

"""Module houses default functions builder class."""
devin-petersohn marked this conversation as resolved.
Show resolved Hide resolved

from modin.data_management.functions.function import Function
from modin.core.dataframe.algebra.function import Function
from modin.utils import try_cast_to_pandas

from pandas.core.dtypes.common import is_list_like
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@

"""Module houses default Series functions builder class."""

from .any_default import AnyDefault
from .any import AnyDefault


class SeriesDefault(AnyDefault):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@

"""Module houses default applied-on-str functions builder class."""

from .series_default import SeriesDefault
from .series import SeriesDefault


class StrDefault(SeriesDefault):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
import pandas

from .mapreducefunction import MapReduceFunction
from .default_methods.groupby_default import GroupBy
from .default2pandas.groupby import GroupBy
from modin.utils import try_cast_to_pandas, hashable


Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,5 @@
# the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF
# ANY KIND, either express or implied. See the License for the specific language
# governing permissions and limitations under the License.

"""Base Modin Dataframe classes."""
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,5 @@
# the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF
# ANY KIND, either express or implied. See the License for the specific language
# governing permissions and limitations under the License.

"""Base Modin Dataframe class."""
14 changes: 14 additions & 0 deletions modin/core/dataframe/base/partitioning/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# Licensed to Modin Development Team under one or more contributor license agreements.
# See the NOTICE file distributed with this work for additional information regarding
# copyright ownership. The Modin Development Team licenses this file to you under the
# Apache License, Version 2.0 (the "License"); you may not use this file except in
# compliance with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software distributed under
# the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF
# ANY KIND, either express or implied. See the License for the specific language
# governing permissions and limitations under the License.

"""Base Modin Dataframe classes related to its partitioning."""
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@

### Object Hierarchy
vnlitvinov marked this conversation as resolved.
Show resolved Hide resolved

- `frame/partition.py` contains `PandasFramePartition` interface and its implementations.
- `frame/partition_manager.py` contains `PandasFramePartitionManager` interface and its implementations.
- `partitioning/partition.py` contains `PandasFramePartition` interface and its implementations.
- `partitioning/partition_manager.py` contains `PandasFramePartitionManager` interface and its implementations.
- `PandasFramePartitionManager` manages 2D-array of `PandasFramePartition` object
- `frame/axis_partition.py` contains `BaseFrameAxisPartition` and with the following hierarchy:
- `partitioning/axis_partition.py` contains `BaseFrameAxisPartition` and with the following hierarchy:
```
BaseFrameAxisPartition -> PandasOnRayFrameAxisPartition -> {PandasOnRayFrameColumnPartition, PandasOnRayFrameRowPartition}
```
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,4 @@
# ANY KIND, either express or implied. See the License for the specific language
# governing permissions and limitations under the License.

"""Package holds key base classes to making efficient scale of data."""
"""Base Modin Dataframe classes optimized for pandas storage format."""
Loading