Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI Fixes #162

Merged
merged 25 commits into from
Feb 2, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
6b45a6f
Set RUST_BACKTRACE 1
jdye64 Jan 31, 2023
bc7e6bf
Add protoc action to build.yml for manylinux
jdye64 Jan 31, 2023
0237c89
Add protoc action to test.yaml matrix
jdye64 Jan 31, 2023
20988e6
Bump to 17.0.0 for a test
jdye64 Jan 31, 2023
1fd7c7b
include features
jdye64 Jan 31, 2023
5440928
test re-enabling protoc action
jdye64 Feb 1, 2023
3e9e7e1
enable protoc action in test.yaml
jdye64 Feb 1, 2023
0057e02
blake formatting
jdye64 Feb 1, 2023
267813c
python linter fixes
jdye64 Feb 1, 2023
6efddb0
blake fixes
jdye64 Feb 1, 2023
7fea8e3
update run tests action
jdye64 Feb 2, 2023
2572337
remove duplicate maturin develop
jdye64 Feb 2, 2023
72aefcc
Include Pip install for datafusion dist
jdye64 Feb 2, 2023
08b33d6
Remove some pip install options that are not needed
jdye64 Feb 2, 2023
69b0e6e
Install dist/datafusion*.whl
jdye64 Feb 2, 2023
6bdb956
Use virtualenv python version for pip install
jdye64 Feb 2, 2023
b7ece5f
Use vitualenv python version for pytest run
jdye64 Feb 2, 2023
14b9a63
examine the contents of the result whl file to make sure _internal.ab…
jdye64 Feb 2, 2023
0358a01
Wrong file extension ... change to .whl from .zip
jdye64 Feb 2, 2023
529dde5
Try another flavor of pip installing
jdye64 Feb 2, 2023
93d897f
Fix pyarrow version
jdye64 Feb 2, 2023
6d3e141
update pip install process
jdye64 Feb 2, 2023
d2f733e
testing
jdye64 Feb 2, 2023
498504b
doh
jdye64 Feb 2, 2023
73083c1
Remove previous maturin build since now it happens in the test run se…
jdye64 Feb 2, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 0 additions & 39 deletions .github/actions/setup-builder/action.yaml

This file was deleted.

6 changes: 6 additions & 0 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -97,9 +97,15 @@ jobs:
name: python-wheel-license
path: .
- run: cat LICENSE.txt
- name: Install Protoc
uses: arduino/setup-protoc@v1
with:
version: '3.x'
- name: Build wheels
uses: PyO3/maturin-action@v1
with:
env:
RUST_BACKTRACE: 1
rust-toolchain: nightly
target: x86_64
manylinux: auto
Expand Down
37 changes: 9 additions & 28 deletions .github/workflows/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -53,18 +53,10 @@ jobs:
toolchain: ${{ matrix.toolchain }}
override: true

- name: Install protobuf compiler
shell: bash
run: |
mkdir -p $HOME/d/protoc
cd $HOME/d/protoc
export PROTO_ZIP="protoc-21.4-linux-x86_64.zip"
curl -LO https://github.com/protocolbuffers/protobuf/releases/download/v21.4/$PROTO_ZIP
unzip $PROTO_ZIP
export PATH=$PATH:$HOME/d/protoc/bin
export PROTOC=$HOME/d/protoc/bin
sudo chown -R $(whoami) $HOME/d/protoc
protoc --version
- name: Install Protoc
uses: arduino/setup-protoc@v1
with:
version: '3.x'

- name: Setup Python
uses: actions/setup-python@v4
Expand Down Expand Up @@ -112,22 +104,11 @@ jobs:
flake8 --exclude venv --ignore=E501,W503
black --line-length 79 --diff --check .

- name: Build wheels
uses: PyO3/maturin-action@v1
with:
command: build
args: --release --out dist

- name: Run tests
env:
RUST_BACKTRACE: 1
run: |
git submodule update --init
export PATH=$PATH:$HOME/d/protoc/bin
export PROTOC=$HOME/d/protoc/bin
sudo chown -R $(whoami) $HOME/d/protoc
ls -l $HOME/d/protoc/
ls -l $HOME/d/protoc/bin
pip install datafusion-python --no-index --find-links dist --force-reinstall
pip install pytest
cargo clean
maturin develop
RUST_BACKTRACE=1 pytest -v .
source venv/bin/activate
pip install -e . -vv
pytest -v .
41 changes: 25 additions & 16 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

10 changes: 5 additions & 5 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -34,11 +34,11 @@ default = ["mimalloc"]
tokio = { version = "1.24", features = ["macros", "rt", "rt-multi-thread", "sync"] }
rand = "0.8"
pyo3 = { version = "~0.17.3", features = ["extension-module", "abi3", "abi3-py37"] }
datafusion = { git = "https://github.com/apache/arrow-datafusion", rev = "5238e8c97f998b4d2cb9fab85fb182f325a1a7fb", features = ["pyarrow", "avro"] }
datafusion-expr = { git = "https://github.com/apache/arrow-datafusion", rev = "5238e8c97f998b4d2cb9fab85fb182f325a1a7fb" }
datafusion-optimizer = { git = "https://github.com/apache/arrow-datafusion", rev = "5238e8c97f998b4d2cb9fab85fb182f325a1a7fb" }
datafusion-common = { git = "https://github.com/apache/arrow-datafusion", rev = "5238e8c97f998b4d2cb9fab85fb182f325a1a7fb", features = ["pyarrow"] }
datafusion-substrait = { git = "https://github.com/apache/arrow-datafusion", rev = "5238e8c97f998b4d2cb9fab85fb182f325a1a7fb" }
datafusion = { version = "17.0.0", features = ["pyarrow", "avro"] }
datafusion-expr = "17.0.0"
datafusion-optimizer = "17.0.0"
datafusion-common = { version = "17.0.0", features = ["pyarrow"] }
datafusion-substrait = "17.0.0"
uuid = { version = "1.2", features = ["v4"] }
mimalloc = { version = "*", optional = true, default-features = false }
async-trait = "0.1"
Expand Down
31 changes: 18 additions & 13 deletions datafusion/tests/test_dataframe.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,11 +52,12 @@ def struct_df():

return ctx.create_dataframe([[batch]])


@pytest.fixture
def aggregate_df():
ctx = SessionContext()
ctx.register_csv('test', 'testing/data/csv/aggregate_test_100.csv')
return ctx.sql('select c1, sum(c2) from test group by c1')
ctx.register_csv("test", "testing/data/csv/aggregate_test_100.csv")
return ctx.sql("select c1, sum(c2) from test group by c1")


def test_select(df):
Expand Down Expand Up @@ -271,10 +272,11 @@ def test_logical_plan(aggregate_df):

assert expected == plan.display()

expected = \
"Projection: test.c1, SUM(test.c2)\n" \
" Aggregate: groupBy=[[test.c1]], aggr=[[SUM(test.c2)]]\n" \
expected = (
"Projection: test.c1, SUM(test.c2)\n"
" Aggregate: groupBy=[[test.c1]], aggr=[[SUM(test.c2)]]\n"
" TableScan: test"
)

assert expected == plan.display_indent()

Expand All @@ -286,25 +288,29 @@ def test_optimized_logical_plan(aggregate_df):

assert expected == plan.display()

expected = \
"Projection: test.c1, SUM(test.c2)\n" \
" Aggregate: groupBy=[[test.c1]], aggr=[[SUM(test.c2)]]\n" \
expected = (
"Projection: test.c1, SUM(test.c2)\n"
" Aggregate: groupBy=[[test.c1]], aggr=[[SUM(test.c2)]]\n"
" TableScan: test projection=[c1, c2]"
)

assert expected == plan.display_indent()


def test_execution_plan(aggregate_df):
plan = aggregate_df.execution_plan()

expected = "ProjectionExec: expr=[c1@0 as c1, SUM(test.c2)@1 as SUM(test.c2)]\n"
expected = (
"ProjectionExec: expr=[c1@0 as c1, SUM(test.c2)@1 as SUM(test.c2)]\n"
)

assert expected == plan.display()

expected = \
"ProjectionExec: expr=[c1@0 as c1, SUM(test.c2)@1 as SUM(test.c2)]\n" \
" Aggregate: groupBy=[[test.c1]], aggr=[[SUM(test.c2)]]\n" \
expected = (
"ProjectionExec: expr=[c1@0 as c1, SUM(test.c2)@1 as SUM(test.c2)]\n"
" Aggregate: groupBy=[[test.c1]], aggr=[[SUM(test.c2)]]\n"
" TableScan: test projection=[c1, c2]"
)

indent = plan.display_indent()

Expand All @@ -317,7 +323,6 @@ def test_execution_plan(aggregate_df):
assert "CsvExec:" in indent



def test_repartition(df):
df.repartition(2)

Expand Down
15 changes: 10 additions & 5 deletions datafusion/tests/test_substrait.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,8 @@
# under the License.

import pyarrow as pa
import pyarrow.dataset as ds

from datafusion import column, literal, SessionContext
from datafusion import SessionContext
from datafusion import substrait as ss
import pytest

Expand All @@ -39,8 +38,14 @@ def test_substrait_serialization(ctx):
assert ctx.tables() == {"t"}

# For now just make sure the method calls blow up
substrait_plan = ss.substrait.serde.serialize_to_plan("SELECT * FROM t", ctx)
substrait_bytes = ss.substrait.serde.serialize_bytes("SELECT * FROM t", ctx)
substrait_plan = ss.substrait.serde.serialize_to_plan(
"SELECT * FROM t", ctx
)
substrait_bytes = ss.substrait.serde.serialize_bytes(
"SELECT * FROM t", ctx
)
substrait_plan = ss.substrait.serde.deserialize_bytes(substrait_bytes)
df_logical_plan = ss.substrait.consumer.from_substrait_plan(ctx, substrait_plan)
df_logical_plan = ss.substrait.consumer.from_substrait_plan(
ctx, substrait_plan
)
substrait_plan = ss.substrait.producer.to_substrait_plan(df_logical_plan)
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ classifier = [
"Programming Language :: Rust",
]
dependencies = [
"pyarrow>=1",
"pyarrow>=6.0.1",
]

[project.urls]
Expand Down