Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optional dependency GSTools-Core #215

Merged
merged 10 commits into from
Nov 11, 2021
Merged

Add optional dependency GSTools-Core #215

merged 10 commits into from
Nov 11, 2021

Conversation

LSchueler
Copy link
Member

I removed all Cython code and replaced it with the GSTools-Core PyPI package v0.1.2.

The Github actions still need to be updated, as the distribution of GSTools has become muuch simpler.

The Github actions still need to be updated.
@LSchueler LSchueler self-assigned this Oct 28, 2021
@LSchueler LSchueler added enhancement New feature or request Performance Performance related stuff. Refactoring Code-Refactoring needed here labels Oct 28, 2021
@adamreichold
Copy link

One thing that would be nice would be to create a largish regression test suite using the existing implementation and apply that to the new one to ensure equivalent results.

@LSchueler
Copy link
Member Author

Nearly all of the tests done with during the sdist job rely on the Rust implementations. But as you have discovered recently, we still don't cover everything. So, there is some work to do!
The other jobs are only failing due to some import linter problem and that we don't need to build wheels anymore, as GSTools is now a pure Python module.

@adamreichold
Copy link

Nearly all of the tests done with during the sdist job rely on the Rust implementations. But as you have discovered recently, we still don't cover everything. So, there is some work to do!

I was pondering something more automated, e.g. using the existing implementation as a model-based test using a coverage-guided fuzzer that would automatically create test cases with the definition of the test cases basically being to yield the same results as the existing implementation. I am just note sure how to integrate the existing implementation, i.e. on a source-level or maybe by just tapping into a fixed binary distribution.

@MuellerSeb
Copy link
Member

MuellerSeb commented Nov 2, 2021

After a nice conversation with @LSchueler we decided:

@adamreichold
Copy link

use gstools_core as an optional drop-in replacement and add a switch to control the used backend

Shouldn't this be

try:
    import gstools_core
    USE_RUST = True
except ImportError:
    del gstools_core
    USE_RUST = False

i.e. only if the import succeeds, USE_RUST == True, not unconditionally as the finally block is executed?

@MuellerSeb
Copy link
Member

I meant else ... Sorry

Now, the GSTools-Core is used, if the package can be imported, but it
can also be switched off, by setting the global var.
`gstools.config.USE_RUST=False` during the runtime.
This makes the Cython code compatible to GSTools-Core v0.1.2 again.
@LSchueler
Copy link
Member Author

The Cython code is back in again and GSTools-Core is going to be used if the package is installed and gstools.config.USE_RUST is not manually set to False
I also refactored the Cython code according to issue #216 , which would close it.

docs/source/index.rst Outdated Show resolved Hide resolved

if config.USE_RUST:
# pylint: disable=E0401
from gstools_core import (

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be possible to merge those or this is a formatter issue?

Copy link
Member Author

@LSchueler LSchueler Nov 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... that was isort's idea. It's about as ugly as it gets, right?! :-D And black and isort can't even agree on where to put the # pragma: no cover.
rustfmt is so nice :-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about setting # pragma: no cover behind if config.USE_RUST: once?
Also the # pylint: disable=C0412 is redundant (see line 13) and we could also just add E0401 in line 13.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I think I was a bit annoyed yesterday by the combination of a false positive coming from pylint and then the isort and black peskiness. I imnproved the situation, but it seems that isort does not like grouped renaming of imports.
I don't want to disable the pylint error E0401 globally, as import errors can be quite a severe problem.

setup.cfg Outdated Show resolved Hide resolved
@LSchueler
Copy link
Member Author

For the moment, we don't provide GSTools-Core via conda, which makes it a bit awkward to use it, if GSTools is installed via conda. But I think that's a problem of GSTools-Core and as the Rust implementations are labeled as experimental anyway, I think this is ready to be merged.

@LSchueler LSchueler marked this pull request as ready for review November 9, 2021 21:24
@LSchueler LSchueler requested a review from MuellerSeb November 9, 2021 21:24
Copy link
Member

@MuellerSeb MuellerSeb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really cool! I got some ideas for improvement, maybe you like them. And what about a routine to set the env-var for the number of cores for the rust parallel runs in gstools.config?

GSTools-Core will automatically use all your cores in parallel, without having
to use OpenMP or a local C compiler.
In case you want to restrict the number of threads used, you can set the
environment variable ``RAYON_NUM_THREADS`` to the desired amount.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about a little function in gstools.config to set this environment variable?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One argument against that is that this variable will affect all Rust software using Rayon, not just GSTools-Core's backend.

If we do provide a programmatic interface, we should probably make the Core initialize a dedicated thread pool and make this function affect only this thread pool?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adamreichold raises a fair point there. But that env. variable would only exist during the shell session from which GSTools is run (I don't have enough experience with Windows to know how it handles env. variables).
Did I understand it correctly, that the number of threads used by Rayon can only be set once at the start and doing it a second time results in an error? I'm not if this is worth the effort.
Another alternative would be to capture the env. variable at the start of GSTools and set it back to that value when exiting, but that doesn't sound very elegant.
I personally would just keep the hint in the readme for the moment. But if you have stronger opinions than I do, I'd be happy to implement your preferred solution.

Copy link

@adamreichold adamreichold Nov 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did I understand it correctly, that the number of threads used by Rayon can only be set once at the start and doing it a second time results in an error? I'm not if this is worth the effort.

Setting the environment variable will affect Rayon's so-called global thread pool which is initialized once, either on-demand when the first Rayon task is spawned or explicitly be the application. Hence, changing the environment variable after a first computation using Rayon was run has no effect.

This why I suggested making the GSTools-Core module store an Option<ThreadPool> which would be affected (meaning re-initialized using a given number of threads) by the function from gstools.config. If the option is set, that thread pool is used via ThreadPool::install, otherwise GSTools-Core falls back to the global thread pool.

I personally would just keep the hint in the readme for the moment.

Personally, I would say that online control over parallelism is a somewhat orthogonal issue and would best be served via a follow-up PR / issue.

gstools/config.py Outdated Show resolved Hide resolved
gstools/field/generator.py Outdated Show resolved Hide resolved
gstools/krige/base.py Outdated Show resolved Hide resolved

if config.USE_RUST:
# pylint: disable=E0401
from gstools_core import (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about setting # pragma: no cover behind if config.USE_RUST: once?
Also the # pylint: disable=C0412 is redundant (see line 13) and we could also just add E0401 in line 13.

@LSchueler
Copy link
Member Author

I think an import error (E0401) should only be disabled locally.

Copy link
Member

@MuellerSeb MuellerSeb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@MuellerSeb MuellerSeb changed the title Remove Cython code & replace it with GSTools-Core Add optional dependency GSTools-Core Nov 11, 2021
@LSchueler LSchueler merged commit ce978b8 into main Nov 11, 2021
@LSchueler LSchueler deleted the rust-core branch November 11, 2021 16:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Performance Performance related stuff. Refactoring Code-Refactoring needed here
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants