Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search for GDAL and PROJ data folders during startup #52

Merged
merged 9 commits into from
Mar 10, 2022

Conversation

brendan-ward
Copy link
Member

@brendan-ward brendan-ward commented Mar 9, 2022

This follows the same approach as used by Fiona for searching for GDAL and PROJ data folders.

GDAL data folders are searched in the following precedence:

  • wheel copy of gdal_data
  • auto detection by GDAL (including GDAL_DATA env var)
  • other well-known paths under sys.prefix

PROJ data folders are searched in the following precedence:

  • wheel copy of proj
  • default detection by GDAL (including PROJ_LIB env var)
  • other well-known paths under sys.prefix

@brendan-ward brendan-ward marked this pull request as ready for review March 9, 2022 18:33
@brendan-ward
Copy link
Member Author

@jorisvandenbossche this assumes that GDAL and Proj data files have been packaged into the wheels just like in Fiona, at

  • pyogrio/gdal_data
  • pyogrio/proj_data

Once those are in place, this should fix the issues finding proj.db in #49

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for looking into this, looks great!

pyogrio/_ogr.pyx Outdated
"""

if "GDAL_DATA" in os.environ:
set_gdal_config_options({"GDAL_DATA": os.environ["GDAL_DATA"]})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that it gives that much of simplification code-wise, but I am wondering to what extent it is needed to set the option in this case, because GDAL itself already will check for the GDAL_DATA env variable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(but it also doesn't hurt, so my comment doesn't matter that much)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was from Fiona, so I assume there was a good reason for it.

My guess is that it is related to precedence: we want to check this before falling back to checking the wheel location to give greatest control over how it gets set (otherwise, if we left it to GDAL to use env var, it would be after the check for the wheel location).

That said, I'm wondering now if the wheel location should take top precedence; if we deliberately put the data files there, we know they are correct for the version of GDAL pyogrio is compiled against, so it really doesn't make sense to let the user override that. So then we could fall back to letting GDAL detect this env var.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otherwise, if we left it to GDAL to use env var, it would be after the check for the wheel location).

Yes, we should still return early here if GDAL_DATA is properly set (so my suggestion was only about removing this single line, and not the whole block)

That said, I'm wondering now if the wheel location should take top precedence

That's a good point, I also can't think of any way that you wouldn't want to use those files if installed from a wheel.

One use case I am thinking about where this could actually matter is the (certainly not recommended! :)) situation of using a conda environment which has GDAL/PROJ installed (eg from installing fiona with conda), and then installing pyogrio from a wheel in that environment. In that case GDAL_DATA will be set and pointing to conda's GDAL, and thus it could give problems if first using the env variable (assuming it is still using the actual libraries included in the wheel).

pyogrio/_ogr.pyx Outdated Show resolved Hide resolved
pyogrio/_ogr.pyx Outdated Show resolved Hide resolved
@brendan-ward
Copy link
Member Author

Latest changes remove specific check for GDAL_DATA, 'PROJ_LIB` (just let GDAL detect these env vars instead of us); promotes wheel path to highest precedence.

pyogrio/_ogr.pyx Outdated Show resolved Hide resolved
pyogrio/_ogr.pyx Outdated Show resolved Hide resolved
pyogrio/_ogr.pyx Outdated Show resolved Hide resolved
pyogrio/_ogr.pyx Outdated Show resolved Hide resolved
@brendan-ward brendan-ward merged commit ec2f264 into main Mar 10, 2022
@brendan-ward brendan-ward deleted the find_gdal_data branch March 10, 2022 00:59
@jorisvandenbossche
Copy link
Member

Thanks Brendan! (and indeed just checking (after setting the wheel path) if GDAL already has the data is a nice simplification!)
I updated #49 to make use of this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants