Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: issues with string destination handling in {Graph,Result}.serialize #2065

Merged

Conversation

aucampia
Copy link
Member

@aucampia aucampia commented Jul 29, 2022

Summary of changes

Change {Graph,Result}.serialize to only handle string destinations as URIs if
the schema is file and to treat it as operating system paths in all
other cases.

This is for the following reasons:

  • Result.serialize was treating URI paths as OS paths which only works
    for some cases, for special charachters and percentage encoding it
    does not work.
  • Many valid Unix and Windows paths parse using urlparse and have no netloc, e.g. C:\some\path
    and some:/path, however they should be treated as paths and not as URIs.
  • Graph and Result should behave consistently.

Some caveats in this change:

  • non-file URIs will now be treated as OS paths which may result in
    slightly confusing error messages, such as
    FileNotFoundError: [Errno 2] No such file or directory: 'http://example.com/'
    if http://example.com/ is passed.
  • some valid file URIs (e.g. file:/path/to/file from https://datatracker.ietf.org/doc/html/rfc8089)
    are also valid Posix paths but will be treated as file-URIs instead of
    Posix paths. For Graph.serialize this ambiguity can be avoided by
    using pathlib.Path, but for Result.serialize there is currently no
    way to avoid it, though I will work on Make serialize() on a CONSTRUCT result act like normal g.serialize() #1834
    soon and in that provide a way to avoid the ambiguity there also.

Checklist

  • Checked that there aren't other open pull requests for
    the same change.
  • Added tests for any changes that have a runtime impact.
  • Checked that all tests and type checking passes.
  • For changes that have a potential impact on users of this project:
    • Considered updating our changelog (CHANGELOG.md).
  • Considered granting push permissions to the PR branch,
    so maintainers can fix minor issues and keep your PR up to date.

@aucampia aucampia force-pushed the iwana-20220729T2252-windows_issues branch 6 times, most recently from 5cd02f4 to 3ed412d Compare July 31, 2022 00:06
@aucampia aucampia changed the title fix: address some windows specific issues fix: issues with string destination handling in {Graph,Result}.serialize Jul 31, 2022
@aucampia aucampia force-pushed the iwana-20220729T2252-windows_issues branch from 3ed412d to b53be4b Compare July 31, 2022 00:10
@aucampia aucampia marked this pull request as ready for review July 31, 2022 00:12
@aucampia aucampia requested a review from a team July 31, 2022 00:12
@aucampia aucampia added the core Relates to core functionality of RDFLib, i.e. `rdflib.{graph,store,term}` label Jul 31, 2022
@coveralls
Copy link

coveralls commented Jul 31, 2022

Coverage Status

Coverage decreased (-0.008%) to 90.447% when pulling 4e90415 on aucampia:iwana-20220729T2252-windows_issues into 8d58fd0 on RDFLib:master.

@aucampia
Copy link
Member Author

Not sure what is up with coverage, I'm guessing it is somehow related to the latest build on master failing (https://github.com/RDFLib/rdflib/actions?query=branch%3Amaster), I will rebase once #2068 is merged.

@aucampia aucampia force-pushed the iwana-20220729T2252-windows_issues branch from b53be4b to ea6bd62 Compare July 31, 2022 00:22
@aucampia
Copy link
Member Author

Not sure what is up with coverage, I'm guessing it is somehow related to the latest build on master failing (https://github.com/RDFLib/rdflib/actions?query=branch%3Amaster), I will rebase once #2068 is merged.

Nevermind, it is because I fiddled with tox.ini when I was trying to debug this.

@aucampia aucampia force-pushed the iwana-20220729T2252-windows_issues branch 2 times, most recently from 72abe6d to 00a9167 Compare July 31, 2022 00:40
Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, windows tax - paid, gj 👍

@aucampia aucampia force-pushed the iwana-20220729T2252-windows_issues branch from 00a9167 to 4e90415 Compare July 31, 2022 10:45
…lize`

Change `{Graph,Result}.serialize` to only handle string destinations as URIs if
the schema is `file` and to treat it as operating system paths in all
other cases.

This is for the following reasons:
- `Result.serialize` was treating URI paths as OS paths which only works
  for some cases, for special charachters and percentage encoding it
  does not work.
- Many valid Unix and Windows paths parse using `urlparse` and have no netloc, e.g. `C:\some\path`
  and `some:/path`, however they should be treated as paths and not as URIs.
- `Graph` and `Result` should behave consistently.

Some caveats in this change:
- non-file URIs will now be treated as OS paths which may result in
  slightly confusing error messages, such as
  `FileNotFoundError: [Errno 2] No such file or directory: 'http://example.com/'`
  if http://example.com/ is passed.
- some valid file URIs (e.g. `file:/path/to/file` from https://datatracker.ietf.org/doc/html/rfc8089)
  are also valid Posix paths but will be treated as file-URIs instead of
  Posix paths. For `Graph.serialize` this ambiguity can be avoided by
  using `pathlib.Path`, but for `Result.serialize` there is currently no
  way to avoid it, though I will work on RDFLib#1834
  soon and in that provide a way to avoid the ambiguity there also.
@aucampia aucampia merged commit 2d61846 into RDFLib:master Aug 2, 2022
@aucampia aucampia deleted the iwana-20220729T2252-windows_issues branch April 9, 2023 15:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Relates to core functionality of RDFLib, i.e. `rdflib.{graph,store,term}`
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Result.serialize() path handling is broken for windows paths and some other cases
2 participants