Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normalize example.com URLs in input and output #112

Closed

Conversation

jgarber623
Copy link
Member

@jgarber623 jgarber623 commented Apr 15, 2020

There were a few cases where HTML input and JSON output mixed schemes for URLs like http://example.com and https://example.com and where the URL in the JSON output lacked a trailing slash.

This PR normalizes on http:// since it's the more widely-used protocol in the remainder of the test suite and adds a trailing slash in cases where the resulting value is http://example.com/.

@jgarber623 jgarber623 marked this pull request as draft April 15, 2020 19:11
@jgarber623 jgarber623 marked this pull request as ready for review April 15, 2020 19:13
@jgarber623 jgarber623 changed the title Normalize example.com URLs to use http:// scheme Normalize example.com URLs in input and output Apr 20, 2020
@gRegorLove
Copy link
Member

@jgarber623 Just checking if this PR is ready or still in-progress?

@jgarber623
Copy link
Member Author

@gRegorLove It should be ready! Came across a case I missed (thus the extra push today).

I'll resolve the conflicts with master and then we should be good!

@jgarber623 jgarber623 force-pushed the normalize-example-com-urls branch from 9cf8ba8 to 7f6d7c9 Compare April 22, 2020 00:21
@jgarber623
Copy link
Member Author

@gRegorLove Conflicts resolved, so this should be good to go! 👍

@jansauer
Copy link
Contributor

jansauer commented Apr 29, 2020

Have a personal wip for the same thing but opted for using https as schema for the url context. For one it promotes the use of tls and looks more modern but more importantly a lot of the test cases that have a base tag use http://example.com. With this change non of them would assert the handling of the url set by the base tag.

Furthermore we should possibly add a info in the readme about the default url context.

@jgarber623 jgarber623 force-pushed the normalize-example-com-urls branch from 7f6d7c9 to e98850d Compare April 29, 2020 18:03
@gRegorLove
Copy link
Member

I don't have a strong preference between http and https example.com in the tests. Looks like there's still a mixture of them in this PR, though.

I think we should resolve microformats/microformats2-parsing#9 before merging the trailing slashes update.

@jgarber623 jgarber623 force-pushed the normalize-example-com-urls branch from e98850d to 0bd6eae Compare April 30, 2020 17:44
@jgarber623 jgarber623 force-pushed the normalize-example-com-urls branch from 0bd6eae to 0a2aea4 Compare May 1, 2020 03:21
Copy link
Member

@tantek tantek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with example.com URL tests using http: instead of https:. Also let's get this trailing slash fix merged because this is for valid URLs, whereas microformats/microformats2-parsing#9 is primarily about invalid URLs and should not be considered a blocker for this.

@Zegnat
Copy link
Member

Zegnat commented May 2, 2020

Also let's get this trailing slash fix merged because this is for valid URLs […]

I strongly disagree. This PR introduces a test about adding slashes that currently fails in all parsers on microformats.io (tested: Go, PHP, Python, Ruby; sadly Node is offline) and the specification is not clear about whether this normalisation should be applied at all.

It just so happens that the discussion about this is taking place in microformats/microformats2-parsing#9. We can (and maybe should) split it off into a new issue, but there is no denying that we are currently only on step two of the mf2 change control: propose a resolution. (Or possibly step 4, iterating on said resolution, depending on how far you feel the conversation has gotten.)

I think it sets a very weird precedent to add something to the test suite that has no documented implementer consensus nor even a single live demonstrable implementation.

@tantek tantek self-requested a review May 21, 2020 19:09
Copy link
Member

@tantek tantek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Zegnat you have convinced me we're not ready for these changes. Thank you for your attention to detail and the process. Yes you're right these changes require further discussion in the issue you linked, and subsequent resolution before we can move forward here. So the required changes are:

  • participation and resolution of that issue (along with sufficient parser developer approval / non-objection)
  • updating the parsing spec accordingly.

If there was a way of adding those as dependencies I would do so, but not sure how to do that in a review for pull request, so marking this as "Request changes" as a proxy for that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants