Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve relative URLs in e-* HTML content #181

Closed
snarfed opened this issue Jan 30, 2023 · 6 comments · Fixed by #201
Closed

Resolve relative URLs in e-* HTML content #181

snarfed opened this issue Jan 30, 2023 · 6 comments · Fixed by #201

Comments

@snarfed
Copy link
Member

snarfed commented Jan 30, 2023

Inspired by snarfed/bridgy-fed#390 (comment), open parsing issue microformats/microformats2-parsing#38, related implementation details in #138.

@angelogladding
Copy link
Collaborator

From the HTML attributes table referenced in @Zegnat's proposal I see the following URL attributes:

  • cite, data, href, src, poster
  • ping
  • action, formaction
  • itemid, itemprop, itemtype

Do we need to support ping, forms or microdata?

Can this wait until after the transition to py3?

@snarfed
Copy link
Member Author

snarfed commented Apr 17, 2023

Thanks! I expect it's fine to ship an initial implementation of this with just the first line: href, src, cite, data, poster, probably in that priority order. The rest seem very low priority and/or unwanted entirely.

(Also mf2py migrated to python 3 ages ago, that's not a blocker here.)

@angelogladding
Copy link
Collaborator

Oops, I meant to say "transition away from py2" or "transition to py3 only".

With py3 wheels we can install lxml without having to compile and thus it's possible to add it as a core dependency.

If that were to be the case we could use their machinery: https://lxml.de/lxmlhtml.html#working-with-links

Here's their code: https://github.com/lxml/lxml/blob/11b33a83ad689bd16bd0a98c14cda51a90572b73/src/lxml/html/__init__.py#L435

It's a bit of a stretch but also a nod toward feature freeze in anticipation of deprecation of py2.

@angelogladding
Copy link
Collaborator

lxml does not make relative srcset items absolute and I see now that other properties are already being made absolute.

I'll work on support for cite, data, href, src, poster. I imagine srcset on an img found inside the e-* should be made absolute as well.

@JKingweb
Copy link

JKingweb commented Jul 7, 2023

I went through this exercise a few weeks ago, and compiled a list of URL-holding attributes from the HTML, MathML 3, and SVG Tiny specifications:

  • action on form
  • altimg on math:math
  • cdgroup on math:math
  • cite on blockquote, del, ins, q
  • data on object
  • definitionURL on math:*
  • formaction on button, input
  • handler on svg:listener
  • href on a, area, base, link, math:*
  • itemid on html:*
  • itemprop on html:*
  • itemtype on html:*
  • ping on a, area
  • src on audio, embed, iframe, img, input, script, source, track, video, math:mglyph, math:share, math:annotation, math:annotation-xml
  • xlink:href on svg:*, math:*
  • xlink:role on svg:*, math:*
  • xlink:arcrole on svg:*, math:*
  • xml:base on svg:*, math:*

Then you also have image candidate strings (comma-separated):

  • imagesrcset on link
  • srcset on img, source

And finally, URL token-lists (whitespace-separated):

  • macros on math:math
  • requiredExtensions on svg:a, svg:animate, svg:animateColor, svg:animateMotion, svg:animateTransform, svg:animation, svg:audio, svg:circle, svg:desc, svg:discard, svg:ellipse, svg:foreignObject, svg:g, svg:image, svg:line, svg:metadata, svg:path, svg:polygon, svg:polyline, svg:rect, svg:set, svg:switch, svg:tbreak, svg:text, svg:textArea, svg:title, svg:tspan, svg:use, svg:video
  • requiredFeatures on svg:a, svg:animate, svg:animateColor, svg:animateMotion, svg:animateTransform, svg:animation, svg:audio, svg:circle, svg:desc, svg:discard, svg:ellipse, svg:foreignObject, svg:g, svg:image, svg:line, svg:metadata, svg:path, svg:polygon, svg:polyline, svg:rect, svg:set, svg:switch, svg:tbreak, svg:text, svg:textArea, svg:title, svg:tspan, svg:use, svg:video

Of all these href on base is a special case (its own value is not considered for resolution), as is xml:base. Attributes on foreign elements should probably also perform xml:base resolution in addition to HTML base resolution.

@angelogladding
Copy link
Collaborator

snarfed added a commit to snarfed/bridgy-fed that referenced this issue Dec 8, 2023
snarfed added a commit to snarfed/bridgy-fed that referenced this issue Dec 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants