-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify attribute properties added to objects in rel-urls. #32
Comments
Seems sensible to me, especially with the overwriting aspect. I can't think of a case where the ability to intentionally set one of these fields to an empty string would be important. |
related: #20 |
While #15 is still waiting on resolving, I do think the How about this?
|
I'm +1 on most of this, but not sure about the "and not an empty string" part. I'm not sure it would cause any problems, but it feels a bit weird to me -- like the parser is trying to fix a publisher mistake. I'll defer to others on that. |
It was my understanding that the entire In fact, accepting these attributes with an empty string value ( I will concede to not having encountered it in the wild, with the footnote that I have never used the That said: point two of this issue still stands. We need to resolve that “if any” means for Apart from that, do we want to strip leading/trailing whitespace from the attributes, or do we want to keep all whitespace as authored? New proposal, no skipping of empty values, but does strip surrounding whitespace:
|
Current spec text:
I think this should clarify 1) what to do with empty attribute values, and 2) what exactly “if any” means.
Currently parsers check the existence of an attribute, but never check their value. Thus empty attributes (e.g.
hreflang=""
) will lead to empty strings being added to the object inrel-urls
. This means the objects inrel-urls
will always match with the authored HTML.The drawback of this is that a link parsed later in the source document may not overwrite the empty value any more because the key is already present, even if they add information. Example:
Will lead to (edited to only show the affected property):
Clearly the empty string adds no information about the URL
#a
, but because it came first in source order that’s the one we keep.Currently the
text
property should only be added if there is “any” text content. But nowhere does it define what text content means or when we consider there to be none.This has already lead to a difference in implementations. The Python parser will add an empty
text
property, the PHP parser will not. Example:In PHP:
In Python:
I believe that, in both cases, storing an empty string value for any property is useless and adds no data. With the overwriting logic part of the parser, we may even miss out on information. Therefore I would propose rewriting the spec as follows:
The text was updated successfully, but these errors were encountered: