Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using generics for bytes / code units / code points #17

Closed
annevk opened this issue Nov 8, 2016 · 6 comments
Closed

Using generics for bytes / code units / code points #17

annevk opened this issue Nov 8, 2016 · 6 comments

Comments

@annevk
Copy link
Member

annevk commented Nov 8, 2016

See #1 for some discussion on code units.

A term like "ASCII digit" and others like it are equally meaningful for all three primitives, since the primitives are defined as integers. Should we define these terms as generics so they can apply to each primitive?

Alternative we could change the phrasing, e.g., "An ASCII digit is a byte, code unit, or code point in the range 0x30 to 0x39, inclusive." This would also require slight tweaking of how we define "byte" and "code point".

@domenic
Copy link
Member

domenic commented Nov 8, 2016

Currently they only operate on code points, I see.

I guess I would like to see any changes here motivated by usage. I am sure such usage exists but it would help make things clearer.

@annevk
Copy link
Member Author

annevk commented Mar 17, 2017

https://mimesniff.spec.whatwg.org/#parsing-a-mime-type tries to use the term ASCII whitespace on bytes. We might even have to define a MIME type parser both for byte sequences and strings, but not sure about that. Would require some more investigation as to what user agents actually do with all the various MIME type entry points.

@annevk
Copy link
Member Author

annevk commented Mar 27, 2017

In whatwg/html#2471 I found a couple places where HTML could have used a generic "ASCII whitespace" as well. We could just define "byte whitespace" I suppose and use that.

@annevk
Copy link
Member Author

annevk commented Mar 28, 2017

More arguments for generics: URL splits a byte sequence in https://url.spec.whatwg.org/#concept-urlencoded-parser.

@annevk
Copy link
Member Author

annevk commented Mar 28, 2017

Perhaps though that can be refactored into doing decoding into a string first.

@annevk
Copy link
Member Author

annevk commented May 18, 2020

I'm pretty happy to use isomorphic decode/encode and their underlying technique as documented by 88fa454 instead of having this. Happy to reconsider if this remains a common need somewhere though.

@annevk annevk closed this as completed May 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants