-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor RuneReader
#441
Comments
I wonder if it's wise to work with bytes rather than runes. I think using bytes mostly depends on the assumption that any character encoding we encounter is going to be compatible with ASCII. Some character encodings like UTF-16 are incompatible. A few other folks have implemented buffered rune readers Other folks seem to roll their own using go-toml used to use go-buffruneio but it seems like they just work on bytes now. It would be interesting to see if go-toml even works on UTF-16 files. Err, no it seems that TOML must be UTF-8 so that's why they can do that. It might be better to just scrap my implementation and use go-buffruneio as it seems to do everything mine does but it's pretty old and doesn't seem to have gotten any updates. Maybe another project to contribute a bit of maintenance to. |
Another idea is to refactor the I'm not sure I like the "unbounded growth unless you call I don't think I can use Or maybe expand it into a full lexing package that allows for some customization? |
One person created a lexer package based on Rob Pike's GDG Sydney talk (Related #459). This makes me think to try it with generics. But it seems someone has done this already. Seems fairly reasonable. I wonder how fast it would be relative to an implementation I could write. Generics and all the other code to generalize it might slow it down a bit. |
Hmm, it seems that the It looks like an interesting idea, but I would have implemented it differently for sure; using just the simple |
RuneReader
was meant to be used as an easy way to read charset converted text from a file into Go's utf-8 strings.However, this doesn't seem necessary and
todos
should be able to deal purely in bytes (so long as language identifiers for comments or strings are not defined in non-ASCII characters). This is especially true if output text oftodos
is not converted into the local terminal's character set.The text was updated successfully, but these errors were encountered: