-
-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TypeScript definitions #5
Comments
It was based on a PR from @dalyIsaac. But as I don't use TypeScript myself, I found that keeping it in sync with the API was a bit cumbersome. There may well be some more slight changes, I think there's no need to supply an external named-character-reference decoder now, and the built in one only handles a very small number of them only, see Limitations. Note that this was mostly a research project to get to know the lexical grammar much better (for other projects), and an exercise in minimalism. Depending on your use case, you may want to check how fast it is for large strings. It might be fine, but I have not checked how well the use of regular expressions here scales. Just a caution. |
Yeah, I love minimalism, that's what drew me in here - this is good work! 🙂👍 (I had to move to |
Thank you! You can let me know what things would have helped? I probably won't have time to work on it, but I'd love to know. I always find feedback really helpful. (And in this case it can also help me with my other projects) |
I opened issues for the things that would have helped already. 🙂 One area where |
Thanks :)
Yeah, I'll keep that in mind. Tree construction for HTML is a very complex thing (I'm working on finding a neat declarative description for it…), so even if you'd not need a full DOM, doing something like that yourself based on a library like this might have been too much. |
I don't know about that? Something like undom is very little code and works quite well. I wouldn't expect a DOM layer in a minimalist library like this one to have much more in terms of features than that. (Come to think of it, if this was something you wanted to pursue, writing an adapter from your token model to |
Yeah, implementing a minimal DOM data structure is not so much an issue. But building a DOM tree from a 'flat' stream of start-tags, end-tags and text-nodes is, because there are are a lot rules for ignoring tags and/or for adding implicit ones. For example, |
Ah, right - the whole error tolerance thing. Just for fun, I looked at the parser spec and kind of wished I hadn't. 😂 (I think I could write a program to convert all those rules into code faster than I could implement them by hand, haha) |
Yep…. 😅 I also wish I didn’t, but after saying “no” decidedly a couple of times… it turned into this html-parser project. Not ready for use yet, though! The parse5 project is really good, it is definitely worth reading the source. I’m just trying an other algorithm, hoping to rephrase the rules into a sort of coherent schema. It is not even just error correction, valid HTML has a lot of such rules still. We’ll see where it goes :) |
I noticed this comment in the TypeScript definitions:
tiny-html-lexer/lib/index.d.ts.bak
Lines 1 to 3 in ec1be86
I think the API is very nice, but I don't know what else you might have had in mind.
You already tagged a release-candidate, so maybe you just forgot? 🙂
I can try to adjust them to match the current API and submit a PR, if you'd like?
The text was updated successfully, but these errors were encountered: