-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bidirectional iterators #18
Comments
Considering you rejected DoubleEndedIterator on Lines, are you also opposed to DoubleEndedIterator on Chars? On Chars, it would be possible to do a fairly naive implementation that simply scans for character boundaries from the back of the string (it's still O(1) since there is a bounded amount of bytes to scan (4)). |
Hi @AljoschaMeyer, thanks for the question! It's not a matter of algorithmic complexity or implementation difficulty. All of Ropey's iterators (lines included) could implement DoubleEndedIterator in a reasonable way. The issue is that it's not the right API. What you generally want for text is something along the lines of C++'s bidrectional iterator concept. It's essentially a "cursor" that you can move either forwards or backwards through the text as you please. The API surface is very simple: you just add a The DoubleEndedIterator API doesn't allow anything like that, and is IMO very limited in scope and application. Implementing it wouldn't provide what you need for text processing generally. Having said that, I now feel like I've been holding my breath for too long, waiting for an appropriate trait in Once a I hope that answers your question! |
Yes, that answers it, thanks. I personally don't need that full flexibility, only the ability to iterate backwards from the end to the beginning of the string. But your reasoning makes sense. I'm not quite sure why to wait for a std trait though. xi-rope offers a Cursor, im-rs has a Focus. Ad-hoc polymorphism would be nice, but any std trait could always be retroactively implemented on the cursor struct. |
In using ropey for lsp-diff-server, text manipulation ergonomics was the biggest problem. Feature parity with A Cursor API is a good step, but I think unit(Line, Char, Byte, UTF-16 Char, ...) based indexing is a more general and ergonomic solution. I will try to make a proof of concept soon. Ideally all the iterators would be provided by a string units library. |
Yeah, that's basically what I expect I'll do now, except without introducing any new types (the existing iterators should work fine, just adding a The reason I'd ideally like to implement a
I'm not sure what you mean here by "unit based indexing"? You can already index into a Ropey rope by line, char, and byte offset... so I assume that's not what you mean? But I'm not sure what else you could mean here. |
But the time complexity is worse: iterating via a properly implemented focus is O(n), random access of each element is O(n log n) Yup, thanks for hearing me out. Please don't feel pressured to implement this. I (most likely) won't make the time to contribute it, so neither do I except you to do so. I'm not blocked, and I'm grateful for the crate :) |
No worries! Thanks for bringing this up. I think I needed to be poked about it, ha ha. :-) I probably won't get to it for a bit--I'm currently pretty busy with other things. But I'll get to it when I have a reasonable chunk of time. I expect I'll start with the |
I've started implementing this in the Current status:
|
All functionality has been implemented now, for all four types of iterators. I've also merged into master and deleted the bidir_iter branch. As always, testing is appreciated! Documentation is extremely sparse still, so I'm leaving this issue open until I properly document things. |
Okay, several bug fixes and documentation passes later, and I believe everything is ready. (EDIT: the documentation for master can be found here. Probably important for getting feedback on the API! The documentation in the I do have one API question for those of you who are interested in this feature: right now the various This mirrors the individual chunk-fetching methods, and I think it's the right API because without that information you don't actually know where you ended up in the rope. But it also feels a little awkward compared to the other iterator-making methods, including vanilla What do you guys think? I'm basically just trying to think if there are significant use-cases where you would want to create a chunk iterator this way and not need that information. And I think the answer is no. But I want to give a chance for some feedback before committing to the API. If no one has any feedback in the next two weeks, I'll stick with what I've already implemented and make a new release (v1.1.0) with this new functionality. |
The API looks solid, this does exactly what I would need. I ended up implementing cursors for a 2-3 tree and a persistent array, and it will be nice not to have to reimplement ropes as well. Some minor wishes:
Tentative agree. |
Oh, yeah, I like that. I think to keep with the "iterators are in-between items" concept, having two methods for forward and behind might make more sense, if only for keeping the mental model consistent. Maybe If you'd like to take a crack at that yourself, let me know. I likely won't go for it myself before the next release, but I would definitely like Ropey to have this functionality.
Iterator creation performance is already quite good IMO, so I'm not super inclined to further complicate the code to special-case-optimize start/end creation unless we actually need to. I also suspect that the performance gain would be marginal at best anyway, as I don't think the comparisons are taking much time (though admittedly I haven't measured). If start/end iterator creation in particular become an actual bottleneck for someone, of course, then I'll be happy to look into it. But until then I'd rather leave things as-is. However, what I would like to do performance-wise is properly optimize the I've punted on a better implementation for now because I'm pretty sure |
Good point. The
I can't offer more than a half-hearted "perhaps at some point in the rather far future" unfortunately. But I did take a look at the implementation, and found it very enjoyable to read/navigate. It would be lovely if the balancing scheme was clearly stated somewhere - is it a B-Tree? |
I have a design document that gives an overview of Ropey's design here: And also, yes, Ropey is implemented as a B-Tree rope. :-) |
It's been two weeks, and so far the only feedback is for additional features which can be added later. I also feel confident about the APIs now after having a chance to think about them more. So I'm calling this done, and will soon make a new release with these new iterator features. |
v1.1.0 is now live on crates.io! |
Is there a reason this wasn't done or was it just an oversight? The .rev() method on Chars would be useful to me. |
Ah, it's mostly an oversight, yeah. Sorry about that! Having said that, it turns out there are some mildly tricky things I hadn't thought about at first, due to being able to construct iterators at any point in the rope. For example, what would you expect So, another possibility I've thought about is to add our own Do you have any thoughts about any of that? I definitely want to make sure you get what you need out of this. But I think some of these issues need to be thought through. Maybe it's worth opening a new issue for that. |
@tadeokondrak I created a new issue (#31) to discuss the appropriate design for reversing iterators. Feel free to leave any feedback there! |
Some client code may want to iterate over e.g. chunks in both directions. For example, DFA regex need to scan backwards after finding a match to calculate where the match begins.
The tricky bit isn't technical, however. It's having a standard API to interoperate with other libraries. Unfortunately, the Rust standard library doesn't provide a bidirectional iterator trait, so this probably isn't useful to implement yet: any API I come up with will be specific to Ropey, and therefore wouldn't interoperate with other libraries anyway.
I'm creating this issue as a reminder to add this functionality if/when a bidirectional iterator trait (or equivalent) is added.
The text was updated successfully, but these errors were encountered: