-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add discussion of performance analysis to libcollection #16267
Conversation
//! * `O(1)` - *Constant*: The performance of the operation is effectively | ||
//! independent of context. This is usually *very* cheap. | ||
//! | ||
//! * `O(logn)` - *Logarithmic*: Performance scales with the logarithm of `n`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe write log operations with spaces like O(log n)
? (Otherwise, it's super good to have this as part of the documentation as I know many programmers who do not know this.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, 👍 to the spaces here
Just a random idea, how about having traits like |
Style issues addressed |
@mneumann That strikes me as a pretty nasty abuse of the trait system with awful combinatoric consequences. I'd much rather some kind of #[perf-time(O(1) amortized)] annotation, or just a special doccomment form. |
@gankro: Well, the advantage it has is that you could write your own algorithms that depend on certain performance characteristics of the underlying data structures. Not sure if this is useful at all. Much more useful I find the aspect of documentation. I.e. clicking on the
Your annotation approach is of course more general. |
Are there any outstanding objections to this, or is it good to go? |
This is a nice discussion, but to be honest it seems pretty out of place in a collection library doc. I'm not sure it makes sense for our API documentation to try to teach performance analysis -- instead, I'd like to see a summary of the asymptotics for the various collections that can help people choose the right collection for their problem. Maybe this could be spun off as a separate guide or blogpost instead? |
(By comparison, the Qt docs have a very brief overview, which is mostly just an explanation of their notation.) |
This was written before I knew we were considering dropping traits for the time being. I had envisioned this providing basic context, with the actual traits providing a summary of their implementers. I think it would probably be more suitable as a separate standalone guide that is linked to by this page (and perhaps others). I'll work on hacking this out into something more generic, and then start on something more direct for collections. I guess this should be closed? fwiw I also wrote this to test the waters on how people overall felt about including formal and informal performance discussions in the docs (overall reaction: positive). |
Trimmed down everything to the most important bits, and dramatically simplified the rest. |
The trimmed down text looks great! Is the ultimate plan to include, below this text, a chart of the cost for various operations on various data structures (as in the Qt or Scala collection docs?) I hope so -- that gives a nice entry point for figuring out the right container to use for your problem. I would be in favor of landing this text together with a chart like that. Put differently, I don't think we should merge this text until we are actually presenting the performance for the various data structures. But once we do have that chart, I'm personally fine with this text. |
Okay yeah, I can include a table like that in this PR. Again, I had intended that to go with the traits, but putting it here is fine. |
Which categorizations and operations do we want to represent in this table? Should all collections be represented, or only the ones worth considering for a given job? |
After discussion with @aturon, we have decided that this should be postponed until the API has been normalized enough for the tables to be constructed. |
internal: Deduplicate some code
This was partially inspired by the excellent documentation Qt provides for its Container library. This is effectively Part-0 of my plan to provide performance details for all of the collection classes. It intends to provide a simple background on theoretical performance analysis, while noting that theory and practice may diverge. I'd like to eventually provide high-level discussions of when to use (or not use) each collection, as well as document the asymptotic performance of each collection's operations.
I probably go too in-depth on Big-Oh notation, but I do try to be informal about it (since I consider the formal definition largely immaterial to practical usage).
cc @steveklabnik