Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add discussion of performance analysis to libcollection #16267

Closed
wants to merge 4 commits into from

Conversation

Gankra
Copy link
Contributor

@Gankra Gankra commented Aug 5, 2014

This was partially inspired by the excellent documentation Qt provides for its Container library. This is effectively Part-0 of my plan to provide performance details for all of the collection classes. It intends to provide a simple background on theoretical performance analysis, while noting that theory and practice may diverge. I'd like to eventually provide high-level discussions of when to use (or not use) each collection, as well as document the asymptotic performance of each collection's operations.

I probably go too in-depth on Big-Oh notation, but I do try to be informal about it (since I consider the formal definition largely immaterial to practical usage).

cc @steveklabnik

//! * `O(1)` - *Constant*: The performance of the operation is effectively
//! independent of context. This is usually *very* cheap.
//!
//! * `O(logn)` - *Logarithmic*: Performance scales with the logarithm of `n`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe write log operations with spaces like O(log n)? (Otherwise, it's super good to have this as part of the documentation as I know many programmers who do not know this.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, 👍 to the spaces here

@mneumann
Copy link
Contributor

mneumann commented Aug 6, 2014

Just a random idea, how about having traits like BigO_n, BigO_logn and implement them by container structures just for the purpose of documentation. This would make it easy to show all structures that are O(n) in the documentation just by clicking on BigO_n. Ok, it's not that easy, as we'd need traits for every operation, like Insert_BigO_n and Lookup_BigO_logn. There could also be different traits for space usage and time.

@Gankra
Copy link
Contributor Author

Gankra commented Aug 6, 2014

Style issues addressed

@Gankra
Copy link
Contributor Author

Gankra commented Aug 6, 2014

@mneumann That strikes me as a pretty nasty abuse of the trait system with awful combinatoric consequences. I'd much rather some kind of #[perf-time(O(1) amortized)] annotation, or just a special doccomment form.

@mneumann
Copy link
Contributor

mneumann commented Aug 6, 2014

@gankro: Well, the advantage it has is that you could write your own algorithms that depend on certain performance characteristics of the underlying data structures. Not sure if this is useful at all. Much more useful I find the aspect of documentation. I.e. clicking on the LogN trait shows me all algorithms and data structures that operate in LogN. I agree that it would be mostly a nasty abuse... :). The good thing you wouldn't have to change rustdoc in any way and "document" performance aspects, nor have to implement a new annotation.

impl<A> Insert_LogN for LinkedList<A> {}
impl<A> Find_LogN for LinkedList<A> {}
impl<K, V> Insert_LogN for HashMap<K, V> {}

Your annotation approach is of course more general.

@Gankra
Copy link
Contributor Author

Gankra commented Aug 6, 2014

@mneumann I've created #16301, which is a better place for this discussion

@Gankra
Copy link
Contributor Author

Gankra commented Aug 7, 2014

Are there any outstanding objections to this, or is it good to go?

@aturon
Copy link
Member

aturon commented Aug 12, 2014

This is a nice discussion, but to be honest it seems pretty out of place in a collection library doc. I'm not sure it makes sense for our API documentation to try to teach performance analysis -- instead, I'd like to see a summary of the asymptotics for the various collections that can help people choose the right collection for their problem.

Maybe this could be spun off as a separate guide or blogpost instead?

@aturon
Copy link
Member

aturon commented Aug 12, 2014

(By comparison, the Qt docs have a very brief overview, which is mostly just an explanation of their notation.)

@Gankra
Copy link
Contributor Author

Gankra commented Aug 12, 2014

This was written before I knew we were considering dropping traits for the time being. I had envisioned this providing basic context, with the actual traits providing a summary of their implementers. I think it would probably be more suitable as a separate standalone guide that is linked to by this page (and perhaps others).

I'll work on hacking this out into something more generic, and then start on something more direct for collections. I guess this should be closed?

fwiw I also wrote this to test the waters on how people overall felt about including formal and informal performance discussions in the docs (overall reaction: positive).

@Gankra
Copy link
Contributor Author

Gankra commented Aug 13, 2014

Trimmed down everything to the most important bits, and dramatically simplified the rest.

@aturon
Copy link
Member

aturon commented Aug 14, 2014

The trimmed down text looks great!

Is the ultimate plan to include, below this text, a chart of the cost for various operations on various data structures (as in the Qt or Scala collection docs?) I hope so -- that gives a nice entry point for figuring out the right container to use for your problem.

I would be in favor of landing this text together with a chart like that. Put differently, I don't think we should merge this text until we are actually presenting the performance for the various data structures. But once we do have that chart, I'm personally fine with this text.

@Gankra
Copy link
Contributor Author

Gankra commented Aug 14, 2014

Okay yeah, I can include a table like that in this PR. Again, I had intended that to go with the traits, but putting it here is fine.

@Gankra
Copy link
Contributor Author

Gankra commented Aug 14, 2014

Which categorizations and operations do we want to represent in this table? Should all collections be represented, or only the ones worth considering for a given job?

@Gankra
Copy link
Contributor Author

Gankra commented Aug 14, 2014

After discussion with @aturon, we have decided that this should be postponed until the API has been normalized enough for the tables to be constructed.

@Gankra Gankra closed this Aug 14, 2014
bors added a commit to rust-lang-ci/rust that referenced this pull request Jan 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants