add discussion of performance analysis to libcollection #16267

Gankra · 2014-08-05T14:13:14Z

This was partially inspired by the excellent documentation Qt provides for its Container library. This is effectively Part-0 of my plan to provide performance details for all of the collection classes. It intends to provide a simple background on theoretical performance analysis, while noting that theory and practice may diverge. I'd like to eventually provide high-level discussions of when to use (or not use) each collection, as well as document the asymptotic performance of each collection's operations.

I probably go too in-depth on Big-Oh notation, but I do try to be informal about it (since I consider the formal definition largely immaterial to practical usage).

cc @steveklabnik

killercup · 2014-08-06T12:24:09Z

src/libcollections/lib.rs

+//! * `O(1)` - *Constant*: The performance of the operation is effectively
+//! independent of context. This is usually *very* cheap.
+//!
+//! * `O(logn)` - *Logarithmic*: Performance scales with the logarithm of `n`.


Maybe write log operations with spaces like O(log n)? (Otherwise, it's super good to have this as part of the documentation as I know many programmers who do not know this.)

yes, 👍 to the spaces here

mneumann · 2014-08-06T12:41:15Z

Just a random idea, how about having traits like BigO_n, BigO_logn and implement them by container structures just for the purpose of documentation. This would make it easy to show all structures that are O(n) in the documentation just by clicking on BigO_n. Ok, it's not that easy, as we'd need traits for every operation, like Insert_BigO_n and Lookup_BigO_logn. There could also be different traits for space usage and time.

Gankra · 2014-08-06T13:44:00Z

Style issues addressed

Gankra · 2014-08-06T13:46:15Z

@mneumann That strikes me as a pretty nasty abuse of the trait system with awful combinatoric consequences. I'd much rather some kind of #[perf-time(O(1) amortized)] annotation, or just a special doccomment form.

mneumann · 2014-08-06T14:09:44Z

@gankro: Well, the advantage it has is that you could write your own algorithms that depend on certain performance characteristics of the underlying data structures. Not sure if this is useful at all. Much more useful I find the aspect of documentation. I.e. clicking on the LogN trait shows me all algorithms and data structures that operate in LogN. I agree that it would be mostly a nasty abuse... :). The good thing you wouldn't have to change rustdoc in any way and "document" performance aspects, nor have to implement a new annotation.

impl<A> Insert_LogN for LinkedList<A> {}
impl<A> Find_LogN for LinkedList<A> {}
impl<K, V> Insert_LogN for HashMap<K, V> {}

Your annotation approach is of course more general.

Gankra · 2014-08-06T14:32:04Z

@mneumann I've created #16301, which is a better place for this discussion

Gankra · 2014-08-07T18:24:09Z

Are there any outstanding objections to this, or is it good to go?

aturon · 2014-08-12T02:47:26Z

This is a nice discussion, but to be honest it seems pretty out of place in a collection library doc. I'm not sure it makes sense for our API documentation to try to teach performance analysis -- instead, I'd like to see a summary of the asymptotics for the various collections that can help people choose the right collection for their problem.

Maybe this could be spun off as a separate guide or blogpost instead?

aturon · 2014-08-12T03:06:57Z

(By comparison, the Qt docs have a very brief overview, which is mostly just an explanation of their notation.)

Gankra · 2014-08-12T03:18:55Z

This was written before I knew we were considering dropping traits for the time being. I had envisioned this providing basic context, with the actual traits providing a summary of their implementers. I think it would probably be more suitable as a separate standalone guide that is linked to by this page (and perhaps others).

I'll work on hacking this out into something more generic, and then start on something more direct for collections. I guess this should be closed?

fwiw I also wrote this to test the waters on how people overall felt about including formal and informal performance discussions in the docs (overall reaction: positive).

Gankra · 2014-08-13T12:02:29Z

Trimmed down everything to the most important bits, and dramatically simplified the rest.

aturon · 2014-08-14T15:44:11Z

The trimmed down text looks great!

Is the ultimate plan to include, below this text, a chart of the cost for various operations on various data structures (as in the Qt or Scala collection docs?) I hope so -- that gives a nice entry point for figuring out the right container to use for your problem.

I would be in favor of landing this text together with a chart like that. Put differently, I don't think we should merge this text until we are actually presenting the performance for the various data structures. But once we do have that chart, I'm personally fine with this text.

Gankra · 2014-08-14T15:55:56Z

Okay yeah, I can include a table like that in this PR. Again, I had intended that to go with the traits, but putting it here is fine.

Gankra · 2014-08-14T16:01:34Z

Which categorizations and operations do we want to represent in this table? Should all collections be represented, or only the ones worth considering for a given job?

Gankra · 2014-08-14T16:11:46Z

After discussion with @aturon, we have decided that this should be postponed until the API has been normalized enough for the tables to be constructed.

internal: Deduplicate some code

Gankra added 2 commits August 5, 2014 10:15

add discussion of performance analysis to libcollection

4a2893a

hyperlinking

9931e95

killercup reviewed Aug 6, 2014
View reviewed changes

fixups

1acb339

Gankra mentioned this pull request Aug 6, 2014

Document all the performance #16301

Closed

simplify discussion

90195de

Gankra closed this Aug 14, 2014

This was referenced Oct 22, 2022

[Snyk] Upgrade eslint from 8.19.0 to 8.24.0 devcode1981/rust#9

Open

[Snyk] Upgrade eslint from 8.19.0 to 8.24.0 MarcelRaschke/rust#9

Open

bors added a commit to rust-lang-ci/rust that referenced this pull request Jan 8, 2024

Auto merge of rust-lang#16267 - Veykril:import-map-fix, r=Veykril

11b0126

internal: Deduplicate some code

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add discussion of performance analysis to libcollection #16267

add discussion of performance analysis to libcollection #16267

Gankra commented Aug 5, 2014

killercup Aug 6, 2014

steveklabnik Aug 6, 2014

mneumann commented Aug 6, 2014

Gankra commented Aug 6, 2014

Gankra commented Aug 6, 2014

mneumann commented Aug 6, 2014

Gankra commented Aug 6, 2014

Gankra commented Aug 7, 2014

aturon commented Aug 12, 2014

aturon commented Aug 12, 2014

Gankra commented Aug 12, 2014

Gankra commented Aug 13, 2014

aturon commented Aug 14, 2014

Gankra commented Aug 14, 2014

Gankra commented Aug 14, 2014

Gankra commented Aug 14, 2014

add discussion of performance analysis to libcollection #16267

add discussion of performance analysis to libcollection #16267

Conversation

Gankra commented Aug 5, 2014

killercup Aug 6, 2014

Choose a reason for hiding this comment

steveklabnik Aug 6, 2014

Choose a reason for hiding this comment

mneumann commented Aug 6, 2014

Gankra commented Aug 6, 2014

Gankra commented Aug 6, 2014

mneumann commented Aug 6, 2014

Gankra commented Aug 6, 2014

Gankra commented Aug 7, 2014

aturon commented Aug 12, 2014

aturon commented Aug 12, 2014

Gankra commented Aug 12, 2014

Gankra commented Aug 13, 2014

aturon commented Aug 14, 2014

Gankra commented Aug 14, 2014

Gankra commented Aug 14, 2014

Gankra commented Aug 14, 2014