html5ever in the rustc-perf repository is memory-intensive #52028

Mark-Simulacrum · 2018-07-03T19:56:37Z

I see OOMs locally with a 16 GB memory computer.

Source code: https://github.com/rust-lang-nursery/rustc-perf/tree/master/collector/benchmarks/html5ever

nnethercote · 2018-07-04T08:45:23Z

I did a run with Massif:

Yep, that's a 14 GiB peak, with 12 GiB of it coming from a single stack trace involving visit_ty/visit_with. I'm not surprised it OOM'd on a 16 GiB machine.

Massif doesn't get great stack traces due to inlining, but DHAT does a better job by using debuginfo. Here's the important info:

-------------------- 2 of 500 --------------------
max-live:    12,884,901,888 in 1 blocks
tot-alloc:   12,884,901,888 in 1 blocks (avg size 12884901888.00)
deaths:      1, at avg age 219,437,305,943 (40.34% of prog lifetime) 
acc-ratios:  0.60 rd, 0.60 wr  (7,768,302,580 b-read, 7,768,302,580 b-written)
   at 0x4C2DECF: malloc (in /usr/lib/valgrind/vgpreload_exp-dhat-amd64-linux.so)
   by 0x7A673EC: alloc (alloc.rs:62)
   by 0x7A673EC: alloc (alloc.rs:123)
   by 0x7A673EC: reserve_internal<(&rustc::ty::sty::RegionKind, rustc::mir::Location),alloc::alloc::Global> (raw_vec.rs:679)
   by 0x7A673EC: reserve<(&rustc::ty::sty::RegionKind, rustc::mir::Location),alloc::alloc::Global> (raw_vec.rs:502)
   by 0x7A673EC: reserve<(&rustc::ty::sty::RegionKind, rustc::mir::Location)> (vec.rs:464)
   by 0x7A673EC: push<(&rustc::ty::sty::RegionKind, rustc::mir::Location)> (vec.rs:1060)
   by 0x7A673EC: {{closure}}<&rustc::ty::TyS> (liveness.rs:171)
   by 0x7A673EC: visit_region<closure> (fold.rs:305)
   by 0x7A673EC: rustc::ty::structural_impls::<impl rustc::ty::fold::TypeFoldable<'tcx> for &'tcx rustc::ty::sty::RegionKind>::visit_with (structural_impls.rs:956)
   by 0x7A68568: super_visit_with<rustc::ty::fold::{{impl}}::for_each_free_region::RegionVisitor<closure>> (structural_impls.rs:888)
   by 0x7A68568: rustc::ty::fold::TypeVisitor::visit_ty (fold.rs:188)
   by 0x7A69AD0: visit_with<rustc::ty::fold::{{impl}}::for_each_free_region::RegionVisitor<closure>> (structural_impls.rs:903)
   by 0x7A69AD0: {{closure}}<rustc::ty::fold::{{impl}}::for_each_free_region::RegionVisitor<closure>> (structural_impls.rs:762)
   by 0x7A69AD0: {{closure}}<core::slice::Iter<&rustc::ty::TyS>,closure> (iterator.rs:1706)
   by 0x7A69AD0: {{closure}}<core::slice::Iter<&rustc::ty::TyS>,closure,core::iter::LoopState<(), ()>> (iterator.rs:1536)
   by 0x7A69AD0: try_fold<&rustc::ty::TyS,(),closure,core::iter::LoopState<(), ()>> (mod.rs:2431)
   by 0x7A69AD0: try_for_each<core::slice::Iter<&rustc::ty::TyS>,closure,core::iter::LoopState<(), ()>> (iterator.rs:1536)
   by 0x7A69AD0: any<core::slice::Iter<&rustc::ty::TyS>,closure> (iterator.rs:1705)
   by 0x7A69AD0: super_visit_with<rustc::ty::fold::{{impl}}::for_each_free_region::RegionVisitor<closure>> (structural_impls.rs:762)
   by 0x7A69AD0: rustc::ty::fold::TypeFoldable::visit_with (fold.rs:63)

Holy cow! That's a single 12 GiB allocation happening within push_type_live_constraint:
https://github.com/rust-lang/rust/blob/master/src/librustc_mir/borrow_check/nll/type_check/liveness.rs#L170-L172

(Once again, add_liveness_constraints and the code beneath it is causing difficulties.)

Because liveness_set is so enormous, add_region_liveness_constraints_from_type_check() is slow. From Callgrind:

            .          for (region, location) in liveness_set {
            .              debug!("generate: {:#?} is live at {:#?}", region, location);
  777,018,978              let region_vid = regioncx.to_region_vid(region);
1,942,548,420              regioncx.add_live_point(region_vid, *location);
59,657,319,186  => /home/njn/moz/rust0/src/librustc_mir/borrow_check/nll/region_infer/mod.rs:rustc_mir::borrow_check::nll::region_infer::RegionInferenceContext::add_live_point (388507821x)
            .          }

add_live_point() then does some SparseBitMatrix insertions.

nnethercote · 2018-07-04T08:53:28Z

There are also some other causes of slowness for html5ever.

First, precompute_borrows_out_of_scope is super hot. Callgrind output:

            .      mir: &'a Mir<'tcx>,
            .      regioncx: &Rc<RegionInferenceContext<'tcx>>, 
            .      borrows_out_of_scope_at_location: &mut FxHashMap<Location, Vec<BorrowIndex>>,
            .      borrow_index: BorrowIndex,
            .      borrow_region: RegionVid,
            .      location: Location,
            .  ) {     
            .      // Keep track of places we've locations to check and locations that we have checked.
       46,668      let mut stack = vec![ location ];
            .      let mut visited = FxHashSet();
       62,224      visited.insert(location);
    4,413,266  => /home/njn/moz/rust0/src/libstd/collections/hash/set.rs:<std::collections::hash::set::HashSet<T, S>>::insert (15431x)
            .                  
            .      debug!(     
            .          "borrow {:?} has region {:?} with value {:?}",
            .          borrow_index,
            .          borrow_region,
            .          regioncx.region_value_str(borrow_region),
            .      );      
            .      debug!("borrow {:?} starts at {:?}", borrow_index, location);
  971,769,410      while let Some(location) = stack.pop() {
            .          // If region does not contain a point at the location, then add to list and skip
            .          // successor locations.
  971,769,410          if !regioncx.region_contains_point(borrow_region, location) {
            .              debug!("borrow {:?} gets killed at {:?}", borrow_index, location);
       99,190              borrows_out_of_scope_at_location
    1,290,825  => /home/njn/moz/rust0/src/libstd/collections/hash/map.rs:<std::collections::hash::map::Entry<'a, K, V>>::or_insert (13827x)
    3,298,140  => /home/njn/moz/rust0/src/libstd/collections/hash/map.rs:<std::collections::hash::map::HashMap<K, V, S>>::entry (13827x)
       28,340                  .entry(location)
            .                  .or_insert(vec![])
            .                  .push(borrow_index);
            .              continue;
            .          }
            .
            .          let bb_data = &mir[location.block];
            .          // If this is the last statement in the block, then add the
            .          // terminator successors next.
1,457,611,605          if location.statement_index == bb_data.statements.len() {
            .              // Add successors to locations to visit, if not visited before.
       59,584              if let Some(ref terminator) = bb_data.terminator {
       89,376                  for block in terminator.successors() {
      799,865  => /home/njn/moz/rust0/src/librustc/mir/mod.rs:rustc::mir::Terminator::successors (29527x)
      190,405                      let loc = block.start_location();
      112,488  => /home/njn/moz/rust0/src/librustc/mir/mod.rs:rustc::mir::BasicBlock::start_location (37496x)
      228,486                      if visited.insert(loc) {
    5,583,147  => /home/njn/moz/rust0/src/libstd/collections/hash/set.rs:<std::collections::hash::set::HashSet<T, S>>::insert (37496x)
            .                          stack.push(loc);
            .                      }
            .                  }
            .              }
            .          } else {
            .              // Visit next statement in block.
2,429,203,715              let loc = location.successor_within_block();
1,943,360,660  => /home/njn/moz/rust0/src/librustc/mir/mod.rs:rustc::mir::Location::successor_within_block (485840165x)
2,915,044,458              if visited.insert(loc) {
68,678,149,511  => /home/njn/moz/rust0/src/libstd/collections/hash/set.rs:<std::collections::hash::set::HashSet<T, S>>::insert (485840165x)
            .                  stack.push(loc);
            .              }
            .          }
            .      }
            .  }

It's the bit at the end that's the hottest, executing 485 million times. Because of this function, the following things are also super hot:

region_contains_point/RegionValues::contains
HashSet::insert. (Note that the visited hash table gets big: a total of 38 GB is allocated/reallocated in 109,000 allocations with an average size of 347 KB!)
successor_within_block

Second, unroll_place is also very hot:

            .      place: &Place<'tcx>,
            .      next: Option<&PlaceComponents<'_, 'tcx>>,
            .      op: impl FnOnce(PlaceComponentsIter<'_, 'tcx>) -> R,
            .  ) -> R {
            .      match place {
8,892,046,598          Place::Projection(interior) => unroll_place(
42,799,560,024  => /home/njn/moz/rust0/src/librustc_mir/borrow_check/places_conflict.rs:rustc_mir::borrow_check::places_conflict::unroll_place'2 (485791408x)
  534,414,126              &interior.base,
1,068,828,252              Some(&PlaceComponents {
            .                  component: place,
            .                  next,
            .              }),
2,672,070,630              op,
            .          ),
            .
            .          Place::Local(_) | Place::Static(_) => {
1,943,739,512              let list = PlaceComponents {
            .                  component: place,
            .                  next,
            .              }; 
1,943,739,512              op(list.iter())
            .          }
            .      }
6,754,274,663  }

nikomatsakis · 2018-07-04T15:13:29Z

I always wondered it that super naive liveness_set would cause us problems =) seems clear the answer is yes.

nikomatsakis · 2018-07-04T15:13:43Z

Thanks @nnethercote for the data, super helpful.

nikomatsakis · 2018-07-04T15:14:09Z

I suspect we can remove the liveness_set vector entirely.

nikomatsakis · 2018-07-04T15:25:59Z

But it will require some refactoring. I think I would want to build this atop #51987. Let me summarize the setup in that PR:

Type check generates outlives constraints ('a: 'b) as well as the set of liveness facts
- liveness facts are presently stored as a set of (R, P) points
Once type check completes, we know the full set of region variables
We create the region inference context
- we compute the SCC amongst region variables
- we create a value per SCC which contains the points where any of the regions in that SCC are live
- at present, we also create a liveness set for each individual region, which is used only in error reporting

We used to have a fairly hard requirement that we know the number of region variables before we could create the region inference context. This is because we represented the values as a huge matrix of bits. But now that we use sparse bit sets that is not true; we could grow the number of region variables fairly easily.

This implies to me that -- to start -- we could refactor so that type check generates the initial liveness results (storing in a sparse bit set per variable) directly. This would effectively replace the liveness_set vector that we have today. We would then hand that off to the region inference context.

The reason we'd have to be able to grow the number of region variables is that I think that computing liveness constraints can sometimes create fresh region variables and possibly constraints between those variables. (This is due to the "dropck" computation.)

It'd be sort of nice if that were not true, though, and it may not be true -- if it weren't, we could compute the SCC ahead of time and then never generate the per-region liveness, instead focusing on the liveness per SCC. This may affect diagnostics, which we could recover by doing more refined computations lazilly perhaps (or maybe it wouldn't affect diagnostics, i'm not sure, I'd have to think about it).

But it'd probably still be a big improvement to move to the sparse bit set instead of the vector of points.

Another option that I have been wondering about is moving from "set of points" to a more compact representation -- for example, the SEME region abstraction implemented in https://github.com/rust-lang-nursery/rustc-seme-regions may be more compact. However, presently there can be multiple disjoint liveness regions, and SEME regions can't express that, so we'd have to figure out what to do there.

nnethercote · 2018-07-04T20:40:04Z

More about precompute_borrows_out_of_scope: there's something quadratic going on. I added this line to the very end of the function:

eprintln!("visited {}", visited.len());

The resulting output when compiling html5ever is dominated by this sequence of length 9858:

visited 88704
visited 88703
visited 88691
visited 88683
visited 88675
visited 88667
...
visited 9899
visited 9891
visited 9883
visited 9875
visited 9867
visited 9862
visited 9861

It definitely feels like we're traversing a long sequence from A to B, then moving forward A slightly (by 8 steps, to be precise), then traversing from A to B again, over and over.

nikomatsakis · 2018-07-04T20:51:54Z

Yeah, precompute_borrows_out_of_scope is potentially quadratic -- it itself is an optimization over an earlier approach -- but I've not thought of a better way for it to work. It basically walks the scope of each borrow -- the number of borrows is sort of "O(n)" in the size of the code, and the potential extent of each borrow is similarly "O(n)".

nikomatsakis · 2018-07-04T20:54:01Z

Hmm, as an aside, I wonder if the premise of finding the borrows-in-scope via a dataflow is a bit silly, given that precompute_borrows_out_of_scope exists. That is, it seems like we are duplicating our work here with some of the dataflow propagation code. Maybe it would be worth killing that code.

(This is another place where changing to an alternative representation of regions, one that optimizes continuous stretches of things, might be a win.)

nnethercote · 2018-07-05T09:12:19Z

I made several attempts to speed up unroll_place -- using SmallVec, using Vec, moving the borrow_components computation outside the inner loop in each_borrow_involving_path -- but they all made things slower or at best made no difference :(

nikomatsakis · 2018-07-05T18:02:48Z

@nnethercote yeah, that code was not naive -- in fact, you had previously optimized it to use SmallVec, and unroll_place was itself an optimization on that =) I have some thoughts about how to do better but it would require deeper refactoring. I will try to write some notes down at least.

nikomatsakis · 2018-07-05T18:06:28Z

JFYI, I created this Zulip thread to talk about this issue

nnethercote · 2018-07-11T13:05:06Z

@Mark-Simulacrum: #52250 reduces html5ever's max-rss from 11,380,868.00 down to 10,285,700.00. (I assume the unit is MB or MiB.) I wonder if that's enough to avoid the OOMs on machines with 16 GiB of RAM.

Mark-Simulacrum · 2018-07-11T14:08:06Z

@nnethercote I'll try and enable it -- the perf collector has, I think, 32 GB of RAM (though free reports 31G, that seems odd) so I think that it should be "good enough" in that regard. Local benchmark runners can exclude it if necessary, I think...

@nikomatsakis

WIP: html5ever in the rustc-perf repository is memory-intensive Part of #52028. Rebased atop of #51987. r? @nikomatsakis

nnethercote · 2018-07-17T06:58:08Z

I looked into unroll_place a bit more. The problem is not that unroll_place is slow, but rather that it's called so many times.

Specifically, we hit the for i in candidates loop within each_borrow_involving_path() 150,000 times when compiling html5ever:

rust/src/librustc_mir/borrow_check/path_utils.rs

Lines 61 to 74 in 31263f3

    
           for i in candidates { 
        
               let borrowed = &borrow_set[i]; 
        
               if places_conflict::places_conflict(tcx, mir, &borrowed.borrowed_place, place, access) { 
        
                   debug!( 
        
                       "each_borrow_involving_path: {:?} @ {:?} vs. {:?}/{:?}", 
        
                       i, borrowed, place, access 
        
                   ); 
        
                   let ctrl = op(s, i, borrowed); 
        
                   if ctrl == Control::Break { 
        
                       return; 
        
                   } 
        
               } 
        
           }

Here's the candidates length distribution for those 150,000 hits:

len=20: 8 times
len=21: 8 times
len=22: 8 times
...
len=9854: 8 times
len=9855: 8 times
len=9856: 9862 times

There's something quadratic going on, as shown by the equal number of occurrences for lengths 1..9855. There is also the over-representation of len=9856. I haven't yet worked out exactly where this is coming from.

@nikomatsakis

html5ever in the rustc-perf repository is memory-intensive Part of #52028. Rebased atop of #51987. r? @nikomatsakis

nikomatsakis · 2018-07-17T12:23:52Z

#52190 reduces the max-rss to 2GB-- still too big, but a lot better. To do much better than that, I think we have to start changing from a simple bitset to something that is more compressed, so we can capture patterns across variables. Something like BDDs come to mind.

nikomatsakis · 2018-07-17T12:25:36Z

The performance remains ungreat though and @nnethercote's profiles are very useful in that regard. It seems clear (also from the fact that liveness dominates) that in this case we have somewhere a large number of borrows accumulating -- each access is then checked against those large number of borrows. I suspect we could do a lot better via some kind of hashing scheme, where we hash paths to figure out if they might possible overlap. I would imagine we would walk down the place to find prefixes.

nikomatsakis · 2018-07-17T19:53:51Z

We are deferring further work that targets max-rss specifically to the release candidate, since it doesn't seem pressing for Edition Preview 2 right now. At this point, it seems like the next step is to look at ways to represent liveness regions that can share between region values -- e.g., something like a BDD (correct ordering will be key here) or perhaps SEME regions.

nnethercote · 2018-07-25T00:44:00Z

I re-profiled. The high number of calls to unroll_place is still a major problem, as is the behaviour of precompute_borrows_out_of_scope. Both appear to involve quadratic behaviour.

lqd · 2018-07-25T17:55:28Z

For posterity's sake, 95% of the time spent in html5ever is due to this big static

lqd · 2018-07-25T21:08:31Z

Just linking the couple issues Niko filed, related to @nnethercote's and others' profiles (but not strictly about this current issue of memory usage):

nikomatsakis · 2018-08-07T20:08:41Z

Update: With #53168, I think we can expect html5ever's memory usage to be reduced to 1.2GB. (At least, the previous PR — which I reverted — had that effect, and this one does the same basic thing, just more soundly.)

nnethercote · 2018-08-17T00:08:18Z

@Mark-Simulacrum : are you happy to close this now? With NLL, html5ever's memory usage and speed are now both reasonable (and #53383 will help both a bit more).

Mark-Simulacrum · 2018-08-17T00:48:44Z

html5ever still uses 6x more memory (over a gigabyte) for check builds if I'm reading the dashboard right, which I believe is unacceptably high long-term. However, I do agree that we can discontinue prioritizing this for the edition (performance regression is no longer a huge outlier for html5ever specifically).

nikomatsakis · 2018-08-27T13:14:31Z

Note that #53327 seems to drop memory use to 600MB here (although not the latest rev, which needs to be investigated).

nikomatsakis · 2018-08-27T13:18:21Z

Pushing this out to RC 2 — it's a perf issue and not a critical one (plus pending PRs ought to help)

nikomatsakis · 2018-09-11T15:04:27Z

Memory use is currently at a 2.28x ratio, weighing in at 501MB.

Still a lot, but not so much as to be an RC2 blocker.

Would be good to know what's going on though.

nnethercote · 2018-09-12T06:30:27Z

Here's a Massif graph showing a big spike near the end of a check build:

228MB of that is from SparseBitMatrix instances within ConstraintGeneration::add_regular_live_constraint. Here's a DHAT record for half of it (and there's a second record for the other half that is extremely similar):

max-live:    114,195,072 in 9,858 blocks
tot-alloc:   115,981,392 in 15,556 blocks (avg size 7455.73) 
deaths:      15,556, at avg age 186,718,791 (1.80% of prog lifetime)
acc-ratios:  1.00 rd, 0.00 wr  (116,230,288 b-read, 248,896 b-written)
   at 0x4C2FEE5: calloc (in /usr/lib/valgrind/vgpreload_exp-dhat-amd64-linux.so)
   by 0x7AE3D45: alloc_zeroed (alloc.rs:147)
   by 0x7AE3D45: alloc_zeroed (alloc.rs:174)
   by 0x7AE3D45: allocate_in<u128,alloc::alloc::Global> (raw_vec.rs:104)
   by 0x7AE3D45: with_capacity_zeroed<u128> (raw_vec.rs:156)
   by 0x7AE3D45: from_elem<u128> (vec.rs:1620)
   by 0x7AE3D45: from_elem<u128> (vec.rs:1581)
   by 0x7AE3D45: new<rustc_mir::borrow_check::nll::region_infer::values::PointIndex> (<vec macros>:2)
   by 0x7AE3D45: {{closure}}<rustc_mir::borrow_check::nll::constraints::ConstraintSccIndex,rustc_mir::borrow_check::nll::region_infer::values::PointIndex> (bitvec.rs:355)
   by 0x7AE3D45: get_or_insert_with<rustc_data_structures::bitvec::BitArray<rustc_mir::borrow_check::nll::region_infer::values::PointIndex>,closure> (option.rs:814)
   by 0x7AE3D45: <rustc_data_structures::bitvec::SparseBitMatrix<R, C>>::ensure_row (bitvec.rs:355)
   by 0x7B654AC: add<rustc::ty::sty::RegionVid,rustc_mir::borrow_check::nll::region_infer::values::PointIndex> (bitvec.rs:363)
   by 0x7B654AC: <rustc_mir::borrow_check::nll::region_infer::values::LivenessValues<N>>::add_element (values.rs:182)
   by 0x7B5690D: {{closure}}<&rustc::ty::sty::RegionKind> (constraint_generation.rs:205)
   by 0x7B5690D: {{closure}}<&rustc::ty::sty::RegionKind,closure> (fold.rs:271)
   by 0x7B5690D: visit_region<closure> (fold.rs:333)
   by 0x7B5690D: visit_with<rustc::ty::fold::{{impl}}::any_free_region_meets::RegionVisitor<closure>> (structural_impls.rs:964)
   by 0x7B5690D: any_free_region_meets<&rustc::ty::sty::RegionKind,closure> (fold.rs:291)
   by 0x7B5690D: for_each_free_region<&rustc::ty::sty::RegionKind,closure> (fold.rs:270)
   by 0x7B5690D: add_regular_live_constraint<&rustc::ty::sty::RegionKind> (constraint_generation.rs:201)
   by 0x7B5690D: visit_region (constraint_generation.rs:70)
   by 0x7B5690D: super_rvalue<rustc_mir::borrow_check::nll::constraint_generation::ConstraintGeneration> (visit.rs:540)
   by 0x7B5690D: visit_rvalue<rustc_mir::borrow_check::nll::constraint_generation::ConstraintGeneration> (visit.rs:138)
   by 0x7B5690D: super_assign<rustc_mir::borrow_check::nll::constraint_generation::ConstraintGeneration> (visit.rs:403)
   by 0x7B5690D: <rustc_mir::borrow_check::nll::constraint_generation::ConstraintGeneration<'cg, 'cx, 'gcx, 'tcx> as rustc::mir::visit::Visitor<'tcx>>::visit_assign (constraint_
generation.rs:151)

nnethercote · 2018-09-12T07:11:46Z

I added some println! statements. Looks like we have a big SparseBitMatrix that ends up with about 30,000 rows, and 92,652 columns in each row, but only a single bit is set in each row. SparseBitMatrix is moderately sparse, in that rows are instantiated fully on demand.

Perhaps a sparser representation could help. In fact, SparseBitMatrix used to be sparser, but in #52250 I made it denser because it was a significant speed win on some benchmarks. Perhaps there's a middle ground to be found.

nnethercote · 2018-09-13T21:57:34Z

I have a plan to fix this, by making SparseBitMatrix sparser in the case where many rows have only 1 column (or a small number of columns) set. It will involve using HybridIdxSet for the rows. But that will first require cleaning up and combining the various bitset structures we currently have, including BitSlice, BitArray, and IdxSet. #54177 is the first step.

(The old SparseBitMatrix used BTreeMap for the rows, which gave terrible performance and memory usage when the rows became fairly dense. HybridIdxSet should provide a good compromise: compact when sparse, but space-efficient when dense, and fast in both cases.)

nnethercote · 2018-09-17T09:34:20Z

I have a plan to fix this, by making SparseBitMatrix sparser in the case where many rows have only 1 column (or a small number of columns) set. It will involve using HybridIdxSet for the rows. But that will first require cleaning up and combining the various bitset structures we currently have, including BitSlice, BitArray, and IdxSet. #54177 is the first step.

#54286 is the next step. Once that lands, I will file a PR for using HybridBitSet within SparseBitMatrix. I have it working locally and the memory improvements are sizeable:

html5ever-check
        avg: -46.7%     min: -46.7%     max: -46.7%
ucd-check
        avg: -45.2%     min: -45.2%     max: -45.2%
clap-rs-check
        avg: -23.3%     min: -23.3%     max: -23.3%
inflate-check
        avg: -14.9%     min: -14.9%     max: -14.9%

nnethercote · 2018-09-18T02:47:39Z

@lqd just pointed me at a crate (unic-ucd-name-0.7.0) with high memory usage under NLL. My HybridBitSet change reduces its max-rss by 96%, from 29.1GB to 1.2GB! (Compared to non-NLL, that improves the increase in max-rss from 42.5x to 1.7x.)

nnethercote · 2018-09-19T08:53:13Z

The NLL dashboard has updated. The top 5 now looks like this:

Benchmark    | clean-check  | nll-check    | %
keccak       | 636,612.00   | 912,740.00   | 143.37
html5ever    | 221,892.00   | 266,248.00   | 119.99
coercions    | 181,744.00   | 210,120.00   | 115.61
tuple-stress | 395,444.00   | 442,804.00   | 111.98
script-servo | 2,463,224.00 | 2,617,760.00 | 106.27

A 20% increase in max-rss for html5ever with NLL isn't ideal, but it's a whole lot better than the 14GB+ we were getting when this PR was opened.

Mark-Simulacrum added the A-NLL Area: Non-lexical lifetimes (NLL) label Jul 3, 2018

nikomatsakis added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. WG-compiler-nll labels Jul 3, 2018

nikomatsakis added this to the Rust 2018 Preview 2 milestone Jul 3, 2018

nnethercote mentioned this issue Jul 9, 2018

Enable NLL for html5ever. rust-lang/rustc-perf#257

Merged

davidtwco mentioned this issue Jul 9, 2018

html5ever in the rustc-perf repository is memory-intensive #52190

Merged

nnethercote mentioned this issue Jul 10, 2018

Dont run ast borrowck on mir mode #52083

Merged

bors added a commit that referenced this issue Jul 16, 2018

Auto merge of #52190 - davidtwco:issue-52028, r=<try>

c1ea2d1

WIP: html5ever in the rustc-perf repository is memory-intensive Part of #52028. Rebased atop of #51987. r? @nikomatsakis

bors added a commit that referenced this issue Jul 17, 2018

Auto merge of #52190 - davidtwco:issue-52028, r=nikomatsakis

025e04e

html5ever in the rustc-perf repository is memory-intensive Part of #52028. Rebased atop of #51987. r? @nikomatsakis

nikomatsakis modified the milestones: Rust 2018 Preview 2, Rust 2018 Release Candidate Jul 17, 2018

pnkfelix added the NLL-performant Working towards the "performance is good" goal label Jul 24, 2018

nikomatsakis added NLL-diagnostics Working towards the "diagnostic parity" goal and removed NLL-performant Working towards the "performance is good" goal labels Jul 24, 2018

nikomatsakis added NLL-performant Working towards the "performance is good" goal and removed NLL-diagnostics Working towards the "diagnostic parity" goal labels Aug 6, 2018

nikomatsakis modified the milestones: Rust 2018 RC, Edition RC 2 Aug 27, 2018

nikomatsakis removed the WG-compiler-nll label Aug 27, 2018

nikomatsakis removed this from the Edition 2018 RC 2 milestone Sep 11, 2018

nikomatsakis added the NLL-deferred label Sep 11, 2018

nnethercote mentioned this issue Sep 17, 2018

Merge bitvec.rs and indexed_set.rs #54286

Merged

bors closed this as completed in 154be2c Sep 19, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

html5ever in the rustc-perf repository is memory-intensive #52028

html5ever in the rustc-perf repository is memory-intensive #52028

Mark-Simulacrum commented Jul 3, 2018

nnethercote commented Jul 4, 2018

nnethercote commented Jul 4, 2018

nikomatsakis commented Jul 4, 2018

nikomatsakis commented Jul 4, 2018

nikomatsakis commented Jul 4, 2018

nikomatsakis commented Jul 4, 2018

nnethercote commented Jul 4, 2018

nikomatsakis commented Jul 4, 2018

nikomatsakis commented Jul 4, 2018

nnethercote commented Jul 5, 2018

nikomatsakis commented Jul 5, 2018

nikomatsakis commented Jul 5, 2018

nnethercote commented Jul 11, 2018

Mark-Simulacrum commented Jul 11, 2018

nnethercote commented Jul 17, 2018

nikomatsakis commented Jul 17, 2018

nikomatsakis commented Jul 17, 2018

nikomatsakis commented Jul 17, 2018

nnethercote commented Jul 25, 2018

lqd commented Jul 25, 2018

lqd commented Jul 25, 2018 •

edited

Loading

nikomatsakis commented Aug 7, 2018 •

edited

Loading

nnethercote commented Aug 17, 2018

Mark-Simulacrum commented Aug 17, 2018

nikomatsakis commented Aug 27, 2018

nikomatsakis commented Aug 27, 2018

nikomatsakis commented Sep 11, 2018

nnethercote commented Sep 12, 2018

nnethercote commented Sep 12, 2018

nnethercote commented Sep 13, 2018

nnethercote commented Sep 17, 2018

nnethercote commented Sep 18, 2018 •

edited

Loading

nnethercote commented Sep 19, 2018

html5ever in the rustc-perf repository is memory-intensive #52028

html5ever in the rustc-perf repository is memory-intensive #52028

Comments

Mark-Simulacrum commented Jul 3, 2018

nnethercote commented Jul 4, 2018

nnethercote commented Jul 4, 2018

nikomatsakis commented Jul 4, 2018

nikomatsakis commented Jul 4, 2018

nikomatsakis commented Jul 4, 2018

nikomatsakis commented Jul 4, 2018

nnethercote commented Jul 4, 2018

nikomatsakis commented Jul 4, 2018

nikomatsakis commented Jul 4, 2018

nnethercote commented Jul 5, 2018

nikomatsakis commented Jul 5, 2018

nikomatsakis commented Jul 5, 2018

nnethercote commented Jul 11, 2018

Mark-Simulacrum commented Jul 11, 2018

nnethercote commented Jul 17, 2018

nikomatsakis commented Jul 17, 2018

nikomatsakis commented Jul 17, 2018

nikomatsakis commented Jul 17, 2018

nnethercote commented Jul 25, 2018

lqd commented Jul 25, 2018

lqd commented Jul 25, 2018 • edited Loading

nikomatsakis commented Aug 7, 2018 • edited Loading

nnethercote commented Aug 17, 2018

Mark-Simulacrum commented Aug 17, 2018

nikomatsakis commented Aug 27, 2018

nikomatsakis commented Aug 27, 2018

nikomatsakis commented Sep 11, 2018

nnethercote commented Sep 12, 2018

nnethercote commented Sep 12, 2018

nnethercote commented Sep 13, 2018

nnethercote commented Sep 17, 2018

nnethercote commented Sep 18, 2018 • edited Loading

nnethercote commented Sep 19, 2018

lqd commented Jul 25, 2018 •

edited

Loading

nikomatsakis commented Aug 7, 2018 •

edited

Loading

nnethercote commented Sep 18, 2018 •

edited

Loading