Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Align watcher::Event init/page variants #1504

Merged
merged 10 commits into from
Jun 4, 2024
Merged

Align watcher::Event init/page variants #1504

merged 10 commits into from
Jun 4, 2024

Conversation

clux
Copy link
Member

@clux clux commented May 29, 2024

Merges - Event::InitPage into Event::InitApply

Follow-up to the unreleased #1494 as suggested in #1499.

This has several consequences;

  1. buffering of individual pages done in watcher
  2. no buffering needed in any flatteners (no need for the concept of flattening)
  3. no need to eventually break the Event enum if we remove paging (when streaming lists is stabilised and everywhere)

Because of 1. this slightly changing the no-buffering PR in #1494 and as such it needs to be benchmarked, but it is a nice reduction in complexity within the runtime if we can justify it. EDIT: it performs the same.

The second point kind of hints at an even larger refactoring where we do not pass vector around internally at all and possibly killing/repurposing EventFlatten.

**Merges - `Event::InitPage` and `Event::InitApply`**

This has several consequences;

- buffering of pages done in watcher (need to benchmark this since it is likely worse memory wise)
- no buffering needed in any flatteners (further simplification can be done later)

Signed-off-by: clux <[email protected]>
@clux clux added the changelog-change changelog change category for prs label May 29, 2024
@clux clux changed the title Align watcher::Event variants for initialisation Align watcher::Event init/page variants May 29, 2024
clux added 2 commits May 29, 2024 17:15
Signed-off-by: clux <[email protected]>
Copy link

codecov bot commented May 29, 2024

Codecov Report

Attention: Patch coverage is 83.87097% with 10 lines in your changes missing coverage. Please review.

Project coverage is 75.1%. Comparing base (de3fe1e) to head (7706413).

Additional details and impacted files
@@           Coverage Diff           @@
##            main   #1504     +/-   ##
=======================================
+ Coverage   75.0%   75.1%   +0.2%     
=======================================
  Files         78      78             
  Lines       6854    6864     +10     
=======================================
+ Hits        5134    5150     +16     
+ Misses      1720    1714      -6     
Files Coverage Δ
kube-runtime/src/reflector/dispatcher.rs 96.5% <100.0%> (+0.3%) ⬆️
kube-runtime/src/reflector/mod.rs 100.0% <100.0%> (ø)
kube-runtime/src/reflector/store.rs 96.9% <ø> (+3.0%) ⬆️
kube-runtime/src/utils/event_flatten.rs 91.2% <100.0%> (+1.5%) ⬆️
kube-runtime/src/utils/event_modify.rs 95.5% <100.0%> (ø)
kube-runtime/src/utils/predicate.rs 73.4% <100.0%> (ø)
kube-runtime/src/utils/reflect.rs 100.0% <100.0%> (ø)
kube-runtime/src/utils/watch_ext.rs 22.3% <50.0%> (ø)
kube-runtime/src/watcher.rs 41.8% <67.9%> (+1.7%) ⬆️

@clux
Copy link
Member Author

clux commented May 29, 2024

Quick controller deployment of gives me the exact same memory profile as the one with zero buffering, so this appears to be a pure ergonomic improvement.

@clux clux marked this pull request as ready for review May 29, 2024 16:55
@clux clux requested review from nightkr and mateiidavid May 29, 2024 19:00
@clux clux added this to the 0.92.0 milestone May 30, 2024
@clux
Copy link
Member Author

clux commented May 30, 2024

For clarity: my benchmarked (running in a real world controller):
Screenshot 2024-05-30 at 11 04 21

last two uses new store buffering, and the last uses this PR. memory profile is always slightly growing in this controller (separate issue), but the starting baseline is exactly the same between this and the previous store buffering. All controllers doing the old internal buffering started at ~80+M (though with some variance), but the two new ones are right there at 46M which nothing has gotten close to.

Comment on lines -810 to -815
// We're filtering by object name, so getting more than one object means that either:
// 1. The apiserver is accepting multiple objects with the same name, or
// 2. The apiserver is ignoring our query
// In either case, the K8s apiserver is broken and our API will return invalid data, so
// we had better bail out ASAP.
Ok(Event::InitPage(objs)) if objs.len() > 1 => Some(Err(Error::TooManyObjects)),
Copy link
Member Author

@clux clux May 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NB: This comment was technically wrong because it can happen if users use a Api::all scoped Api against names that exist in multiple namespaces. Now we just pick the first consistently (which is also what we did for streaming lists).

Additionally because page events do not happen anymore, the TooManyObjects error goes away (this is the only real place it happened). Tests have used this error variant everywhere as a type of convenience though, so that's why there's many strange test changes touching this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that explains the tests... is the ordering random or does the api server ever pre-filter the objects based on some other property like timestamp (i.e. creation time)?

Copy link
Member Author

@clux clux Jun 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think on a watch against rv="0" it's actually alphabetical - which in this case doesn't help much if there's duplicate objects (in say different namespaces).

bad idea

it's possible we could change this fn to split depending on what scope the `Api` is used for, ala:

    let fields = if let Some(ns) = api.namespace {
        format!("metadata.name={name},metadata.namespace={ns}")
    } else {
        format!("metadata.name={name}")
    };

but that requires exposing namespace as pub, so will do that as a follow-up
EDIT: it also doesn't help because if the Api (being passed in` is scoped to namespace, then the watch is also scoped so this suggestion is pointless.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor docs follow-up for this: #1510

Copy link
Contributor

@mateiidavid mateiidavid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thanks so much for the clean-up and really straightforward comments, it made reviewing much easier.

Comment on lines -20 to -24
Self {
stream,
queue: vec![].into_iter(),
emit_deleted,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kind of cool that this got simplified as a result

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, think we can maybe deprecate / repurpose this thing entirely to some kind of filter module instead (since the concept of an "unflattened stream" kind of goes away externally (which is good, it was a big source of confusion).

kube-runtime/src/watcher.rs Outdated Show resolved Hide resolved
}
InitialListStrategy::ListWatch => (Some(Ok(Event::Init)), State::InitPage {
continue_token: None,
objects: VecDeque::default(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we know the page size in advance (since it's part of the strategy) would it make sense to allocate a ring buffer with a predetermined capacity equal to the page size? Bit of a micro optimisation I suppose 🤔

Edit: nvm, this wouldn't work since we always construct a new InitPage for each page we consume. Would overcomplicate things.

Copy link
Member Author

@clux clux Jun 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, yeah, maybe there is something we can do here, but not sure what. it's also going to become "the old way" of doing things once this kubernetes feature becomes stabilised and rolled out everywhere, so 🤷

it doesn't seem to perform any noticeably worse than with main (with the zero buffering in watcher setup) so it's probably fine. have run this for a week now.

Comment on lines -810 to -815
// We're filtering by object name, so getting more than one object means that either:
// 1. The apiserver is accepting multiple objects with the same name, or
// 2. The apiserver is ignoring our query
// In either case, the K8s apiserver is broken and our API will return invalid data, so
// we had better bail out ASAP.
Ok(Event::InitPage(objs)) if objs.len() > 1 => Some(Err(Error::TooManyObjects)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that explains the tests... is the ordering random or does the api server ever pre-filter the objects based on some other property like timestamp (i.e. creation time)?

@clux clux merged commit 6ce3978 into main Jun 4, 2024
17 checks passed
@clux clux deleted the watcher-strategy-align branch June 4, 2024 11:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog-change changelog change category for prs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants