Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

page_service: rewrite batching to work without a timeout, pipeline in protocol handler instead #9851

Draft
wants to merge 54 commits into
base: problame/merge-getpage-test
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
1639b26
async-timer based approach
problame Nov 20, 2024
f3ed569
Revert "async-timer based approach"
problame Nov 20, 2024
81d9970
tokio::time::Interval based approach
problame Nov 20, 2024
1d85bec
Revert "tokio::time::Interval based approach"
problame Nov 20, 2024
12124b2
tokio_timerfd::Interval
problame Nov 20, 2024
f9bf038
Revert "tokio_timerfd::Interval"
problame Nov 20, 2024
689788c
async-timer based approach (again, with data)
problame Nov 20, 2024
7be13bc
undo local modifications to benchmark
problame Nov 20, 2024
c73e9e4
try async-timer 1.0.0-beta15 (still signal-based timers)
problame Nov 20, 2024
68550f0
async-timer 1.0.0-beta15 with features=tokio1
problame Nov 20, 2024
721643b
try interval-based impl to cross-chec
problame Nov 20, 2024
5f3e6f3
Revert "try interval-based impl to cross-chec"
problame Nov 20, 2024
cbb5817
Revert "async-timer 1.0.0-beta15 with features=tokio1"
problame Nov 20, 2024
21866fa
Revert "try async-timer 1.0.0-beta15 (still signal-based timers)"
problame Nov 20, 2024
469ce81
Revert "async-timer based approach (again, with data)"
problame Nov 20, 2024
fcda7a7
tokio_timerfd::Delay based impl
problame Nov 20, 2024
f22ad86
Revert "tokio_timerfd::Delay based impl"
problame Nov 20, 2024
517dda8
vanilla tokio based timer impl based on tokio::time::Sleep
problame Nov 20, 2024
c68661d
Revert "undo local modifications to benchmark"
problame Nov 20, 2024
89b6cb8
Revert "vanilla tokio based timer impl based on tokio::time::Sleep"
problame Nov 20, 2024
fa7ce2c
the final choice: async-timer 1.0beta15 with features=["tokio1"]
problame Nov 21, 2024
09e7485
Merge branch 'problame/merge-getpage-test' into problame/batching-timer
problame Nov 21, 2024
a1bb2e7
WIP: pipelined batching
problame Nov 21, 2024
aa1032a
no need for cancel & ctx in pagestream_do_batch
problame Nov 21, 2024
345f8b6
fix ready_for_next_batch order
problame Nov 21, 2024
408bc8f
cleanups
problame Nov 21, 2024
73046fd
span fixes
problame Nov 21, 2024
56de071
fruitless debugging
problame Nov 21, 2024
7680aa1
draft
problame Nov 21, 2024
240e48d
improvements
problame Nov 21, 2024
db9093f
revert back to 'span fixes' commit
problame Nov 21, 2024
88fd8ae
watch-based approach
problame Nov 21, 2024
89d9d16
cherry-pick from problame/batching-benchmark while it's waiting for m…
problame Nov 22, 2024
a3d1cf6
config changes to express pipelining config (not respected yet)
problame Nov 22, 2024
c1040bc
task-based mode
problame Nov 22, 2024
0fa8ae3
WIP refactor to allow truly serial mode
problame Nov 22, 2024
093674b
impmlement the serial mode
problame Nov 22, 2024
c1e8347
make configurable whether pipelining should use concurrent futures or…
problame Nov 22, 2024
39e45f9
improve tests
problame Nov 22, 2024
ef502f8
remove async-timer heritage
problame Nov 22, 2024
a28c54d
cosmetics
problame Nov 22, 2024
d6e5a46
eliminate the word `batch` and stale doc comments
problame Nov 22, 2024
11dc713
rename test file to test_page_service_batching
problame Nov 22, 2024
5796f3b
fix test
problame Nov 22, 2024
cbe1839
Benchmark results (metal node, AMD Ryzen 9 7950X3D 16-Core Processor)
problame Nov 22, 2024
990e44d
longer target runtime
problame Nov 22, 2024
bd31f42
run benchmarks
problame Nov 22, 2024
6ec5ac1
DO NOT MERGE: enable pipelining (32,concurrent-futures) by default so…
problame Nov 25, 2024
0bb0372
logging to debug test_pageserver_restarts_under_worload
problame Nov 25, 2024
b9477aa
fix: batcher wouldn't shut down after executor exits
problame Nov 25, 2024
99b664c
expand fix to tasks mode; add some comments
problame Nov 25, 2024
9bf2618
implement spsc_fold
problame Nov 26, 2024
a23abb2
adopt spsc_fold
problame Nov 26, 2024
41ddc67
benchmark
problame Nov 26, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 10 additions & 3 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,7 @@ comfy-table = "7.1"
const_format = "0.2"
crc32c = "0.6"
dashmap = { version = "5.5.0", features = ["raw-api"] }
diatomic-waker = { version = "0.2.3" }
either = "1.8"
enum-map = "2.4.2"
enumset = "1.0.12"
Expand Down
26 changes: 20 additions & 6 deletions libs/pageserver_api/src/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -109,8 +109,7 @@ pub struct ConfigToml {
pub virtual_file_io_mode: Option<crate::models::virtual_file::IoMode>,
#[serde(skip_serializing_if = "Option::is_none")]
pub no_sync: Option<bool>,
#[serde(with = "humantime_serde")]
pub server_side_batch_timeout: Option<Duration>,
pub page_service_pipelining: Option<PageServicePipeliningConfig>,
}

#[derive(Debug, Clone, PartialEq, Eq, serde::Serialize, serde::Deserialize)]
Expand All @@ -127,6 +126,21 @@ pub struct DiskUsageEvictionTaskConfig {
pub eviction_order: EvictionOrder,
}

#[derive(Debug, Clone, PartialEq, Eq, serde::Serialize, serde::Deserialize)]
#[serde(deny_unknown_fields)]
pub struct PageServicePipeliningConfig {
/// Causes runtime errors if larger than max get_vectored batch size.
pub max_batch_size: NonZeroUsize,
pub protocol_pipelining_mode: PageServiceProtocolPipeliningMode,
}

#[derive(Debug, Clone, Copy, PartialEq, Eq, serde::Serialize, serde::Deserialize)]
#[serde(rename_all = "kebab-case")]
pub enum PageServiceProtocolPipeliningMode {
ConcurrentFutures,
Tasks,
}

pub mod statvfs {
pub mod mock {
#[derive(Debug, Clone, PartialEq, Eq, serde::Serialize, serde::Deserialize)]
Expand Down Expand Up @@ -319,8 +333,6 @@ pub mod defaults {
pub const DEFAULT_EPHEMERAL_BYTES_PER_MEMORY_KB: usize = 0;

pub const DEFAULT_IO_BUFFER_ALIGNMENT: usize = 512;

pub const DEFAULT_SERVER_SIDE_BATCH_TIMEOUT: Option<&str> = None;
}

impl Default for ConfigToml {
Expand Down Expand Up @@ -401,10 +413,12 @@ impl Default for ConfigToml {
ephemeral_bytes_per_memory_kb: (DEFAULT_EPHEMERAL_BYTES_PER_MEMORY_KB),
l0_flush: None,
virtual_file_io_mode: None,
server_side_batch_timeout: DEFAULT_SERVER_SIDE_BATCH_TIMEOUT
.map(|duration| humantime::parse_duration(duration).unwrap()),
tenant_config: TenantConfigToml::default(),
no_sync: None,
page_service_pipelining: Some(PageServicePipeliningConfig {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we enable this in a few test unit/python to get some rudimentary continuous test coverage?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we enable this in a few test unit/python to get some rudimentary continuous test coverage?

I don't get the question. If the default is Some here, so, we have the entire test suite as coverage.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assumed you won't merge it with pipelining enabled. If that's the case, I am suggesting to parametrize some existing python tests to use pipelining (we'd also have to tweak the compute config).

max_batch_size: NonZeroUsize::new(32).unwrap(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use MAX_GET_VECTORED_KEYS instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, initially I was thinking to revert back to None before merging - it's Some() right now so I get CI coverage.

Let's keep this conversation open and decide later.

protocol_pipelining_mode: PageServiceProtocolPipeliningMode::ConcurrentFutures,
}),
}
}
}
Expand Down
1 change: 1 addition & 0 deletions libs/utils/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ bincode.workspace = true
bytes.workspace = true
camino.workspace = true
chrono.workspace = true
diatomic-waker.workspace = true
git-version.workspace = true
hex = { workspace = true, features = ["serde"] }
humantime.workspace = true
Expand Down
2 changes: 2 additions & 0 deletions libs/utils/src/sync.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
pub mod heavier_once_cell;

pub mod gate;

pub mod spsc_fold;
Loading
Loading