-
Notifications
You must be signed in to change notification settings - Fork 1.6k
[DONTMERGE] Add debug logs and make approval voting failable #3783
base: master
Are you sure you want to change the base?
Conversation
ac8810e
to
a58867c
Compare
node/core/approval-voting/src/lib.rs
Outdated
@@ -97,6 +99,7 @@ const APPROVAL_CHECKING_TIMEOUT: Duration = Duration::from_secs(120); | |||
const APPROVAL_CACHE_SIZE: usize = 1024; | |||
const TICK_TOO_FAR_IN_FUTURE: Tick = 20; // 10 seconds. | |||
const LOG_TARGET: &str = "parachain::approval-voting"; | |||
const MALICIOUS_FREQUENCY: usize = 5000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI 5000 is quite a lot. 1 parachain block requires 20ish approvals, so this means 1 in 200 parachain blocks. I'd recommend tweaking this to 50_000 or 100_000 so it's only once every few hours.
let _ = rx.send(recent_disputes.keys().cloned().collect()); | ||
}, | ||
DisputeCoordinatorMessage::ActiveDisputes(rx) => { | ||
// TODO (ladi): Shouldn''t this load active disputes? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's only 'recent disputes' on disk, and then this filters them to be only active
@@ -103,7 +104,7 @@ impl Error { | |||
tracing::debug!(target: LOG_TARGET, err = ?self) | |||
}, | |||
// it's worth reporting otherwise | |||
_ => tracing::warn!(target: LOG_TARGET, err = ?self), | |||
_ => tracing::debug!(target: LOG_TARGET, err = ?self), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why demote this?
@@ -119,7 +120,7 @@ where | |||
match ctx.recv().await { | |||
Err(_) => return, | |||
Ok(FromOverseer::Signal(OverseerSignal::Conclude)) => { | |||
tracing::info!(target: LOG_TARGET, "Received `Conclude` signal, exiting"); | |||
tracing::debug!(target: LOG_TARGET, "Received `Conclude` signal, exiting"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And many others in the same file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's easier (for now) to just display everything in Debug logs imo. I can change it back if you see it differently.
This PR is not intended to get merged at any point
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can match logs with e.g. level =~"DEBUG|INFO|ERROR|WARN"
... just for reference.
a58867c
to
78abae4
Compare
78abae4
to
d12cde4
Compare
603d79f
to
0f4d592
Compare
0f4d592
to
65f7767
Compare
.collect(); | ||
actions.append(&mut approvals); | ||
} | ||
ApprovalOutcome::Malicious((candidate, session_index)) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a lot of extra machinery to add instead of just sending the message from within the if coin_flip == 0
block
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was debating whether or not to do it simply with the coin_flip == 0
, but thought maybe this was a little cleaner.
I realize we aren't going to be re-using this code, but it wasn't a large change. Will keep in mind to keep it simple going forward.
node/core/approval-voting/src/lib.rs
Outdated
@@ -100,7 +101,8 @@ const APPROVAL_CACHE_SIZE: usize = 1024; | |||
const TICK_TOO_FAR_IN_FUTURE: Tick = 20; // 10 seconds. | |||
const LOG_TARGET: &str = "parachain::approval-voting"; | |||
const DEBUG_LOG_TARGET: &str = "parachain::ladi-debug-approval-voting"; | |||
const MALICIOUS_FREQUENCY: usize = 5_000; | |||
const MALICIOUS_MEAN: f64 = 2_500f64; | |||
const MALICIOUS_SDEV: f64 = 70f64; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: .._STDDEV
is the common abbreviation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I wanted MEAN and SDEV to have the same length :)
node/core/approval-voting/src/lib.rs
Outdated
@@ -346,12 +348,15 @@ where | |||
{ | |||
fn start(self, ctx: Context) -> SpawnedSubsystem { | |||
let backend = DbBackend::new(self.db.clone(), self.db_config); | |||
let normal = Normal::new(MALICIOUS_MEAN, MALICIOUS_SDEV); | |||
let offset: u64 = normal.sample(&mut rand::thread_rng()).round() as u64; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend adding 3 * STDDEV
which means 99.7% are not edited, but you avoid hickups due to huge outliers.
node/core/approval-voting/src/lib.rs
Outdated
@@ -791,6 +797,7 @@ where | |||
&mut approvals_cache, | |||
&mut subsystem.mode, | |||
actions, | |||
if counter.fetch_add(1, Ordering::SeqCst) == offset { true } else { false }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: the if case can be dropped :) just use the boolean expression
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added logging in these two cases, so for now I'll leave it as an if-statement. I'll keep the logging for deploying to one node but can drop the logging before deploying to all validators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (for test)
Signed-off-by: Oliver Tale-Yazdi <[email protected]>
Signed-off-by: Oliver Tale-Yazdi <[email protected]>
…adi-logging-disputes
* changed lease period to 1 day * bumped version * bumped version again * changed 356 to 365 days to mimic one year Co-authored-by: Santi Balaguer <[email protected]>
* changed lease period to 1 day * bumped version * bumped version again * changed 356 to 365 days to mimic one year Co-authored-by: Santi Balaguer <[email protected]>
8aa9955
to
e63e448
Compare
* Switch substrate branch to polkadot-v0.9.16 * Bump spec_version * Bump version to 0.9.16
* New changelog scripts (paritytech#4491) * Add templates * Add folder for local storage of the digests * Add first draft of the changelog scripts * Enable Audits in the change template * Fixes for Polkadot * Fix templating issue in case there is no high prio change * Fix Ruby setup * Remove shell * Fix chain names * Fix ENV * Fix how to get runtime * Fix runtime_dir * Fix context location * Pin changelogerator to a specific version * New weights for Westend * New weights for Kusama * New weights for Polkadot * deps update
ce4e0b0
to
cd4b791
Compare
35a6216
to
f58d17b
Compare
f58d17b
to
61de01c
Compare
Malicious node implementation for burn-in of disputes on Rococo.
This adds substantial logging and causes nodes to falsely raise disputes on good candidate at a frequency of 1/5000 approvals. The next step is to alter candidate-backing to second a deterministically bad candidate at a similar frequency.