Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

config: rework validator configs #1939

Merged
merged 1 commit into from
May 13, 2022
Merged

config: rework validator configs #1939

merged 1 commit into from
May 13, 2022

Conversation

bmwill
Copy link
Contributor

@bmwill bmwill commented May 13, 2022

This patch does a large rework of our validator configs. Specifically it
creates a single ValidatorConfig which represents all of the
configuration a Validator needs to start up. It also creates a
light-weight ValidatorInfo type which contains information we expect
to know about other validators (stake, public key, network address) and
will eventually be read from on-chain. This is a replacement for the
AuthorityInfo type which was a type that all validators have for all
other validators but included some odd information like the on-disk path
to a validator's db which other validators don't need to know about.

This patch also does an initial rework of creating a NetworkConfig
which is used when you want to create N validators to run locally.
There's still some additional work that will need to be done to make
this nicer for testing purposes but this is a good start.

@bmwill
Copy link
Contributor Author

bmwill commented May 13, 2022

Note I was able to convert everything in a fairly straightforward manner except some of the benchmarking binaries for remote benchmarking. I think those will require some help from @oxade to get done properly. I still plan on doing a bit more in this area immediately once this PR lands but would like to get this landed asap given the vast blast radius it already has.

@bmwill
Copy link
Contributor Author

bmwill commented May 13, 2022

One thing of note is that now instead of needing to curry around this extra AuthorityKeys type you instead just have all the information you need in a NetworkConfig, if you fully own the network, and a Validator only has the information that it needs to know about.

@bmwill bmwill added Priority: Medium A nice to have feature or annoying bug, non-blocking and no delays expected if we punt on it Status: Needs Review PR is ready for review labels May 13, 2022
@bmwill bmwill requested a review from lxfind May 13, 2022 18:09
// let load_gen_path = Path::new(&path_str);

// lg.persisted(load_gen_path).save().unwrap();
// }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commented code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, i've removed it

@mystenmark
Copy link
Contributor

Overall this seems like a good idea. Some comments:

  • It appears you've switched to using public keys rather than addresses as the identifier of a validator in the config. I think we should probably stick with addresses.
  • This is going to require significant changes for deployment to devnet. One issue is that we will need to have a way to generate a network.conf for a solo validator. Unless those configs are now created by hand (I don't know what changes were necessary for when we removed the private keys from the configs).

@bmwill
Copy link
Contributor Author

bmwill commented May 13, 2022

  • It appears you've switched to using public keys rather than addresses as the identifier of a validator in the config. I think we should probably stick with addresses.

A SuiAddress is derived from a PublicKey so this actually enforces that we don't have an address that mis-matches the public key. Validators are still identified the same.

  • This is going to require significant changes for deployment to devnet. One issue is that we will need to have a way to generate a network.conf for a solo validator. Unless those configs are now created by hand (I don't know what changes were necessary for when we removed the private keys from the configs).

Yes this will and will need to be handled in a separate change. We don't need nor want a network.conf for a solo validator. Instead the validator binary should only accept a ValidatorConfig as a network config contains information about all validators for testing purposes and individual validators shouldn't know all of the config information for other validators. So this might mean that we have some pre-step that generates the network config for all validators and then distributes the individual validator configs to the respective validators.

Copy link
Contributor

@longbowlu longbowlu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for taking time cleaning this up!
This is a relevant issue being tracked #1837

Main feedback from the 1st round of looking: how do we expect the genesis process looks like for testnet/mainnet? My gut feeling is we will need to share a file that contains all validator's information, and only public information. (some context: Initially the network.conf had all the private keys and we moved away to only contain one key pair recently)

let (network_config, _, _) = genesis(genesis_conf, Some(adddress)).await?;
network_config
let mut genesis_conf: GenesisConfig = PersistedConfig::read(&cfg.genesis_config_path)?;
genesis_conf.quorum_size = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is specifically for the case where a config isn't provided and we want to spin up a lonely validator. I'm not so sure i agree with the overall flow in this case but didn't want to change even more things in this PR since its already long

Comment on lines 64 to +66
let listen_address = cfg
.listen_address
.unwrap_or_else(|| authority.network_address.clone());

let consensus_committee = network_config.make_narwhal_committee();

let consensus_parameters = ConsensusParameters {
max_header_delay: std::time::Duration::from_millis(5_000),
max_batch_delay: std::time::Duration::from_millis(5_000),
..ConsensusParameters::default()
};
let consensus_store_path = sui_config_dir()?
.join(CONSENSUS_DB_NAME)
.join(encode_bytes_hex(&authority.public_key));
.unwrap_or_else(|| validator_config.network_address().to_owned());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't use this flag now (because we can get it from ValidatorInfo)
should be safe to delete it
cc @mystenmark

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to not adjust the cmdline flags too much so i can leave this to be cleaned up in a future pr.

Comment on lines +11 to +12
use sui_config::PersistedConfig;
use sui_config::{GenesisConfig, ValidatorConfig};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

merge?

let mut genesis_conf: GenesisConfig = PersistedConfig::read(&cfg.genesis_config_path)?;
genesis_conf.quorum_size = 1;
let (network_config, _, _) = genesis(genesis_conf).await?;
network_config.into_validator_configs().remove(0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same :)


#[derive(Serialize, Deserialize)]
pub struct GenesisConfig {
pub quorum_size: usize,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

committee_size is probably better - quorum_size sounds like the k in k out of n

network_config.authorities.push(authority);
}
let config_dir = sui_config_dir().unwrap();
let mut network_config = NetworkConfig::generate(&config_dir, num_to_provision);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

repeating what brandon & I discussed offline:

we’d like to keep the ability to genesis with a pre-specified validator sets (aka with their public keys). This is how it will work for testnet/mainnet and also help with debugability by keeping the same set of validator keys. People will run sui genesis with a genesis config and set up all the dbs etc etc. and we don’t need to socialize two files/config genesis.conf and network.conf

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the end we should only need to provide validators with their own validator config.

If we want to fix the names of the validators for debug-ability there is another generate function which takes an RNG so we can just seed it

db_path: PathBuf,
network_address: Multiaddr,
metrics_address: Multiaddr,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove?

//TODO make narwhal config serializable
#[serde(skip_serializing)]
#[serde(default)]
narwhal_config: DebugIgnore<ConsensusParameters>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious why DebugIgnore?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ConsensusParameters doesn't impl Debug, I plan to fix it but need to do so in the narwhal repo

Comment on lines +152 to +153
/// This is a config that is used for testing or local use as it contains the config and keys for
/// all validators
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Today's workflow:

  1. we collect validator information in genesis.conf, socialize it with all genesis validators (there's a keypair which is specific to the concerned validator)
  2. each validator runs sui genesis and get a network.conf. The network.conf only has public keys of all validators (plus a keypair of the concerned validator)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new workflow should be more straight forward:

  1. We perform genesis and generate a NetworkConfig which includes N Validator configs
  2. We give the corresponding validator config to each validator

@longbowlu longbowlu requested a review from velvia May 13, 2022 19:57
@bmwill
Copy link
Contributor Author

bmwill commented May 13, 2022

Main feedback from the 1st round of looking: how do we expect the genesis process looks like for testnet/mainnet? My gut feeling is we will need to share a file that contains all validator's information, and only public information. (some context: Initially the network.conf had all the private keys and we moved away to only contain one key pair recently)

There's still more work to do to get it to where we'd need for testnet/mainet. But we can't have everyone running genesis themselves. At the end of the day genesis is the initial set of objects/checkpoint that will need to be signed by everyone's private keys so we'll need to perform a ceremony for that. I heavily worked on similar logic for diem so i'm working towards a similar process here. Its going to take a few more PRs to get there though ;)

Copy link
Contributor

@longbowlu longbowlu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bmwill can you provide a concise description of how you envision the genesis would happen?

approving to avoid rebase nightmare

This patch does a large rework of our validator configs. Specifically it
creates a single `ValidatorConfig` which represents all of the
configuration a Validator needs to start up. It also creates a
light-weight `ValidatorInfo` type which contains information we expect
to know about other validators (stake, public key, network address) and
will eventually be read from on-chain. This is a replacement for the
`AuthorityInfo` type which was a type that all validators have for all
other validators but included some odd information like the on-disk path
to a validator's db which other validators don't need to know about.

This patch also does an initial rework of creating a `NetworkConfig`
which is used when you want to create N validators to run locally.
There's still some additional work that will need to be done to make
this nicer for testing purposes but this is a good start.
@bmwill bmwill enabled auto-merge (squash) May 13, 2022 22:20
@bmwill bmwill merged commit 2746dd4 into MystenLabs:main May 13, 2022
@bmwill
Copy link
Contributor Author

bmwill commented May 13, 2022

@longbowlu I provided a short description in the issue you linked: #1837

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority: Medium A nice to have feature or annoying bug, non-blocking and no delays expected if we punt on it Status: Needs Review PR is ready for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants