Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Developer tooling for easier chain experience #902

Closed
5 tasks
greg-szabo opened this issue May 5, 2021 · 2 comments · Fixed by #928
Closed
5 tasks

Developer tooling for easier chain experience #902

greg-szabo opened this issue May 5, 2021 · 2 comments · Fixed by #928
Labels
I: infrastructure Internal: related to Infrastructure (testing, deployment, etc)
Milestone

Comments

@greg-szabo
Copy link
Member

Crate

scripts

Summary

This is an investigative issue to improve the developer experience trying to test the relayer on the local machine.

Problem Definition

User stories:

  • As a developer I would like to easily start one or two validator nodes on my local machine using local binaries. The networks are separate and they have easily identifiable ports assigned on my machine so I can easily manage/use them over those ports.
  • As a developer I would like to easily start an additional full node related to a chain. The process of starting the node would only involve me pointing the startup script to the validator and the node would start on a identified set of ports. Managing starting/stopping the node is easy.

Fulfilling these user stories with an easily configurable tool (possibly a set of scripts) would help local testing.

Managing gaiad locally is not really hard, but it's cumbersome. Creating a validator requires running multiple steps, copy-pasting results, remembering keys, etc. Creating a full node requires finding the network genesis.json, configuring the config.toml to connect to a validator, etc. These steps are done and redone in all our testing tools, yet when running something locally, we still rely mostly on gaiad and each of the steps one-by-one.

Proposal

I propose a set of scripts used by the developers integrated into the console that allows "point-and-run" methods to create and maintain one-or-two validators and corresponding full nodes. (gaiad manager = gm)

  • Why not a high-level language like Rust?
    Most of the things to do to create and maintain a local node are low-level OS functions: copying files, running gaiad processes. (The most complex is updating toml files.) Any high-level language adds complexity for long-term supportability. But in the long term, most probably these tools will change a lot and there's no point in investing in long-term support yet. When we get to that point, most probably we'll have a much better set of requirements and we can still implement those with Rust. AKA PoC.

  • How to keep configuration maintenance low?
    There's a lot of inherent knowledge when running nodes on a local machine: we have the genesis file at hand, we don't care about key security (test backend is fine) but we want the full node's persistent_peers automatically populated. We know we use internal IPs so strict mode is always false in the Tendermint config.
    Some of these configuration items are fixed and the script can automatically maintain them (strict_mode). Some of them should be automatically generated and kept up-to-date for developer convenience. (persistent_peers) And some of them have to be kept open for the developers to change at will. (timeouts)
    To share this inherent knowledge, I propose a config file (let's call it gm.toml) that has this knowledge documented so the set of scripts can use and reuse it (and the developer and the script can agree on what to do). This config file needs to be straightforward to read and edit for the developer.

The script(s) should be able to set up a validator and a corresponding node for the developer with one (or few) command(s). It should report all usable information back to the developer so the nodes can be managed separate from this tool too.

Example of how this would work:

gm.toml:

[global]
gaiad_binary=$HOME/go/bin/gaiad
ports_start_at=27000
home_dir=$HOME/.gm
auto_maintain_config=yes

[validator1]

[node]
network=validator1
gaiad_binary=$HOME/git/cosmos/gaia/build/gaiad

[validator2]
home_dir=$HOME/git/informal/ibc-rs/scripts/full-mesh/network1

The above config defines two separate validator nodes (on separate networks) and a full node for validator1.
Validator1 is using the global defaults, the full node uses a custom gaiad binary and validator2 has its own configuration folder somewhere else. All nodes allow maintaining some of their configuration keys such as the persistent_peers section for full nodes to automatically connect to validators or similarly the unconditional_peers section in the validator. The tool will auto-create configuration if there is none and optionally try to fix some issues (like if a genesis.json is missing from a full node) where it's straightforward to do so.

Commands to manage the set:

Start services:

$ gm start
Creating validator1 config...
validator1 RPC at http://localhost:27000
validator1 GRPC at http://localhost:27002

Creating node config...
node RPC at http://localhost:27004
node GRPC at http://localhost:27006

Using existing validator2 config...
validator2 RPC at http://localhost:27008
validator2 GRPC at http://localhost:27010

(use gm start validator1 to start only one service.)

Query services:

$ gm status
Name         PID   RPC   APP    GRPC  P2P    NodeID           HomeDir
validator1   1234  27000 27001  27002 27003  abcdef1234567890 /Users/greg/.gm/validator1
|-node       1236  27004 27005  27006 27007  fedcba0987654321 /Users/greg/.gm/node
validator2  (1238)     -     -      -     -  defcbadeadbeef32 /Users/greg/git/informal/ibc-rs/scripts/full-mesh/network1

(You can see that validator2 must have had something happen and the process died after start.)

This allows the developer to have direct access to all node processes and keep them started/stopped as they see fit using the gm tool or separately. (In which case the tool will try to catch up as best as possible.)

Show keys:

$ gm keys validator1
/Users/greg/go/bin/gaiad keys list --keyring-backend test --keyring-dir /Users/greg/.gm/validator1
- name: validator
  type: local
  address: cosmos1npveugqwyr0kkkl39lkxw0xlcahx7pvtesne28
  pubkey: cosmospub1addwnpepqvqsvqedwgc4vsn3yt49hy7ndpwp03urs3z5e9q49xcf8uf2na9quunwpcw
  mnemonic: "fluid verb can gadget clean frozen comfort goose blast visa middle abandon ignore device witness acoustic borrow evoke aware duty stage kitten suggest inquiry"
  threshold: 0
  pubkeys: []
- name: "wallet"
  type: local
  address: cosmos1wl9c48kgszlmgx6qyagmexfyjxuvhrut3gk9ev
  pubkey: cosmospub1addwnpepqfr5j0rmns09489hyndxcvee87krn5v32u4lzm4wwn0gnh2rdmfggsavxdu
  mnemonic: "click private name imitate forward table pair shield lab error design smooth barrel industry trend hurry cinnamon attitude over menu pigeon spy grace burger"
  threshold: 0
  pubkeys: []

Show log:

$ gm log validator2 -f
tail -f /Users/greg/git/informal/ibc-rs/scripts/full-mesh/network1/log
(regular log file with consensus error not depicted)

Notes:

  • for the first draft, only one-node validators are supported but multiple full nodes can be connected to validators directly. This can be expanded later with multi-node validators or full nodes connecting to other full nodes.
  • The tool will not be able to manage nodes started outside of the tool but nodes created using gm can happily connect to tools outside.
  • Feedback is welcome, especially for hermes-related extensions. (Do we want the tool to manage hermes relayer processes?)

Acceptance Criteria

When the above-detailed commands can be executed. (Details can be negotiated.)


For Admin Use

  • Not duplicate issue
  • Appropriate labels applied
  • Appropriate milestone (priority) applied
  • Appropriate contributors tagged
  • Contributor assigned/self-assigned
@greg-szabo
Copy link
Member Author

$ gm validator1 p2p
http://localhost:27003

or maybe gm p2p validator1? Or preferably both. Just brainstorming.

@brapse
Copy link
Contributor

brapse commented May 11, 2021

This sounds absolutely amazing. Reminds me a bit of kubernetes tooling. The scope of this could be pretty vast so it might be good to see if the community wants to help out here.

@romac romac added the I: infrastructure Internal: related to Infrastructure (testing, deployment, etc) label May 21, 2021
@adizere adizere added this to the 05.2021 milestone May 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I: infrastructure Internal: related to Infrastructure (testing, deployment, etc)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants