Skip to content

Commit

Permalink
Merge pull request #75 from D8-X/dev
Browse files Browse the repository at this point in the history
Merge dev into main
  • Loading branch information
Mantelijo authored Mar 21, 2024
2 parents 4be0b01 + db984e2 commit d1b8980
Show file tree
Hide file tree
Showing 18 changed files with 662 additions and 121 deletions.
11 changes: 11 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -369,6 +369,14 @@ On the broker server, inspect the services with `docker ps` and look at the log
files with `cd broker` and `docker compose logs -f`. Redeploy changed configs
via `d8x setup broker-deploy`

Sometimes swarm cluster deployment ingress network can get stuck on manager
server. This might result in HTTP 503 errors or `Connection refused` message
when trying to curl swarm services on manager server. To fix this you can re-run
`d8x setup swarm-deploy` command which attempts to fix "broker" ingress network
after deploying swarm services. There is also individual subcommand `d8x
fix-ingress` which does only the ingress network fixing part, but requires to
rerun `swarm-deploy` command afterwards.

</details>
<details>
<summary>How do I update the swarm server software images to a new version?</summary>
Expand Down Expand Up @@ -417,6 +425,9 @@ swarm-manager for which you can get the ip with `d8x ip manager`).
docker service update --image "ghcr.io/d8-x/d8x-trader-main:dev@sha256:aea8e56d6077c733a1d553b4291149712c022b8bd72571d2a852a5478e1ec559" stack_api
```

See the [Update Runbook](./UPDATE_RUNBOOK.md) for more guidelines and more
details how updating services works.

</details>

<details>
Expand Down
187 changes: 187 additions & 0 deletions UPDATE_RUNBOOK.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,187 @@
# Updating your d8x services

When updating your d8x swarm and broker services, we recommend performing a
database backup before running the actual updates. If you make a backup, if
something goes wrong, you will still have the database that can be used to
restore the state of your previous deployments. Making a backup is also useful
when significant contract changes are introduced, for example changes in events
(around beginning of March in 2024).

The following runbook will guide on how to perform a database backup and update
your services.

Follow this runbook from top to bottom.

## Running the database backup

Refer to the [Database Backups](./README.md#database-backups) on more details
about how to perform a database backup.

First, you should run the `d8x backup-db` command:

```bash
d8x backup-db
```

Will result with output similar to the following:
```bash
┌──────────────────────────┐
│ ____ ___ __ __ │
| _ \ ( _ ) \ \/ / │
| | | | / _ \ \ / │
| |_| | | (_) | / \
|____/ \___/ /_/\_\
│ │
└──────────────────────────┘
Backing up database...

Determining postgres version
Postgres server at lin-55017-35110-pgsql-primary-private.servers.linodedb.net version: 14.6
Ensuring pg_dump is installed on manager server (postgresql-client-16)
Creating database testing_db backup
Backup file size: 0.023752 MB
Database testing_db backup file was downloaded and copied to /home/mantas/work/d8x-cli/build/backup-d8x-cluster-testing-linode-2024-03-12-17-36-36.dump.sql
Removing backup file from server
```

Make sure you securely store the backup file.

## Running the update for specific or all services

Refer to [Readme](./README.md) for more information about the `d8x update` command.

Run `d8x update` and select the service(s) you want to update. Depending on
which services are updated, you might be asked to enter private keys or other
information.

```bash
d8x update
```

```bash
┌──────────────────────────┐
│ ____ ___ __ __ │
| _ \ ( _ ) \ \/ / │
| | | | / _ \ \ / │
| |_| | | (_) | / \
|____/ \___/ /_/\_\
│ │
└──────────────────────────┘
Updating swarm services...

Select swarm services to update

[x] api
[x] candles-pyth-client
[x] candles-ws-server
[x] history
[x] referral
╭────────╮
│ OK │
╰────────╯
Select broker-server services to update

[x] broker
[x] executorws
╭────────╮
│ OK │
╰────────╯
Fetching image tags with sha hashes for service referral
Fetching image tags with sha hashes for service candles-pyth-client
Fetching image tags with sha hashes for service candles-ws-server
Fetching image tags with sha hashes for service api
Fetching image tags with sha hashes for service history
Image tags fetched for service ghcr.io/d8-x/d8x-candles-pyth-client
Image tags fetched for service ghcr.io/d8-x/d8x-trader-main
Image tags fetched for service ghcr.io/d8-x/d8x-trader-history
Image tags fetched for service ghcr.io/d8-x/d8x-candles-ws-server
Image tags fetched for service ghcr.io/d8-x/d8x-referral-system

Choose which image reference to update api service to

[x] ghcr.io/d8-x/d8x-trader-main:main@sha256:ac06805f6be51a83e21dfa78d9d27ec425d169623f16ffa43484792a48d8a016
[ ] ghcr.io/d8-x/d8x-trader-main:dev@sha256:2f306c1342d6f7aecc440fd8d841479cb63afa3e0e9b61dceb384a3118000928
[ ] Enter image reference manually
╭────────╮
│ OK │
╰────────╯
Service api will be updated to ghcr.io/d8-x/d8x-trader-main:main@sha256:ac06805f6be51a83e21dfa78d9d27ec425d169623f16ffa43484792a48d8a016

Choose which image reference to update candles-pyth-client service to

[x] ghcr.io/d8-x/d8x-candles-pyth-client:main
[ ] ghcr.io/d8-x/d8x-candles-pyth-client:dev@sha256:5373e33c382f72773d50e3ac7b47f739ca95b05ae6d0f11e1eac9ce800877f3a
[ ] Enter image reference manually
╭────────╮
│ OK │
╰────────╯
Service candles-pyth-client will be updated to ghcr.io/d8-x/d8x-candles-pyth-client:main

Choose which image reference to update candles-ws-server service to

[x] ghcr.io/d8-x/d8x-candles-ws-server:main
[ ] ghcr.io/d8-x/d8x-candles-ws-server:dev@sha256:081eb98ec939d0bfa7e58637fb541c985e36ab0092eca7dd5dc7396f1f5e89ef
[ ] Enter image reference manually
╭────────╮
│ OK │
╰────────╯
Service candles-ws-server will be updated to ghcr.io/d8-x/d8x-candles-ws-server:main

Choose which image reference to update history service to

[x] ghcr.io/d8-x/d8x-trader-history:main@sha256:001704f5249a88cbd93272da705cd92c933837190653b1ad02e7b63add4a24df
[ ] ghcr.io/d8-x/d8x-trader-history:dev@sha256:4e14361b0033ea1917971bd9293a017b0791edf34e7025a344cb091d223c7830
[ ] Enter image reference manually
╭────────╮
│ OK │
╰────────╯
Service history will be updated to ghcr.io/d8-x/d8x-trader-history:main@sha256:001704f5249a88cbd93272da705cd92c933837190653b1ad02e7b63add4a24df

Choose which image reference to update referral service to

[x] ghcr.io/d8-x/d8x-referral-system:main@sha256:5c38fc8938f9cc85a4168386a684930a97cbe1bcdf06e51df1cf7f34b247cfcd
[ ] ghcr.io/d8-x/d8x-referral-system:dev@sha256:cd1925abdcbb17fb063e17370bb3b376d5f85f395e9233873db7e05217098992
[ ] Enter image reference manually
╭────────╮
│ OK │
╰────────╯
Service referral will be updated to ghcr.io/d8-x/d8x-referral-system:main@sha256:5c38fc8938f9cc85a4168386a684930a97cbe1bcdf06e51df1cf7f34b247cfcd
Enter your referral payment executor private key:
> ****************************************************************

Wallet address of entered private key: 0xAc35CA4cC617CFf4143A1471151a904FE535F0c6
Is this the correct address?

╭─────────╮ ╭────────╮
│ yes │ │ no │
╰─────────╯ ╰────────╯

Pruning unused resources on worker servers...
Running docker prune on worker-1:
Deleted Containers:
aefad8dbc219ec4a17a9c5a86f3b56e00937ee2a8627db8a9908d377b5473dc4
fc1a21466bb6de3572ee6d6b8f8cde2dfb31d0d3bace9ea69d7924f0c60a0b27

Total reclaimed space: 0B
Total reclaimed space: 0B

Docker prune on worker 1 completed successfully
Running docker prune on worker-2:
Deleted Containers:
6e251b1f07c935d7b2ac61a0c6ae49fe44abf90bdcda8e3c3125bc54e0d96b9f
205a6f00d081dc3937e4cd3e5907dab50ca6b34eff49609a90e14a25f1e43e84

Total reclaimed space: 0B
Total reclaimed space: 0B

Docker prune on worker 2 completed successfully
Updating api to ghcr.io/d8-x/d8x-trader-main:main@sha256:ac06805f6be51a83e21dfa78d9d27ec425d169623f16ffa43484792a48d8a016
stack_api
overall progress: 2 out of 2 tasks
1/2: running
2/2: running
verify: Service converged
Service api updated successfully
<...>
```

99 changes: 99 additions & 0 deletions internal/actions/ingress_fix.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
package actions

import (
"fmt"
"strconv"
"sync"
"time"

"github.com/D8-X/d8x-cli/internal/conn"
"github.com/D8-X/d8x-cli/internal/styles"
"github.com/urfave/cli/v2"
)

// IngressFix fixes non-working ingress network on manager. Steps to fix ingress
// (in order) are: remove any existing stacks; remove ingress on manager;
// recreate ingress on manager; reboot manager's docker; reboot all workers'
// docker.
func (c *Container) IngressFix(ctx *cli.Context) error {
pwd, err := c.GetPassword(ctx)
if err != nil {
return err
}

// Remove the ingress on manager
ip, err := c.HostsCfg.GetMangerPublicIp()
if err != nil {
return err
}
managerConn, err := conn.NewSSHConnection(ip, c.DefaultClusterUserName, c.SshKeyPath)
if err != nil {
return err
}

// Remove the stack and ingress network
fmt.Println("Removing stack and ingress network")
if _, err := managerConn.ExecCommand(
fmt.Sprintf("docker stack rm %s && yes | docker network rm ingress -f", dockerStackName),
); err != nil {
return fmt.Errorf("removing stack and ingress network: %w", err)
} else {
fmt.Println(styles.SuccessText.Render("Successfully removed stack and ingress network"))
}

fmt.Println("Recreating ingress network")
time.Sleep(5 * time.Second)
// Recreate ingress. Make sure subnet is the same as in setup playbook
if _, err := managerConn.ExecCommand("docker network create -d overlay --subnet 172.16.1.0/24 --ingress ingress"); err != nil {
return fmt.Errorf("recreating ingress network: %w", err)
} else {
fmt.Println(styles.SuccessText.Render("Successfully recreated ingress network"))
}

fmt.Println("Restarting docker daemons")

// Restart the manager's docker
if _, err := managerConn.ExecCommand(
fmt.Sprintf(`echo "%s"| sudo -S systemctl restart docker`, pwd),
); err != nil {
return fmt.Errorf("restarting manager's docker: %w", err)
} else {
fmt.Println(styles.SuccessText.Render("Successfully restarted docker on manager"))
}

workerIps, err := c.HostsCfg.GetWorkerIps()
if err != nil {
return err
}

// Reboot all workers
wg := sync.WaitGroup{}
for n, ip := range workerIps {
n := n
wg.Add(1)
go func(ip string) {
workerNum := n + 1
defer wg.Done()
workerConn, err := conn.NewSSHConnectionWithBastion(managerConn.GetClient(), ip, c.DefaultClusterUserName, c.SshKeyPath)
if err != nil {
info := fmt.Sprintf("creating ssh connection to worker-%d %s: %s", workerNum, ip, err.Error())
fmt.Println(styles.ErrorText.Render(info))
}

if _, err := workerConn.ExecCommand(
fmt.Sprintf(`echo "%s"| sudo -S systemctl restart docker`, pwd),
); err != nil {
return
} else {
fmt.Println(styles.SuccessText.Render("Successfully restarted docker on worker-" + strconv.Itoa(workerNum)))
}
}(ip)
}
wg.Wait()

if ctx.Command.Name == "fix-ingress" {
fmt.Println("Make sure you re-run d8x setup swarm-deploy to re-deploy the services")
}

return nil
}
7 changes: 7 additions & 0 deletions internal/actions/input.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ import (
"fmt"
"os"
"os/exec"
"sort"
"strconv"
"strings"

Expand Down Expand Up @@ -772,6 +773,12 @@ func (c *InputCollector) GetChainId(cfg *configs.D8XConfig, ctx *cli.Context) (u
chainSelection = append(chainSelection, chainName)
}

// Sort chains by name so we always have consistent order in the
// selection
sort.Slice(chainSelection, func(i, j int) bool {
return chainSelection[i] < chainSelection[j]
})

chains, err := c.TUI.NewSelection(chainSelection, components.SelectionOptAllowOnlySingleItem(), components.SelectionOptRequireSelection())
if err != nil {
return 0, err
Expand Down
18 changes: 17 additions & 1 deletion internal/actions/setup.go
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,24 @@ func (c *Container) Setup(ctx *cli.Context) error {
return err
}
if !keepConfig {
if err := c.ConfigRWriter.Write(&configs.D8XConfig{}); err != nil {
// Print out a warning one more time to prevent accidental deletion
// of config
fmt.Println(
styles.AlertImportant.Render("Warning! Existing configuration will be completely removed!"),
)
if yes, err := c.TUI.NewPrompt("Are you sure you want to continue?", false); err != nil {
return err
} else if yes {
// Make a backup of the existing config just in case
backup := c.ConfigRWriter.GetPath() + ".backup-" + time.Now().Format("2006-01-02_15:04:05")
if err := c.ConfigRWriter.WriteTo(backup, cfg); err != nil {
return err
}
fmt.Printf("Backup of the existing configuration was saved to %s\n\n", backup)

if err := c.ConfigRWriter.Write(&configs.D8XConfig{}); err != nil {
return err
}
}
}
}
Expand Down
Loading

0 comments on commit d1b8980

Please sign in to comment.