Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: "--unsafe-overwrite-db" flag to support v2 migration with no previous v3 data #9484

Closed
wants to merge 8 commits into from
Closed
8 changes: 8 additions & 0 deletions CHANGELOG-3.2.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,14 @@ See [code changes](https://github.com/coreos/etcd/compare/v3.2.16...v3.2.17) and
- Server now returns `rpctypes.ErrLeaseTTLTooLarge` to client, when the requested `TTL` is larger than *9,000,000,000 seconds* (which is >285 years).
- Again, etcd `Lease` is meant for short-periodic keepalives or sessions, in the range of seconds or minutes. Not for hours or days!
- Enable etcd server [`raft.Config.CheckQuorum` when starting with `ForceNewCluster`](https://github.com/coreos/etcd/pull/9347).
- Add [`etcd --unsafe-overwrite-db`](https://github.com/coreos/etcd/pull/9484) flag, to support [migration from v2 with no v3 data](https://github.com/coreos/etcd/issues/9480).
- etcd server panics if it tries to restore from existing snapshots but no v3 `ETCD_DATA_DIR/member/snap/db` file.
- This happens when the server had migrated from v2 with no previous v3 data.
- This is to prevent accidental v3 data loss (e.g. `db` file might have been moved).
- With `--unsafe-overwrite-db` enabled, etcd allows to create a fresh `db` file on reboot, when there are existing snapshots.
- To continue to use etcd without v3 data, keep `--unsafe-overwrite-db` enabled. Or write some test v3 keys (e.g. `ETCDCTL_API=3 etcdctl put foo bar`), and then restart etcd with `--unsafe-overwrite-db` disabled.
- Use this flag only for v2 migration. Otherwise, previous v3 data could be overwritten with existing snapshots.
- v4 will deprecate this flag.

### Go

Expand Down
11 changes: 11 additions & 0 deletions CHANGELOG-3.3.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,17 @@ See [code changes](https://github.com/coreos/etcd/compare/v3.3.2...v3.3.3) and [
- For instance, when hourly writes are 100 and `--auto-compaction-mode=periodic --auto-compaction-retention=24h`, `v3.2.x`, `v3.3.0`, `v3.3.1`, and `v3.3.2` compact revision 2400, 2640, and 2880 for every 2.4-hour, while `v3.3.3` *or later* compacts revision 2400, 2500, 2600 for every 1-hour.
- Futhermore, when `--auto-compaction-mode=periodic --auto-compaction-retention=30m` and writes per minute are about 1000, `v3.3.0`, `v3.3.1`, and `v3.3.2` compact revision 30000, 33000, and 36000, for every 3-minute, while `v3.3.3` *or later* compacts revision 30000, 60000, and 90000, for every 30-minute.

### Fixed: v3

- Add [`etcd --unsafe-overwrite-db`](https://github.com/coreos/etcd/pull/9484) flag, to support [migration from v2 with no v3 data](https://github.com/coreos/etcd/issues/9480).
- etcd server panics if it tries to restore from existing snapshots but no v3 `ETCD_DATA_DIR/member/snap/db` file.
- This happens when the server had migrated from v2 with no previous v3 data.
- This is to prevent accidental v3 data loss (e.g. `db` file might have been moved).
- With `--unsafe-overwrite-db` enabled, etcd allows to create a fresh `db` file on reboot, when there are existing snapshots.
- To continue to use etcd without v3 data, keep `--unsafe-overwrite-db` enabled. Or write some test v3 keys (e.g. `ETCDCTL_API=3 etcdctl put foo bar`), and then restart etcd with `--unsafe-overwrite-db` disabled.
- Use this flag only for v2 migration. Otherwise, previous v3 data could be overwritten with existing snapshots.
- v4 will deprecate this flag.

### Metrics, Monitoring

- Add missing [`etcd_network_peer_sent_failures_total` count](https://github.com/coreos/etcd/pull/9437).
Expand Down
8 changes: 8 additions & 0 deletions CHANGELOG-3.4.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,14 @@ See [security doc](https://github.com/coreos/etcd/blob/master/Documentation/op-g
- For instance, a flaky(or rejoining) member may drop in and out, and start campaign. This member will end up with a higher term, and ignore all incoming messages with lower term. In this case, a new leader eventually need to get elected, thus disruptive to cluster availability. Raft implements Pre-Vote phase to prevent this kind of disruptions. If enabled, Raft runs an additional phase of election to check if pre-candidate can get enough votes to win an election.
- `--pre-vote=false` by default.
- v3.5 will enable `--pre-vote=true` by default.
- Add [`--unsafe-overwrite-db`](https://github.com/coreos/etcd/pull/9484) flag, to support [migration from v2 with no v3 data](https://github.com/coreos/etcd/issues/9480).
- etcd server panics if it tries to restore from existing snapshots but no v3 `ETCD_DATA_DIR/member/snap/db` file.
- This happens when the server had migrated from v2 with no previous v3 data.
- This is to prevent accidental v3 data loss (e.g. `db` file might have been moved).
- With `--unsafe-overwrite-db` enabled, etcd allows to create a fresh `db` file on reboot, when there are existing snapshots.
- To continue to use etcd without v3 data, keep `--unsafe-overwrite-db` enabled. Or write some test v3 keys (e.g. `ETCDCTL_API=3 etcdctl put foo bar`), and then restart etcd with `--unsafe-overwrite-db` disabled.
- Use this flag only for v2 migration. Otherwise, previous v3 data could be overwritten with existing snapshots.
- v4 will deprecate this flag.
- TODO: [`--initial-corrupt-check`](TODO) flag is now stable (`--experimental-initial-corrupt-check` is deprecated).
- `--initial-corrupt-check=true` by default, to check cluster database hashes before serving client/peer traffic.
- TODO: [`--corrupt-check-time`](TODO) flag is now stable (`--experimental-corrupt-check-time` is deprecated).
Expand Down
2 changes: 2 additions & 0 deletions Documentation/upgrades/upgrade_3_2.md
Original file line number Diff line number Diff line change
Expand Up @@ -212,6 +212,8 @@ See [issue #6336](https://github.com/coreos/etcd/issues/6336) for more contexts.

### Server upgrade checklists

**NOTE:** `v3.2.18` added [`etcd --unsafe-overwrite-db`](https://github.com/coreos/etcd/pull/9484) flag, to support [migration from v2 with no v3 data](https://github.com/coreos/etcd/issues/9480). etcd server panics if it tries to restore from existing snapshots but no v3 `ETCD_DATA_DIR/member/snap/db` file. This happens when the server had migrated from v2 with no previous v3 data. This is to prevent accidental v3 data loss (e.g. `db` file might have been moved). With `--unsafe-overwrite-db` enabled, etcd allows to create a fresh `db` file on reboot, when there are existing snapshots. To continue to use etcd without v3 data, keep `--unsafe-overwrite-db` enabled. Or write some test v3 keys (e.g. `ETCDCTL_API=3 etcdctl put foo bar`), and then restart etcd with `--unsafe-overwrite-db` disabled. Use this flag only for v2 migration. Otherwise, previous v3 data could be overwritten with existing snapshots. v4 will deprecate this flag.
Copy link
Contributor Author

@gyuho gyuho Mar 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jlhawn Please let me know if this makes sense. Basically, if you aren't going to write any v3 keys for awhile, keep the --unsafe-overwrite-db flag. Or start with --unsafe-overwrite-db flag once, and write some test v3 keys. And restart with --unsafe-overwrite-db flag disabled. It's complicated, but think it's safer while keeping the fix introduced in #7876 :0

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gyuho that makes sense. Thanks!


#### Upgrade requirements

To upgrade an existing etcd deployment to 3.2, the running cluster must be 3.1 or greater. If it's before 3.1, please [upgrade to 3.1](upgrade_3_1.md) before upgrading to 3.2.
Expand Down