Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lightning/dumpling/dm/cdc/sync-diff-inspector: Support encrypting the MySQL password #30524

Open
kennytm opened this issue Nov 4, 2020 · 4 comments
Labels
component/dumpling This is related to Dumpling of TiDB. component/lightning This issue is related to Lightning of TiDB. sig/migrate type/feature-request Categorizes issue or PR as related to a new feature.

Comments

@kennytm
Copy link
Contributor

kennytm commented Nov 4, 2020

Feature Request

Is your feature request related to a problem? Please describe:

Currently the MySQL password is stored as plain-text in the config.toml, which some users feel uncomfortable with.

Describe the feature you'd like:

Provide some way to hide the password. Example:

# the current situation.
[tidb]
password = "Passw0rd!!"

# same as above, only base-64 encoded
[tidb]
password = { base64 = "UGFzc3cwcmQhIQ==" }

# same as above (using TOML's dotted-key feature)
[tidb]
password.base64 = "UGFzc3cwcmQhIQ=="

# read from a file 
[tidb]
password.file = "/data/secret/lightning.txt"

# read from environment
[tidb]
password.env = "TIDB_PASSWORD"

# read from file, cached in memory, so the file can be deleted after Lightning starts
# (without caching Lightning will fetch the password from the file in case of reconnect)
[tidb]
password.cached.file = "/data/secret/lightning.txt"

# encrypted using AES-256-CTR, key read from a local file in raw binary.
[tidb]
password.aes-256-ctr = { data.base64 = "zdpUpVhoJoggKw==", key.file = "/data/secret/lightning-key.bin", nonce.base64 = "XN74TC92g2MBNbzEPpxZUA==" }

# (same as above)
[tidb.password.aes-256-ctr]
data.base64 = "zdpUpVhoJoggKw=="
key.file = "/data/secret/lightning-key.bin"
nonce.base64 = "XN74TC92g2MBNbzEPpxZUA=="

# (also same as above)
[tidb]
password.aes-256-ctr.data.base64 = "zdpUpVhoJoggKw=="
password.aes-256-ctr.key.file = "/data/secret/lightning-key.bin"
password.aes-256-ctr.nonce.base64 = "XN74TC92g2MBNbzEPpxZUA=="

# can also read the encrypted password from a binary file.
[tidb]
password.aes-256-ctr = { data.file = "/data/secret/lightning.enc", key.file = "/data/secret/lightning-key.bin", nonce.base64 = "XN74TC92g2MBNbzEPpxZUA==" }

The password will always be decrypted in the Lightning process no matter which algorithm is chosen, since the MySQL protocol demands the original password for authentication.

Describe alternatives you've considered:

Don't do it. Rely on Lightning-in-SQL.

Teachability, Documentation, Adoption, Optimization:

@King-Dylan
Copy link
Contributor

King-Dylan commented Nov 25, 2020

I need this feature (only base-64 encoded)

@pepezzzz
Copy link

Financial enterprise use lighting tools embedded in shell script to import data routinely without any interaction,and the user password stored in configuration file should be encoded by base-64 algorithm 。

@kennytm
Copy link
Contributor Author

kennytm commented Nov 30, 2020

Specification

Configuration syntax (Lightning, DM, CDC, sync-diff-inspector)

In existing structural configuration (JSON, TOML, YAML), a "string" password field can be naturally extended to a dynamic variable password, which can be used to conceal the secret from the configuration.

A variable is a datatype satisfying the following JSON schema:

{
    "$schema": "https://json-schema.org/draft/draft-07/schema#",
    "$id": "#variable",
    "oneOf": [
        {
            "comment": "plain text",
            "type": "string"
        },
        {
            "comment": "decode variable using base64 on decrypt",
            "type": "object",
            "properties": {
                "base64": {"$ref": "#variable"}
            },
            "required": ["base64"]
        },
        {
            "comment": "read from file using path from variable on decrypt",
            "type": "object",
            "properties": {
                "file": {"$ref": "#variable"}
            },
            "required": ["file"]
        },
        {
            "comment": "read from environment using name from variable on decrypt",
            "type": "object",
            "properties": {
                "env": {"$ref": "#variable"}
            },
            "required": ["env"]
        },
        {
            "comment": "immediate decrypt on load and cache decrypted result in memory",
            "type": "object",
            "properties": {
                "cached": {"$ref": "#variable"}
            },
            "required": ["cached"]
        },
        {
            "comment": "decrypt variable by AES-256-CTR",
            "type": "object",
            "properties": {
                "aes-256-ctr": {
                    "type": "object",
                    "properties": {
                        "data": {"$ref": "#variable"},
                        "key": {"$ref": "#variable"},
                        "nonce": {"$ref": "#variable"}
                    },
                    "required": ["data", "key", "nonce"]
                }
            },
            "required": ["aes-256-ctr"]
        }
    ]
}

Variables can be cascaded, allowing us to use the same vocabulary for providing the password directly or the decryption key & nonce. This also allows us to provide some useful feature like

# read $TIDB_PASSWORD_FILE, which contains a file path, which contains the password itself.
# (this is similar to how GCP's credential file operates)
[tidb]
password.file.env = "TIDB_PASSWORD_FILE"

or some useless security theater like 🙃

[tidb]
password.base64.base64.base64.base64.base64.base64 = "Vm0xNFUxSXhWWGhXYTJSWFlURktWRlpyVWtKUFVUMDk="

TOML's dotted key feature makes it particularly easy to spell out these nested structures. Unfortunately this is not possible in YAML so you need to expand it:

target-database:
  host: 127.0.0.1
  port: 3306
  user: root
  password:
    aes-256-ctr:
      data: {base64: zdpUpVhoJoggKw==}
      key: {file: /data/secret/lightning-key.bin}
      nonce: {base64: XN74TC92g2MBNbzEPpxZUA==}

Command line syntax (Lightning, Dumpling)

We prefer providing dotted command line flags like

./dumpling -h 127.0.0.1 -P 3306 \
    -u root \
    --password.aes-256-ctr.data.base64 'zdpUpVhoJoggKw==' \
    --password.aes-256-ctr.key.file '/data/secret/lightning-key.bin' \
    --password.aes-256-ctr.nonce.base64 'XN74TC92g2MBNbzEPpxZUA=='

This, however, requires spf13/pflag#187 or spf13/pflag#199 or spf13/pflag#285 (the amount of duplicated PR shows how well maintained the pflag library is). If these PRs aren't merged or we can't switch to one of the forks, we may need to use a more conventional and ugly API like:

./dumpling -h 127.0.0.1 -P 3306 \
    -u root \
    --encrypted-password '{"aes-256-ctr":{
        "data":{"base64":"zdpUpVhoJoggKw=="},
        "key":{"file":"/data/secret/lightning-key.bin"},
        "nonce":{"base64":"XN74TC92g2MBNbzEPpxZUA=="}
    }}'

Encryption tool

There should be a tool to generate the encrypted password (maybe through tidb-lightning-ctl / br debug / dmctl / tidb-ctl)

$ ./some-password-tool encrypt base64 -f toml -r 'tidb.password'
Enter password: ••••••••••
## Warning: base64 is not an encryption, it cannot protect your password if leaked. Use at your own risk.
[tidb]
password.base64 = "UGFzc3cwcmQhIQ=="

$ ./some-password-tool encrypt base64 --password 'Passw0rd!!' -f yaml -r 'target-database.password'
## Warning: base64 is not an encryption, it cannot protect your password if leaked. Use at your own risk.
target-database:
  password:
    base64: "UGFzc3cwcmQhIQ=="

$ ./some-password-tool encrypt base64 --password 'Passw0rd!!' -f json -r ''
// Warning: base64 is not an encryption, it cannot protect your password if leaked. Use at your own risk.
{"base64":"UGFzc3cwcmQhIQ=="}

$ ./some-password-tool encrypt aes-256-ctr --password 'Passw0rd!!' --key-file ./lightning-key.bin | tee encrypted.toml
password.aes-256-ctr.data.base64 = "zdpUpVhoJoggKw=="
password.aes-256-ctr.key.file = "/data/secret/lightning-key.bin"
password.aes-256-ctr.nonce.base64 = "XN74TC92g2MBNbzEPpxZUA=="

The possible subcommands are "base64" and "aes-256-ctr".

Argument Meaning
-f, --format Output format: TOML (default), YAML, JSON, CLI
-r, --root A dot-separated key path of the root. Default to 'password'.
-p, --password The password to encrypt. Reads from stdin if empty.
-k, --key-file For aes-256-ctr only. A 32-byte file containing the encryption key.

the same tool should also be able to decrypt the password.

$ ./some-password-tool decrypt -f toml < encrypted.toml
Passw0rd!!

$ echo '{"x":{"base64":"UGFzc3cwcmQhIQ=="}}' | ./some-password-tool decrypt -f json -r x
Passw0rd!!

$ # with aes-256-ctr, losing the key file should make the decryption output garbage
$ echo -n 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA' > /data/secret/lightning-key.bin
$ ./some-password-tool decrypt < encrypted.toml
u÷	dµ�+�Z"

This tool can also be created as a static webpage, which should be the most user-friendly and easiest to use. However, there may be some conception issue about whether the password entered on a web interface will be secretly sent to PingCAP (no it won't).

1-fs8

Module path

The library will be placed on pingcap/tidb-tools for now. We may move it into pingcap/tidb as a sub-Go-module in future decisions.

@kennytm kennytm transferred this issue from pingcap/tidb-lightning Dec 8, 2021
@kennytm kennytm changed the title Support encrypting the MySQL password br/lightning/dumpling/dm/cdc: Support encrypting the MySQL password Dec 8, 2021
@kennytm kennytm changed the title br/lightning/dumpling/dm/cdc: Support encrypting the MySQL password lightning/dumpling/dm/cdc/sync-diff-inspector: Support encrypting the MySQL password Dec 8, 2021
@kennytm kennytm added component/dumpling This is related to Dumpling of TiDB. component/lightning This issue is related to Lightning of TiDB. sig/migrate type/feature-request Categorizes issue or PR as related to a new feature. labels Dec 8, 2021
@kennytm
Copy link
Contributor Author

kennytm commented Dec 8, 2021

Design notes

  • in "aes-256-ctr" we may want to concatenate "nonce" and "data" to so there are only 2 fields.
  • need to support DM's existing password encryption scheme (which is aes-256-cfb)
  • what's the easiest way to distribute the encryption tool?
    • dumpling simply uses pflag so we're not going to allow ./dumpling encrypt-password ...
    • user may not install the tools via tiup so we can't assume tiup ctl tidb encrypt-password ... is available
    • copying an EXE just to encrypt the password may be heavy-weight
  • may want to reuse ./tidb-lightning --tidb-password '...' for both encrypted and plain text password, by reducing the space of plain text password (breaking backward compatibility)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/dumpling This is related to Dumpling of TiDB. component/lightning This issue is related to Lightning of TiDB. sig/migrate type/feature-request Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

3 participants