Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve backup and restore of statistics #29628

Open
dveeden opened this issue Nov 9, 2021 · 3 comments
Open

Improve backup and restore of statistics #29628

dveeden opened this issue Nov 9, 2021 · 3 comments
Labels
component/br This issue is related to BR of TiDB. type/enhancement The issue or PR belongs to an enhancement.

Comments

@dveeden
Copy link
Contributor

dveeden commented Nov 9, 2021

Enhancement

The statistics docs has information about how to export and import table statistics.

The br docs on backup of the mysql schema say the statistics related tables are never restored.

The FAQ entry on this says that backup of statistics is disabled in 4.0.10 and up and that you should run ANALYZE after doing a restore.

Looking in the code shows that DumpStatsToJSONis called.

if statsHandle != nil {
if err := schema.dumpStatsToJSON(statsHandle); err != nil {
logger.Error("dump table stats failed", logutil.ShortError(err))
}
}

Running the following results in statistics in a backup:

BACKUP SCHEMA test TO '/tmp/test_backup_001';
cd /tmp/test_backup_001
tiup br validate decode  -s .
for line in $(jq -r ".schemas[].stats" backupmeta.json)
do
  echo $line | base64 -d | jq -r '"Stats report that " + .database_name + "." + .table_name + " has " + (.count|tostring) + " rows"'
done
Stats report that test.t3 has 2 rows
Stats report that test.t2 has 2 rows
Stats report that test.t4 has 500000 rows
Stats report that test.t1 has 1388544 rows
$ curl -s -o - http://127.1:10080/stats/dump/test/t1 | jq -r '"Stats report that " + .database_name + "." + .table_name + " has " + (.count|tostring) + " rows"'
Stats report that test.t1 has 1388544 rows

So in the backupmeta.json for each schema there is a stats key that holds a base64 encoded JSON that has the same format as the output of the stats dump as described in the statistics docs.

Things that should be answered/fixed:

  1. The FAQ entry looks incorrect and needs updating
  2. Getting access to the statistics in the backup is hard as the backupmeta.json is not created by default and because it is stored as base64 encoded JSON as part of another JSON structure.
    a. Can we make this easier?
    b. We should probably document this.
  3. The statistics docs should be updated with information about this being included in backups by default.
  4. Are the statistics restored when restoring a backup? If not, can we make an option to do this?
  5. Can we make LOAD STATS to accept a backupmeta.json or backupmeta file?
@dveeden dveeden added the type/enhancement The issue or PR belongs to an enhancement. label Nov 9, 2021
@dveeden
Copy link
Contributor Author

dveeden commented Nov 9, 2021

/component br

@ti-chi-bot ti-chi-bot added the component/br This issue is related to BR of TiDB. label Nov 9, 2021
@dveeden
Copy link
Contributor Author

dveeden commented Nov 9, 2021

Looks like tiup br backup db --db test -s /tmp/test_backup_002 creates a backup without statistics.

https://docs.pingcap.com/tidb/stable/sql-statement-backup suggests that br and the BACKUP statements should be mostly identical.

@kennytm
Copy link
Contributor

kennytm commented Nov 9, 2021

this is supposed to be supported via pingcap/br#679. the JSON stats is just for backward compatibility.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/br This issue is related to BR of TiDB. type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

No branches or pull requests

3 participants