Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify influx_inspect to compute Cloud 2 cardinality #23356

Closed
davidby-influx opened this issue May 18, 2022 · 1 comment
Closed

Modify influx_inspect to compute Cloud 2 cardinality #23356

davidby-influx opened this issue May 18, 2022 · 1 comment

Comments

@davidby-influx
Copy link
Contributor

davidby-influx commented May 18, 2022

In order to assist migrations from 1.X installations, it would be helpful to have a tool to calculate Cloud 2 equivalent cardinality for a 1.X database.

Add a new command influx_inspect report-db that will correctly sum C2 cardinality for a 1.X database. This can be run for an OSS instance, or on each data node of an Enterprise cluster.

Parameters are:

-db-path <database data directory>
-c <concurrency> - defaults to 1
-exact uses an exact count instead of a hyperloglog estimator. Exact counts can be memory-intensive on large databases
-detailed prints cardinality for tags and fields as well
-rollup <m|r|d|t> rolls up and prints aggregates. m is measurement, the default, r is aggregated to retention policy, d is aggregated to database, t is fully aggregated to the total.

Avoiding -detailed and -exact and using a rollup aggregate like r, d, or t reduces the memory requirements of influx-inspect report-db significantly for large databases.

Example output:

> influx_inspect report-db -c 3 -db-path ~/.influxdb/data -detailed
DB           RP        measurement      series  fields  tag total tags
--           --        -----------      ------  ------  --------- ----
"telegraf"   "autogen" "cpu"            90      10      10        "cpu": 9, "host": 1
"telegraf"   "autogen" "disk"           14      7       8         "path": 2, "device": 2, "fstype": 2, "host": 1, "mode": 1
"telegraf"   "autogen" "diskio"         396     11      37        "name": 36, "host": 1
"telegraf"   "autogen" "kernel"         5       5       1         "host": 1
"telegraf"   "autogen" "mem"            34      34      1         "host": 1
"telegraf"   "autogen" "processes"      11      11      1         "host": 1
"telegraf"   "autogen" "swap"           6       6       1         "host": 1
"telegraf"   "autogen" "system"         7       7       1         "host": 1
"telegraf"   "autogen"                  563     82      53        
"telegraf"                              563     82      53        
"scaler"     "autogen" "foo"            2       1       2         "tag1": 2
"scaler"     "autogen" "baz"            2       1       2         "tag1": 2
"scaler"     "autogen"                  4       2       2         
"scaler"                                4       2       2         
"_internal"  "monitor" "tsm1_wal"       132     4       108       "walPath": 33, "database": 4, "engine": 1, "hostname": 1, "id": 33, "indexType": 1, "path": 33, "retentionPolicy": 2
"_internal"  "monitor" "httpd"          23      23      2         "bind": 1, "hostname": 1
"_internal"  "monitor" "localStore"     3       3       1         "hostname": 1
"_internal"  "monitor" "runtime"        15      15      1         "hostname": 1
"_internal"  "monitor" "tsm1_cache"     297     9       108       "database": 4, "engine": 1, "hostname": 1, "id": 33, "indexType": 1, "path": 33, "retentionPolicy": 2, "walPath": 33
"_internal"  "monitor" "tsm1_engine"    957     29      108       "indexType": 1, "path": 33, "retentionPolicy": 2, "walPath": 33, "database": 4, "engine": 1, "hostname": 1, "id": 33
"_internal"  "monitor" "tsm1_filestore" 66      2       108       "engine": 1, "hostname": 1, "id": 33, "indexType": 1, "path": 33, "retentionPolicy": 2, "walPath": 33, "database": 4
"_internal"  "monitor" "write"          8       8       1         "hostname": 1
"_internal"  "monitor" "cq"             2       2       1         "hostname": 1
"_internal"  "monitor" "database"       8       2       5         "database": 4, "hostname": 1
"_internal"  "monitor" "queryExecutor"  5       5       1         "hostname": 1
"_internal"  "monitor" "shard"          363     11      108       "database": 4, "engine": 1, "hostname": 1, "id": 33, "indexType": 1, "path": 33, "retentionPolicy": 2, "walPath": 33
"_internal"  "monitor" "subscriber"     4       4       1         "hostname": 1
"_internal"  "monitor"                  1883    108     109       
"_internal"                             1883    108     109       
"tester"     "autogen" "foo"            4       2       4         "t1": 2, "t2": 2
"tester"     "autogen"                  4       2       4         
"tester"                                4       2       4         
Total (est.)                            2454    194     168    
> influx_inspect report-db -db-path ~/.influxdb/data -exact -rollup r
DB          RP        measurement series
--          --        ----------- ------
"telegraf"  "autogen"             563
"telegraf"                        563
"scaler"    "autogen"             4
"scaler"                          4
"_internal" "monitor"             1883
"_internal"                       1883
"tester"    "autogen"             4
"tester"                          4
Total                             2454
@davidby-influx davidby-influx self-assigned this May 18, 2022
davidby-influx added a commit that referenced this issue May 18, 2022
To ease migrations to Cloud 2 installations from
1.X databases, estimate Cloud 2 cardinality for
a datanode (or OSS system).

closes #23356
davidby-influx added a commit that referenced this issue May 23, 2022
To ease migrations to Cloud 2 installations from
1.X databases, estimate Cloud 2 cardinality for
a datanode (or OSS system).

closes #23356
davidby-influx added a commit that referenced this issue May 26, 2022
feat: estimate Cloud2 cardinality on 1.X databases

To ease migrations to Cloud 2 installations from
1.X databases, estimate Cloud 2 cardinality for
a data node (or OSS system).

closes #23356
@davidby-influx
Copy link
Contributor Author

The salient difference between influx_inspect report and influx_inspect report-db is that the former sums series by TSM file, database, and grand total, and prints the time range of the TSM files. The latter rolls up series (and in detailed mode, tags and fields) per measurement, per retention policy, per database, and to a grand total.

> influx_inspect report data
DB        RP      Shard   File                    Series  New (est) Min Time                       Max Time                       Load Time
telegraf  autogen 2       000000006-000000002.tsm 541     541       2021-12-17T22:44:20Z           2021-12-19T23:59:50Z           152.174µs
telegraf  autogen 9       000000008-000000002.tsm 541     0         2021-12-20T00:00:00Z           2021-12-21T22:34:10Z           119.621µs
telegraf  autogen 13      000000001-000000001.tsm 541     0         2021-12-28T23:08:30Z           2021-12-30T01:36:30Z           58.063µs
telegraf  autogen 22      000000069-000000002.tsm 541     0         2022-01-04T20:28:40Z           2022-01-07T18:56:10Z           444.793µs
telegraf  autogen 298     000000001-000000001.tsm 541     0         2022-01-12T23:43:20Z           2022-01-14T22:06:00Z           481.944µs
scaler    autogen 302     000000001-000000001.tsm 2       2         2022-01-14T21:44:26.585665059Z 2022-01-14T21:44:46.738396551Z 64.891µs
scaler    autogen 303     000000001-000000001.tsm 2       2         1970-01-01T00:00:00Z           1970-01-01T00:00:00Z           57.216µs
telegraf  autogen 304     000000053-000000002.tsm 541     0         2022-01-19T18:57:00Z           2022-01-22T00:57:50Z           86.406µs
telegraf  autogen 311     000000070-000000002.tsm 552     11        2022-01-24T23:21:50Z           2022-01-25T04:56:40Z           84.559µs
telegraf  autogen 314     000000003-000000002.tsm 552     0         2022-01-31T17:56:40Z           2022-02-01T01:42:00Z           90.872µs
telegraf  autogen 323     000000004-000000002.tsm 541     0         2022-02-10T00:14:50Z           2022-02-10T17:16:40Z           74.079µs
telegraf  autogen 325     000000001-000000001.tsm 530     0         2022-02-28T19:54:00Z           2022-03-01T16:11:00Z           59.412µs
telegraf  autogen 346     000000004-000000002.tsm 530     0         2022-03-11T21:40:00Z           2022-03-13T23:59:50Z           79.968µs
telegraf  autogen 356     000000007-000000002.tsm 541     0         2022-03-14T00:00:00Z           2022-03-16T19:49:20Z           114.431µs
telegraf  autogen 376     000000003-000000002.tsm 552     0         2022-03-28T16:49:30Z           2022-03-30T18:51:30Z           78.353µs
telegraf  autogen 451     000000001-000000001.tsm 552     0         2022-04-12T21:05:00Z           2022-04-13T23:52:10Z           58.665µs
telegraf  autogen 455     000000001-000000001.tsm 552     0         2022-04-19T18:12:10Z           2022-04-19T19:28:00Z           58.482µs
telegraf  autogen 457     000000006-000000002.tsm 563     11        2022-04-25T23:03:10Z           2022-04-30T05:04:20Z           82.571µs
telegraf  autogen 492     000000001-000000001.tsm 552     0         2022-05-04T21:27:20Z           2022-05-05T03:56:50Z           56.966µs
telegraf  autogen 495     000000001-000000001.tsm 552     0         2022-05-12T21:23:20Z           2022-05-12T22:05:00Z           54.062µs
telegraf  autogen 497     000000001-000000001.tsm 563     0         2022-05-16T23:39:50Z           2022-05-20T00:21:20Z           60.297µs
telegraf  autogen 497     000000002-000000001.tsm 563     0         2022-05-20T00:21:30Z           2022-05-20T08:25:30Z           59.124µs
telegraf  autogen 497     000000003-000000001.tsm 563     0         2022-05-20T08:25:40Z           2022-05-20T16:29:40Z           61.352µs
_internal monitor 498     000000001-000000001.tsm 1661    1660      2022-05-17T00:11:20Z           2022-05-17T02:18:20Z           716.491µs
_internal monitor 499     000000001-000000001.tsm 1386    55        2022-05-18T20:28:40Z           2022-05-18T22:19:40Z           168.768µs
_internal monitor 500     000000001-000000001.tsm 1498    113       2022-05-19T21:44:30Z           2022-05-19T23:59:50Z           176.292µs
tester    autogen 501     000000001-000000001.tsm 4       4         2022-05-19T21:57:21.008251071Z 2022-05-19T21:57:59.026537392Z 41.807µs
_internal monitor 502     000000003-000000001.tsm 1498    55        2022-05-20T05:58:50Z           2022-05-20T08:58:20Z           161.253µs
_internal monitor 502     000000002-000000001.tsm 1498    0         2022-05-20T02:59:10Z           2022-05-20T05:58:40Z           167.334µs
_internal monitor 502     000000001-000000001.tsm 1553    0         2022-05-20T00:00:00Z           2022-05-20T02:59:00Z           183.851µs
_internal monitor 502     000000006-000000001.tsm 1498    0         2022-05-20T14:57:50Z           2022-05-20T17:57:20Z           211.129µs
_internal monitor 502     000000005-000000001.tsm 1498    0         2022-05-20T11:58:10Z           2022-05-20T14:57:40Z           193.907µs
_internal monitor 502     000000004-000000001.tsm 1498    0         2022-05-20T08:58:30Z           2022-05-20T11:58:00Z           202.14µs

Summary:
  Files: 33
  Time Range: 1970-01-01T00:00:00Z - 2022-05-20T17:57:20Z
  Duration: 459185h57m20s 

Statistics
  Series:
     - telegraf (est): 563 (22%)
     - scaler (est): 4 (0%)
     - _internal (est): 1883 (76%)
     - tester (est): 4 (0%)
  Total (est): 2454
Completed in 44.573537ms

davidby-influx added a commit that referenced this issue Jun 8, 2022
feat: estimate Cloud2 cardinality on 1.X databases

To ease migrations to Cloud 2 installations from
1.X databases, estimate Cloud 2 cardinality for
a data node (or OSS system).

closes #23356

(cherry picked from commit ef90bc8)

closes #23416
davidby-influx added a commit that referenced this issue Jun 10, 2022
feat: estimate Cloud2 cardinality on 1.X databases

To ease migrations to Cloud 2 installations from
1.X databases, estimate Cloud 2 cardinality for
a data node (or OSS system).

closes #23356

(cherry picked from commit ef90bc8)

closes #23416
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant