Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature blocky benchmark #541

Merged
merged 5 commits into from
Apr 17, 2020
Merged

Feature blocky benchmark #541

merged 5 commits into from
Apr 17, 2020

Conversation

wilko77
Copy link
Collaborator

@wilko77 wilko77 commented Apr 16, 2020

This extends the benchmark script to be able to run experiments which use blocking.

An experiment definition for blocking looks like this:

{
    "sizes": ["100K", "100K"],
    "use_blocking": true,
    "threshold": 0.80
  }

The corresponding 'clknblocks' files are uploaded to S3.
For now I haven't changed the default-experiements.json file, as the blocked experiments take a very long time and will most likely trigger a timeout.
Once we addressed that issue in the entity service, we can replace default-experiements.json with default-experiements-wawo-blocking.json.

@wilko77 wilko77 requested a review from hardbyte April 16, 2020 04:02
Copy link
Collaborator

@hardbyte hardbyte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The benchmarking code looks pretty good to me. I'd like to see some indication of the number and size of of blocks used in each experiment. Ideally we could configure the experiments to run with different sizes but I realize that would be tricky.

Please update docs/benchmarking.rst before merging, in particular it would be good to see more about how the blocks were created.

A feature request - we could show some form of progress in the benchmark containers output. If we printed the result token the user could attach a rest_client to watch progress if they really wanted.

The performance it reveals is another story 🤢

Running on my desktop I see uploading 100k clknblock takes ~45s versus ~6s for binary encodings!
During upload I see I log the size of each block (opps) and that most(?) block has just 1 element.
For the 100k x 100k experiment it creates 112551 chunks. Creating the chunks appears to take almost 3 minutes. I scaled up to using 10 workers to give it a chance of finishing. On my machine one chunk is taking as much as 50ms - although I saw some ~10ms. My cpu cores are all <20% active during this process :-/

image

and `clk_{user}_{size_data}.json` where $user is a letter starting from `a` indexing the data owner, and `size_data`
is a integer representing the number of data rows in the dataset (e.g. 10000). Note that the csv usually has a header.
the 3 party linkage), and then a number a file following the format `PII_{user}_{size_data}.csv`,
`clk_{user}_{size_data}_v2.bin`, `clk_{user}_{size_data}.json` and `clknblocks_{user}_{size_data}.json` where $user
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`clk_{user}_{size_data}_v2.bin`, `clk_{user}_{size_data}.json` and `clknblocks_{user}_{size_data}.json` where $user
`clk_{user}_{size_data}_v2.bin`, `clk_{user}_{size_data}.json` and `clknblocks_{user}_{size_data}.json` where `user`

is a integer representing the number of data rows in the dataset (e.g. 10000). Note that the csv usually has a header.
the 3 party linkage), and then a number a file following the format `PII_{user}_{size_data}.csv`,
`clk_{user}_{size_data}_v2.bin`, `clk_{user}_{size_data}.json` and `clknblocks_{user}_{size_data}.json` where $user
is a letter starting from `a` indexing the data owner, and `size_data` is a integer representing the number of data
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
is a letter starting from `a` indexing the data owner, and `size_data` is a integer representing the number of data
is a letter starting from `a` indexing the data providers, and `size_data` is an integer representing the number of

@hardbyte
Copy link
Collaborator

I left it running last night with a 12 hour timeout, 6 workers and worker settings:

      - CELERYD_MAX_TASKS_PER_CHILD=2048
      - CELERYD_CONCURRENCY=4
      - CELERY_DB_MIN_CONNECTIONS=1
      - CELERY_DB_MAX_CONNECTIONS=8
Benchmark Logs
2020/04/16 05:56:53 Waiting for: tcp://db:5432
2020/04/16 05:56:53 Waiting for: tcp://nginx:8851/api/v1/status
2020/04/16 05:56:53 Connected to tcp://db:5432
2020/04/16 05:56:53 Connected to tcp://nginx:8851
INFO:loaded experiments: [{'sizes': ['100K', '100K'], 'threshold': 0.95, 'repetition': 1}, {'sizes': ['100K', '100K'], 'threshold': 0.8, 'repetition': 1}, {'sizes': ['10K', '10K'], 'threshold': 0.95, 'repetition': 1}, {'sizes': ['100K', '100K'], 'use_blocking': True, 'threshold': 0.95, 'repetition': 1}, {'sizes': ['100K', '100K'], 'use_blocking': True, 'threshold': 0.8, 'repetition': 1}, {'sizes': ['10K', '10K'], 'use_blocking': True, 'threshold': 0.95, 'repetition': 1}]
INFO:{'project_count': 1, 'rate': 1, 'status': 'ok'}
INFO:Downloading synthetic datasets from S3
INFO:Downloads complete
INFO:running experiment: {'sizes': ['100K', '100K'], 'threshold': 0.95, 'repetition': 1, 'rep': 1}
INFO:Starting time: Thu Apr 16 05:56:54 2020
INFO:Upload status: 201
INFO:uploading clks for a took 5.747
INFO:Upload status: 201
INFO:uploading clks for b took 5.727
INFO:waiting for run bf9bd1ef3e3b006d5808e8b5c0ad8f418c2b2aeb36598ff2 from the project 02be6854a0cb0731e06653fba86a063b348551a9ee6c97ad to finish
INFO:experiment successful. Evaluating results now...
INFO:cleaning up...
INFO:Ending time: Thu Apr 16 05:58:12 2020
INFO:running experiment: {'sizes': ['100K', '100K'], 'threshold': 0.8, 'repetition': 1, 'rep': 1}
INFO:Starting time: Thu Apr 16 05:58:12 2020
INFO:Upload status: 201
INFO:uploading clks for a took 5.523
INFO:Upload status: 201
INFO:uploading clks for b took 5.516
INFO:waiting for run d1a4cbbeed696d1d44974cd334ac4c6f3c8251856c3cc8e3 from the project 27b7283a5f7a1b6da4f86d8e5d208f74142ad7ffbdee7d32 to finish
INFO:experiment successful. Evaluating results now...
INFO:cleaning up...
INFO:Ending time: Thu Apr 16 05:59:46 2020
INFO:running experiment: {'sizes': ['10K', '10K'], 'threshold': 0.95, 'repetition': 1, 'rep': 1}
INFO:Starting time: Thu Apr 16 05:59:46 2020
INFO:Upload status: 201
INFO:uploading clks for a took 0.583
INFO:Upload status: 201
INFO:uploading clks for b took 0.587
INFO:waiting for run 4ed5b71259202072a1118dafc0bd02086e39600ffeec01e7 from the project 84b73cd77e203b44ee94ba829becea58048b7d974b83b478 to finish
INFO:experiment successful. Evaluating results now...
INFO:cleaning up...
INFO:Ending time: Thu Apr 16 05:59:52 2020
INFO:running experiment: {'sizes': ['100K', '100K'], 'use_blocking': True, 'threshold': 0.95, 'repetition': 1, 'rep': 1}
INFO:Starting time: Thu Apr 16 05:59:52 2020
INFO:upload result: {'message': 'Updated', 'receipt_token': 'e92adb33e186e59e694c82a893314925e493c9578063e02e'}
INFO:uploading clknblocks for a took 40.494
INFO:upload result: {'message': 'Updated', 'receipt_token': '7b57c43d49c2c722825398473525268b4c9a33ada19ad796'}
INFO:uploading clknblocks for b took 48.012
INFO:waiting for run 851488d30746e028bdaa43aa3406e20d61d929526f3b5888 from the project 324bc1f37c008814703560af8d5f4e8075b33e1edf4cd119 to finish
WARNING:experiment '{'sizes': ['100K', '100K'], 'use_blocking': True, 'threshold': 0.95, 'repetition': 1, 'rep': 1}' failed: Traceback (most recent call last):
  File "benchmark.py", line 384, in run_single_experiment
    raise RuntimeError('run did not finish!\n{}'.format(status))
RuntimeError: run did not finish!
{'current_stage': {'description': 'compute similarity scores', 'number': 2, 'progress': {'absolute': 3384967, 'description': 'number of already computed similarity scores', 'relative': 0.0003384967}}, 'stages': 3, 'state': 'error', 'time_added': '2020-04-16T06:01:21.132298+00:00'}

INFO:cleaning up...
INFO:Ending time: Thu Apr 16 09:25:54 2020
INFO:running experiment: {'sizes': ['100K', '100K'], 'use_blocking': True, 'threshold': 0.8, 'repetition': 1, 'rep': 1}
INFO:Starting time: Thu Apr 16 09:25:54 2020
INFO:upload result: {'message': 'Updated', 'receipt_token': 'e5c8a7d2029f0bac6f503b56d208c46f1623a5fb85fba549'}
INFO:uploading clknblocks for a took 39.736
INFO:upload result: {'message': 'Updated', 'receipt_token': 'dc89c764465cceac4df6cedcf41a9633011c05e1d108b9ef'}
INFO:uploading clknblocks for b took 47.232
INFO:waiting for run 0123fccd1f574adbf9f219a641f811da701a70404ba747cd from the project 25bacd878d7a27d556a9a225002f5a789af944c3c026bd42 to finish
INFO:experiment successful. Evaluating results now...
INFO:cleaning up...
INFO:Ending time: Thu Apr 16 12:25:02 2020
INFO:running experiment: {'sizes': ['10K', '10K'], 'use_blocking': True, 'threshold': 0.95, 'repetition': 1, 'rep': 1}
INFO:Starting time: Thu Apr 16 12:25:02 2020
INFO:upload result: {'message': 'Updated', 'receipt_token': 'f649a2189896fbf8d85e995e9f7ff494c116cc3409b61e66'}
INFO:uploading clknblocks for a took 7.232
INFO:upload result: {'message': 'Updated', 'receipt_token': '918c8fcb0b408ffa1bfb830c6a8cda211aeebf10d27468e3'}
INFO:uploading clknblocks for b took 7.729
INFO:waiting for run e5f0bd97d09349d23aaa1a0fe3611cfe279f482f6e23a2f0 from the project d39e4b8fb2642e829200ab44a02010cff6b397a07ff5262e to finish
INFO:experiment successful. Evaluating results now...
INFO:cleaning up...
INFO:Ending time: Thu Apr 16 12:54:38 2020
{'experiments': [{'experiment': {'rep': 1,
                                 'repetition': 1,
                                 'sizes': ['100K', '100K'],
                                 'threshold': 0.95},
                  'groups_results': {'accuracy': 0.43175537666706715,
                                     'false_positives': 0,
                                     'negatives': 28377,
                                     'positives': 21561},
                  'project_id': '02be6854a0cb0731e06653fba86a063b348551a9ee6c97ad',
                  'run_id': 'bf9bd1ef3e3b006d5808e8b5c0ad8f418c2b2aeb36598ff2',
                  'sizes': {'size_a': 100000, 'size_b': 100000},
                  'status': 'completed',
                  'threshold': 0.95,
                  'timings': {'added:': '2020-04-16T05:57:05.559457+00:00',
                              'completed': '2020-04-16T05:58:08.652419+00:00',
                              'runtime': 63.053858,
                              'started': '2020-04-16T05:57:05.598561+00:00'}},
                 {'experiment': {'rep': 1,
                                 'repetition': 1,
                                 'sizes': ['100K', '100K'],
                                 'threshold': 0.8},
                  'groups_results': {'accuracy': 0.9962553566422364,
                                     'false_positives': 7289,
                                     'negatives': 187,
                                     'positives': 49751},
                  'project_id': '27b7283a5f7a1b6da4f86d8e5d208f74142ad7ffbdee7d32',
                  'run_id': 'd1a4cbbeed696d1d44974cd334ac4c6f3c8251856c3cc8e3',
                  'sizes': {'size_a': 100000, 'size_b': 100000},
                  'status': 'completed',
                  'threshold': 0.8,
                  'timings': {'added:': '2020-04-16T05:58:23.275891+00:00',
                              'completed': '2020-04-16T05:59:43.332925+00:00',
                              'runtime': 80.04071,
                              'started': '2020-04-16T05:58:23.292215+00:00'}},
                 {'experiment': {'rep': 1,
                                 'repetition': 1,
                                 'sizes': ['10K', '10K'],
                                 'threshold': 0.95},
                  'groups_results': {'accuracy': 0.44051638530287984,
                                     'false_positives': 0,
                                     'negatives': 2817,
                                     'positives': 2218},
                  'project_id': '84b73cd77e203b44ee94ba829becea58048b7d974b83b478',
                  'run_id': '4ed5b71259202072a1118dafc0bd02086e39600ffeec01e7',
                  'sizes': {'size_a': 10000, 'size_b': 10000},
                  'status': 'completed',
                  'threshold': 0.95,
                  'timings': {'added:': '2020-04-16T05:59:47.435166+00:00',
                              'completed': '2020-04-16T05:59:48.897097+00:00',
                              'runtime': 1.42179,
                              'started': '2020-04-16T05:59:47.475307+00:00'}},
                 {'description': 'Traceback (most recent call last):\n'
                                 '  File "benchmark.py", line 384, in '
                                 'run_single_experiment\n'
                                 "    raise RuntimeError('run did not "
                                 "finish!\\n{}'.format(status))\n"
                                 'RuntimeError: run did not finish!\n'
                                 "{'current_stage': {'description': 'compute "
                                 "similarity scores', 'number': 2, 'progress': "
                                 "{'absolute': 3384967, 'description': 'number "
                                 "of already computed similarity scores', "
                                 "'relative': 0.0003384967}}, 'stages': 3, "
                                 "'state': 'error', 'time_added': "
                                 "'2020-04-16T06:01:21.132298+00:00'}\n",
                  'name': {'rep': 1,
                           'repetition': 1,
                           'sizes': ['100K', '100K'],
                           'threshold': 0.95,
                           'use_blocking': True},
                  'project_id': '324bc1f37c008814703560af8d5f4e8075b33e1edf4cd119',
                  'run_id': '851488d30746e028bdaa43aa3406e20d61d929526f3b5888',
                  'status': 'ERROR'},
                 {'experiment': {'rep': 1,
                                 'repetition': 1,
                                 'sizes': ['100K', '100K'],
                                 'threshold': 0.8,
                                 'use_blocking': True},
                  'groups_results': {'accuracy': 0.9740878689575073,
                                     'false_positives': 2507,
                                     'negatives': 1294,
                                     'positives': 48644},
                  'project_id': '25bacd878d7a27d556a9a225002f5a789af944c3c026bd42',
                  'run_id': '0123fccd1f574adbf9f219a641f811da701a70404ba747cd',
                  'sizes': {'size_a': 100000, 'size_b': 100000},
                  'status': 'completed',
                  'threshold': 0.8,
                  'timings': {'added:': '2020-04-16T09:27:21.801546+00:00',
                              'completed': '2020-04-16T12:24:57.448014+00:00',
                              'runtime': 10632.080275,
                              'started': '2020-04-16T09:27:45.367739+00:00'}},
                 {'experiment': {'rep': 1,
                                 'repetition': 1,
                                 'sizes': ['10K', '10K'],
                                 'threshold': 0.95,
                                 'use_blocking': True},
                  'groups_results': {'accuracy': 0.44051638530287984,
                                     'false_positives': 0,
                                     'negatives': 2817,
                                     'positives': 2218},
                  'project_id': 'd39e4b8fb2642e829200ab44a02010cff6b397a07ff5262e',
                  'run_id': 'e5f0bd97d09349d23aaa1a0fe3611cfe279f482f6e23a2f0',
                  'sizes': {'size_a': 10000, 'size_b': 10000},
                  'status': 'completed',
                  'threshold': 0.95,
                  'timings': {'added:': '2020-04-16T12:25:17.611094+00:00',
                              'completed': '2020-04-16T12:54:33.483004+00:00',
                              'runtime': 1753.61083,
                              'started': '2020-04-16T12:25:19.872174+00:00'}}],
 'server': 'http://nginx:8851',
 'version': {'anonlink': '0.12.5',
             'entityservice': 'v1.13.0-beta2',
             'python': '3.8.2'}}
2020/04/16 12:54:38 Command finished successfully.

The gist is one 100k x 100k run failed with a timeout, and the other took 10632 seconds.
See how the progress is incorrect: 'absolute': 3384967, 'relative': 0.0003384967, I've opened an issue for that #542

@wilko77
Copy link
Collaborator Author

wilko77 commented Apr 17, 2020

Re: binary encodings. we should definitely look into allowing to upload binary clks and block info in separate files for big jobs.
Re: performance. Poo.

@wilko77 wilko77 merged commit 012d9a0 into develop Apr 17, 2020
@wilko77 wilko77 deleted the feature-blocky-benchmark branch April 17, 2020 07:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants