add torchbench example: WIP #388

rakataprime · 2023-06-07T00:17:38Z

Work in progress PR for adding torchbench gpu benchmarking sdl

…nchmarks to hf_bert,hf_Bert_Large,resnet50,tacotron2

anilmurty · 2023-06-21T22:53:55Z

SDL needs to be updated to include the "vendor" key as shown here https://docs.akash.network/testnet/example-gpu-sdls/specific-gpu-vendor
add:

          attributes:
            vendor:
              nvidia:

rakataprime · 2023-06-21T23:12:39Z

SDL needs to be updated to include the "vendor" key as shown here https://docs.akash.network/testnet/example-gpu-sdls/specific-gpu-vendor add:
          attributes:
            vendor:
              nvidia:

i have updated for the attributes. I think it would be best to prepackage a notebook for the benchmarks so that people just have to click play all to get the benchmarks. We could probably use shebang in the first cell like !run.sh or !python /workspace/benchmark/install.py models hf_bert hf_Bert_large resnet50 tacotron2 && pytest /workspace/benchmark/test_bench.py -k "(hf_bert or hf_bert_Large or resnet50 or tacotron2)" --ignore_machine_config

that would be the most minimal. You could also persist the json stored benchmarks and try to make some pretty plots too, but if time is of the essence I think we could just add jupyter to the requirments.txt with the minimal template notebook.

anilmurty · 2023-06-21T23:19:37Z

That would be great @rakataprime - would you like to add to this PR itself?

anilmurty · 2023-06-21T23:21:16Z

by the way - if you want to test deployments you can use one of these client options https://docs.akash.network/testnet/gpu-testnet-client-instructions - we have a few GPU providers on the testnet now https://akash.praetorapp.com/provider-status (select "testnet" in the "Network Selection" dropdown to see them)

rakataprime · 2023-06-21T23:23:55Z

That would be great @rakataprime - would you like to add to this PR itself?

I could do either this pr or a new one. Do you have the requirements for what information you want included in that notebook other than the benchmarks? eg github, username, email, wallet address, etc ?

anilmurty · 2023-06-21T23:55:40Z

we will already be collecting those details via a typeform (right @brewsterdrinkwater ?) but wouldn't hurt to ask for github ID, Discord Handle, and wallet address, I think.

anilmurty · 2023-06-21T23:56:14Z

in fact I think it may help correlate things for awards

anilmurty · 2023-06-27T15:16:31Z

@rakataprime - not sure if you are waiting on a response here but we're ok either way re. collecting user info in the jupyter notebook

rakataprime · 2023-06-28T02:05:07Z

@rakataprime - not sure if you are waiting on a response here but we're ok either way re. collecting user info in the jupyter notebook

I think if we test it and it works well enough we would be ready to merge.

anilmurty · 2023-06-30T16:43:29Z

Thanks @rakataprime - have you tried this on the testnet? There are 26 GPUs available there right now https://akash.praetorapp.com/provider-status?chainid=testnet-02

chainzero

Requested changes to GPU profile and exposed port have been made. SDL looks good and have tested successfully.

anilmurty · 2023-06-30T19:14:45Z

Thanks again @rakataprime and thanks @chainzero !

rakataprime added 5 commits June 6, 2023 18:09

add torchbench example

756b78a

update docker container to install torchbench at runtime and limit be…

8f02ee9

…nchmarks to hf_bert,hf_Bert_Large,resnet50,tacotron2

update entrypoint for sdl

b4e300e

update for sh script

d98b5ce

update container perm

e115932

anilmurty requested review from chainzero and andy108369 June 21, 2023 22:54

add vendor attributes to sdl

482ec09

add jupyter deps, scripts, and notebook template to repo

7d473ee

chainzero approved these changes Jun 30, 2023

View reviewed changes

chainzero merged commit e69560b into akash-network:master Jun 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add torchbench example: WIP #388

add torchbench example: WIP #388

rakataprime commented Jun 7, 2023

anilmurty commented Jun 21, 2023

rakataprime commented Jun 21, 2023

anilmurty commented Jun 21, 2023

anilmurty commented Jun 21, 2023

rakataprime commented Jun 21, 2023

anilmurty commented Jun 21, 2023 •

edited

Loading

anilmurty commented Jun 21, 2023

anilmurty commented Jun 27, 2023

rakataprime commented Jun 28, 2023

anilmurty commented Jun 30, 2023

chainzero left a comment

anilmurty commented Jun 30, 2023

add torchbench example: WIP #388

add torchbench example: WIP #388

Conversation

rakataprime commented Jun 7, 2023

anilmurty commented Jun 21, 2023

rakataprime commented Jun 21, 2023

anilmurty commented Jun 21, 2023

anilmurty commented Jun 21, 2023

rakataprime commented Jun 21, 2023

anilmurty commented Jun 21, 2023 • edited Loading

anilmurty commented Jun 21, 2023

anilmurty commented Jun 27, 2023

rakataprime commented Jun 28, 2023

anilmurty commented Jun 30, 2023

chainzero left a comment

Choose a reason for hiding this comment

anilmurty commented Jun 30, 2023

anilmurty commented Jun 21, 2023 •

edited

Loading