Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable cpu/xpu support for the benchmarking suite #905

Merged
merged 2 commits into from
Aug 14, 2024

Conversation

louie-tsai
Copy link
Contributor

@louie-tsai louie-tsai commented May 22, 2024

Enable Intel CPU and Intel XPU support for Benchmark Suite.
Many customers use deepspeed on CPU and XPU for LLM models, and this benchmark suite helps them to debugging communication issues on their environment.

an screenshot for two nodes run of all_reduce.py on CPU
image

an screenshot for two cards run of run_all.py on XPU
image

@louie-tsai
Copy link
Contributor Author

@louie-tsai please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.

@microsoft-github-policy-service agree [company="{your company}"]

Options:

  • (default - no company specified) I have sole ownership of intellectual property rights to my Submissions and I am not making Submissions in the course of work for my employer.
@microsoft-github-policy-service agree
  • (when company given) I am making Submissions in the course of work for my employer (or my employer has intellectual property rights in my Submissions by contract or applicable law). I have permission from my employer to make Submissions and enter into this Agreement on behalf of my employer. By signing below, the defined term “You” includes me and my employer.
@microsoft-github-policy-service agree company="Microsoft"

Contributor License Agreement

@microsoft-github-policy-service agree [company="{Intel}"]

@louie-tsai
Copy link
Contributor Author

@louie-tsai the command you issued was incorrect. Please try again.

Examples are:

@microsoft-github-policy-service agree

and

@microsoft-github-policy-service agree company="your company"

@microsoft-github-policy-service agree [company="{Intel}"]

@louie-tsai
Copy link
Contributor Author

@microsoft-github-policy-service agree [company="{Intel}"]

@louie-tsai
Copy link
Contributor Author

@louie-tsai the command you issued was incorrect. Please try again.

Examples are:

@microsoft-github-policy-service agree

and

@microsoft-github-policy-service agree company="your company"

@microsoft-github-policy-service agree company="Intel"

@louie-tsai
Copy link
Contributor Author

@microsoft-github-policy-service agree company="Intel"

@microsoft-github-policy-service agree company=Intel

@tjruwase
Copy link
Contributor

@louie-tsai, thanks so much. This is an amazing PR. We will review and merge shortly.

@tjruwase
Copy link
Contributor

@louie-tsai, can you confirm if this PR is ready for review, I noticed that output (e.g., Gbps) is incorrect/missing.
image

@louie-tsai
Copy link
Contributor Author

louie-tsai commented Jul 26, 2024

@louie-tsai, can you confirm if this PR is ready for review, I noticed that output (e.g., Gbps) is incorrect/missing. image

The output issue is related to the duration calculation from event.
if I used time.time to measure instead of XPU event. it looks good.
image
I will escalate the XPU event issue and ask for a fix.
In the meantime, remove XPU support from README

@loadams loadams merged commit b04fedd into microsoft:master Aug 14, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants