-
Notifications
You must be signed in to change notification settings - Fork 52
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
### 改动 1. 集群信息新增gpuType和vramMb 参数,同时训练时需要将gpuType传回给适配器 2. 新增多机多卡功能,当用户在AI 训练时,如果选择多个节点(pod),则需要指定对应的算法框架,目前支持 tensorflow、pytorch和mindspore(华为特有),来进行对应的分布式训练 ![image](https://github.com/PKUHPC/SCOW/assets/130351655/06e41fdb-7a78-4c36-b253-3f5728bfa1e4) 3. AI应用配置文件新增tag,并且新增三个接口:listApp, listTags,以及根据appId 来获取对应可创建该应用的集群信息 listClusters,这部分接口主要是为了后续 AI作业模块重构准备。 4. gpu分区提交作业和应用必传gpuType告知适配器是是什么gpu类型 5. 若是华为gpu卡,不管是不是分布式训练,提交应用和作业都需要指定框架 --------- Co-authored-by: OYX-1 <[email protected]>
- Loading branch information
1 parent
c214bd2
commit 753a996
Showing
13 changed files
with
338 additions
and
34 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
--- | ||
"@scow/scheduler-adapter-protos": patch | ||
"@scow/config": patch | ||
"@scow/ai": patch | ||
--- | ||
|
||
AI 增加多机多卡分布式训练和对华为 GPU 的特殊处理 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.