Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(ai): 多平台镜像上传到远程镜像仓库问题 #1254

Merged
merged 4 commits into from
May 16, 2024

Conversation

piccaSun
Copy link
Contributor

@piccaSun piccaSun commented May 16, 2024

背景

#1181 中,通过 nerdctl -n k8s.io push --all-platforms 可以解决远程单一平台大镜像推送成功的问题,但是由此引入了新的问题

通过对当前系统中containerd容器运行时拉取和推送的命令进行进一步验证:

nerdctl -n k8s.io pull externalImage
nerdctl -n k8s.io tag externalImage taggedImage
nerdctl -n k8s.io push --all-platforms taggedImage

对于一个多平台镜像如图1中的jupyter/pyspark-notebook:ubuntu-22.04 ,如果在拉取时无法获取每一个平台的镜像如图2(不符合当前平台的SIZE为0)
image
image
那么推送时指定 --all-platforms会报错失败,但是不指定 --all-platforms ,会只进行推送当前使用当前节点平台的镜像,没有问题

  • 如果像图2混入了他平台sha256层数据的情况,即使拉取时指定平台pull --platform linux.amd64也一定会在nerdclt images中查看到多条平台的镜像,这个时候--all-platforms不会报错

  • 单平台镜像无论在push中是否添加全平台或指定平台都没有这种问题,提交作业时即使使用了多平台镜像,在容器中commit时会按当前节点平台架构commit为单平台镜像,也不会有问题

解决方法
1.在pull中加入--all-platforms可以保证拉取时获取每一个平台的镜像,使无论单平台还是多平台通过push --all-platforms成功,但是这与目前时实际大多数单平台架构的模式不符,容易造成冗余

2.将之前添加的nerdctl -n k8s.io push --all-platforms taggedImage修改为nerdctl -n k8s.io push taggedImage 尽量保证单平台/多平台镜像 在nerdctl images下都只有当前系统平台,单一平台的数据

1虽然满足了 #1181 问题的解决,但是经过进一步检查在nerdctl images下也没有混入其他平台数据的情况,#1181问题已无法复现,通过推测可能由于平台数据不纯净混入了多平台的层数据或者由于nerdctl清楚缓存等机制造成了 镜像的当前平台层数据缺失

修改

此PR按上述解决方法2进行修改,删除上一次添加的 push 中的--all-platforms
同时在push失败时尽量删除上一次拉取和tag的镜像,后台logger提示管理员检查镜像列表,尽量保证镜像列表环境纯净
但是对于# 1181问题如果再次出现需补充测试进一步验证

修改后确认如果Push失败会删除拉取到本地的镜像和tag后的镜像
image

Copy link

changeset-bot bot commented May 16, 2024

🦋 Changeset detected

Latest commit: 98741e0

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@scow/ai Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@piccaSun piccaSun marked this pull request as ready for review May 16, 2024 07:56
@pkuhpc-review-bot pkuhpc-review-bot bot added the Code-ReviewRequested Code Review Requested label May 16, 2024
@pkuhpc-review-bot pkuhpc-review-bot bot requested a review from ddadaal May 16, 2024 07:56
@pkuhpc-review-bot pkuhpc-review-bot bot added Code-Approved Code Review approved ReadyForMerge Ready for merge and removed Code-ReviewRequested Code Review Requested labels May 16, 2024
@ddadaal ddadaal merged commit 0957f1a into master May 16, 2024
13 checks passed
@ddadaal ddadaal deleted the fix-operation-with-multi-platform-image branch May 16, 2024 15:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Code-Approved Code Review approved ReadyForMerge Ready for merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants