Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于训练和推理内存的问题 #38

Open
tang-yu-feng opened this issue Jan 8, 2025 · 2 comments
Open

关于训练和推理内存的问题 #38

tang-yu-feng opened this issue Jan 8, 2025 · 2 comments

Comments

@tang-yu-feng
Copy link

请问一下作者用的是什么显卡训练的(or多卡?),我这边训练一张44G显存的显卡,只能跑batchsize为2的NBC2小模型,并且layer最多设置成4.推理的时候显示需要162G内存(可能是我输入的音频长度太长了,2~3min的),因此只能用CPU推理,非常慢。以上情况是正常情况还是有哪里设置有问题?

@quancs
Copy link
Member

quancs commented Jan 9, 2025

  1. 我使用的是A100多卡
  2. batch size小可以使用累计梯度的方式来等价实现大batch size,此外还可以考虑混合精度、activation checkpointing,都有助于解决这个问题;
  3. 推理音频太长了,我一般是分段推理,然后重叠拼接

@tang-yu-feng
Copy link
Author

明白了,感谢回复。另外还有一个新问题,这里面除了OnlineSpatialNet是因果的,其他都不是因果的吗?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants