You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We've been using spark operator and volcano for a long time in production env, however, there are some problems with the calculation of resource usage for volcano podGroup when the sparkapp is submitted.
The spark.dynamicAllocation.* & spark.kubernetes.memoryOverheadFactor params of spark are not taken into account when calculating memory of minResources for volcano podGroup. As a result, the calculated minResources maybe smaller than real usage of sparkapp, and the gang scheduling maybe fail.
✋ I have searched the open/closed issues and my issue is not listed.
Reproduction Code [Required]
Expected behavior
Actual behavior
Environment & Versions
Spark Operator App version: 2.0.1
Helm Chart Version: 2.0.1
Kubernetes Version: 1.25.7
Apache Spark version: 3.4.3
Additional context
The text was updated successfully, but these errors were encountered:
BTW, i see that resourceusage directory implemented in yunikorn, and if you have no plan to support this for volcano, I can contribute our code for volcano, which has been verified by thousands times of spark task.
Hey, I wrote the resourceusage module for the Yunikorn batch scheduler. When I implemented this initially we discussed pulling these functions out into a more generic module for use across other batch schedulers. If you have code that also calculates the resulting pod resource fields I'd be happy to review and hopefully improve the existing solution and update the existing Volcano batch scheduler.
Description
We've been using spark operator and volcano for a long time in production env, however, there are some problems with the calculation of resource usage for volcano podGroup when the sparkapp is submitted.
The spark.dynamicAllocation.* & spark.kubernetes.memoryOverheadFactor params of spark are not taken into account when calculating memory of minResources for volcano podGroup. As a result, the calculated minResources maybe smaller than real usage of sparkapp, and the gang scheduling maybe fail.
Reproduction Code [Required]
Expected behavior
Actual behavior
Environment & Versions
Additional context
The text was updated successfully, but these errors were encountered: