[CUDA] Update benchmark_mha.py to capture debug info to identify sdpa kernel #21804
Azure Pipelines / orttraining-ortmodule-distributed (DistributedInferenceTest Onnxruntime_Linux_GPU_Inference_Distributed_Test)
succeeded
Aug 21, 2024 in 1h 4m 40s
DistributedInferenceTest Onnxruntime_Linux_GPU_Inference_Distributed_Test succeeded
Loading