Patch release
Thanks to our 2 contributors for their prompt fixing mostly applies for training and FA2!
- Fix Gemma2 4d attention mask (#31674) by @hiyouga
- don't zero out the attention_mask when using sliding window with flash attention (#31670) by @winglian