DS-Ulysses formating (#4204)

* fix identation * fix formatting --------- Co-authored-by: Jeff Rasley <[email protected]>
deepspeedai · Aug 24, 2023 · 961827b · 961827b
1 parent 3e82cb6
commit 961827b
Show file tree

Hide file tree

Showing 2 changed files with 3 additions and 4 deletions.
diff --git a/README.md b/README.md
@@ -1,6 +1,6 @@
 [![License Apache 2.0](https://badgen.net/badge/license/apache2.0/blue)](https://github.com/Microsoft/DeepSpeed/blob/master/LICENSE)
 [![PyPI version](https://badge.fury.io/py/deepspeed.svg)](https://pypi.org/project/deepspeed/)
-[![Downloads](https://static.pepy.tech/badge/deepspeed)](https://pepy.tech/project/deepspeed) 
+[![Downloads](https://static.pepy.tech/badge/deepspeed)](https://pepy.tech/project/deepspeed)
 [![Build](https://badgen.net/badge/build/check-status/blue)](#build-pipeline-status)
 [![Twitter](https://img.shields.io/twitter/follow/MSFTDeepSpeed)](https://twitter.com/intent/follow?screen_name=MSFTDeepSpeed)
 [![Japanese Twitter](https://img.shields.io/badge/%E6%97%A5%E6%9C%AC%E8%AA%9ETwitter-%40MSFTDeepSpeedJP-blue)](https://twitter.com/MSFTDeepSpeedJP)

diff --git a/blogs/deepspeed-ulysses/README.md b/blogs/deepspeed-ulysses/README.md
@@ -149,7 +149,7 @@ match this analysis.
 
 ### Additional Highlights of DeepSpeed-Ulysses
 
-1)  An Attention Agnostic Solution
+***An Attention Agnostic Solution***
 
 DeepSpeed implementation of distributed attention module is general
 enough to support any attention: e.g., self-attention, cross-attention,
@@ -165,8 +165,7 @@ per head but just with fewer heads, thus attention computation can be
 replaced with any type of attention mechanisms, e.g., dense attention
 and various forms of sparse attention.
 
-2)  Training Bigger Models with Longer Sequences through ZeRO-3
-    Integration
+***Training Bigger Models with Longer Sequences through ZeRO-3 Integration***
 
 While DeepSpeed sequence parallelism reduces the activation memory when
 training with longer sequences, it does not impact the memory consumed