taichi-dev · neozhaoliang · Dec 21, 2022 · Dec 21, 2022
diff --git a/docs/lang/articles/basic/sparse.md b/docs/lang/articles/basic/sparse.md
@@ -34,7 +34,7 @@ Sparse data structures are traditionally based on [Quadtrees](https://en.wikiped
 [VDB](https://www.openvdb.org/) and [SPGrid](http://pages.cs.wisc.edu/~sifakis/papers/SPGrid.pdf) are such examples.
 In Taichi, programmers can compose data structures similar to VDB and SPGrid with SNodes. The advantages of Taichi spatially sparse data structures include:
 
-- Array and List access time, which is the equivalent to accessing a dense data structure.
+- Access with indices, which just like accessing a dense data structure.
 - Automatic parallelization when iterating.
 - Automatic memory access optimization.
 

diff --git a/docs/lang/articles/performance_tuning/performance.md b/docs/lang/articles/performance_tuning/performance.md
@@ -134,7 +134,7 @@ hierarchy matches `ti.root.(sparse SNode)+.dense`), Taichi will assign one CUDA
 thread block to each `dense` container (or `dense` block). BLS optimization works
 specifically for such kinds of fields.
 
-BLS intends to enhance stencil computing processes by utilising CUDA shared memory. This optimization begins with users annotating the set of fields they want to cache using `ti.block local`. At *compile time*, Taichi tries to identify the accessing range in relation to the `dense` block of these annotated fields. If Taichi is successful, it creates code that first loads all of the accessible data in range into a *block local* buffer (CUDA's shared memory), then replaces all accesses to the relevant slots into this buffer.
+BLS intends to enhance stencil computing processes by utilizing CUDA shared memory. This optimization begins with users annotating the set of fields they want to cache using `ti.block local`. At *compile time*, Taichi tries to identify the accessing range in relation to the `dense` block of these annotated fields. If Taichi is successful, it creates code that first loads all of the accessible data in range into a *block local* buffer (CUDA's shared memory), then replaces all accesses to the relevant slots into this buffer.
 
 Here is an example illustrating the usage of BLS. `a` is a sparse field with a
 block size of `4x4`.