[fusion] Fold unit dims of globals #756

MaheshRavishankar · 2024-06-27T22:59:46Z

During compilation of quantized SDXL model there are artifacts of this form

  %extracted_slice_219 = tensor.extract_slice %236[0, 0, 0] [2, 4096, 2560] [1, 1, 1] : tensor<2x4096x5120xf16> to tensor<2x4096x2560xf16>
  %extracted_slice_220 = tensor.extract_slice %236[0, 0, 2560] [2, 4096, 2560] [1, 1, 1] : tensor<2x4096x5120xf16> to tensor<2x4096x2560xf16>
  %expanded_221 = tensor.expand_shape %extracted_slice_219 [[0], [1], [2, 3, 4]] output_shape [2, 4096, 1, 1, 2560] : tensor<2x4096x2560xf16> into tensor<2x4096x1x1x2560xf16>
  %expanded_222 = tensor.expand_shape %extracted_slice_220 [[0], [1], [2, 3, 4]] output_shape [2, 4096, 1, 1, 2560] : tensor<2x4096x2560xf16> into tensor<2x4096x1x1x2560xf16>
  %237 = tensor.empty() : tensor<2x4096x1x1x2560xi8>
  %238 = flow.dispatch.region -> (tensor<2x4096x1x1x2560xi8>) {
    %5295 = linalg.generic {indexing_maps = [affine_map<(d0, d1, d2, d3, d4) -> (d0, d1, d2, d3, d4)>, affine_map<(d0, d1, d2, d3, d4) -> (d0, d1, d2, d3, d4)>, affine_map<(d0, d1, d2, d3, d4) -> (d2, d3, d4)\>, affine_map<(d0, d1, d2, d3, d4) -> ()>, affine_map<(d0, d1, d2, d3, d4) -> (d0, d1, d2, d3, d4)>], iterator_types = ["parallel", "parallel", "parallel", "parallel", "parallel"]} ins(%expanded_221, %expande\d_222, %__auto.down_blocks.1.attentions.0.transformer_blocks.0.ff.net.2.premul_input, %__auto.down_blocks.1.attentions.0.transformer_blocks.0.ff.net.2.q_input3Ascale : tensor<2x4096x1x1x2560xf16>, tensor<2x40\96x1x1x2560xf16>, tensor<1x1x2560xf16>, tensor<f32>) outs(%237 : tensor<2x4096x1x1x2560xi8>) {

Here we would like to fuse the extract_slice with its consumer. That gets blocked due to the presence of the tensor.expand_shape in between the slice and its use in the dispatch. The real reason why these epxand shapes exist is because after FoldUnitDims, the unit dimensions in the globals like %__auto.down_blocks.1.attentions.0.transformer_blocks.0.ff.net.2.premul_input and %__auto.down_blocks.1.attentions.0.transformer_blocks.0.ff.net.2.q_input3Ascale dont get folded away. This results in the collapse_shapes not fully folding away, which later propagation passes pick up.
While this could be accounted for during the propagation passes, it is also worth just folding the unit dimensions in the global variables away.

The text was updated successfully, but these errors were encountered:

Currently reverting [7884dc8](7884dc8) to test regressions (there were problems with llama). Issue here nod-ai/SHARK-ModelDev#756 Couldn't reproduce the issue with llama yet. It might be best to land this since the unit dims should be folded in general, it just doesn't play well with this model in particular. Signed-off-by: Ian Wood <[email protected]>

Currently reverting [7884dc8](iree-org@7884dc8) to test regressions (there were problems with llama). Issue here nod-ai/SHARK-ModelDev#756 Couldn't reproduce the issue with llama yet. It might be best to land this since the unit dims should be folded in general, it just doesn't play well with this model in particular. Signed-off-by: Ian Wood <[email protected]> Signed-off-by: Lubo Litchev <[email protected]>

MaheshRavishankar added this to Turbine: SDXL on CDNA Jun 27, 2024

MaheshRavishankar converted this from a draft issue Jun 27, 2024

MaheshRavishankar assigned MaheshRavishankar and IanWood1 and unassigned MaheshRavishankar Jun 27, 2024

MaheshRavishankar added the sdxl-int8 Issues replated to SDXL quantized model support label Jun 27, 2024

antiagainst mentioned this issue Jun 28, 2024

[model] Drop unit dim in constant weight params nod-ai/sdxl-scripts#33

Closed

antiagainst moved this to Todo in Turbine: SDXL on CDNA Jun 28, 2024

IanWood1 mentioned this issue Jul 3, 2024

[Flow][Global Opt] Fold global unit dims iree-org/iree#17781

Merged

IanWood1 closed this as completed Jul 8, 2024

github-project-automation bot moved this from Todo to Done in Turbine: SDXL on CDNA Jul 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[fusion] Fold unit dims of globals #756

[fusion] Fold unit dims of globals #756

MaheshRavishankar commented Jun 27, 2024 •

edited

Loading

[fusion] Fold unit dims of globals #756

[fusion] Fold unit dims of globals #756

Comments

MaheshRavishankar commented Jun 27, 2024 • edited Loading

MaheshRavishankar commented Jun 27, 2024 •

edited

Loading