You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Here we would like to fuse the extract_slice with its consumer. That gets blocked due to the presence of the tensor.expand_shape in between the slice and its use in the dispatch. The real reason why these epxand shapes exist is because after FoldUnitDims, the unit dimensions in the globals like %__auto.down_blocks.1.attentions.0.transformer_blocks.0.ff.net.2.premul_input and %__auto.down_blocks.1.attentions.0.transformer_blocks.0.ff.net.2.q_input3Ascale dont get folded away. This results in the collapse_shapes not fully folding away, which later propagation passes pick up.
While this could be accounted for during the propagation passes, it is also worth just folding the unit dimensions in the global variables away.
The text was updated successfully, but these errors were encountered:
Currently reverting
[7884dc8](7884dc8)
to test regressions (there were problems with llama). Issue here
nod-ai/SHARK-ModelDev#756
Couldn't reproduce the issue with llama yet. It might be best to land
this since the unit dims should be folded in general, it just doesn't
play well with this model in particular.
Signed-off-by: Ian Wood <[email protected]>
Currently reverting
[7884dc8](iree-org@7884dc8)
to test regressions (there were problems with llama). Issue here
nod-ai/SHARK-ModelDev#756
Couldn't reproduce the issue with llama yet. It might be best to land
this since the unit dims should be folded in general, it just doesn't
play well with this model in particular.
Signed-off-by: Ian Wood <[email protected]>
Signed-off-by: Lubo Litchev <[email protected]>
During compilation of quantized SDXL model there are artifacts of this form
Here we would like to fuse the
extract_slice
with its consumer. That gets blocked due to the presence of thetensor.expand_shape
in between the slice and its use in the dispatch. The real reason why these epxand shapes exist is because afterFoldUnitDims
, the unit dimensions in the globals like%__auto.down_blocks.1.attentions.0.transformer_blocks.0.ff.net.2.premul_input
and%__auto.down_blocks.1.attentions.0.transformer_blocks.0.ff.net.2.q_input3Ascale
dont get folded away. This results in the collapse_shapes not fully folding away, which later propagation passes pick up.While this could be accounted for during the propagation passes, it is also worth just folding the unit dimensions in the global variables away.
The text was updated successfully, but these errors were encountered: