-
Notifications
You must be signed in to change notification settings - Fork 340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update the llvm commit to 061e0189a3dab6b1831a80d489ff1b15ad93aafb #1599
Conversation
…ource NOT including the accelerators Signed-off-by: Stella Stamenova <[email protected]>
Signed-off-by: Stella Stamenova <[email protected]>
Signed-off-by: Stella Stamenova <[email protected]>
Signed-off-by: Stella Stamenova <[email protected]>
Signed-off-by: Stella Stamenova <[email protected]>
Signed-off-by: Stella Stamenova <[email protected]>
Signed-off-by: Stella Stamenova <[email protected]>
Note this will update onnx-mlir to the commit chosen here: llvm/torch-mlir#1178 |
@tungld sorry to add to your plate, you did some good work with the constants. Do you mind looking into @sstamenova's question? Tx @sstamenova thanks so much for lifting the LLVM to newer version, this is much much appreciated. |
@sstamenova sometimes the CI just timeouts on downloading resnet50. So you can restart the CIs to see if amd/ppc results are a fluke. Both are in a zone that make download slower. |
@AlexandreEichenberger : I'm still trying to figure out the correct way to get symbol lookup to work after the mlir change, so there is one lit test failing. I started the PR to give people a change to review the removal of the opaque attribute while I track that last failure down. |
Signed-off-by: Stella Stamenova <[email protected]>
@sstamenova DenseResourceElementsAttr seems to be the right choice to me too. We have used Given that As a side note, when using |
@tungld The first 4 bytes of the resource string is supposed to be the alignment. I tried to guess based on the tests what the intended alignment was, but since I didn't know what the source of the data was, I am guessing I didn't get them all correctly and would like to know what they are supposed to be. |
If so, we need a way to specify alignment since NNPA needs 4K-alignment (Sometimes 8K-alignment). In the current code, all constants will be represented by KrnlGlobal with a separated alignment attribute. LLVM Global variables will be created from KrnlGlobal by setting alignment (https://github.com/onnx/onnx-mlir/blob/main/src/Conversion/KrnlToLLVM/KrnlGlobal.cpp#L75). Now if |
@tungld I can update the tests, but I would like to know if the change works with the models that you mentioned. I think the way it's currently setup, alignment can continue to be specified as it was before in addition to specifying it in the data. |
Signed-off-by: Stella Stamenova <[email protected]>
@sstamenova yes, I will check this patch with some models. I just returned from a long vacation and it will take time for me to give you the result. |
Signed-off-by: Stella Stamenova <[email protected]>
@tungld I looked at it again, and I think what will happen right now is that the explicitly specified alignment attribute on the constant will continue to work as it is (and it is still specified), while the alignment in the hex string will be used for the reading/writing of the hex string itself. I noticed that I had used a different value in the tests and the code ( |
Signed-off-by: Stella Stamenova <[email protected]>
@sstamenova Yes, I agree. The alignment in the hex string is used for buffers allocated during compilation only (See https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/IR/AsmState.h#L83), while our specified alignment attribute is used for allocating buffers during runtime (of the binary generated code). So they are quite independent. |
@jenkins-droid test this please |
Signed-off-by: Stella Stamenova <[email protected]>
@jenkins-droid test this please |
@tungld : Were you able to run some tests and verify the accelerator change? |
@sstamenova unfortunately, I got wrong results on NNPA when using this patch. After digging into this patch a bit, I know the reason. Let me explain here. A key difference between So, the code to free the given buffer is still there in this patch, so when using I confirm that just removing |
…t the blob will now have control of its memory and will clean up appropriately. Signed-off-by: Stella Stamenova <[email protected]>
test/mlir/accelerators/nnpa/transform/zhigh-constant-propagation-be/constprop.mlir
Show resolved
Hide resolved
Signed-off-by: Stella Stamenova <[email protected]>
@tungld : I made the couple of updates. I'd like to check this in sooner rather than later since the longer we wait, the more likely there are conflicting changes committed and I want to avoid that since this is not a trivial change. Can you have a look? |
Signed-off-by: Stella Stamenova <[email protected]>
Signed-off-by: Stella Stamenova <[email protected]>
@tungld @AlexandreEichenberger Since we have fairly high confidence now that this will work on the NNPA machine, I think it makes sense to commit this sooner rather than later and if there are issues specifically on the NNPA machine, then someone who has access to run the tests can quickly iterate to fix the issues. I understand that we don't want any breaks in any part of the project if it can be avoided, but gating a change to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for the update!
Jenkins Linux s390x Build #7026 [push] Update the llvm commit t... started at 00:36 |
Jenkins Linux amd64 Build #7011 [push] Update the llvm commit t... started at 23:36 |
Jenkins Linux ppc64le Build #6119 [push] Update the llvm commit t... started at 00:37 |
Jenkins Linux amd64 Build #7011 [push] Update the llvm commit t... passed after 1 hr 55 min |
Jenkins Linux s390x Build #7026 [push] Update the llvm commit t... passed after 1 hr 59 min |
Jenkins Linux ppc64le Build #6119 [push] Update the llvm commit t... passed after 2 hr 9 min |
There are multiple impactful changes from mlir:
Of these, the last two are the most impactful and I'm still working through figuring out the correct SymbolTableCollection usage, but I wanted to get eyes on this earlier rather than later because of the
OpaqueElementsAttr
change.OpaqueElementsAttr
which is used both in KrnlToLLVM and the NNPA accelerator was removed and a new similar, but not exactly the same, attribute was added -DenseResourceElementsAttr
. torch-mlir removed there usage ofOpaqueElementsAttr
in favor ofSparseElementsAttr
, but this does not seem appropriate here (llvm/torch-mlir@bb47c16) and the newDenseResourceElementsAttr
seems like the right choice. I thinkDenseElementsAttr
could probably also work. I'd love to get some feedback from @tungld , @AlexandreEichenberger and anyone else since it is not a simple replacement.