-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AssignBufferAddresses] Add bank-aware scheme #1159
base: main
Are you sure you want to change the base?
Conversation
// Each entry of `nextAddrInBanks` is the next address available for use | ||
// in that bank, and the index is the bank number. | ||
int stackSize = 0; | ||
std::vector<int64_t> nextAddrInBanks; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::vector<int64_t> nextAddrInBanks; | |
SmallVector<int64_t> nextAddrInBanks; |
|
||
// Each entry of `bankLimits` contains pairs of start and end addresses for | ||
// that bank. | ||
std::vector<BankLimits> bankLimits; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::vector<BankLimits> bankLimits; | |
SmallVector<BankLimits> bankLimits; |
// Leave room at the bottom of the address range for stack. | ||
if (CoreOp core = tile.getCoreOp()) { | ||
stackSize = core.getStackSize(); | ||
nextAddrInBanks[0] += stackSize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to return an error if stackSize > bankSize
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the stack size is usually small compared to the bank size.
// and then deallocates all the buffers. | ||
if (!setBufferAddress(buffer, numBanks, bankIndex, nextAddrInBanks, | ||
bankLimits)) { | ||
deAllocateBuffers(allocatedBuffers); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be necessary to keep deAllocateBuffers
here? From what I see, mlir-aie
includes it primarily for this fallback mode (Xilinx/mlir-aie#1625), which we aren’t planning to support?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add a fallback mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative design that might be
- Try bank aware allocation.
- If bank aware allocation failed, emit warning and use linear allocation.
Where in bank aware allocation, automatically fail if
- any buffer has a prespecified address
- any buffer has a prespecified memory bank
- any buffer is allocated during the algorithm to more than 1 bank (i.e. overflows).
My thinking is that
- if a user prespecifies the address/bank if a buffer, they probably don't need/want us to do anything more than just place the remaining buffers as simply as possible.
- the above will hopefully simplify the logic a bit
- the above will result in compilation failure less frequently
// the list of buffers without addresses, to be completed later on). | ||
FailureOr<bool> checkAndAddBufferWithAddress( | ||
BufferOp buffer, int numBanks, std::vector<int64_t> &nextAddrInBanks, | ||
std::vector<BankLimits> &bankLimits) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::vector<BankLimits> &bankLimits) { | |
const std::vector<BankLimits> &bankLimits) { |
int addr = addrAttr.getInt(); | ||
for (int i = 0; i < numBanks; i++) { | ||
// If the address is not within the bank, continue. | ||
if (addr < bankLimits[i].startAddr || addr >= bankLimits[i].endAddr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: you can get the (first) bank that the address is on in O(1) instead of O(numBanks).
// addresses, to be completed later on). | ||
FailureOr<bool> checkAndAddBufferWithMemBank( | ||
BufferOp buffer, int numBanks, std::vector<int64_t> &nextAddrInBanks, | ||
std::vector<BankLimits> &bankLimits) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::vector<BankLimits> &bankLimits) { | |
const std::vector<BankLimits> &bankLimits) { |
// from the given index to try and find a bank with enough space. If it does, | ||
// it will set the buffer's address and mem_bank attributes and update the | ||
// nextAddrInBanks vector. If it does not find one with enough space, it will | ||
// throw an error. Returns true if the buffer was successfully allocated, false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// throw an error. Returns true if the buffer was successfully allocated, false | |
// emit an error. Returns true if the buffer was successfully allocated, false |
@@ -536,6 +536,10 @@ struct AMDAIEDeviceModel { | |||
|
|||
FailureOr<std::array<uint32_t, 3>> getAIEMatmulInstructionSize( | |||
Type elTypeLhs, Type elTypeRhs, Type elTypeAcc) const; | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where did you find these values?
|
||
// Each entry of `bankLimits` contains pairs of start and end addresses for | ||
// that bank. | ||
std::vector<BankLimits> bankLimits; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neater IMO would be to have a constructor for the BankLimits class,
BankLimits(int numBanks, int bankSize);
int numBanks = deviceModel.getNumBanks(tile.getCol(), tile.getRow()); | ||
int bankSize = maxDataMemorySize / numBanks; | ||
|
||
// Each entry of `nextAddrInBanks` is the next address available for use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this code duplicated with the linear algorithm?
The scheme mostly adheres to the implementation in MLIR-AIE.