Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bounds checking fails to compile in 16-bit mode (MSP430) #2520

Open
luismarques opened this issue Jan 20, 2018 · 7 comments
Open

Bounds checking fails to compile in 16-bit mode (MSP430) #2520

luismarques opened this issue Jan 20, 2018 · 7 comments

Comments

@luismarques
Copy link
Contributor

The following code, when compiled with bounds checking on, fails to compile for -mtriple=msp430 :

void test()
{
    int[1] array;
    int k = 0;
    auto value = array[k];
}
Assertion failed: (getOperand(0)->getType() == getOperand(1)->getType() && "Both operands to ICmp instruction are not of the same type!"), function AssertOK, file llvm/include/llvm/IR/Instructions.h, line 1113.
0  ldc2                     0x0000000108cc4f8c llvm::sys::PrintStackTrace(llvm::raw_ostream&) + 60
1  ldc2                     0x0000000108cc5559 PrintStackTraceSignalHandler(void*) + 25
2  ldc2                     0x0000000108cc0ee9 llvm::sys::RunSignalHandlers() + 425
3  ldc2                     0x0000000108cc58d2 SignalHandler(int) + 354
4  libsystem_platform.dylib 0x00007fff4fffcf5a _sigtramp + 26
5  libsystem_platform.dylib 0x00007fe07ed13948 _sigtramp + 785476104
6  libsystem_c.dylib        0x00007fff4fe27312 abort + 127
7  libsystem_c.dylib        0x00007fff4fdef368 basename_r + 0
8  ldc2                     0x000000010615ca13 llvm::ICmpInst::AssertOK() + 243
9  ldc2                     0x000000010615c90e llvm::ICmpInst::ICmpInst(llvm::CmpInst::Predicate, llvm::Value*, llvm::Value*, llvm::Twine const&) + 142
10 ldc2                     0x000000010615c873 llvm::ICmpInst::ICmpInst(llvm::CmpInst::Predicate, llvm::Value*, llvm::Value*, llvm::Twine const&) + 51
11 ldc2                     0x000000010615352d llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>::CreateICmp(llvm::CmpInst::Predicate, llvm::Value*, llvm::Value*, llvm::Twine const&) + 221
12 ldc2                     0x00000001060845eb DtoIndexBoundsCheck(Loc&, DValue*, DValue*) + 347
13 ldc2                     0x00000001061f7d93 ToElemVisitor::visit(IndexExp*) + 1571
14 ldc2                     0x00000001061eda26 toElem(Expression*) + 70
15 ldc2                     0x00000001061f8d69 ToElemVisitor::visit(AssignExp*) + 1449
16 ldc2                     0x00000001061eda26 toElem(Expression*) + 70
17 ldc2                     0x000000010614f77f DtoVarDeclaration(VarDeclaration*) + 1231
18 ldc2                     0x000000010614fbaa DtoDeclarationExp(Dsymbol*) + 298
19 ldc2                     0x00000001061f36b3 ToElemVisitor::visit(DeclarationExp*) + 211
20 ldc2                     0x00000001061edd69 toElemDtor(Expression*) + 73
21 ldc2                     0x00000001061c2c6a ToIRVisitor::visit(ExpStatement*) + 282
22 ldc2                     0x00000001061c2d93 ToIRVisitor::visit(CompoundStatement*) + 227
23 ldc2                     0x00000001061c2d93 ToIRVisitor::visit(CompoundStatement*) + 227
24 ldc2                     0x00000001061c2a92 Statement_toIR(Statement*, IRState*) + 66
25 ldc2                     0x00000001061341b3 DtoDefineFunction(FuncDeclaration*, bool) + 7891
26 ldc2                     0x00000001060e57f5 CodegenVisitor::visit(FuncDeclaration*) + 53
27 ldc2                     0x00000001060e301b CodegenVisitor::visit(AttribDeclaration*) + 155
28 ldc2                     0x00000001060e2e92 Declaration_codegen(Dsymbol*, IRState*) + 66
29 ldc2                     0x00000001060e2e3f Declaration_codegen(Dsymbol*) + 31
30 ldc2                     0x0000000106168f6e codegenModule(IRState*, Module*) + 1374
31 ldc2                     0x00000001062a8b2c ldc::CodeGenerator::emit(Module*) + 300
32 ldc2                     0x00000001062fd8ce codegenModules(Array<Module*>&) + 574
33 ldc2                     0x0000000105f54553 mars_mainBody(Array<char const*>&, Array<char const*>&) + 6083

I looked around. I guess the problem will be here:

  llvm::ICmpInst::Predicate cmpop = llvm::ICmpInst::ICMP_ULT;
  llvm::Value *cond = gIR->ir->CreateICmp(cmpop, DtoRVal(index),
                                          DtoArrayLen(arr), "bounds.cmp");

DtoArrayLen eventually leads to the changes I made to size_t in LDC for the MSP430, so I'm leaning towards the issue being with index not being of the proper type, but I'm a bit lost in the weeds here, so any help would be appreciated.

@kinke
Copy link
Member

kinke commented Jan 20, 2018

DtoArrayLen eventually leads to the changes I made to size_t in LDC for the MSP430, so I'm leaning towards the issue being with index not being of the proper type

Sounds reasonable. With a debugger, try outputting index->type->toChars(), that's the D type, and index->val->dump(), that's the IR value (probably a pointer, i.e., the k lvalue). As the D variable's type is int and I guess that's still 32-bits wide, you may have to truncate it (gIR->ir->CreateTrunc()).

@dnadlinger
Copy link
Member

Just a debugging trick which might or might not be applicable here: Did you try dump()-ing the LLVM module in the debugger when the assertion is hit? You will see the already emitted code, which might be helpful in wrapping your head about what is going on.

@luismarques
Copy link
Contributor Author

@klickverbot No, I didn't try it. My knowledge of LLVM is somewhat limited ATM, so those kind of suggestions help. I'll try that and @kinke's suggestion, and get back to you as soon as I find the time.

@luismarques
Copy link
Contributor Author

One interesting thing is that the bug happens even when k is a ushort (i.e. size_t for the MSP430). Another interesting thing is that D doesn't seem to care for that and extends it to a 32-bit integer for the array indexing:

(lldb) p index->val->dump()
  %11 = zext i16 %10 to i32

I don't understand why calling toChars() in the index->type doesn't result in a string though:

(lldb) p index->type->toChars()
(const char *) $0 = 0x0000000000000001 <no value available>

@kinke
Copy link
Member

kinke commented Jan 20, 2018

I don't understand why calling toChars() in the index->type doesn't result in a string though

Hmm, that's D code, maybe give gdb a shot?

Another useful debugging tool is running ldc with the -vv switch. Lots of output, of special interest are the last lines before hitting the assertion.

@luismarques
Copy link
Contributor Author

Using CreateTrunc does indeed fix this issue. It also seems to be a no-op if the type is already correct, so I guess I don't need to do it conditionally. I think ideally the index would reach us with the correct type (without the zero extension) at the place the bounds checking is requested, as that would avoid creating busy work for the optimizer (16-bit size_t -> zext i32 -> 16-bit size_t), but I'm guessing that's related to the general desire that D has to always extend types to int, and so it would be difficult to correct, as that logic would be completely decoupled from the array indexing and bound checking.

I'll submit a pull request using CreateTrunc ok?

@dukc
Copy link
Contributor

dukc commented Jun 20, 2023

I'm currently playing with LDC on AVR. Bumped into this issue and did a bit of experimentation, gaining some insight.

Since D is designed for 32-bit architectures and up, it'd probably be painful to have size_t and ptrdiff_t be ushort and short respectively. Whether consciously or by accident, currently they are made uint/ints for achitectures with 16-bit pointers. PointerType.sizeof is still 2.

This is not by the book, since size_t and ptrdiff_t are supposed to be of same size as the pointers, but I'm not necessarily judging it. It's probably the best tradeoff since D code rarely expects to deal with array.length or ptrA - ptrB returning 16-bit integers, that then would have to be cast back at almost every corner.

So all fine so far. Where this falls flat is that the frontend and backend are in disagreement about the type of the integer. If you execute arr.length < 5, you will get a backend error since backend considers arr.length 16 bits wide, the frontend thinks it's already 32 bits and thus needs no cast to 32 bits. You can work around this by casting arr.length to ushort before doing anything else with it.

Thus the fix would be that when a pointer is used as a integer, backend code should automatically be generated to widen the integer to 32 bits at the backend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants