Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TIR] Upstream Sync2 #40

Closed
tqchen opened this issue Feb 29, 2020 · 17 comments
Closed

[TIR] Upstream Sync2 #40

tqchen opened this issue Feb 29, 2020 · 17 comments

Comments

@tqchen
Copy link
Contributor

tqchen commented Feb 29, 2020

The python side has been updated to reflect the final form of the namespace organization

Update Guide apache/tvm#4647

cc @Hzfengsy

@Hzfengsy
Copy link
Member

Hzfengsy commented Mar 1, 2020

During rebase, I find a very strange problem. This problem happens when we construct a new tir object(e.g. Block, BlockRealize) in python. In C++, the constructor will create a new object and everything is OK. In python, however, once finish the make, the object itself will be deleted. The pointer (python object) will be a wild pointer.

Another weird thing is that the problem only happens in Ubuntu 16.04. On macOS and Ubuntu 18.04, everything is OK. No matter the GCC and Python version, that issue happens every time on 16.04. And the bug has existed for a long time (from the beginning of tir). It did not be discovered only because we always use macOS or Ubuntu 18.04 to code.

Do you have any idea about this? @tqchen

@spectrometerHBH
Copy link
Collaborator

I've done several experiments on the wild pointer issue. I use the same GCC and Python version on 16.04 as what I use on 18.04. The problem still occurs.
I tried to use those DLLs generated under 16.04 on 18.04, no problem occurs. But DLLs generated under 18.04 will just not work on 16.04 since the Glibc version is older.

Specifically, see https://github.com/merrymercy/tvm-tensorir/blob/dev/src/tir/api.cc

TVM_REGISTER_GLOBAL("make.BlockRealize")
.set_body_typed<BlockRealize(Array<PrimExpr>, PrimExpr, Block)>(
    [](Array<PrimExpr> values, PrimExpr predicate, Block block) {
      if (!predicate.dtype().is_bool()) {
        // To support python ir_builder
        CHECK(is_one(predicate));
        predicate = IntImm(DataType::Bool(), 1);
      }
      return BlockRealize(std::move(values), std::move(predicate), std::move(block));
    });

After return, the newly created BlockRealize object will be deleted.
Moreover, the same problem occurs for early commits, when calling make.Loop.

I found a way to fix this, but I don't know why it works. I change make.BlockReailze as below

Stmt BlockRealizeNode::make(Array<PrimExpr> values, PrimExpr predicate, Block block) {
  CHECK_EQ(block->iter_vars.size(), values.size());
  ObjectPtr<BlockRealizeNode> node = make_object<BlockRealizeNode>();
  node->binding_values = std::move(values);
  if (!predicate.dtype().is_bool()) {
    // To support python ir_builder
    CHECK(is_one(predicate));
    predicate = IntImm(DataType::Bool(), 1);
  }
  node->predicate = std::move(predicate);
  node->block = std::move(block);
  return Stmt(node);
}

TVM_REGISTER_GLOBAL("make.BlockRealize")
.set_body_typed(BlockRealizeNode::make);

which is the same style as the original nodes on AST. And now it's all ok.
I guess it is related to the mechanism of ffi and the version of Glibc, but I'm not familiar with these.

@tqchen
Copy link
Contributor Author

tqchen commented Mar 1, 2020

It would be great if we can create a minimum repro, is it only for the new tensor ir objects, or it also happens for upstream if we use a similar style(eg pass things via constructor).

Would be interesting to dig into the issue a bit, if we do gdb or valgrind, or try to see exactly at which pt the deallocate get called.

@tqchen
Copy link
Contributor Author

tqchen commented Mar 1, 2020

If you can get a clean repro— eg create a dummy object that is similar and registration function, and call via py. Which version of gcc. I can take a look as wel

@tqchen
Copy link
Contributor Author

tqchen commented Mar 1, 2020

From what we I can see the problem might occurs in set_body_typed. But am not exactly sure where

@tqchen
Copy link
Contributor Author

tqchen commented Mar 1, 2020

OK, here is my guess, there might be something wrong with the type deduction -

Add ->Stmt hint to the function signature

This will make the function signature the same as the change @spectrometerHBH

TVM_REGISTER_GLOBAL("make.BlockRealize")
.set_body_typed<BlockRealize(Array<PrimExpr>, PrimExpr, Block)>(
    [](Array<PrimExpr> values, PrimExpr predicate, Block block) -> Stmt {
      if (!predicate.dtype().is_bool()) {
        // To support python ir_builder
        CHECK(is_one(predicate));
        predicate = IntImm(DataType::Bool(), 1);
      }
      return BlockRealize(std::move(values), std::move(predicate), std::move(block));
    });

Change ->Stmt to ->BlockRealize

@spectrometerHBH
Copy link
Collaborator

spectrometerHBH commented Mar 2, 2020

I change the signature to @tqchen

TVM_REGISTER_GLOBAL("make.BlockRealize")
.set_body_typed<Stmt(Array<PrimExpr>, PrimExpr, Block)>(
    [](Array<PrimExpr> values, PrimExpr predicate, Block block) -> Stmt {
      if (!predicate.dtype().is_bool()) {
        // To support python ir_builder
        CHECK(is_one(predicate));
        predicate = IntImm(DataType::Bool(), 1);
      }
      return BlockRealize(std::move(values), std::move(predicate), std::move(block));
    });

but it doesn't work.

@spectrometerHBH
Copy link
Collaborator

spectrometerHBH commented Mar 2, 2020

Post an experiment result here.
I modified

template<typename R, int index, typename F>
struct unpack_call_dispatcher<R, 0, index, F> {
  template<typename ...Args>
  static void run(const F& f,
                  const TVMArgs& args_pack,
                  TVMRetValue* rv,
                  Args&&... unpacked_args) {
    std::cout << "?" << std::endl; // here
    *rv = R(f(std::forward<Args>(unpacked_args)...));
    std::cout << "??" << std::endl; // and here
  }
};

and

inline void Object::IncRef() {
  std::cout << "IncRef " << GetTypeKey() << std::endl; // here
  ref_counter_.fetch_add(1, std::memory_order_relaxed);
}

inline void Object::DecRef() {
  if (ref_counter_.fetch_sub(1, std::memory_order_release) == 1) {
    std::cout << "Delete " << GetTypeKey() << std::endl; // here
    std::atomic_thread_fence(std::memory_order_acquire);
    if (this->deleter_ != nullptr) {
      (*this->deleter_)(this);
    }
  } else {
    std::cout << "DecRef " << GetTypeKey() << std::endl; // here
  }
}

and

TVM_REGISTER_GLOBAL("make.BlockRealize")
.set_body_typed<BlockRealize(Array<PrimExpr>, PrimExpr, Block)>(
    [](Array<PrimExpr> values, PrimExpr predicate, Block block) {
      if (!predicate.dtype().is_bool()) {
        // To support python ir_builder
        CHECK(is_one(predicate));
        predicate = IntImm(DataType::Bool(), 1);
      }
      return BlockRealize(std::move(values), std::move(predicate), std::move(block));
    });

BlockRealize::BlockRealize(Array<PrimExpr> values, PrimExpr predicate, Block block) {
  CHECK_EQ(block->iter_vars.size(), values.size());
  ObjectPtr<BlockRealizeNode> node = make_object<BlockRealizeNode>();
  node->binding_values = std::move(values);
  node->predicate = std::move(predicate);
  node->block = std::move(block);
  data_ = std::move(node);
  std::cout << "OK" << std::endl;
}

Here's the test.py

import tvm
from tvm import tir

target = tvm.make.Block(
    [], [], [], tvm.make.SeqStmt([]),
    [], [], 'target')

br = tvm.make.BlockRealize([], 1, target)

print(br)

The last few lines of the running result of test.py are

? // enter
IncRef Block
IncRef Block
IncRef Block
IncRef Block
DecRef Block
DecRef Block
DecRef Block
IncRef IntImm
IncRef Array
IncRef Array
IncRef Array
DecRef Array
DecRef Array
IncRef IntImm
Delete IntImm
IncRef Array
IncRef BlockRealize
Delete Array
OK // end of constructor
DecRef Array
Delete BlockRealize // delete here
DecRef Block
Delete IntImm
Delete Array
Delete Block
Delete Array
Delete Array
Delete SeqStmt
Delete Array
Delete Array
Delete Array
Delete Array
?? // exit

@spectrometerHBH
Copy link
Collaborator

spectrometerHBH commented Mar 3, 2020

I've created a repo for debugging use, and I have invited you @tqchen @Hzfengsy. https://github.com/spectrometerHBH/tvm-tir-debug

I've removed the whole tir directory under python, now tir part only contains definitions, make and functors of tir nodes.

I'll try to remove more in the following commits.

@tqchen
Copy link
Contributor Author

tqchen commented Mar 3, 2020

Good job confirming the details. There are still ways to dig deeper to see where the Delete BlockRealize happens in the line of *rv = R(f(...)))

If the error is reproducible, i think we should be able to get to the truth

@spectrometerHBH
Copy link
Collaborator

Some more experiment results:

1. BlockRealize contains an Stmt.

Correct

Stmt br_make(Stmt block) {
  ObjectPtr<BlockRealizeNode> node = make_object<BlockRealizeNode>();
  node->block = block;
  return Stmt(node);
}

TVM_REGISTER_GLOBAL("make.BlockRealize")
.set_body_typed(br_make);

Wrong

TVM_REGISTER_GLOBAL("make.BlockRealize")
.set_body_typed<Stmt(Stmt)>(
    [](Stmt block) -> Stmt {
      ObjectPtr<BlockRealizeNode> node = make_object<BlockRealizeNode>();
      node->block = std::move(block);
      return Stmt(node);
    });

2. If BlockRealize is empty then

Correct

Stmt br_make() {
  ObjectPtr<BlockRealizeNode> node = make_object<BlockRealizeNode>();
  return Stmt(node);
}

TVM_REGISTER_GLOBAL("make.BlockRealize")
.set_body_typed(br_make);

Correct

TVM_REGISTER_GLOBAL("make.BlockRealize")
.set_body_typed<Stmt()>(
    []() -> Stmt {
      ObjectPtr<BlockRealizeNode> node = make_object<BlockRealizeNode>();
      return Stmt(node);
    });

@Hzfengsy
Copy link
Member

Hzfengsy commented Mar 3, 2020

I change .set_body_typed to .set_body, everything is OK.

TVM_REGISTER_GLOBAL("tir.BlockRealize")
.set_body([](TVMArgs args, TVMRetValue* rv) {
  Array<PrimExpr> values = args[0];
  PrimExpr predicate = args[1];
  Block block = args[2];
  if (!predicate.dtype().is_bool()) {
    // To support python ir_builder
    CHECK(is_one(predicate));
    predicate = IntImm(DataType::Bool(), 1);
  }
  *rv = BlockRealize(values, predicate, block);
});

@spectrometerHBH
Copy link
Collaborator

spectrometerHBH commented Mar 3, 2020

Previously my gcc version=7.4.0, but my g++ version=5.4.0, after I change g++=7.4.0. Everything is OK.
It seems to be an internal g++ compiler bug related to lambda expressions.

Upd: siyuan tried clang=9. It also works.

@tqchen
Copy link
Contributor Author

tqchen commented Mar 3, 2020

Ok, then it could due a mismatch of compiler and libcxx

@Hzfengsy
Copy link
Member

Hzfengsy commented Mar 4, 2020

synced @spectrometerHBH @tqchen

@tqchen tqchen closed this as completed Apr 18, 2020
@tqchen tqchen reopened this May 5, 2020
@tqchen
Copy link
Contributor Author

tqchen commented May 5, 2020

FYI @spectrometerHBH @Hzfengsy I went back to look at the issue of BlockRealize and find the cause.

The main reason was due to we are passing the FLambda signature to the set body typed. In particular, if we change the following code

.set_body_typed<BlockRealize(Array<PrimExpr>, PrimExpr, Block)>(
    [](Array<PrimExpr> values, PrimExpr predicate, Block block) {
   ..
});

to the following one

.set_body_typed(
    [](Array<PrimExpr> values, PrimExpr predicate, Block block) {
   ..
});

The then segfault disappears. Here is my guess about what was happening: In the new code(without the FLambda signature), the closure is directly passed into set_body_typed and being captured. The compiler knows that we need to keep the function align.

On the version with Flambda though, the closure is first converted to type BlockRealize(Array<PrimExpr>, PrimExpr, Block), which perhaps should refers to a function pointer? And somehow compiler optimizes the original body of the lambda away(thinking that it is not needed anywhere).

@tqchen
Copy link
Contributor Author

tqchen commented May 5, 2020

The take away is that we should not pass type signatures to the argument of the set_body_typed

@tqchen tqchen closed this as completed Jun 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants