Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[test] Enable cfg_optimization in all tests #2106

Merged
merged 5 commits into from
Dec 21, 2020

Conversation

xumingkuan
Copy link
Contributor

@xumingkuan xumingkuan commented Dec 20, 2020

Related issue = #1905

All the tests passed on my end. I'm not sure in which way will the control-flow graph optimizations incompatible with the new type system?

BTW, in compile_to_offloads.cpp, this part may be redundant and may not perform the check it intends to:

if (config.cfg_optimization) {
irpass::cfg_optimization(ir, false);
print("Optimized by CFG");
irpass::analysis::verify(ir);
}

-- this is because cfg_optimization is already called in full_simplify without checking compile config here:

if ((first_iteration || modified) &&
cfg_optimization(root, after_lower_access))
modified = true;

[Click here for the format server]


@xumingkuan xumingkuan marked this pull request as ready for review December 20, 2020 12:30
Copy link
Member

@yuanming-hu yuanming-hu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting... Thanks for investigating this :-) Maybe a recent change somewhere in the type system fixes the CFG optimizations. Btw, could you add a check of CompileConfig::cfg_optimization in full_simplify, to fix the behavior of that boolean flag? Feel free to merge after the flag is fixed.

Copy link
Contributor

@Hanke98 Hanke98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! It seems that skipping cfg_optimization pass is not necessary now. I added this flag when I was not very familiar with Taichi compilation system. At that time, I tried to set and verify data in one single kernel like:

cit = ti.type_factory_.get_custom_int_type(16, True)
x = ti.field(dtype=cit)
ti.root._bit_struct(32).place(x)

@ti.kernel
def test_custom_int_type(data: ti.i32):
      x = data
      assert x == data

And I found that after the pass of cfg_optimization, the assignment statement would always be skipped. So, to make sure the assignment could be executed, I turn off this optimization. Now, almost all of our test cases do the setting and verifying in two separate kernels, which is a better way, so it is not necessary to keep cfg_optimization flag false.

Thank you so much for investigating this!

@xumingkuan
Copy link
Contributor Author

xumingkuan commented Dec 21, 2020

Great! It seems that skipping cfg_optimization pass is not necessary now. I added this flag when I was not very familiar with Taichi compilation system. At that time, I tried to set and verify data in one single kernel like:

cit = ti.type_factory_.get_custom_int_type(16, True)
x = ti.field(dtype=cit)
ti.root._bit_struct(32).place(x)

@ti.kernel
def test_custom_int_type(data: ti.i32):
      x = data
      assert x == data

And I found that after the pass of cfg_optimization, the assignment statement would always be skipped. So, to make sure the assignment could be executed, I turn off this optimization. Now, almost all of our test cases do the setting and verifying in two separate kernels, which is a better way, so it is not necessary to keep cfg_optimization flag false.

Thank you so much for investigating this!

I see! For this kernel

@ti.kernel
def test_custom_int_type(data: ti.i32):
    x[None] = data
    assert x[None] == data

, the IR is

    <i32> $1 = arg[0]
    <*gen> $2 = get root
    <i32> $3 = const [0]
    <*gen> $4 = [S0root][root]::lookup($2, $3) activate = false
    <*bs(ci16@0)> $5 = get child [S0root->S1bit_struct<bs(ci16@0)>] $4
    <*gen> $6 = [S1bit_struct<bs(ci16@0)>][bit_struct]::lookup($5, $3) activate = false
    <^ci16> $7 = get child [S1bit_struct<bs(ci16@0)>->S2place<ci16><bit>] $6
    $8 : global store [$7 <- $1]
    <i32> $9 = const [1]
    <i32> $10 = cmp_eq $1 $1 # (we can further optimize this in alg_simp)
    <i32> $11 = bit_and $10 $9
    12 : assert $11, "(x[None] == data)"

. When cfg_optimization=False, the IR becomes

    <i32> $1 = arg[0]
    <*gen> $2 = get root
    <i32> $3 = const [0]
    <*gen> $4 = [S0root][root]::lookup($2, $3) activate = false
    <*bs(ci16@0)> $5 = get child [S0root->S1bit_struct<bs(ci16@0)>] $4
    <*gen> $6 = [S1bit_struct<bs(ci16@0)>][bit_struct]::lookup($5, $3) activate = false
    <^ci16> $7 = get child [S1bit_struct<bs(ci16@0)>->S2place<ci16><bit>] $6
    $8 : global store [$7 <- $1]
    <i32> $9 = const [1]
    <i32> $10 = global load $7
    <i32> $11 = cmp_eq $10 $1 # <------ not optimized!
    <i32> $12 = bit_and $11 $9
    13 : assert $12, "(x[None] == data)"

So IIUC cfg_optimization would not make anything go wrong, just make it harder to test?

@Hanke98
Copy link
Contributor

Hanke98 commented Dec 21, 2020

So IIUC cfg_optimization would not make anything go wrong, just make it harder to test?

Yes you are right, that is the exact reason why I turn this optimization pass off. Thanks for the very detailed IR analysis!

@k-ye k-ye mentioned this pull request Jan 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants