-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[opengl] Randomly breaking down mpm128.py #633
Comments
I surprisingly found that I can't reproduce this bug now.. |
Maybe you should move |
I built the OpenGL backend on my end and run into this issue on mpm99.py. I also tested mpm128 and got a similar issue. Do you have an idea? :-) python mpm99.py
[Taichi] mode=development
[Taichi] preparing sandbox at /tmp/taichi-_zvu7wvc
[Taichi] sandbox prepared
[Taichi] version 0.5.8, cuda 10.0, commit a66eba07, python 3.6.9
[I 03/26/20 20:33:30.700] [program.cpp:materialize_layout@255] OpenGL root buffer size: 1114112 B
[W 03/26/20 20:33:30.700] [opengl_api.cpp:initialize_opengl@194] OpenGL backend currently WIP, MAY NOT WORK
[I 03/26/20 20:33:30.869] [opengl_api.cpp:initialize_opengl@223] [glsl] OpenGL 4.3.0 NVIDIA 430.26
[E 03/26/20 20:33:30.893] [opengl_api.cpp:compile@62] [glsl] error while compiling shader:
1 #version 430 core
2 precision highp float;
3 #define S25 const int // place float
4 #define S25_stride 4 // sizeof(float)
5 #define S24_ch const int
6 #define S24_get0(a_) (a_) // S25
7 #define S24_ch_stride (S25_stride)
8 #define S24 const int // dense
9 #define S24_n 16384
10 #define S24_stride (S24_ch_stride * S24_n)
11 #define S24_children(a_, i) ((a_) + S24_ch_stride * (i))
12 #define S23 const int // place float
13 #define S23_stride 4 // sizeof(float)
14 #define S22 const int // place float
15 #define S22_stride 4 // sizeof(float)
16 #define S21_ch const int
17 #define S21_get0(a_) (a_) // S22
18 #define S21_get1(a_) ((a_) + (S22_stride)) // S23
19 #define S21_ch_stride (S22_stride + S23_stride)
20 #define S21 const int // dense
21 #define S21_n 16384
22 #define S21_stride (S21_ch_stride * S21_n)
23 #define S21_children(a_, i) ((a_) + S21_ch_stride * (i))
24 #define S20 const int // place float
25 #define S20_stride 4 // sizeof(float)
26 #define S19_ch const int
27 #define S19_get0(a_) (a_) // S20
28 #define S19_ch_stride (S20_stride)
29 #define S19 const int // dense
30 #define S19_n 16384
31 #define S19_stride (S19_ch_stride * S19_n)
32 #define S19_children(a_, i) ((a_) + S19_ch_stride * (i))
33 #define S18 const int // place int
34 #define S18_stride 4 // sizeof(int)
35 #define S17_ch const int
36 #define S17_get0(a_) (a_) // S18
37 #define S17_ch_stride (S18_stride)
38 #define S17 const int // dense
39 #define S17_n 16384
40 #define S17_stride (S17_ch_stride * S17_n)
41 #define S17_children(a_, i) ((a_) + S17_ch_stride * (i))
42 #define S16 const int // place float
43 #define S16_stride 4 // sizeof(float)
44 #define S15 const int // place float
45 #define S15_stride 4 // sizeof(float)
46 #define S14 const int // place float
47 #define S14_stride 4 // sizeof(float)
48 #define S13 const int // place float
49 #define S13_stride 4 // sizeof(float)
50 #define S12_ch const int
51 #define S12_get0(a_) (a_) // S13
52 #define S12_get1(a_) ((a_) + (S13_stride)) // S14
53 #define S12_get2(a_) ((a_) + (S13_stride + S14_stride)) // S15
54 #define S12_get3(a_) ((a_) + (S13_stride + S14_stride + S15_stride)) // S16
55 #define S12_ch_stride (S13_stride + S14_stride + S15_stride + S16_stride)
56 #define S12 const int // dense
57 #define S12_n 16384
58 #define S12_stride (S12_ch_stride * S12_n)
59 #define S12_children(a_, i) ((a_) + S12_ch_stride * (i))
60 #define S11 const int // place float
61 #define S11_stride 4 // sizeof(float)
62 #define S10 const int // place float
63 #define S10_stride 4 // sizeof(float)
64 #define S9 const int // place float
65 #define S9_stride 4 // sizeof(float)
66 #define S8 const int // place float
67 #define S8_stride 4 // sizeof(float)
68 #define S7_ch const int
69 #define S7_get0(a_) (a_) // S8
70 #define S7_get1(a_) ((a_) + (S8_stride)) // S9
71 #define S7_get2(a_) ((a_) + (S8_stride + S9_stride)) // S10
72 #define S7_get3(a_) ((a_) + (S8_stride + S9_stride + S10_stride)) // S11
73 #define S7_ch_stride (S8_stride + S9_stride + S10_stride + S11_stride)
74 #define S7 const int // dense
75 #define S7_n 16384
76 #define S7_stride (S7_ch_stride * S7_n)
77 #define S7_children(a_, i) ((a_) + S7_ch_stride * (i))
78 #define S6 const int // place float
79 #define S6_stride 4 // sizeof(float)
80 #define S5 const int // place float
81 #define S5_stride 4 // sizeof(float)
82 #define S4_ch const int
83 #define S4_get0(a_) (a_) // S5
84 #define S4_get1(a_) ((a_) + (S5_stride)) // S6
85 #define S4_ch_stride (S5_stride + S6_stride)
86 #define S4 const int // dense
87 #define S4_n 16384
88 #define S4_stride (S4_ch_stride * S4_n)
89 #define S4_children(a_, i) ((a_) + S4_ch_stride * (i))
90 #define S3 const int // place float
91 #define S3_stride 4 // sizeof(float)
92 #define S2 const int // place float
93 #define S2_stride 4 // sizeof(float)
94 #define S1_ch const int
95 #define S1_get0(a_) (a_) // S2
96 #define S1_get1(a_) ((a_) + (S2_stride)) // S3
97 #define S1_ch_stride (S2_stride + S3_stride)
98 #define S1 const int // dense
99 #define S1_n 16384
100 #define S1_stride (S1_ch_stride * S1_n)
101 #define S1_children(a_, i) ((a_) + S1_ch_stride * (i))
102 #define S0_ch const int
103 #define S0_get0(a_) (a_) // S1
104 #define S0_get1(a_) ((a_) + (S1_stride)) // S4
105 #define S0_get2(a_) ((a_) + (S1_stride + S4_stride)) // S7
106 #define S0_get3(a_) ((a_) + (S1_stride + S4_stride + S7_stride)) // S12
107 #define S0_get4(a_) ((a_) + (S1_stride + S4_stride + S7_stride + S12_stride)) // S17
108 #define S0_get5(a_) ((a_) + (S1_stride + S4_stride + S7_stride + S12_stride + S17_stride)) // S19
109 #define S0_get6(a_) ((a_) + (S1_stride + S4_stride + S7_stride + S12_stride + S17_stride + S19_stride)) // S21
110 #define S0_get7(a_) ((a_) + (S1_stride + S4_stride + S7_stride + S12_stride + S17_stride + S19_stride + S21_stride)) // S24
111 #define S0_ch_stride (S1_stride + S4_stride + S7_stride + S12_stride + S17_stride + S19_stride + S21_stride + S24_stride)
112 #define S0 const int // root
113 #define S0_n 1
114 #define S0_stride (S0_ch_stride * S0_n)
115 #define S0_children(a_, i) ((a_) + S0_ch_stride * (i))
116
117 layout(std430, binding = 0) buffer data_i32 { int _data_i32_[]; };
118 layout(std430, binding = 0) buffer data_f32 { float _data_f32_[]; };
119 layout(std430, binding = 0) buffer data_f64 { double _data_f64_[]; };
120 #define _mem_i32(x) _data_i32_[(x) >> 2]
121 #define _mem_f32(x) _data_f32_[(x) >> 2]
122 #define _mem_f64(x) _data_f64_[(x) >> 3]
123 #define _Ax_(x) x
124 #define _At_(x) _Ax_(_at_##x(x))
125 uvec4 _rand_;
126
127 void _init_rand()
128 {
129 uint i = gl_GlobalInvocationID.x;
130 _rand_.x = 123456789 * i * 1000000007;
131 _rand_.y = 362436069;
132 _rand_.z = 521288629;
133 _rand_.w = 88675123;
134 }
135
136 uint _rand_u32()
137 {
138 uint t = _rand_.x ^ (_rand_.x << 11);
139 _rand_.xyz = _rand_.yzw;
140 _rand_.x = _rand_.y;
141 _rand_.y = _rand_.z;
142 _rand_.z = _rand_.w;
143 _rand_.w = (_rand_.w ^ (_rand_.w >> 19)) ^ (t ^ (t >> 8));
144 return _rand_.w * 1000000007;
145 }
146
147 float _rand_f32()
148 {
149 return float(_rand_u32()) * (1.0 / 4294967296.0);
150 }
151
152 double _rand_f64()
153 {
154 return double(_rand_f32());
155 }
156
157 int _rand_i32()
158 {
159 return int(_rand_u32());
160 }
161
162 void initialize_c6_00()
163 { // range for
164 // range known at compile time
165 const int _thread_id_ = int(gl_GlobalInvocationID.x);
166 if (_thread_id_ >= 9000) return;
167 const int _it_value_ = 0 + _thread_id_ * 1;
168 const float tmp5 = _rand_f32();
169 const float tmp6 = 0.2;
170 const float tmp7 = float(tmp5 * tmp6);
171 const float tmp8 = 0.3;
172 const float tmp9 = float(tmp7 + tmp8);
173 const int tmp10 = _it_value_;
174 const int tmp11 = 3000;
175 const int tmp12 = int(tmp10 * tmp11 >= 0 ? abs(tmp10) / abs(tmp11) : sign(tmp10) * (abs(tmp10) + abs(tmp11) - 1) / tmp11);
176 const float tmp13 = 0.1;
177 const float tmp14 = float(tmp12);
178 const float tmp15 = float(tmp14 * tmp13);
179 const float tmp16 = float(tmp9 + tmp15);
180 S0 tmp19 = 0;
181 const int tmp199 = 0;
182 S0_ch tmp21 = S0_children(tmp19, tmp199);
183 S1 tmp22 = S0_get0(tmp21);
184 const int tmp23 = (((0 + tmp10) >> 0) & ((1 << 14) - 1));
185 const int tmp201 = 1;
186 const int tmp202 = int(tmp23 * tmp201);
187 const int tmp203 = int(tmp199 + tmp202);
188 S1_ch tmp25 = S1_children(tmp22, tmp203);
189 S2 tmp26 = S1_get0(tmp25);
190 #define _at_tmp26 _mem_f32
191 _At_(tmp26) = tmp16;
192 const float tmp29 = _rand_f32();
193 const float tmp30 = float(tmp29 * tmp6);
194 const float tmp31 = 0.05;
195 const float tmp32 = float(tmp30 + tmp31);
196 const float tmp33 = 0.32;
197 const float tmp34 = float(tmp14 * tmp33);
198 const float tmp35 = float(tmp32 + tmp34);
199 S3 tmp45 = S1_get1(tmp25);
200 #define _at_tmp45 _mem_f32
201 _At_(tmp45) = tmp35;
202 S17 tmp53 = S0_get4(tmp21);
203 S17_ch tmp56 = S17_children(tmp53, tmp203);
204 S18 tmp57 = S17_get0(tmp56);
205 #define _at_tmp57 _mem_i32
206 _At_(tmp57) = tmp12;
207 const float tmp61 = 0.0;
208 S4 tmp66 = S0_get1(tmp21);
209 S4_ch tmp69 = S4_children(tmp66, tmp203);
210 S5 tmp70 = S4_get0(tmp69);
211 #define _at_tmp70 _mem_f32
212 _At_(tmp70) = tmp61;
213 S6 tmp82 = S4_get1(tmp69);
214 #define _at_tmp82 _mem_f32
215 _At_(tmp82) = tmp61;
216 const float tmp86 = 1.0;
217 S12 tmp91 = S0_get3(tmp21);
218 S12_ch tmp94 = S12_children(tmp91, tmp203);
219 S13 tmp95 = S12_get0(tmp94);
220 #define _at_tmp95 _mem_f32
221 _At_(tmp95) = tmp86;
222 S14 tmp107 = S12_get1(tmp94);
223 #define _at_tmp107 _mem_f32
224 _At_(tmp107) = tmp61;
225 S15 tmp119 = S12_get2(tmp94);
226 #define _at_tmp119 _mem_f32
227 _At_(tmp119) = tmp61;
228 S16 tmp131 = S12_get3(tmp94);
229 #define _at_tmp131 _mem_f32
230 _At_(tmp131) = tmp86;
231 S19 tmp139 = S0_get5(tmp21);
232 S19_ch tmp142 = S19_children(tmp139, tmp203);
233 S20 tmp143 = S19_get0(tmp142);
234 #define _at_tmp143 _mem_f32
235 _At_(tmp143) = tmp86;
236 }
237
238 void main()
239 {
240 _init_rand();
241 initialize_c6_00();
242 }
243 layout(local_size_x = 1792, local_size_y = 1, local_size_z = 1) in;
0(243) : error C7604: layout(layout_size_x = 1792) exceeds maximum value |
Not the same issue. This is because a hardcoded magic number, @archibate will find that Found: https://stackoverflow.com/questions/39004898/get-maximum-workgroup-size-for-compute-shaders |
* use GL_MAX_COMPUTE_WORK_GROUP_INVOCATIONS instead of 1792 for portability * modify mpm128.py to reproduce bug #633 * Update opengl_api.cpp * misc * gather #define _at_{} * [skip ci] use ptr_signat * no #define _At_ [skip ci] fix typo [skip ci] fix again * attempt to fix opengl on test_loops * [skip ci] really fix test_loops * [skip ci] enable _GLSL_DEBUG & try improve used.atomic_float for all * [skip ci] gtmp test * [skip ci] fix calloc null when gtmp_size uninited * no atan(double, double) * [skip ci] better inform TI_ARCH * [skip ci] Update misc/make_changelog.py * hardcoded _GLSL_NVIDIA for built-in atomic float ops * [skip ci] share work about stride_map_ [skip ci] really did stride_map_ test passing * [skip ci] save my power to sleep * [skip ci] fix const mutable by no const qua struct_compiled_ * [skip ci] also class_children_map_ * [skip ci] no use struct_compiled->source_code * [skip ci] no macro for _earg_i32 * [skip ci] no macro like _arg_{}({}) * also make data/gtmp/extr no macroed * use fancier short_name() to make NV GLSL compiler ridiculously faster * no extra float(...) bracing BinaryOpStmt * [skip ci] remove useless TI_INFO some * auto detect GL_NV_shader_atomic_float * [skip ci] fix typo in atomic sim * apply reviews (thanks to @k-ye!) * [skip ci] fix mpm88/99 bug (do we have better solution?) * [skip ci] disable _GLSL_DEBUG * guard short_name.cpp with TI_NAMESPACE_BEGIN/END * [skip ci] use STR macro by k-ye for shader code * [skip ci] enforce code format * [skip ci] add clang-format off/on guard for STR * [skip ci] enforce code format * platform/opengl -> backends/opengl (like metal does) * [skip ci] use opengl/shaders/*.glsl.h for STR(..) * [skip ci] minor shader code adjustments Co-authored-by: Yuanming Hu <[email protected]> Co-authored-by: Taichi Gardener <[email protected]>
No description provided.
The text was updated successfully, but these errors were encountered: