-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
code with static arrays is very very slow to compile with the --release flag #2485
Comments
I can't reproduce this. It does take a long time to compile because of the big static array (LLVM is slow with big static arrays), but eventually it compiles and runs fine. When you say "FAIL", what do you mean? What happens? |
Could be an issue with running out of RAM or similar? Just thinking loud... |
You're right @asterite, it does work but it's very very slow Is there a way to improve this behaviour? |
@alainravet Yes, use a regular array. |
@asterite I'm trying to optimize CPU intensive code (a chess engine). |
Slowness of non-static arrays (vector) usually comes from size extension operations. Therefore significant speed-up should be observable when vectors are created with pre-defined size (that won't change in runtime). It will not be as fast as static array (with all optimizations applied), but it should be faster then blind usage of vectors. BTW, how does the benchmark of current code with non-static arrays look like? Would it be possible to share code (if it is open-source), so that we can play around with it and come up with some optimizations? (I would be interested myself in that). |
@waterlink I'm porting a C code project to Crystal and, at first, I refactored to OO as much as possible. The code is better structured and more readable and more ... , but it also creates tons of objects and is slow (relatively speaking). I am now unfactoring the expensive parts, using pools of arrays to reuse instead of creating new ones, unwrapping primitives (as we can't extend the Int* structs 😁 ), ... It looks like you can't have your cake and eat it after all.
It takes 2 minutes to explore 119_060_324 millions possible moves in 6 plies (Perft test). I'll share it next week, once it's cleaner/less messy. It could be useful to measure performance changes between Crystal versions. |
@alainravet I can't say how to optimize the program without profiling it, but another alternative to Array is Slice. The "problem" with Array is that if you do In any case, once you share you'll code I'm sure we can all optimize together to be in the order of the C program. |
There's nothing we can do here. LLVM is known to be slow with big static arrays. These should only be used in very few cases. Array and Slice and most appropriate in most cases. So I'm closing this. |
Edit: Much of the issue I've described below isn't necessarily related. However, this slow compilation is still 100% an issue. @asterite I just had a similar problem and spent a couple of hours trying to figure out the cause. The error messages were also unhelpful, which I'll describe below. I simply added a StaticArray(Float32, 4096) as an instance variable, causing my --release compile time jumped from ~5 seconds to ~4 minutes. I think it would be useful to add some more specifics regarding the use-cases of StaticArrays to the API docs. As for the error messages, I ctrl-c'd the
|
array = uninitialized Int32[4096]
4096.times do |i|
array.to_unsafe[i] = 0
end
array I have also found a work-around that compiles a lot faster: struct StaticArray(T, N)
def self.new_fast(& : Int32 -> T)
array = uninitialized self
buf = array.to_unsafe
{% for i in 0...N %}
buf[{{i.id}}] = yield {{i.id}}
{% end %}
array
end
end |
So, we did some digging into this and apparently Crystal in In both cases, The first candidates I could find from a quick glance here seem to be crystal/src/compiler/crystal/compiler.cr Lines 572 to 573 in 68f0f38
So, if somebody can figure out what the heck is different in the optimizer configuration between If not, we should at least apply @j8r's workaround for now. |
@jhass you're just running I guess this has something to do with returning the array as value from the function. Note that this cannot be reproduced in C so probably clang never faced this issue. The workaround still returns the array as value, but for some reason that combination trigger some optimization edge case. I just tried replacing the |
Huh, that's confusing, why does |
I know, and for a second I thought we had an opportunity for a huge performance improvement. But the result is not the same. The optimizations they perform are different. I just tested with |
I see, well TIL. Oh dear LLVM, could be so easy... |
I have very similar observation when trying to build a simple test code with relatively large array. I found that code as part of some bench marking exercise that test code has an array with 10,000 entries. I use Crystal 0.36.1 on Fedora. |
I'm also getting this using 1.0.0. I tried to create a StaticArray of String with this domain list (about 120,000 strings). The file looked like: StaticArray[
"...",
"...",
"and so on for another 120k rows",
] After 23minutes, my terminal died when my MBP hit 25GB usage on Crystal 😂
|
I, too, confirm that the issue still exists with Crystal 1.0.0. |
Maybe
|
No because it's valid to have static arrays of ten K elements. For example for a buffer in the stack. The issue is initializing a large static array. But in general, I would advise simply not using static array at all. Use Array. |
Showing an error message early is certainly better than setting the compiler off to run forever. I totally agree on the general note to just use array instead. |
Yes, actually, the only reason we have |
What do you mean? I'd actually consider |
I mean, removing |
I think part of the issue is the name StaticArray doesn't automatically trigger a "this is on the stack" thought in one's mind. Name-wise it just sounds like a normal Array but defined statically with a fixed size. |
Not sure I understand the argument against StaticArray, but I switched the test code to use it and it solved the problem for me.
|
I do think StaticArray can be beneficial in cases like #11370. If I simply change to a regular Array performance drops by 33x for small strings to around 2x for long ones, which seems to indicate a significant increase in initialization times. But I do believe that until a resolution is found it merits at least a compiler warning for StaticArrays over a certain size. I got bit by this one, but luckily it happened just after a single change so I was able to quickly isolate the problem. But I can imagine situations where someone is scratching their head trying to figure out why compile speeds suddenly got out of hand. |
The text was updated successfully, but these errors were encountered: