-
Notifications
You must be signed in to change notification settings - Fork 450
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FlxPoint pooling harms performance? #1189
Comments
A better and more consistent way to test is to put the It's possible that |
@MSGhero I have moved the if and the declaration, and it makes no noticeable difference on any platform. I do think the implementation of pooling is potentially poor, because if there are any dynamic arrays growing and shrinking, that straight-up undoes any of the work for preventing memory reallocation. I think, until there is a demonstrable problem to fix, we should not have the pooling. |
I agree that pooling is not very efficient atm due to heavy array manipulation. We shoulet investigate using a linked list as the base data structure of FlxPool before the decision to eliminate it. |
@MSGhero's description of pooling would certainly be a lot better than we have now. Figuring out when and how to change the pool size could be tricky, though? If we fix the pooling we should get rid of the "helper" points. Those |
var count:Int = 0;
public function get():T {
if (count == 0) return Type.createInstance(_class, []);
return _pool[--count];
}
public function put(obj:T):Void {
if (obj != null && _pool.indexOf(obj) < 0) {
obj.destroy();
_pool[count++] = obj;
}
}
public function putUnsafe(obj:T):Void {
if (obj != null) {
obj.destroy();
_pool[count++] = obj;
}
}
public function preAllocate(numObjects:Int):Void {
while (numObjects-- > 0) _pool[count++] = Type.createInstance(_class, []);
}
public function clear():Array<T> {
count = 0;
var oldPool = _pool;
_pool = [];
return oldPool;
} Linked list works fine, but it's more code and some |
@MSGhero Can you get that to make any meaningful performance difference? I can't seem to. EDIT: Making some of the functions inline helped on flash. There is a benefit after about 60k FlxPoints. C++ seems to see no benefit. EDIT 2: C++ sees a benefit when the framerate is less than about 10 (Requiring around 1.2 million FlxPoints) |
@JoeCreates, you should try using putUnsafe instead of put in your case, it might provide a meaningful perf difference. I believe we use putUnsafe internally. |
|
I wonder what the performance difference would be if Wow . . . Yeah. On C++ without pooling I get 30fps with 800k points. With |
Yep, |
And the pool size is actually only 5 in my test case. With bigger pools it will get worse much quicker. I think its safe to say that regardless of the eventual solution, the safety check in |
I disagree, it's actually pretty important the same object doesn't end up in the pool twice. |
At this point the only thing left to try is using a Linked List for the pooling implementation, I recommend we sit on this issue until we can try that approach (I'll look into building it, but I'll need some time, very busy at work atm). |
@gamedevsam using a linkedlist will have zero effect on my test case, which is kind on the pooling because the pool size never needs to increase. |
@Gama11 What is the point in pooling with this safety if it has way worse performance than than the completely safe non-use of pooling? |
Pooling provides significant benefit on platforms with aggressive GCs (mainly HTML5). We should strive to make pooling as fast as possible, it's worth trying out the Linked List implementation. It might not have an impact on your test case, but it might, we don't know until we try. |
It's also worth noting that FlxPoints don't have a whole lot going on, 2 floats and 2 booleans. Maybe try with something more substantial, like FlxSprite. If it takes hundreds of thousands of FlxPoints to make a performance difference, we should worry about other classes' test cases. |
I mean as far as making the FlxPool class better, not referring to FlxPoint's internal pool. It might be that making a FlxPoint is faster than the function call and array access of using a pool. Linked list might be worth looking at since all you're doing is adding or removing from the head. You would either have a Node class or a |
I've fiddled with this a bit, here's how it played out (using @JoeCreates' code to benchmark it): At first I found out that inlining Then I tried a slightly different implementation of the pool that avoids calling Afterwards I noticed that maybe some more could be squeezed by removing the overhead of calling the setters in cpp worked @~60 FPS when pooling, and at half that framerate in non-pooling mode (with 500'000 points o.O). Please test it as I'm pretty sure I've introduced some bugs in there that may break usage I've not foreseen. Especially the non-duplicates-in-pool policy. In the process I found this small typo over at flixel/flixel/math/FlxPoint.hx Line 459 in 7913b78
setYCallback .
|
Conclusions:
These are interesting findings, we need to discuss what's the best way to get around this overhead. |
Was the linked list idea for the |
In general. |
https://github.com/MSGhero/FlxPoolTests These tests use Tested on Windows-Release, Flash-Debug (can't copy paste from release), Neko-Release "Control __" is just creating a new "Array Method __" is the current "Array Index __" is the code I posted earlier. "Top-Level List __" is using the "PutU" is Array indexing is faster on the tested platforms in every test except for "Put+Get" which uses The length of the pools is just 1 because bigger numbers timed out flash. But array indexing was faster with Notes:
EDIT: Changed it, and it's now consistently faster than the current implementation in every test. |
I thought most GCs will reuse memory when possible, so the
new
keyword isn't nearly so terrible as in say, regular C++. With a little (crude) testing, it looks like the pooling ofFlxPoint
s tends to actually harm performance, rather than to help it. The performance impact is evident on flash, but even more so on C++.Have a play with this simple test on different platforms: http://hastebin.com/yanekehuko.scala
It is quite difficult to push it to the point where it makes much of a difference, but when it eventually does make a difference, it doesn't seem to be in favor of the pooling.
The text was updated successfully, but these errors were encountered: