-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rethink what to benchmark #18
Comments
Some background/insight: Pros:
Cons:
The essential tossup when choosing to use If you want to build a multiplayer game, or a game with high entity counts / iterations, I would encourage the use of As far as what sorts of benchmarks would be valuable: |
I completely agree with this but the task is titanic! I think someone should propose a PR targeting its implementation so we could discuss around it. Then, we will try to implement it for every other libraries. |
In my experience, it's hard to agree what a "simple" game should look like. If it's too simple we'll risk not testing a real-life scenario; if it's too complex, then implementing the benchmark for every library will be a lot of work and will also involve a lot of code not directly related to the benchmark (e.g. rendering and efficient collision detection for large entity counts). Games can also vary in performance depending on the user's input. Ideally, the benchmark would be input-agnostic and only consist of deterministic AI-driven logic. My first thought is to try to implement a tower-defense-like scene where both the attackers and the defenders are controlled by the AI. I'd also advocate for making it grid-based to avoid costly collision detection at larger entity counts, and to skip rendering (but maybe emulate it to some extent, e.g. by calculating light positions). Some libraries are designed to perform well when used to create simple games ( I can try to prepare a "spec" of a typical load of a typical scene from one of the games I created in
|
As for the benchmark, I'd also like to include at least one system which depends on the order of the iteration. In my recent |
That's the reason why a game isn't a fully viable solution. Something closer to a simulation would be better. A potential issue would be related to the number of game ticks: 3600 ticks represent a minute of playtime. The fastest implementations would complete the simulation in 10ms which would result in biased results (since the last iterations would update almost nothing). Including the setup inside the benchmark would also be unfair since many implementations are slow to be initialized. A alternative would be to reset the simulation after a fixed number of ticks.
That would be interesting but I'm not sure if it is relevant. I don't expect any implementation do deliver entities in order. So the benchmark would be something like this:
Only ① would be implementation dependant. |
TLDR: Plot graphs from simulations Whether it's a game, simulation, or benchmark, the number of entities, components, and systems will end up being the same. You also don't want to spend cycles outside of running ECS code, because then you aren't benchmarking the library speed, and results will be skewed based on how much time is spent in the simulation. However, I think the addition of real world tests would be great, but not as a replacement for the benchmarks. I think some possibly useful data would be to benchmark many different combinations of counts of entities, components, and systems. Then you could plot graphs with this data and show asymptotic properties of how different operations behave between the frameworks, so people could choose the most optimal one for their project. |
A month ago I created a toy simulation called PewPew. The code is at https://github.com/stasm/pewpew. You can see it in action at https://stasm.github.io/pewpew/. My goal was to create a scene with ca. 15 components and 15 systems, which would imitate actual game logic and which can run both in the browser and in Node.js. PewPew is based on Goodluck and clocks in at around 1,000 LOC. I'm not 100% happy with the result and I'm posting this comment to gather feedback on how to design the benchmark better.
Perhaps a better approach would be to:
One of the challenges here is to design a benchmark with more than just a few systems (because we want to imitate a real game), and at the same to keep it simple (because we don't want to benchmark non-ECS code). How can we reconcile these two constraints? |
Currently the suite is far from perfect because it does not highlight the strengths and the weaknesses of each library. I tried to list what could affect the performances:
Number of components
int32
bit masks which prevent the usage of more than 32 components. While it provides huge speedups over its alternatives, it makes the libraries unusable for large games.Number of queries
Entity distribution
In the end, what should be measured? It is not realistic to benchmark 50 components, 50 systems, 5K entities, 50% of entity mutated, 200 creations/removals per tick. I'm not even discussing the data types (
bitecs
cannot store closures or external objects within components) or what's happening inside the systems. For example,perform-ecs
does very few object accesses (~1 per entity) and would be considerably faster than an implementation doing 2 accesses per entity. However, what's happening inside the systems could be significantly slower than the query.I would like to gather feedback from various ECS authors to tailor a fair benchmark. The goal is not to compete but to help the user to pick the right implementation for their need.
cc @NateTheGreatt (bitecs) @ddmills (geotic) @stasm (goodluck) @fireveined (perform-ecs) @ayebear (picoes)
The text was updated successfully, but these errors were encountered: