Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rust save parser #97

Merged
merged 30 commits into from
Apr 23, 2023
Merged

Rust save parser #97

merged 30 commits into from
Apr 23, 2023

Conversation

eliasdoehne
Copy link
Owner

Implements a new parser for the save files in rust. A bit inefficient since it converts to JSON before handing it back to python code, but still much faster than the existing implementation (~4 times faster in a quick test)

)
time.sleep(delay_seconds)
return gamestate
import rust_parser
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello! In poking at this I wasn't able to get it to run due the below error, though this may be my unfamiliarity with Rust talking. If I change the import statement like this, it works:

    from stellarisdashboard.parsing import rust_parser

Error:

(venv-stellaris-20221219) Chris@tarkin:~/stellaris-dashboard$ python -m stellarisdashboard.cli parse-saves

  File "/home/Chris/stellaris-dashboard/stellarisdashboard/cli.py", line 102, in parse_saves
    f_parse_saves(threads, save_path, game_name_prefix=game_name)
  File "/home/Chris/stellaris-dashboard/stellarisdashboard/cli.py", line 116, in f_parse_saves
    for (
  File "/home/Chris/stellaris-dashboard/stellarisdashboard/parsing/save_parser.py", line 241, in get_gamestates_and_check_for_new_files
    result = future.result()
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 458, in result
    return self.__get_result()
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
ModuleNotFoundError: No module named 'rust_parser'

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, question - where do you plan for rust_parser.so to live? I put it at stellarisdashboard/parsing/rust_parser.so but I imagine (without testing) that the import statement would work as-is if you put it higher in the tree than that.

Copy link
Owner Author

@eliasdoehne eliasdoehne Jan 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes what I did locally was to open the rust_parser directory and run maturin develop -r which compiles the rust part with optimizations and installs the extension in the virtualenv. I haven't given it much thought yet how to integrate it in the regular build and release yet, so far I was just happy to get it working 😄

Maybe it also makes sense to release the parser itself as a rust crate and only keep the glue code in this repo 🤔

Copy link
Owner Author

@eliasdoehne eliasdoehne Jan 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another part I'm not quite happy with yet is how the data is passed back from rust to python, right now it's serialized to json first, but I believe it should be possible to directly assemble the Python dict from within rust using pyo3, I expect it would be much more efficient, I'll give that another try that when I have some time again in the next 1-2 weeks

@chennin
Copy link
Contributor

chennin commented Jan 5, 2023

Nice! I put together a save parser benchmark on a medium galaxy save from 2565 (though I am not a benchmark expert). There's quite a visible difference between Python 3.8/10/11, and between master and rust-save-parser.

Master and rust branches are as of this post.

TIME: master-debian-bullseye-py3.11: 30 seconds
TIME: master-debian-bullseye-py3.11: 29 seconds
TIME: master-debian-bullseye-py3.11: 29 seconds

TIME: master-debian-bullseye-py3.10: 35 seconds
TIME: master-debian-bullseye-py3.10: 36 seconds
TIME: master-debian-bullseye-py3.10: 35 seconds

TIME: master-debian-bullseye-py3.8: 38 seconds
TIME: master-debian-bullseye-py3.8: 39 seconds
TIME: master-debian-bullseye-py3.8: 37 seconds

TIME: rust-save-parser-debian-bullseye-py3.11: 18 seconds
TIME: rust-save-parser-debian-bullseye-py3.11: 18 seconds
TIME: rust-save-parser-debian-bullseye-py3.11: 17 seconds

TIME: rust-save-parser-debian-bullseye-py3.10: 20 seconds
TIME: rust-save-parser-debian-bullseye-py3.10: 19 seconds
TIME: rust-save-parser-debian-bullseye-py3.10: 20 seconds

TIME: rust-save-parser-debian-bullseye-py3.8: 22 seconds
TIME: rust-save-parser-debian-bullseye-py3.8: 21 seconds
TIME: rust-save-parser-debian-bullseye-py3.8: 21 seconds

@eliasdoehne
Copy link
Owner Author

eliasdoehne commented Jan 6, 2023

Nice! I put together a save parser benchmark on a medium galaxy save from 2565 (though I am not a benchmark expert). There's quite a visible difference between Python 3.8/10/11, and between master and rust-save-parser.

@chennin did you compile the rust code using the release flag? (I guess cargo build -ror similar) In my ad-hoc testing I saw the rust version performing about 3-4 times faster than the old parser. I used a much smaller save from the early game for my performance comparison, so I wonder if the difference is due to the input data

@chennin
Copy link
Contributor

chennin commented Jan 6, 2023

Nice! I put together a save parser benchmark on a medium galaxy save from 2565 (though I am not a benchmark expert). There's quite a visible difference between Python 3.8/10/11, and between master and rust-save-parser.

@chennin did you compile the rust code using the release flag? (I guess cargo build -ror similar) In my ad-hoc testing I saw the rust version performing about 3-4 times faster than the old parser. I used a much smaller save from the early game for my performance comparison, so I wonder if the difference is due to the input data

Yep - like this:

https://github.com/chennin/stellaris-dashboard-docker/blob/cb4cd3f1825c35cb1d03458e0969aad55f35601d/build-ext.sh#L17

@chennin
Copy link
Contributor

chennin commented Jan 6, 2023

For benchmarking I put together a script (runs on Linux, uses Docker): https://gist.github.com/chennin/67caf6cebbc7df35dc4ae9194e3350c9 . Now with more precision.

This is the save I'm using:
2565.06.02.sav.zip

I bet this is measuring more than you were, including docker image startup and processing gamestate.

@eliasdoehne eliasdoehne merged commit f40d791 into master Apr 23, 2023
@eliasdoehne eliasdoehne deleted the rust-save-parser branch April 23, 2023 18:25
@chennin
Copy link
Contributor

chennin commented Apr 24, 2023

Benchmark update, looks even slightly better than the first!

I think I updated the CPU after the first report, and I'm using a different save, so don't compare directly with previous. This benchmark runs python -m stellarisdashboard.cli parse-saves in docker containers.

TIME: v4.4-debian-bullseye-py3.11: 27.417 seconds
TIME: v4.4-debian-bullseye-py3.11: 27.093 seconds
TIME: v4.4-debian-bullseye-py3.11: 27.132 seconds

TIME: v5.0-debian-bullseye-py3.11: 15.617 seconds
TIME: v5.0-debian-bullseye-py3.11: 15.980 seconds
TIME: v5.0-debian-bullseye-py3.11: 15.755 seconds

TIME: v4.4-debian-bullseye-py3.10: 32.158 seconds
TIME: v4.4-debian-bullseye-py3.10: 31.949 seconds
TIME: v4.4-debian-bullseye-py3.10: 31.714 seconds

TIME: v5.0-debian-bullseye-py3.10: 17.835 seconds
TIME: v5.0-debian-bullseye-py3.10: 17.298 seconds
TIME: v5.0-debian-bullseye-py3.10: 17.586 seconds

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants