-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC, Use asyncio for external programs, just prototype for now #235
base: develop
Are you sure you want to change the base?
Conversation
…filtering on linux for now
…d the sorting step
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @alfh, thanks for this big PR! Nice reductions in runtime!
As there are a lot of changes reviewing is very hard. I mentioned some minor things inline but for the main part it would be way easier to review if only the really changed lines occur as diff or the commits are nicely cutted per function or similar.
An example: Line 138-166 in osm_maps_functions.py
. The indentation has changed what causes these "all lines changed" in my opinion. Leaving Windows as-is and the if/else for the OS there would reduce these changes in my opinion. As the Windows part is not working then is no problem for me.
I know it is only a RFC but reviewing greatly benefits from having fewer diffs and the real diffs.
Cutting the PR down to multiple smaller ones would also be a good way to go (as you already mentioned).
Concerning your questions in the PR-text:
- Once calculated,
semaphore
's should be object attributes and taken fromself.
in the methods to not have as parameter. One could think of more than one object attribute or equal. Code duplication is always good to reduce or not have at all :-) - Actually I have in 95% of the functions one function which distincts the OS's if needed. If the differences are not too big I would keep this scheme if it makes sense
- We would log all tiles anyway but with parallelization the order is not ascending. So if logs are sent in parallel I think it is not too bad if it is per "main function" what I assume.
- Putting
filtered.o5m.pbf
files on RAM or make it possible via argument would also be fitting into a separete PR. Do we benefit from this change only in parallelization or also without?
wahoomc/osm_maps_functions.py
Outdated
if save_cruiser: | ||
cmd.append('--keep') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because cmd
is a string now, it errors out in this line. Creating also cruiser files is more or less a default for me (see launch.json
) and hence noticed on the first try of your branch :-)
Unittests also fail because of this reason.
@@ -59,6 +60,41 @@ def run_subprocess_and_log_output(cmd, error_message, cwd=""): | |||
log.debug('subprocess debug output:') | |||
log.debug(process.stdout) | |||
|
|||
async def run_async_subprocess_and_log_output(semaphore, cmd, args, error_message, cwd=""): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cmd seams to be a string here, before it was typed as list.
it seams you need the cmd as string for calling asyncio.create_subprocess_exec
.
two things to consider:
- Is it like that or can you call
asyncio.create_subprocess_exec
also with a list? - why don't you leave
cmd
parameter typed as list and do the list-to-string conversion in the new function?
The benefit of when doing 2. would be less changes around this file because the parameter type stays the same and it would also be my favorite.
timings_tile = Timings() | ||
|
||
out_file_map = os.path.join(USER_OUTPUT_DIR, | ||
f'{tile["x"]}', f'{tile["y"]}.map') | ||
out_file_map = os.path.join(USER_OUTPUT_DIR, f'{tile["x"]}', f'{tile["y"]}.map') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please undo this only-formatting change to reduce diffs.
If you have a auto-formatter that produced these changes you should deactivate this please.
I don't remember if I autoformatted the code and if yes with which auto-fomatter. In other projects I use black
.
If we want to use a auto-formatter, we
- should agree on a auto-formatter
- run the auto-formatter throughout the whole code and add this as separate PR
|
||
# apply tag-wahoo xml every time because the result is different per .xml file (user input) | ||
merged_file = os.path.join(USER_OUTPUT_DIR, | ||
f'{tile["x"]}', f'{tile["y"]}', 'merged.osm.pbf') | ||
merged_file = os.path.join(USER_OUTPUT_DIR, f'{tile["x"]}', f'{tile["y"]}', 'merged.osm.pbf') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comments some lines above
# - the defined TAGS_TO_KEEP_UNIVERSAL constants have changed are changed (user input or new release) | ||
if not os.path.isfile(out_file_o5m_filtered_win) or not os.path.isfile(out_file_o5m_filtered_names_win) \ | ||
or self.o_osm_data.force_processing is True or self.tags_are_identical_to_last_run(key) is False \ | ||
or self.last_changed_is_identical_to_last_run(key) is False: | ||
log.info( | ||
'+ Filtering unwanted map objects out of map of %s', key) | ||
cmd = [get_tooling_win_path('osmfilter', in_user_dir=True)] | ||
cmd.append(out_file_o5m) | ||
cmd.append( | ||
'--keep="' + translate_tags_to_keep(sys_platform=platform.system()) + '"') | ||
cmd.append('--keep-tags="all type= layer= ' + | ||
translate_tags_to_keep(sys_platform=platform.system()) + '"') | ||
cmd.append('-o=' + out_file_o5m_filtered_win) | ||
|
||
run_subprocess_and_log_output( | ||
cmd, f'! Error in OSMFilter with country: {key}') | ||
|
||
cmd = [get_tooling_win_path('osmfilter', in_user_dir=True)] | ||
cmd.append(out_file_o5m) | ||
cmd.append( | ||
'--keep="' + translate_tags_to_keep( | ||
name_tags=True, sys_platform=platform.system()) + '"') | ||
cmd.append('--keep-tags="all type= name= layer= ' + | ||
translate_tags_to_keep( | ||
name_tags=True, sys_platform=platform.system()) + '"') | ||
cmd.append('-o=' + out_file_o5m_filtered_names_win) | ||
|
||
run_subprocess_and_log_output( | ||
cmd, f'! Error in OSMFilter with country: {key}') | ||
|
||
val['filtered_file'] = out_file_o5m_filtered_win | ||
val['filtered_file_names'] = out_file_o5m_filtered_names_win | ||
|
||
# Non-Windows | ||
else: | ||
out_file_pbf_filtered_mac = f'{out_file_o5m_filtered_win}.pbf' | ||
out_file_pbf_filtered_names_mac = f'{out_file_o5m_filtered_names_win}.pbf' | ||
|
||
# filter out tags: | ||
# - if no filtered files exist | ||
# - force processing is set (this is also when new map files were dowwnloaded) | ||
# - the defined TAGS_TO_KEEP_UNIVERSAL constants have changed are changed (user input or new release) | ||
if not os.path.isfile(out_file_pbf_filtered_mac) or not os.path.isfile(out_file_pbf_filtered_names_mac) \ | ||
or self.o_osm_data.force_processing is True or self.tags_are_identical_to_last_run(key) is False \ | ||
or self.last_changed_is_identical_to_last_run(key) is False: | ||
log.info( | ||
'+ Filtering unwanted map objects out of map of %s', key) | ||
|
||
# https://docs.osmcode.org/osmium/latest/osmium-tags-filter.html | ||
cmd = ['osmium', 'tags-filter', '--remove-tags'] | ||
cmd.append(val['map_file']) | ||
cmd.extend(translate_tags_to_keep( | ||
sys_platform=platform.system())) | ||
cmd.extend(['-o', out_file_pbf_filtered_mac]) | ||
cmd.append('--overwrite') | ||
|
||
run_subprocess_and_log_output( | ||
cmd, f'! Error in Osmium with country: {key}') | ||
|
||
cmd = ['osmium', 'tags-filter', '--remove-tags'] | ||
cmd.append(val['map_file']) | ||
cmd.extend(translate_tags_to_keep( | ||
name_tags=True, sys_platform=platform.system())) | ||
cmd.extend(['-o', out_file_pbf_filtered_names_mac]) | ||
cmd.append('--overwrite') | ||
|
||
run_subprocess_and_log_output( | ||
cmd, f'! Error in Osmium with country: {key}') | ||
|
||
val['filtered_file'] = out_file_pbf_filtered_mac | ||
val['filtered_file_names'] = out_file_pbf_filtered_names_mac | ||
out_file_o5m_filtered_win = os.path.join(country_dir, 'filtered.o5m') | ||
out_file_o5m_filtered_names_win = os.path.join(country_dir, 'filtered_names.o5m') | ||
|
||
out_file_pbf_filtered_mac = f'{out_file_o5m_filtered_win}.pbf' | ||
out_file_pbf_filtered_names_mac = f'{out_file_o5m_filtered_names_win}.pbf' | ||
|
||
# filter out tags: | ||
# - if no filtered files exist | ||
# - force processing is set (this is also when new map files were dowwnloaded) | ||
# - the defined TAGS_TO_KEEP_UNIVERSAL constants have changed are changed (user input or new release) | ||
if not os.path.isfile(out_file_pbf_filtered_mac) or not os.path.isfile(out_file_pbf_filtered_names_mac) \ | ||
or self.o_osm_data.force_processing is True or self.tags_are_identical_to_last_run(key) is False \ | ||
or self.last_changed_is_identical_to_last_run(key) is False: | ||
log.info('+ Filtering unwanted map objects out of map of %s', key) | ||
|
||
tags_to_keep = translate_tags_to_keep(sys_platform=platform.system()) | ||
tags_to_keep_names = translate_tags_to_keep(name_tags=True, sys_platform=platform.system()) | ||
|
||
# async with asyncio.TaskGroup() as tg: | ||
log.debug('start run filtered') | ||
tasks.add(asyncio.create_task(self.invoke_filter_tags_osmium_linux(semaphore, key, val['map_file'], tags_to_keep, out_file_pbf_filtered_mac))) | ||
# tg.create_task(self.invoke_filter_tags_osmium_linux(key, val['map_file'], tags_to_keep, out_file_pbf_filtered_mac)) | ||
|
||
log.debug('start run filtered names') | ||
tasks.add(asyncio.create_task(self.invoke_filter_tags_osmium_linux(semaphore, key, val['map_file'], tags_to_keep_names, out_file_pbf_filtered_names_mac))) | ||
# tg.create_task(self.invoke_filter_tags_osmium_linux(key, val['map_file'], tags_to_keep_names, out_file_pbf_filtered_names_mac)) | ||
|
||
val['filtered_file'] = out_file_pbf_filtered_mac | ||
val['filtered_file_names'] = out_file_pbf_filtered_names_mac |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here are a lot of changed lines but most probably the macOS/Linux part hasn't changed that much. It would be nice to have only the real changes as diff and not all lines.
…evation generation Temporarily remove srg.. support, since it involves fetching files using login cookie and that seems to run into trouble. Got trouble even with semaphore(1), so do not think it is related to running many instances in parallel, need to check how the pyhgtmap is handling the download. The pyhgtmap should possibly be changed to rather use M2M API with token, instead of simulating logging in as a web client.
Thanks for the initial feedback, @treee111. I've now also gotten the contour/elevation generation to work with asyncio (using pyhgtmap). I've also briefly tested on Windows (not with contour generation), and only ran into trouble when running the lzma/compression steps, I think it might that compression executable that is using some temp files that collide with other invocation. I'll look more into that. So now I have a working solution, with nice performance improvements.
(No countour, ran out of disk space when testing that, need to look more into it) For France, the numbers look like this
I will now think about how many PRs I will split this into, I think some PRs with "just" code reformatting / restructuring, and then 1-2 final PRs, with introducing the asyncio. |
What’s the status of this pull request? I wonder if I could take a stab based on this work, or there’s some other activity in async (or anything related to performance) area that I’ve missed? |
I have not had time to work on this lately.
I think this is the change has a big positive impact on performance.
I have been wondering if it could have some issues when processing several countries are the same time, if the same tile in the border areas are processed simultaneously, so that case should be tested.
I wanted to work more on this, by restructuring the commits and changes a bit.
Alf
Alf
Få BlueMail for Android
19. sep. 2024, 19:05 kl. 19:05 skrev "Łukasz Jendrysik" ***@***.***>:
…What’s the status of this pull request? I wonder if I could take a stab
based on this work, or there’s some other activity in async (or
anything related to performance) area that I’ve missed?
--
Reply to this email directly or view it on GitHub:
#235 (comment)
You are receiving this because you were mentioned.
Message ID: ***@***.***>
|
This PR…
Use Python asyncio library to launch external programs in parallell where possible, to speed up the generation on multi CPU core machines.
Considerations and implementations
This is just a proof of concept as of now, with several hacks, and also not considering Windows to any extent as of now.
So mainly to get some early feedback if possible, on the general approach.
I will add a method to calculate the number of Semaphores, based on "max CPU core used by external program invocation, min memory required by each external program invocation".
Will probably also take parts of this pull request, and make a general "reduce code duplication", and have that PR be merged first.
Also unsure if I should have one method for linux and one for windows for setting up the parameters for each external program invocation case, or one method with if tests for linux and windows. Currently, I have a mix of these, and I think one place I just removed the windows part when I was hacking the very first attempts, will readd the Windows support there in this PR.
I have also commented out the logging for each tile, since unsure how much sense that makes now. Will think about it, it would be useful if something is hanging or causing error.
Also note that is really pays of to have the
_tiles/germany/ directory, which contains the two files
put on a RAM disk (dev/shm for example). So I think a PR to add some config options somehow to control where those two files are put, would be really useful. If they are on disk or SSD, you will get a lot of iowait in the filtering step, since each tile is reading those files then.
I see a 10 fold increase in performance on some steps when looking at Norway, when running on a 16 core CPU (32 threads), 64 GB RAM.
Here is the output for Germany on my machine with asyncio
Here is when running the upstream develop code, not using asyncio (just showing the main parts of the log)
So a 3x increase in performance for Germany.
{...}
How to test
Invoke as normal, but as of now manually adjust the Semphore settings, which control how many parallel external programs are running.
Pull Request Checklist