Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC, Use asyncio for external programs, just prototype for now #235

Open
wants to merge 11 commits into
base: develop
Choose a base branch
from

Conversation

alfh
Copy link
Contributor

@alfh alfh commented Dec 3, 2023

This PR…

Use Python asyncio library to launch external programs in parallell where possible, to speed up the generation on multi CPU core machines.

Considerations and implementations

This is just a proof of concept as of now, with several hacks, and also not considering Windows to any extent as of now.

So mainly to get some early feedback if possible, on the general approach.

I will add a method to calculate the number of Semaphores, based on "max CPU core used by external program invocation, min memory required by each external program invocation".

Will probably also take parts of this pull request, and make a general "reduce code duplication", and have that PR be merged first.

Also unsure if I should have one method for linux and one for windows for setting up the parameters for each external program invocation case, or one method with if tests for linux and windows. Currently, I have a mix of these, and I think one place I just removed the windows part when I was hacking the very first attempts, will readd the Windows support there in this PR.

I have also commented out the logging for each tile, since unsure how much sense that makes now. Will think about it, it would be useful if something is hanging or causing error.

Also note that is really pays of to have the
_tiles/germany/ directory, which contains the two files

filtered_names.o5m.pbf
filtered.o5m.pbf

put on a RAM disk (dev/shm for example). So I think a PR to add some config options somehow to control where those two files are put, would be really useful. If they are on disk or SSD, you will get a lot of iowait in the filtering step, since each tile is reading those files then.

I see a 10 fold increase in performance on some steps when looking at Norway, when running on a 16 core CPU (32 threads), 64 GB RAM.

Here is the output for Germany on my machine with asyncio

INFO:# Filter tags from country osm.pbf files
INFO:+ Filtering unwanted map objects out of map of germany
INFO:+ Filter tags from country osm.pbf files: OK, took 99.655 s
INFO:--------------------------------------------------------------------------------
INFO:# Generate land for each coordinate
INFO:start land sea
INFO:start land1
INFO:+ Generate land for each coordinate: OK, took 6.896 s
INFO:--------------------------------------------------------------------------------
INFO:# Generate sea for each coordinate
INFO:+ Generate sea for each coordinate: OK, took 0.004 s
INFO:--------------------------------------------------------------------------------
INFO:# Split filtered country files to tiles
INFO:+ Split filtered country files to tiles: OK, took 214.493 s
INFO:--------------------------------------------------------------------------------
INFO:# Merge splitted tiles with land, elevation, and sea
INFO:+ Sorted: OK, took 2.835 s
INFO:+ Merge splitted tiles with land, elevation, and sea: OK, took 36.320 s
INFO:--------------------------------------------------------------------------------
INFO:# Creating .map files for tiles
INFO:+ Created .map files for tiles: OK, took 107.150 s
INFO:+ Compressed .map files for tiles: OK, took 122.797 s (this is time from start, so compress took 15 s)
INFO:+ Creating .map files for tiles: OK, took 122.799 s
INFO:--------------------------------------------------------------------------------
INFO:# Create: .map.lzma files
INFO:+ Country: germany
INFO:+ Copying .map.lzma tiles to output folders
INFO:+ Create .map.lzma files: OK, took 1.189 s
INFO:--------------------------------------------------------------------------------
INFO:# Total time 481.898 s

Here is when running the upstream develop code, not using asyncio (just showing the main parts of the log)

INFO:+ Filter tags from country osm.pbf files: OK, took 121.303 s
INFO:+ Generate land for each coordinate: OK, took 19.885 s
INFO:+ Generate sea for each coordinate: OK, took 0.005 s
INFO:+ Split filtered country files to tiles: OK, took 562.958 s
INFO:+ Merge splitted tiles with land, elevation, and sea: OK, took 219.167 s
INFO:+ Creating .map files for tiles: OK, took 551.492 s
INFO:+ Create .map.lzma files: OK, took 1.094 s
INFO:# Total time 1476.415 s

So a 3x increase in performance for Germany.

{...}

How to test

Invoke as normal, but as of now manually adjust the Semphore settings, which control how many parallel external programs are running.

Pull Request Checklist

@alfh alfh mentioned this pull request Dec 3, 2023
Copy link
Owner

@treee111 treee111 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @alfh, thanks for this big PR! Nice reductions in runtime!

As there are a lot of changes reviewing is very hard. I mentioned some minor things inline but for the main part it would be way easier to review if only the really changed lines occur as diff or the commits are nicely cutted per function or similar.
An example: Line 138-166 in osm_maps_functions.py. The indentation has changed what causes these "all lines changed" in my opinion. Leaving Windows as-is and the if/else for the OS there would reduce these changes in my opinion. As the Windows part is not working then is no problem for me.

I know it is only a RFC but reviewing greatly benefits from having fewer diffs and the real diffs.
Cutting the PR down to multiple smaller ones would also be a good way to go (as you already mentioned).

Concerning your questions in the PR-text:

  1. Once calculated, semaphore's should be object attributes and taken from self. in the methods to not have as parameter. One could think of more than one object attribute or equal. Code duplication is always good to reduce or not have at all :-)
  2. Actually I have in 95% of the functions one function which distincts the OS's if needed. If the differences are not too big I would keep this scheme if it makes sense
  3. We would log all tiles anyway but with parallelization the order is not ascending. So if logs are sent in parallel I think it is not too bad if it is per "main function" what I assume.
  4. Putting filtered.o5m.pbf files on RAM or make it possible via argument would also be fitting into a separete PR. Do we benefit from this change only in parallelization or also without?

Comment on lines 670 to 671
if save_cruiser:
cmd.append('--keep')
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because cmd is a string now, it errors out in this line. Creating also cruiser files is more or less a default for me (see launch.json) and hence noticed on the first try of your branch :-)

Unittests also fail because of this reason.

@@ -59,6 +60,41 @@ def run_subprocess_and_log_output(cmd, error_message, cwd=""):
log.debug('subprocess debug output:')
log.debug(process.stdout)

async def run_async_subprocess_and_log_output(semaphore, cmd, args, error_message, cwd=""):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cmd seams to be a string here, before it was typed as list.
it seams you need the cmd as string for calling asyncio.create_subprocess_exec.

two things to consider:

  1. Is it like that or can you call asyncio.create_subprocess_exec also with a list?
  2. why don't you leave cmd parameter typed as list and do the list-to-string conversion in the new function?

The benefit of when doing 2. would be less changes around this file because the parameter type stays the same and it would also be my favorite.

timings_tile = Timings()

out_file_map = os.path.join(USER_OUTPUT_DIR,
f'{tile["x"]}', f'{tile["y"]}.map')
out_file_map = os.path.join(USER_OUTPUT_DIR, f'{tile["x"]}', f'{tile["y"]}.map')
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please undo this only-formatting change to reduce diffs.

If you have a auto-formatter that produced these changes you should deactivate this please.
I don't remember if I autoformatted the code and if yes with which auto-fomatter. In other projects I use black.

If we want to use a auto-formatter, we

  1. should agree on a auto-formatter
  2. run the auto-formatter throughout the whole code and add this as separate PR


# apply tag-wahoo xml every time because the result is different per .xml file (user input)
merged_file = os.path.join(USER_OUTPUT_DIR,
f'{tile["x"]}', f'{tile["y"]}', 'merged.osm.pbf')
merged_file = os.path.join(USER_OUTPUT_DIR, f'{tile["x"]}', f'{tile["y"]}', 'merged.osm.pbf')
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comments some lines above

Comment on lines 137 to +166
# - the defined TAGS_TO_KEEP_UNIVERSAL constants have changed are changed (user input or new release)
if not os.path.isfile(out_file_o5m_filtered_win) or not os.path.isfile(out_file_o5m_filtered_names_win) \
or self.o_osm_data.force_processing is True or self.tags_are_identical_to_last_run(key) is False \
or self.last_changed_is_identical_to_last_run(key) is False:
log.info(
'+ Filtering unwanted map objects out of map of %s', key)
cmd = [get_tooling_win_path('osmfilter', in_user_dir=True)]
cmd.append(out_file_o5m)
cmd.append(
'--keep="' + translate_tags_to_keep(sys_platform=platform.system()) + '"')
cmd.append('--keep-tags="all type= layer= ' +
translate_tags_to_keep(sys_platform=platform.system()) + '"')
cmd.append('-o=' + out_file_o5m_filtered_win)

run_subprocess_and_log_output(
cmd, f'! Error in OSMFilter with country: {key}')

cmd = [get_tooling_win_path('osmfilter', in_user_dir=True)]
cmd.append(out_file_o5m)
cmd.append(
'--keep="' + translate_tags_to_keep(
name_tags=True, sys_platform=platform.system()) + '"')
cmd.append('--keep-tags="all type= name= layer= ' +
translate_tags_to_keep(
name_tags=True, sys_platform=platform.system()) + '"')
cmd.append('-o=' + out_file_o5m_filtered_names_win)

run_subprocess_and_log_output(
cmd, f'! Error in OSMFilter with country: {key}')

val['filtered_file'] = out_file_o5m_filtered_win
val['filtered_file_names'] = out_file_o5m_filtered_names_win

# Non-Windows
else:
out_file_pbf_filtered_mac = f'{out_file_o5m_filtered_win}.pbf'
out_file_pbf_filtered_names_mac = f'{out_file_o5m_filtered_names_win}.pbf'

# filter out tags:
# - if no filtered files exist
# - force processing is set (this is also when new map files were dowwnloaded)
# - the defined TAGS_TO_KEEP_UNIVERSAL constants have changed are changed (user input or new release)
if not os.path.isfile(out_file_pbf_filtered_mac) or not os.path.isfile(out_file_pbf_filtered_names_mac) \
or self.o_osm_data.force_processing is True or self.tags_are_identical_to_last_run(key) is False \
or self.last_changed_is_identical_to_last_run(key) is False:
log.info(
'+ Filtering unwanted map objects out of map of %s', key)

# https://docs.osmcode.org/osmium/latest/osmium-tags-filter.html
cmd = ['osmium', 'tags-filter', '--remove-tags']
cmd.append(val['map_file'])
cmd.extend(translate_tags_to_keep(
sys_platform=platform.system()))
cmd.extend(['-o', out_file_pbf_filtered_mac])
cmd.append('--overwrite')

run_subprocess_and_log_output(
cmd, f'! Error in Osmium with country: {key}')

cmd = ['osmium', 'tags-filter', '--remove-tags']
cmd.append(val['map_file'])
cmd.extend(translate_tags_to_keep(
name_tags=True, sys_platform=platform.system()))
cmd.extend(['-o', out_file_pbf_filtered_names_mac])
cmd.append('--overwrite')

run_subprocess_and_log_output(
cmd, f'! Error in Osmium with country: {key}')

val['filtered_file'] = out_file_pbf_filtered_mac
val['filtered_file_names'] = out_file_pbf_filtered_names_mac
out_file_o5m_filtered_win = os.path.join(country_dir, 'filtered.o5m')
out_file_o5m_filtered_names_win = os.path.join(country_dir, 'filtered_names.o5m')

out_file_pbf_filtered_mac = f'{out_file_o5m_filtered_win}.pbf'
out_file_pbf_filtered_names_mac = f'{out_file_o5m_filtered_names_win}.pbf'

# filter out tags:
# - if no filtered files exist
# - force processing is set (this is also when new map files were dowwnloaded)
# - the defined TAGS_TO_KEEP_UNIVERSAL constants have changed are changed (user input or new release)
if not os.path.isfile(out_file_pbf_filtered_mac) or not os.path.isfile(out_file_pbf_filtered_names_mac) \
or self.o_osm_data.force_processing is True or self.tags_are_identical_to_last_run(key) is False \
or self.last_changed_is_identical_to_last_run(key) is False:
log.info('+ Filtering unwanted map objects out of map of %s', key)

tags_to_keep = translate_tags_to_keep(sys_platform=platform.system())
tags_to_keep_names = translate_tags_to_keep(name_tags=True, sys_platform=platform.system())

# async with asyncio.TaskGroup() as tg:
log.debug('start run filtered')
tasks.add(asyncio.create_task(self.invoke_filter_tags_osmium_linux(semaphore, key, val['map_file'], tags_to_keep, out_file_pbf_filtered_mac)))
# tg.create_task(self.invoke_filter_tags_osmium_linux(key, val['map_file'], tags_to_keep, out_file_pbf_filtered_mac))

log.debug('start run filtered names')
tasks.add(asyncio.create_task(self.invoke_filter_tags_osmium_linux(semaphore, key, val['map_file'], tags_to_keep_names, out_file_pbf_filtered_names_mac)))
# tg.create_task(self.invoke_filter_tags_osmium_linux(key, val['map_file'], tags_to_keep_names, out_file_pbf_filtered_names_mac))

val['filtered_file'] = out_file_pbf_filtered_mac
val['filtered_file_names'] = out_file_pbf_filtered_names_mac
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here are a lot of changed lines but most probably the macOS/Linux part hasn't changed that much. It would be nice to have only the real changes as diff and not all lines.

…evation generation

Temporarily remove srg.. support, since it involves fetching files using login cookie
and that seems to run into trouble. Got trouble even with semaphore(1), so do not think
it is related to running many instances in parallel, need to check how the pyhgtmap
is handling the download.
The pyhgtmap should possibly be changed to rather use M2M API with token, instead of
simulating logging in as a web client.
@alfh
Copy link
Contributor Author

alfh commented Dec 9, 2023

Thanks for the initial feedback, @treee111.
I've read your comments, and will consider them thoroughly.

I've now also gotten the contour/elevation generation to work with asyncio (using pyhgtmap).

I've also briefly tested on Windows (not with contour generation), and only ran into trouble when running the lzma/compression steps, I think it might that compression executable that is using some temp files that collide with other invocation. I'll look more into that.

So now I have a working solution, with nice performance improvements.
For Canada, the number looks like this

INFO:--------------------------------------------------------------------------------
INFO:# Filter tags from country osm.pbf files
INFO:+ Filtering unwanted map objects out of map of canada
INFO:+ Filter tags from country osm.pbf files: OK, took 81.311 s
INFO:--------------------------------------------------------------------------------
INFO:# Generate land for each coordinate
INFO:start land sea
INFO:start land1
INFO:+ Generate land for each coordinate: OK, took 115.640 s
INFO:--------------------------------------------------------------------------------
INFO:# Generate sea for each coordinate
INFO:+ Generate sea for each coordinate: OK, took 0.173 s
INFO:--------------------------------------------------------------------------------
INFO:# Split filtered country files to tiles
INFO:+ Split filtered country files to tiles: OK, took 34182.288 s
INFO:--------------------------------------------------------------------------------
INFO:# Merge splitted tiles with land, elevation, and sea
INFO:+ Sorted: OK, took 76.932 s
INFO:+ Merge splitted tiles with land, elevation, and sea: OK, took 389.623 s
INFO:--------------------------------------------------------------------------------
INFO:# Creating .map files for tiles
INFO:+ Created .map files for tiles: OK, took 2406.952 s
INFO:+ Compressed .map files for tiles: OK, took 2423.778 s
INFO:+ Creating .map files for tiles: OK, took 2423.880 s
INFO:--------------------------------------------------------------------------------
INFO:# Create: .map.lzma files
INFO:+ Country: canada
INFO:+ Copying .map.lzma tiles to output folders
INFO:+ Create .map.lzma files: OK, took 1.255 s
INFO:--------------------------------------------------------------------------------

(No countour, ran out of disk space when testing that, need to look more into it)

For France, the numbers look like this

INFO:+ Filter tags from country osm.pbf files: OK, took 100.156 s
INFO:--------------------------------------------------------------------------------
INFO:# Generate land for each coordinate
INFO:start land sea
INFO:start land1
INFO:+ Generate land for each coordinate: OK, took 2.508 s
INFO:--------------------------------------------------------------------------------
INFO:# Generate sea for each coordinate
INFO:+ Generate sea for each coordinate: OK, took 0.005 s
INFO:--------------------------------------------------------------------------------
INFO:# Generate contour lines for each coordinate
INFO:+ Generate contour lines for each coordinate: OK, took 111.715 s
INFO:--------------------------------------------------------------------------------
INFO:# Split filtered country files to tiles
INFO:+ Split filtered country files to tiles: OK, took 283.197 s
INFO:--------------------------------------------------------------------------------
INFO:# Merge splitted tiles with land, elevation, and sea
INFO:+ Sorted: OK, took 3.409 s
INFO:+ Merge splitted tiles with land, elevation, and sea: OK, took 90.307 s
INFO:--------------------------------------------------------------------------------
INFO:# Creating .map files for tiles
INFO:+ Created .map files for tiles: OK, took 151.222 s
INFO:+ Compressed .map files for tiles: OK, took 173.164 s
INFO:+ Creating .map files for tiles: OK, took 173.167 s
INFO:--------------------------------------------------------------------------------
INFO:# Create: .map.lzma files
INFO:+ Country: france
INFO:+ Copying .map.lzma tiles to output folders
INFO:+ Create .map.lzma files: OK, took 2.018 s
INFO:--------------------------------------------------------------------------------
INFO:# Total time 769.618 s

I will now think about how many PRs I will split this into, I think some PRs with "just" code reformatting / restructuring, and then 1-2 final PRs, with introducing the asyncio.
I think I will first have all commits in one PR, but with a sequence that makes it easy to split them out into separate PRs.

@scadu
Copy link

scadu commented Sep 19, 2024

What’s the status of this pull request? I wonder if I could take a stab based on this work, or there’s some other activity in async (or anything related to performance) area that I’ve missed?

@alfh
Copy link
Contributor Author

alfh commented Sep 19, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants