-
-
Notifications
You must be signed in to change notification settings - Fork 409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance #514
Performance #514
Conversation
Previously rpacket_set_next_line_id() was called at the beginning of the function and later get_next_line_id() - 1 was used to get the current id. This was changed and set_next_line_id() is now called at the right position.
The option '--gdb' will print the pid and pause before running the simulation. This allows attaching gdb to the running process to faciliate debugging.
Previously accessing two back-to-back elements of a two-dimensional array resulted in two completely different memmory locations. This resulted in memory bandwith being the bottleneck for these calculations. This change converts the arrays to a layout beneficial for our calculations before passing them to the C extension. The indexing of those arrays is changed appropriately. The result of the j_blue_estimator has to be converted back to the format python expects. Additionally the j_blue_estimators will only be calculated if the corresponding array exists. This allows setting line_lists_j_blues to point to Null in order to skip this part if it is not needed.
@wkerzendorf @unoebauer I tested this setup with The speedup I measured was ~20-25%. Can you please point me to other setups I can run to ensure I didn't implement any nasty bugs? If I'm not mistaken all I have to do is ensure, that both macroatom and downbranch return the same results, right? Other parameters have no influence on the MC simulation. |
Hi @yeganer - having a quick at the changes, it looks to me as if the sequence of RNG calls is not altered. To double check that the optimization tweaks do not interfere with the implemented physics, I'd suggest that you do a bit comparison of the spectra before and after the optimization changes for
|
@unoebauer - are they in the TEP? |
@wkerzendorf yes (apart from tardis_example) |
The calculation that needs to be performed is a simple trigonomic calculation with two possible results. For performance reason one can introduce a previous check. For a detailed discussion see tardis-sn#497 This results in the following sheme: - For mu > 0 (outward propagation) the next boundary ist the outer boundary (case 1) - For mu < 0 we have to decide whether we hit the inner boundary or not and set the distance accordingly (case 2&3) Additionally branch prediction optimization was performed in distance2line. The compiler always assumes if statements to be true, which means the default branch should always the one the if statement checks for. This slightly reduces instruction misses. Resolves: tardis-sn#497
@unoebauer @wkerzendorf @orbitfold I updated this PR and I'm not seeing any difference between spectra. |
@@ -36,6 +36,8 @@ parser.add_argument('--profile', action='store_true', help= | |||
parser.add_argument('--profiler_log_file', default='profiler.log', help= | |||
'name of the profiler output file') | |||
|
|||
parser.add_argument('--gdb', action='store_true', help='print pid and pause') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we still need this - or is this a debug statement from you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a debug statement I added specifically to enable debugging. If we don't want that capanility I can remove that commit but I think it's a handy option.
I wasn't able to start tardis with gdb
to debug the C code, so I added it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's fine - just wanted to know.
@unoebauer @wkerzendorf @chvogl @orbitfold I rerun my comparison with this commit and the previous one(with recently_crossed_boundary missing) and both gave me spectra which were bit equal for all three configurations. With that Information I think it is save to assume that I didn't change any logic. As for the Issue with packets potentially having a discriminant < 0 which would result in the packet missing the outer boundary. A quick calculation showed that |
@tardis-sn/tardis-core are we happy to merge this? |
@wkerzendorf I'm happy to merge. My local branch has |
@yeganer I'm happy to merge this! I need one more confirmation from @tardis-sn/tardis-core |
I'm fine as well
|
This is a small collection of performance boosts and improvement of code readability.
Included changes:
tardis_example.yml
tardis_w7.yml
~20-25%abn_tom_test.yml
~31%