Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restructure Trackers #2719

Merged
merged 28 commits into from
Jul 29, 2024

Conversation

Sumit112192
Copy link
Contributor

📝 Description

Type: 🎢 infrastructure
This PR brings the rpacket_trackers initialization outside the montecarlo_main_loop to avoid the usage of compile time constants and reduce complexity.

Copy link

codecov bot commented Jul 22, 2024

Codecov Report

Attention: Patch coverage is 74.35897% with 10 lines in your changes missing coverage. Please review.

Project coverage is 69.30%. Comparing base (bcbe40d) to head (55b291e).
Report is 7 commits behind head on master.

Files Patch % Lines
tardis/transport/montecarlo/packet_trackers.py 43.75% 9 Missing ⚠️
...ardis/transport/montecarlo/montecarlo_main_loop.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2719      +/-   ##
==========================================
- Coverage   69.65%   69.30%   -0.36%     
==========================================
  Files         188      192       +4     
  Lines       14930    15047     +117     
==========================================
+ Hits        10400    10428      +28     
- Misses       4530     4619      +89     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@tardis-bot
Copy link
Contributor

tardis-bot commented Jul 22, 2024

*beep* *bop*
Hi human,
I ran benchmarks as you asked comparing master (bcbe40d) and the latest commit (c18503a).
Here are the logs produced by ASV.
Results can also be downloaded as artifacts here.

Significantly changed benchmarks:

· Did not find results for commit c18503a794babddd9b54e1e3aaa89f733ced4e41

All benchmarks:

· Did not find results for commit c18503a794babddd9b54e1e3aaa89f733ced4e41

If you want to see the graph of the results, you can check it here

@Sumit112192
Copy link
Contributor Author

Profiling: Though the time taken fluctuates. Take +- 30s
Master Branch:
Screenshot from 2024-07-22 14-40-12

RestructureRPacketTrackers Branch:
Screenshot from 2024-07-22 14-40-02

@Sumit112192 Sumit112192 marked this pull request as ready for review July 22, 2024 09:39
@andrewfullard
Copy link
Contributor

So you are just using pure Python to initialize the lists and passing them along to the main loop?

@Sumit112192
Copy link
Contributor Author

Yes. That's the only way it could be initialized without dead branch pruning.

@Sumit112192
Copy link
Contributor Author

I could move the initialization to montecarlo_globals with numba compilation. Should I go for that?

@andrewfullard
Copy link
Contributor

Need to test the speed of the initialization

@Sumit112192
Copy link
Contributor Author

Sumit112192 commented Jul 23, 2024

Testing the speed of appending to a numba list in pure python function and function decorated with Numba.

import numba
from numba.typed import List
import timeit

from tardis.transport.montecarlo.packet_trackers import RPacketTracker, RPacketLastInteractionTracker
no_of_packets = 100000
@numba.njit
def return_rpacket_tracker():
    aList = List()
    for i in range(no_of_packets):
        aList.append(RPacketTracker(10))
    return aList

@numba.njit
def return_rpacket_last_interaction_tracker():
    aList = List()
    for i in range(no_of_packets):
        aList.append(RPacketLastInteractionTracker())
    return aList

def return_rpacket_tracker_pure():
    aList = List()
    for i in range(no_of_packets):
        aList.append(RPacketTracker(10))
    return aList

def return_rpacket_last_interaction_tracker_pure():
    aList = List()
    for i in range(no_of_packets):
        aList.append(RPacketLastInteractionTracker())
    return aList
%timeit return_rpacket_tracker()
237 ms ± 3.23 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit return_rpacket_last_interaction_tracker()
24 ms ± 753 μs per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit return_rpacket_tracker_pure()
1.22 s ± 22.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit return_rpacket_last_interaction_tracker_pure()
794 ms ± 13.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

@Sumit112192
Copy link
Contributor Author

Testing the speed of various ways to solve the current problem.

from numba.typed import List
import numba
import timeit

from tardis.transport.montecarlo.packet_trackers import RPacketTracker, RPacketLastInteractionTracker
no_of_packets = 100000

appending in true python

def aNormalFunc(flag):
    aList = List()
    if flag:
        for i in range(no_of_packets):
            aList.append(RPacketTracker(10))
    else:
        for i in range(100000):
            aList.append(RPacketLastInteractionTracker())
    return aList

appending in numba

flag = True
@numba.njit
def aNumbaFunc():
    aList = List()
    if flag:
        for i in range(no_of_packets):
            aList.append(RPacketTracker(10))
    else:
        for i in range(no_of_packets):
            aList.append(RPacketLastInteractionTracker())
    return aList

appending in true python with loop in numba

@numba.njit
def returnRPacketTracker():
    aList = List()
    for i in range(no_of_packets):
        aList.append(RPacketTracker(10))
    return aList

@numba.njit
def returnRPacketLastInteractionTracker():
    aList = List()
    for i in range(no_of_packets):
        aList.append(RPacketLastInteractionTracker())
    return aList

def aNormalFuncWithNumbaLoop(flag):
    if flag:
        aList = returnRPacketTracker()
    else:
        aList = returnRPacketLastInteractionTracker()
    return aList

Checking runtime using timeit

%timeit aNormalFunc(True)
1.27 s ± 110 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit aNumbaFunc()
266 ms ± 22.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit aNormalFuncWithNumbaLoop(True);
243 ms ± 12.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

@Sumit112192 Sumit112192 force-pushed the RestrucutreRPacketTrackers branch from 8c8f186 to ed5fbed Compare July 23, 2024 10:16
andrewfullard
andrewfullard previously approved these changes Jul 24, 2024
benchmarks/transport_montecarlo_main_loop.py Outdated Show resolved Hide resolved
tardis/transport/montecarlo/base.py Show resolved Hide resolved
@wkerzendorf
Copy link
Member

@atharva-2001 @andrewfullard I think this means we do not need a separate RPacketTracker test suite

andrewfullard
andrewfullard previously approved these changes Jul 26, 2024
@wkerzendorf
Copy link
Member

@Sumit112192 please add tests

@tardis-bot
Copy link
Contributor

tardis-bot commented Jul 27, 2024

*beep* *bop*
Hi human,
I ran ruff on the latest commit (55b291e).
Here are the outputs produced.
Results can also be downloaded as artifacts here.
Summarised output:

Complete output(might be large):


def test_generate_rpacket_last_interaction_tracker_list():
no_of_packets = 50
random_index = np.random.randint(0, no_of_packets)
Copy link
Contributor Author

@Sumit112192 Sumit112192 Jul 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the time usage of the random number generator

import numpy as np
no_of_packets = 50
%timeit np.random.randint(0, no_of_packets)
2.89 μs ± 199 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

@Sumit112192
Copy link
Contributor Author

@Sumit112192 please add tests

Done.

@Sumit112192
Copy link
Contributor Author

I have added benchmarks for the packet_tracker utility functions, namely, time_generate_rpacket_tracker_list and time_generate_rpacket_last_interaction_tracker_list.
It's helpful if, in the future, someone adds some attributes to the tracker classes, which might slow it down so the utility benchmark can capture it.

@Sumit112192
Copy link
Contributor Author

Sumit112192 commented Jul 28, 2024

Also, I am parameterizing the benchmarks the way they are given in the documentation as opposed to the parameterized marker, i.e.,
time_benchmark_name.params = ['some_iterable1', 'some_iterable2']
time_benchmark_name.param_names = ['name1', 'name2']
I found that to be easy to follow.

@wkerzendorf wkerzendorf self-requested a review July 29, 2024 16:23
@andrewfullard andrewfullard self-requested a review July 29, 2024 16:23
Copy link
Contributor

@andrewfullard andrewfullard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to merge this if the benchmarks running can be confirmed

@Sumit112192
Copy link
Contributor Author

Sumit112192 commented Jul 29, 2024

Happy to merge this if the benchmarks running can be confirmed

The benchmarks pass if we look at the first 50% of the GitHub benchmark check. It fails on the master branch commit because the utility functions are not currently available in the master branch.

@andrewfullard andrewfullard self-requested a review July 29, 2024 18:43
@andrewfullard andrewfullard merged commit 7231707 into tardis-sn:master Jul 29, 2024
17 of 18 checks passed
@Sumit112192 Sumit112192 deleted the RestrucutreRPacketTrackers branch September 8, 2024 13:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants