Initial version of LavaProfiler user interface #233

PhilippPlank · 2022-03-21T16:23:36Z

Issue Number: #232

Objective of pull request: Implement an universal tool to capture power, energy and execution time in Lava which works on all backends with one interface across backends and which gives insights about power and performance of the components. This tool will be the LavaProfiler.

Pull request checklist

Your PR fulfills the following requirements:

Issue created that explains the change and why it's needed
Tests are part of the PR (for bug fixes / features)
Docs reviewed and added / updated if needed (for bug fixes / features)
PR conforms to Coding Conventions
PR applys BSD 3-clause or LGPL2.1+ Licenses to all code files
Lint (flakeheaven lint src/lava tests/) and (bandit -r src/lava/.) pass locally
Build tests (pytest) passes locally

Pull request type

Please check your PR type:

What is the current behavior?

No LavaProfiler

What is the new behavior?

LavaProfiler user interface defined and sketch of the main methods in the profiler class.

Does this introduce a breaking change?

Yes
No

Supplemental information

…m PR lava-nc#46

…ests

awintel

Good start but lots to do left...

src/lava/utils/profiler.py

awintel · 2022-03-22T20:21:01Z

src/lava/utils/profiler.py

@@ -22,7 +22,8 @@
 from lava.magma.runtime.runtime import Runtime
 from lava.magma.core.process.message_interface_enum import ActorType
 from lava.magma.compiler.compiler import Compiler
-from lava.magma.core.resources import Loihi1NeuroCore, Loihi2NeuroCore
+from lava.magma.core.resources import (
+    AbstractComputeResource, Loihi1NeuroCore, Loihi2NeuroCore)


 class LavaProfiler(ABC):


Always good to have some docstrings by the time of a PR in which you expect feedback from others.

awintel · 2022-03-22T20:22:16Z

src/lava/utils/profiler.py

@@ -34,7 +35,7 @@ def __init__(self, start: int = 0, end: int = 0,
        self.end = end
        self.bin_size = bin_size
        self.buffer_size = buffer_size
-        self.loihi_flag = False  # Exchange with list of resources
+        self.used_resources: ty.List[AbstractComputeResource] = []

    def profile(self, proc: AbstractProcess):


tests/lava/utils/__init__.py

awintel · 2022-03-22T20:25:04Z

src/lava/utils/profiler.py

+        self.buffer_size = buffer_size
+        self.loihi_flag = False  # Exchange with list of resources
+
+    def profile(self, proc: AbstractProcess):


We talked about two API variants. The old one in which the profiler wraps the Proc and probably this new version. Did you get feedback which one is actually better. I believe our conclusion was that the old one seemed more suitable for what we want to do so you wanted to prepare both side by side to show DR and PS the alternative.

Because one class overwriting a method attribute of another class looks like borderline invasive in terms of unexpected side effects. Python allows it but I'd imagine this would draw the anger of the Python community upon us.

I still think that the user API would be better if the user could simply define a profiler without having to change his line
proc.run(...),
i.e. the interface that PP coded here. But I also see Andreas point: Overwriting the run function feels bad. In particular, what happens if we ever change the general run method of Processes? Then we may forget to change it in the Profiler, and it will stop working.
What does the Profiler actually do under the hood? Does it just count how often the internal run_spk method is called, and how often spikes are being sent? In that case, could we

ask people to just add a decorator @count to count runs in front of the run_spk function:
class MyProcModel(list):
@counter
def run_spk(self, *args, **kwargs):
...
with
def counter(f):
def count(*args, **kwargs):
count.calls += 1
return f(*args, **kwargs)
count.calls = 0
return count
This decorator would mean that we don't need to rewrite the whole run method.

subscribe to the port of the Process to read its sent spikes?

To me, that would be way less invasive. But I know too little about what's going on under the hood to say if anything like that would work.

Both versions are slide by slide in the PowerPoint ;)
I still prefer the black box version, but the functionality is the same either way so switching it should be simple.

@phstratmann We need to do quite a few things under the hood, especially modifying the compilation process (the sketch of what we need to do is in form of comments and mock methods in this PR).
We do not really count how often run_spk is called.

awintel · 2022-03-22T20:28:04Z

src/lava/utils/profiler.py

+
+        Functionally, this method does the same as run(..) of AbstractProcess,
+        but modifies the chosen ProcModels and executables to be able to use the
+        LavaProfiler. From the user perspective, it should not be noticeable as


will be noticable and cause confusion as soon as somebody uses a debugger with breakpoints.

src/lava/utils/profiler.py

awintel · 2022-03-22T20:33:01Z

src/lava/utils/profiler.py

+        ...
+        return proc_map
+
+    def _set_profiler(self, proc_map):


Name does not reflect was method does.
You don't seem to be doing anything with the Profiler itself but with the ProcModels so they can work with the Profiler.

Changed to "_prepare_proc_models"

tests/lava/utils/test_profiler.py

awintel · 2022-03-22T20:36:05Z

src/lava/utils/profiler.py

+    def profile(self, proc: AbstractProcess):
+        proc.run = types.MethodType(self.run, proc)
+
+    def get_energy(self) -> np.array:


We probably need something more elaborate instead or besides this method. Yes we want a total time series but in simulation, we also need the ability to get time series for specific Procs or cores or specific contributors to the entire energy.
How are we going to do this?:

phstratmann · 2022-03-23T10:35:24Z

src/lava/utils/profiler.py

@@ -12,3 +12,125 @@
 elementary operation has a defined execution time and energy cost, which is
 used in a performance model to calculate execution time and energy.
 """


Docstring:
will contain -> contains [otherwise, this needs to be rewritten once it works next month, and we will forget it]
power and performance of workloads -> isn't power an aspect of performance?

Maybe a few lines on what the Profiler simulation is good for?

Simulate a network on CPU before it is being deployed - in which situations is that feature helpful? If people don't have access to Loihi yet. If they want to understand the performance on a fine-granular layer. ...

I moved this Docstring to the Profiler class docstring without the "will".

phstratmann · 2022-03-23T11:01:20Z

src/lava/utils/profiler.py

+        exchange the process models accordingly.
+        Tell the user which Processes will not be profiled, as they lack a
+        profileable ProcModel."""
+        ...


Should we automatically switch process models? I could imagine that people may get confused in some cases. Let's assume the user first runs a process without profiler and the compiler chooses a process model that is not profilable. Then the user runs the same process with profiler. The profiler will automatically switch the process model to one that can be profiled. But these process models may differ - maybe because of a bug, maybe because of other reasons. The user will not have expected to see any different process behavior just because (s)he activated the profiler. Instead, I would expect that I receive an error message if the default process model cannot be profiled.

My approach would be that each ProcModel has a profileable counter part ProcModel, which only adds the operation counters.
So for the LIF neuron for example, we would need a profilable ProcModel for "fixed_pt" and "floating_pt" and there is a 1:1 mapping to exchange them if the profiler is present. I do not mess with the compiler, but only afterwards look for ProcModels which can be exchanged.

If there is no such ProcModel, then the user will be informed that this Process is not considered by the Profiler. If no chosen ProcModel has a profileable version and we run on simulation only, than there will be an error stating no Process is considered for the Profiler.

What does this mean in terms of code duplication? Does it mean there is a ProcModel class and then an almost identical ProcModel class that just adds counters?

Before we go into a whole lot of implementation, as usual, we should first write down an end to end (mock) example of what we are trying to enable and then agree that this is the best way to go. Such decisions are best made not in the abstract, for people who have not thought about the pros and cons deeply before, but using a concrete exmple.

Do we have such an example already?

If not I suggest you draft one and share it. We are at an important fork in the road, so we should get some wider input.

I added an example in proc/lif/models.py

We can inherit the ProcModel and add the code for operation counters.

phstratmann · 2022-03-23T13:20:03Z

src/lava/utils/profiler.py

+        """Returns the energy estimate per time step in µJ."""
+        ...
+        return 0
+


Should we give the user the option to either receive the whole time series or just the total time / energy? I could imagine that for particularly long runs, we run into memory or runtime problems if we store one value for each time step. If we just accumulate the values, it may often suffice.
In addition, what's about a value to provide the frequency of measurements? We may often not need a value each time step, but only each 1000th time step.

Sure, this is just a first example of getting results. We can offer more sophisticated options, including standard plots etc.

phstratmann

Great work! Some comments and questions, though.

tim-shea · 2024-06-20T13:29:34Z

@PhilippPlank I propose to close this unmerged as it has grown too far out of sync with our current plan for PnP, are you okay with that?

PhilippPlank and others added 30 commits November 12, 2021 14:53

- Initial enablement of RefPort and VarPorts

f456afb

- Initial enablement of RefPort and VarPorts

2d0ec74

- Initial enablement of RefPort and VarPorts

25a3a68

- Initial enablement of RefPort and VarPorts

a2d1765

Merge branch 'lava-nc:main' into main

74dfdf6

- Enablement of RefPorts and VarPorts - addressed change requests fro…

1daa9af

…m PR lava-nc#46

- Enablement of RefPorts and VarPorts - addressed change requests fro…

5908401

…m PR lava-nc#46

- Enablement of RefPorts and VarPorts - addressed change requests fro…

3df2847

…m PR lava-nc#46

Merge branch 'lava-nc:main' into main

9581fae

- Enablement of RefPorts and VarPorts - addressed change requests fro…

6e4716a

…m PR lava-nc#46

Merge branch 'lava-nc:main' into main

e22def6

Merge branch 'lava-nc:main' into main

bd25a80

Merge branch 'lava-nc:main' into main

7edeb1f

Merge branch 'lava-nc:main' into main

23fb8d7

Merge branch 'lava-nc:main' into main

f6686b4

Merge branch 'lava-nc:main' into main

86867c2

Merge branch 'lava-nc:main' into main

859a195

Merge branch 'lava-nc:main' into main

f7007f8

Merge branch 'lava-nc:main' into main

0163202

modified connection tutorial for release 0.2.0

bf934ef

Merge branch 'lava-nc:main' into main

0eef67e

fixed typos

f08a35c

Merge branch 'lava-nc:main' into main

ceddf2a

Merge branch 'lava-nc:main' into main

f798b0a

Merge branch 'lava-nc:main' into main

b6e72bb

Merge branch 'lava-nc:main' into main

9c57309

Merge branch 'lava-nc:main' into main

7425f7e

Merge branch 'lava-nc:main' into main

bc353c7

Merge branch 'lava-nc:main' into main

73288a4

Merge branch 'lava-nc:main' into main

3f5c2d7

PhilippPlank and others added 15 commits February 10, 2022 15:32

Merge branch 'lava-nc:main' into main

e691b1d

Merge branch 'lava-nc:main' into main

9ab9a5d

Merge branch 'lava-nc:main' into main

a1f0367

Merge branch 'lava-nc:main' into main

50f9113

Merge branch 'lava-nc:main' into main

e2342f5

Merge branch 'lava-nc:main' into main

0b7cedb

Merge branch 'lava-nc:main' into main

6a30ad7

Merge branch 'lava-nc:main' into main

8e652bc

Merge branch 'lava-nc:main' into main

3b3cc6f

Merge branch 'lava-nc:main' into main

46b54ad

Merge branch 'lava-nc:main' into profiler_nc

49d1f43

Merge branch 'lava-nc:main' into profiler_nc

fa69d5f

Merge branch 'lava-nc:main' into profiler_nc

49c8b57

- Initial commit of the LavaProfiler - mock user interface and unit t…

5e8b577

…ests

- Initial commit of the LavaProfiler - mock user interface and unit t…

1de39fc

…ests

PhilippPlank requested review from awintel, drager-intel, joyeshmishra, phstratmann and ysingh7 March 21, 2022 16:24

awintel requested changes Mar 22, 2022

View reviewed changes

phstratmann reviewed Mar 23, 2022

View reviewed changes

phstratmann requested changes Mar 23, 2022

View reviewed changes

PhilippPlank added 3 commits March 23, 2022 17:26

- Update addressing requested changes

beb555a

- Example of profileable ProcModel

c9188b8

- Example of profileable ProcModel

59fc2f6

mgkwill force-pushed the main branch from 527dcfa to 5ed88f7 Compare November 15, 2023 04:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial version of LavaProfiler user interface #233

Initial version of LavaProfiler user interface #233

PhilippPlank commented Mar 21, 2022 •

edited

Loading

awintel left a comment

awintel Mar 22, 2022

awintel Mar 22, 2022

awintel Mar 22, 2022

phstratmann Mar 23, 2022

PhilippPlank Mar 23, 2022 •

edited

Loading

awintel Mar 22, 2022

awintel Mar 22, 2022

PhilippPlank Mar 23, 2022

awintel Mar 22, 2022

phstratmann Mar 23, 2022

PhilippPlank Mar 23, 2022

phstratmann Mar 23, 2022

PhilippPlank Mar 23, 2022 •

edited

Loading

awintel Mar 23, 2022

PhilippPlank Mar 23, 2022

phstratmann Mar 23, 2022

PhilippPlank Mar 23, 2022

phstratmann left a comment

tim-shea commented Jun 20, 2024

Initial version of LavaProfiler user interface #233

Are you sure you want to change the base?

Initial version of LavaProfiler user interface #233

Conversation

PhilippPlank commented Mar 21, 2022 • edited Loading

Pull request checklist

Pull request type

What is the current behavior?

What is the new behavior?

Does this introduce a breaking change?

Supplemental information

awintel left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

PhilippPlank Mar 23, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

PhilippPlank Mar 23, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

phstratmann left a comment

Choose a reason for hiding this comment

tim-shea commented Jun 20, 2024

PhilippPlank commented Mar 21, 2022 •

edited

Loading

PhilippPlank Mar 23, 2022 •

edited

Loading

PhilippPlank Mar 23, 2022 •

edited

Loading