Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial version of LavaProfiler user interface #233

Open
wants to merge 48 commits into
base: main
Choose a base branch
from

Conversation

PhilippPlank
Copy link
Contributor

@PhilippPlank PhilippPlank commented Mar 21, 2022

Issue Number: #232

Objective of pull request: Implement an universal tool to capture power, energy and execution time in Lava which works on all backends with one interface across backends and which gives insights about power and performance of the components. This tool will be the LavaProfiler.

Pull request checklist

Your PR fulfills the following requirements:

  • Issue created that explains the change and why it's needed
  • Tests are part of the PR (for bug fixes / features)
  • Docs reviewed and added / updated if needed (for bug fixes / features)
  • PR conforms to Coding Conventions
  • PR applys BSD 3-clause or LGPL2.1+ Licenses to all code files
  • Lint (flakeheaven lint src/lava tests/) and (bandit -r src/lava/.) pass locally
  • Build tests (pytest) passes locally

Pull request type

Please check your PR type:

  • Bugfix
  • Feature
  • Code style update (formatting, renaming)
  • Refactoring (no functional changes, no api changes)
  • Build related changes
  • Documentation changes
  • Other (please describe):

What is the current behavior?

  • No LavaProfiler

What is the new behavior?

  • LavaProfiler user interface defined and sketch of the main methods in the profiler class.

Does this introduce a breaking change?

  • Yes
  • No

Supplemental information

PhilippPlank and others added 30 commits November 12, 2021 14:53
Copy link
Contributor

@awintel awintel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good start but lots to do left...

src/lava/utils/profiler.py Outdated Show resolved Hide resolved
@@ -22,7 +22,8 @@
from lava.magma.runtime.runtime import Runtime
from lava.magma.core.process.message_interface_enum import ActorType
from lava.magma.compiler.compiler import Compiler
from lava.magma.core.resources import Loihi1NeuroCore, Loihi2NeuroCore
from lava.magma.core.resources import (
AbstractComputeResource, Loihi1NeuroCore, Loihi2NeuroCore)


class LavaProfiler(ABC):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always good to have some docstrings by the time of a PR in which you expect feedback from others.

@@ -34,7 +35,7 @@ def __init__(self, start: int = 0, end: int = 0,
self.end = end
self.bin_size = bin_size
self.buffer_size = buffer_size
self.loihi_flag = False # Exchange with list of resources
self.used_resources: ty.List[AbstractComputeResource] = []

def profile(self, proc: AbstractProcess):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docstring

tests/lava/utils/__init__.py Outdated Show resolved Hide resolved
self.buffer_size = buffer_size
self.loihi_flag = False # Exchange with list of resources

def profile(self, proc: AbstractProcess):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We talked about two API variants. The old one in which the profiler wraps the Proc and probably this new version. Did you get feedback which one is actually better. I believe our conclusion was that the old one seemed more suitable for what we want to do so you wanted to prepare both side by side to show DR and PS the alternative.

Because one class overwriting a method attribute of another class looks like borderline invasive in terms of unexpected side effects. Python allows it but I'd imagine this would draw the anger of the Python community upon us.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think that the user API would be better if the user could simply define a profiler without having to change his line
proc.run(...),
i.e. the interface that PP coded here. But I also see Andreas point: Overwriting the run function feels bad. In particular, what happens if we ever change the general run method of Processes? Then we may forget to change it in the Profiler, and it will stop working.
What does the Profiler actually do under the hood? Does it just count how often the internal run_spk method is called, and how often spikes are being sent? In that case, could we

  1. ask people to just add a decorator @count to count runs in front of the run_spk function:
    class MyProcModel(list):
    @counter
    def run_spk(self, *args, **kwargs):
    ...
    with
    def counter(f):
    def count(*args, **kwargs):
    count.calls += 1
    return f(*args, **kwargs)
    count.calls = 0
    return count
    This decorator would mean that we don't need to rewrite the whole run method.
  2. subscribe to the port of the Process to read its sent spikes?

To me, that would be way less invasive. But I know too little about what's going on under the hood to say if anything like that would work.

Copy link
Contributor Author

@PhilippPlank PhilippPlank Mar 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both versions are slide by slide in the PowerPoint ;)
I still prefer the black box version, but the functionality is the same either way so switching it should be simple.

@phstratmann We need to do quite a few things under the hood, especially modifying the compilation process (the sketch of what we need to do is in form of comments and mock methods in this PR).
We do not really count how often run_spk is called.


Functionally, this method does the same as run(..) of AbstractProcess,
but modifies the chosen ProcModels and executables to be able to use the
LavaProfiler. From the user perspective, it should not be noticeable as
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will be noticable and cause confusion as soon as somebody uses a debugger with breakpoints.

src/lava/utils/profiler.py Outdated Show resolved Hide resolved
...
return proc_map

def _set_profiler(self, proc_map):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name does not reflect was method does.
You don't seem to be doing anything with the Profiler itself but with the ProcModels so they can work with the Profiler.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to "_prepare_proc_models"

tests/lava/utils/test_profiler.py Outdated Show resolved Hide resolved
def profile(self, proc: AbstractProcess):
proc.run = types.MethodType(self.run, proc)

def get_energy(self) -> np.array:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably need something more elaborate instead or besides this method. Yes we want a total time series but in simulation, we also need the ability to get time series for specific Procs or cores or specific contributors to the entire energy.
How are we going to do this?:

@@ -12,3 +12,125 @@
elementary operation has a defined execution time and energy cost, which is
used in a performance model to calculate execution time and energy.
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstring:
will contain -> contains [otherwise, this needs to be rewritten once it works next month, and we will forget it]
power and performance of workloads -> isn't power an aspect of performance?

Maybe a few lines on what the Profiler simulation is good for?

  • Simulate a network on CPU before it is being deployed - in which situations is that feature helpful? If people don't have access to Loihi yet. If they want to understand the performance on a fine-granular layer. ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this Docstring to the Profiler class docstring without the "will".

exchange the process models accordingly.
Tell the user which Processes will not be profiled, as they lack a
profileable ProcModel."""
...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we automatically switch process models? I could imagine that people may get confused in some cases. Let's assume the user first runs a process without profiler and the compiler chooses a process model that is not profilable. Then the user runs the same process with profiler. The profiler will automatically switch the process model to one that can be profiled. But these process models may differ - maybe because of a bug, maybe because of other reasons. The user will not have expected to see any different process behavior just because (s)he activated the profiler. Instead, I would expect that I receive an error message if the default process model cannot be profiled.

Copy link
Contributor Author

@PhilippPlank PhilippPlank Mar 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My approach would be that each ProcModel has a profileable counter part ProcModel, which only adds the operation counters.
So for the LIF neuron for example, we would need a profilable ProcModel for "fixed_pt" and "floating_pt" and there is a 1:1 mapping to exchange them if the profiler is present. I do not mess with the compiler, but only afterwards look for ProcModels which can be exchanged.

If there is no such ProcModel, then the user will be informed that this Process is not considered by the Profiler. If no chosen ProcModel has a profileable version and we run on simulation only, than there will be an error stating no Process is considered for the Profiler.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean in terms of code duplication? Does it mean there is a ProcModel class and then an almost identical ProcModel class that just adds counters?

Before we go into a whole lot of implementation, as usual, we should first write down an end to end (mock) example of what we are trying to enable and then agree that this is the best way to go. Such decisions are best made not in the abstract, for people who have not thought about the pros and cons deeply before, but using a concrete exmple.

Do we have such an example already?

If not I suggest you draft one and share it. We are at an important fork in the road, so we should get some wider input.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added an example in proc/lif/models.py

We can inherit the ProcModel and add the code for operation counters.

"""Returns the energy estimate per time step in µJ."""
...
return 0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we give the user the option to either receive the whole time series or just the total time / energy? I could imagine that for particularly long runs, we run into memory or runtime problems if we store one value for each time step. If we just accumulate the values, it may often suffice.
In addition, what's about a value to provide the frequency of measurements? We may often not need a value each time step, but only each 1000th time step.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, this is just a first example of getting results. We can offer more sophisticated options, including standard plots etc.

Copy link
Contributor

@phstratmann phstratmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Some comments and questions, though.

@tim-shea
Copy link
Contributor

@PhilippPlank I propose to close this unmerged as it has grown too far out of sync with our current plan for PnP, are you okay with that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants