Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Profiling code #44

Open
dpshelio opened this issue Dec 3, 2024 · 2 comments
Open

Profiling code #44

dpshelio opened this issue Dec 3, 2024 · 2 comments
Labels
preparation Exercises to do before the class week10 Performance

Comments

@dpshelio
Copy link
Contributor

dpshelio commented Dec 3, 2024

Even when we measure the total time that a function takes to run (#43), that doesn't help us with knowing which parts of the code are slow!

To look into that, we need to use a different too called a profiler. Python comes with its own profiler, but we will use a more convenient tool.

Setup

This exercise will work with IPython or Jupyter notebooks, and will use two "magic" commands available there. You may need some steps to set them up first.

If you use Anaconda, you should already have access to Jupyter. If you don't, let us know on Moodle or use pip install ipython to install IPython.

The %prun magic should be already available with every installation of IPython/Jupyter. However, you may need to install the second magic (%lprun).
If you use Anaconda, run conda install line_profiler from a terminal. Otherwise, use pip install line_profiler.

Using profiling tools in IPython/Jupyter notebook

prun's magic gives us information about every function called.

  1. Open a Jupyter notebook or an IPython terminal.
  2. Add an interesting function (from Jake VanderPlas's book)
    def sum_of_lists(N):
        total = 0
        for i in range(5):
            L = [j ^ (j >> i) for j in range(N)]
            # j >> i == j // 2 ** i (shift j bits i places to the right)
            # j ^ i -> bitwise exclusive or; j's bit doesn't change if i's = 0, changes to complement if i's = 1
            total += sum(L)
        return total
  3. run %prun:
    %prun sum_of_lists(10_000_000)
  4. Look at the table of results. What information does it give you? Can you find which operation takes the most time? (You may find it useful to look at the last column first)

Using a line profiler in IPython/Jupyter

While prun presents its results by function, the lprun magic gives us line-by-line details.

  1. Load the extension on your IPython shell or Jupyter notebook
    %load_ext line_profiler
  2. Run %lprun
    %lprun -f sum_of_lists sum_of_lists(10_000_000)
  3. Can you interpret the results? On which line is most of the time spent?

Finishing up

When you are done, react to this issue using one of the available emojis, and/or comment with your findings: Which function takes the most time? Which line of the code?

@dpshelio dpshelio added preparation Exercises to do before the class week10 Performance labels Dec 3, 2024
@rn-phelan
Copy link

rn-phelan commented Dec 8, 2024

Timer unit: 1e-09 s

Total time: 8.73761 s

Line # Hits Time Per Hit % Time Line Contents

 1                                           def sum_of_lists(N):
 2         1       1000.0   1000.0      0.0      total = 0
 3         6       9000.0   1500.0      0.0      for i in range(5):
 4  50000005 8471798000.0    169.4     97.0          L = [j ^ (j >> i) for j in range(N)]
 7         5  265802000.0    5e+07      3.0          total += sum(L)
 8         1       2000.0   2000.0      0.0      return total

(lines 5 and 6 are comments)

L = [j ^ (j >> i) for j in range(N)] takes the majority of the time (97%), then total += sum(L) (3%)

@r4b61t
Copy link

r4b61t commented Dec 8, 2024

Line 4 takes the most time

In [3]: %lprun -f sum_of_lists sum_of_lists(10_000_000)
Timer unit: 1e-09 s

Total time: 11.7898 s
File: <ipython-input-2-5de972046bd3>
Function: sum_of_lists at line 1

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     1                                           def sum_of_lists(N):
     2         1        637.0    637.0      0.0      total = 0
     3         6       7876.0   1312.7      0.0      for i in range(5):
     4  50000005        1e+10    230.1     97.6          L = [j ^ (j >> i) for j in range(N)]
     5                                                   # j >> i == j // 2 ** i (shift j bits i places to the right)
     6                                                   # j ^ i -> bitwise exclusive or; j's bit doesn't change if i's = 0, changes to complement if i's = 1
     7         5  284864757.0    6e+07      2.4          total += sum(L)
     8         1       1376.0   1376.0      0.0      return total

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
preparation Exercises to do before the class week10 Performance
Projects
None yet
Development

No branches or pull requests

3 participants