-
Notifications
You must be signed in to change notification settings - Fork 1
/
2013-05-30-speeding-up-matplotlib.html
232 lines (194 loc) · 10 KB
/
2013-05-30-speeding-up-matplotlib.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<link rel="alternate"
type="application/rss+xml"
href="https://bastibe.de/rss.xml"
title="RSS feed for https://bastibe.de/">
<title>Speeding up Matplotlib</title>
<meta name="author" content="Bastian Bechtold">
<meta name="referrer" content="no-referrer">
<link href= "static/style.css" rel="stylesheet" type="text/css" />
<link rel="icon" href="static/favicon.ico">
<link rel="apple-touch-icon-precomposed" href="static/favicon-152.png">
<link rel="msapplication-TitleImage" href="static/favicon-144.png">
<link rel="msapplication-TitleColor" href="#0141ff">
<script src="static/katex.min.js"></script>
<script src="static/auto-render.min.js"></script>
<script src="static/lightbox.js"></script>
<link rel="stylesheet" href="static/katex.min.css">
<script>document.addEventListener("DOMContentLoaded", function() { renderMathInElement(document.body); });</script>
<meta http-equiv="content-type" content="application/xhtml+xml; charset=UTF-8">
<meta name="viewport" content="initial-scale=1,width=device-width,minimum-scale=1"></head>
<body>
<div id="preamble" class="status"><div class="header">
<a href="https://bastibe.de">Basti's Scratchpad on the Internet</a>
<div class="sitelinks">
<a href="https://github.com/bastibe">Github</a> | <a href="https://bastibe.de/projects.html">Projects</a> | <a href="https://bastibe.de/uses.html">Uses</a> | <a href="https://bastibe.de/reviews.html">Reviews</a> | <a href="https://bastibe.de/about.html">About</a>
</div>
</div></div>
<div id="content">
<div class="post-date">30 May 2013</div><h1 class="post-title"><a href="https://bastibe.de/2013-05-30-speeding-up-matplotlib.html">Speeding up Matplotlib</a></h1>
<p>
For the record, <a href="http://matplotlib.org">Matplotlib</a> is awesome! Its output looks amazing, it is extremely configurable and very easy to use. What more could you want?
</p>
<p>
Well… speed. If there is one thing I could criticize about Matplotlib, it is its relative slowness. To measure that, lets make a very simple line plot and draw some random numbers as quickly as possible:
</p>
<div class="org-src-container">
<pre class="src src-python">import matplotlib.pyplot as plt
import numpy as np
import time
fig, ax = plt.subplots()
tstart = time.time()
num_plots = 0
while time.time()-tstart < 1:
ax.clear()
ax.plot(np.random.randn(100))
plt.pause(0.001)
num_plots += 1
print(num_plots)
</pre>
</div>
<p>
On my machine, I get about 11 plots per second. I am using <code>pause()</code> here to update the plot without blocking. The correct way to do this is to use <code>draw()</code> instead, but due to a bug in the Qt4Agg backend, you can't use it there. If you are not using the Qt4Agg backend, <code>draw()</code> is supposedly the correct choice.
</p>
<p>
For a single plot, ten plots per second is not terrible. But then, this is really the simplest case possible, so ten frames per second in the simplest case probably means bad things for not so simple cases.
</p>
<p>
One thing that really takes time here is creating all the axes and text labels over and over again. So let's not do that.
</p>
<p>
Instead of calling <code>clear()</code> and then <code>plot()</code>, thus effectively deleting everything about the plot, then re-creating it for every frame, we can keep an existing plot and only modify its data:
</p>
<div class="org-src-container">
<pre class="src src-python">fig, ax = plt.subplots()
line, = ax.plot(np.random.randn(100))
tstart = time.time()
num_plots = 0
while time.time()-tstart < 1:
line.set_ydata(np.random.randn(100))
plt.pause(0.001)
num_plots += 1
print(num_plots)
</pre>
</div>
<p>
which yields about 26 plots per second. Not bad for a simple change like this. The downside is that the axes are not re-scaled any longer when the data changes. Thus, they won't change their limits based on the data any more.
</p>
<p>
Profiling this yields some interesting results:
</p>
<table>
<colgroup>
<col class="org-right">
<col class="org-right">
<col class="org-right">
<col class="org-right">
<col class="org-right">
<col class="org-left">
</colgroup>
<tbody>
<tr>
<td class="org-right">ncalls</td>
<td class="org-right">tottime</td>
<td class="org-right">percall</td>
<td class="org-right">cumtime</td>
<td class="org-right">percall</td>
<td class="org-left">filename:lineno(function)</td>
</tr>
<tr>
<td class="org-right">15</td>
<td class="org-right">0.167</td>
<td class="org-right">0.011</td>
<td class="org-right">0.167</td>
<td class="org-right">0.011</td>
<td class="org-left">{built-in method sleep)</td>
</tr>
</tbody>
</table>
<p>
The one function that uses the biggest chunk of runtime is <code>sleep()</code>, of all things. Clearly, this is not what we want. Delving deeper into the profiler shows that this is indeed happening in the call do <code>pause()</code>. Then again, I <i>was</i> wondering if using <i>pause</i> really was a great idea for performance…
</p>
<p>
As it turns out, <code>pause()</code> internally calls <code>fig.canvas.draw()</code>, then <code>plt.show()</code>, then <code>fig.canvas.start_event_loop()</code>. The default implementation of <code>fig.canvas.start_event_loop()</code> then calls <code>fig.canvas.flush_events()</code>, then sleeps for the requested time. To add insult to injury, it even insists on sleeping at least one hundredth of a second, which actually explains the profiler output of 0.167 seconds of <code>sleep()</code> for 15 calls very well.
</p>
<p>
Putting this all together now yields:
</p>
<div class="org-src-container">
<pre class="src src-python">fig, ax = plt.subplots()
line, = ax.plot(np.random.randn(100))
tstart = time.time()
num_plots = 0
while time.time()-tstart < 1:
line.set_ydata(np.random.randn(100))
fig.canvas.draw()
fig.canvas.flush_events()
num_plots += 1
print(num_plots)
</pre>
</div>
<p>
which now plots about 40 frames per second. Note that the call to <code>show()</code> mentioned earlier can be omitted since the figure is already on screen. <code>flush_events()</code> just runs the Qt event loop, so there is probably nothing to optimize there.
</p>
<p>
The only thing left to optimize now is thus <code>fig.canvas.draw()</code>. What this really is doing is drawing all the artists contained in the <code>ax</code>. Those artists can be accessed using <code>ax.get_children()</code>. For a simple plot like this, the artists are:
</p>
<ul class="org-ul">
<li>the background <code>ax.patch</code></li>
<li>the line, as returned from the <code>plot()</code> function</li>
<li>the spines <code>ax.spines</code></li>
<li>the axes <code>ax.xaxis</code> and <code>ax.yaxis</code></li>
</ul>
<p>
What we can do here is to selectively draw only the parts that are actually changing. That is, at least the background and the line. To only redraw these, the code now looks like this:
</p>
<div class="org-src-container">
<pre class="src src-python">fig, ax = plt.subplots()
line, = ax.plot(np.random.randn(100))
plt.show(block=False)
tstart = time.time()
num_plots = 0
while time.time()-tstart < 5:
line.set_ydata(np.random.randn(100))
ax.draw_artist(ax.patch)
ax.draw_artist(line)
fig.canvas.update()
fig.canvas.flush_events()
num_plots += 1
print(num_plots/5)
</pre>
</div>
<p>
Note that you have to add <code>fig.canvas.update()</code> to copy the newly rendered lines to the drawing backend.
</p>
<p>
This now plots about 500 frames per second. Five hundred times per second! Frankly, this is quite amazing!
</p>
<p>
Note that since we are only redrawing the background and the line, some detail in the axes will be overwritten. To also draw the spines, use <code>for spine in ax.spines.values(): ax.draw_artist(spine)</code>. To draw the axes, use <code>ax.draw_artist(ax.xaxis)</code> and <code>ax.draw_artist(ax.yaxis)</code>. If you draw all of them, you get roughly the same performance as <code>fig.canvas.draw()</code>. The axes in particular are quite expensive.
</p>
<p>
There is also <a href="http://stackoverflow.com/a/8956211/1034">a way</a> of drawing the complete figure once and copying the complete but empty background, then reinstating that and only plotting a new line on top of it. This is equally fast as the code above without any visual artifacts, but breaks if you resize the plot.
</p>
<p>
In conclusion, I am quite impressed with the flexibility of Matplotlib. Matplotlib by default values quality over performance. But if you really need the performance at some point, it is flexible and hackable enough to let you tweak it to your hearts content. Really, an amazing piece of technology!
</p>
<p>
<b>EDIT</b>: As it turns out, <code>fig.canvas.blit(ax.bbox)</code> is a bad idea since it leaks memory like crazy. What you should use instead is <code>fig.canvas.update()</code>, which is equally fast but does not leak memory.
</p>
<div class="taglist"><a href="https://bastibe.de/tags.html">Tags</a>: <a href="https://bastibe.de/tag-python.html">python</a> </div>
<div id="comments"><script async src="https://talk.hyvor.com/embed/embed.js" type="module"></script>
<hyvor-talk-comments id="hyvorcomments" website-id="3390" page-id=""></hyvor-talk-comments>
<script type="text/javascript">
document.getElementById("hyvorcomments").setAttribute("page-id", location.pathname);
</script></div></div>
<div id="postamble" class="status"><div id="archive">
<a href="https://bastibe.de/archive.html">Other posts</a>
</div>
<center><a rel="license" href="https://creativecommons.org/licenses/by-sa/3.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/3.0/88x31.png" /></a><br /><span xmlns:dct="https://purl.org/dc/terms/" href="https://purl.org/dc/dcmitype/Text" property="dct:title" rel="dct:type">bastibe.de</span> by <a xmlns:cc="https://creativecommons.org/ns#" href="https://bastibe.de" property="cc:attributionName" rel="cc:attributionURL">Bastian Bechtold</a> is licensed under a <a rel="license" href="https://creativecommons.org/licenses/by-sa/3.0/">Creative Commons Attribution-ShareAlike 3.0 Unported License</a>.</center></div>
</body>
</html>