Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed improvement. #13

Closed
jeammimi opened this issue Mar 7, 2017 · 40 comments
Closed

Speed improvement. #13

jeammimi opened this issue Mar 7, 2017 · 40 comments

Comments

@jeammimi
Copy link
Contributor

jeammimi commented Mar 7, 2017

I am starting to use it with my work, and in this example case, I am working with array of size 1000x2000 and it starts taking some time to execute.
Apparently a lot of time is spend on the checking and transfer of the data from python to javascript.
Any ideas of to improve this part ?
I provide the output of prun:

       67394998 function calls (56976840 primitive calls) in 41.346 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
10116086/80    9.798    0.000   35.256    0.441 jsonutil.py:97(json_clean)
 12531198    8.719    0.000   14.622    0.000 abc.py:178(__instancecheck__)
 25266778    8.120    0.000   22.742    0.000 {built-in method builtins.isinstance}
 17361221    5.904    0.000    5.904    0.000 _weakrefset.py:70(__contains__)
  7060/60    2.730    0.000   35.248    0.587 jsonutil.py:153(<listcomp>)
97928/196    1.199    0.000    2.413    0.012 pyparsing.py:943(_parseNoCache)
      160    1.143    0.007    1.144    0.007 encoder.py:203(iterencode)
        6    0.452    0.075    0.454    0.076 traitlets.py:169(repr_type)
    15410    0.383    0.000    0.383    0.000 {method 'tolist' of 'numpy.ndarray' objects}
      286    0.311    0.001    0.311    0.001 {built-in method numpy.core.multiarray.array}
107909/196    0.261    0.000    2.415    0.012 pyparsing.py:1023(_parseCache)
     7701    0.232    0.000    0.232    0.000 {pandas._window.roll_mean}
    40996    0.175    0.000    0.175    0.000 pyparsing.py:171(__init__)
      200    0.143    0.001    0.143    0.001 {method 'update' of '_hashlib.HASH' objects}
16727/1089    0.118    0.000    2.371    0.002 pyparsing.py:2345(parseImpl)
@maartenbreddels
Copy link
Collaborator

I know that @SylvainCorlay has some idea about it, did you have anything written up Sylvain?

I had an idea based on the image transfer I do for the volume data. You could turn a numpy array to a png image, every pixels (rgba) would be 1 float32 value (for transferring data i think that is precise enough). On the js side you'd have to draw it on a canvas, and then read the pixels. PNG (can be/) is lossless and you'd get compression for free. But the best way would be to do a binary transfer over a websocket, but I guess that requires a seperate websocket, otherwise we'd have to do base64 or base85 encoding/decoding.

@maartenbreddels
Copy link
Collaborator

I just noticed this: bqplot/bqplot#150 as en example, I also noticed there is a native atob function, but maybe not all browsers support it, so that may need a fallback.

@jeammimi
Copy link
Contributor Author

jeammimi commented Mar 7, 2017

It is the json_clean function that take a lot of time for big array.
I did some test and I have an improvement by ten if I disable this check for a numpy array.
However I am not sure if there is a clean way to implement it

@maartenbreddels
Copy link
Collaborator

Sounds good! We could also write a numpy to json code ourselves even, but then I'd like to see a unittest on how it handles NaN etc. But can you share your code here?

@SylvainCorlay
Copy link
Contributor

SylvainCorlay commented Mar 7, 2017

@maartenbreddels we should factor out the custom serializers for numpy arrays in the traittypes project.

@maartenbreddels
Copy link
Collaborator

True, but I'm happy to test out these ideas in ipyvolume for now. @jeammimi has a direct use case now, so if figures out a good solution, we should move it from here to traittypes. Would be good to see a (custom) json vs base64 encoding to see if it's worth the effort. I'm pretty sure it is (in favor for base64).

@SylvainCorlay
Copy link
Contributor

Comms and widgets can handle binary serialization...

@maartenbreddels
Copy link
Collaborator

Really? and can you get an ArrayBuffer with a handle to the data? Any code examples, or hints on how to do that. That would mean 0 overhead right?

@SylvainCorlay
Copy link
Contributor

That is right. It is documented in comms and handled when there are memory views in the widgets.

I really think that it is worth the effort to do this at a common level for visualization widget libraries. This is the reason why I moved the traittypes project out of bqplot etc...

@jeammimi
Copy link
Contributor Author

jeammimi commented Mar 7, 2017

So it is a little bit ugly because i didn't manage to wrap the recursive function
json_clean and so I redefined it to add only one case at the end
numpy transfer

@maartenbreddels
Copy link
Collaborator

@jeammimi could you try instead sth along these lines

import base64
....
a = np.array(value)
data = base64.encodebytes(a.tobytes())
# return data for the js side

And on the js side you'd have to do a custom deserializer (check volume.js for 'serializers'), and define one for 'x', 'y' etc.
What @SylvainCorlay is gonna be another 10x faster, but it will be a bit more work.

@maartenbreddels
Copy link
Collaborator

@SylvainCorlay ok, I saw this here: https://github.com/jupyter/notebook/blob/master/notebook/tests/test_serialize.py
But how do you use it with a traitlet, I mean, you cannot use to_json/from_json.

@SylvainCorlay
Copy link
Contributor

Yes, if to_json, from_json contains a memoryview, at the top level, is is passed as a buffer in the zmq and websocker message.

@SylvainCorlay
Copy link
Contributor

please submit it to traittypes! This would help everyone. @vidartf was also interested in binary messaging for numpy arrays.

@maartenbreddels
Copy link
Collaborator

I see, it's quite simple! The only thing is that the deserialize on the js side seems to be skipped.

@jeammimi
Copy link
Contributor Author

jeammimi commented Mar 7, 2017

I tried with the base64.
It seems faster with array of size 100 000 ,
but I don't know why with array of size 1000 000 it crash twice my browser

@maartenbreddels
Copy link
Collaborator

Ok, deserializers are getting used.

So, I got something working.. just to let you know, on the python side:

def array_to_binary(ar, obj=None):
	if ar is not None:
		ar = ar.astype(np.float64)
		mv = memoryview(ar)
		return mv
	else:
		return None

on the js side:

function binary_array(data, manager) {
    console.log("binary array")
    console.log(data)
    if(data) {
        window.ardata = data
        var ar = new Float64Array(data.buffer)
        window.far = ar
        return ar
    }
    return
}

var ScatterModel = widgets.WidgetModel.extend({
    defaults: function() {
        return _.extend(widgets.WidgetModel.prototype.defaults(), {
.....
        })
    }}, {
    serializers: _.extend({
        x: { deserialize: binary_array },
        y: { deserialize: binary_array },
        z: { deserialize: binary_array },
    }, widgets.WidgetModel.serializers)
});

I thought I could return a json with memory views in it, but it only accepts a memoryview, json_clean doesn't seem to accept that.
@SylvainCorlay : any way to work around that? I need to send the type and shape along.

@maartenbreddels
Copy link
Collaborator

maartenbreddels commented Mar 7, 2017

I thought a bit about it, and I think the issue is Widget. _split_state_buffers. It needs to go throught dicts and lists (similar to _widget_to_json ) to see if they contain any _binary_types and .pop those, so it can be json'ed. Then the buffer_keys can also be a list of items, say ["x", "data"], so that on the js side, instead of simply assigning to state[buffer_keys[i]] , it should do it nested, so in the above example it would effectively do state[buffer_keys[i][0]][buffer_keys[i][1]] = buffers[i]. Lists would then also naturally be supported.
This would make things really flexible and fast.

@SylvainCorlay should I take this discussion to ipywidgets? Or do you think a PR like this can be merged without issues? (Only thing is getting the dev version of ipywidgets to install on my computer.

@jeammimi what do you think in the meantime, base64, json, custom json, or some hack to get the binary transfer working?

@jeammimi
Copy link
Contributor Author

jeammimi commented Mar 8, 2017

The transfer base64 seems to be quite fast, but converting the array to base64 takes some time,
and now I don't decode it on the javascript side and I only have an improvement of a factor 3.
So to me the easiest is the small modification of json_clean that I did in
numpy transfer,
but I don't know how to incorporate it cleanly.

@maartenbreddels
Copy link
Collaborator

I think that's gonna be a non-friendly solution, you'd have to monkey patch it, I think that's a no go.

What I propose, is we do it 2 ways, the (default) json method, and the binary transfer. The last thing requires a patch to ipywidgets (I'm working on it now), but this is gonna be the best route anyway. This however means that you'd have to install ipywidgets from git (it's not always easy). Therefore I thing we should for the moment by default to json (slow), and by setting say ipyvolume.serialize.binary = True, it will transfer in binary. So if you want max performance, you need to get ipyvolume from git, or wait for the next release, otherwise it falls back to json.

@jeammimi
Copy link
Contributor Author

jeammimi commented Mar 8, 2017

Well indeed my hack is ugly (but I use it right now because it is ten time faster).
To be frank I don't really understand what you are proposing with the binary transfer.
Regarding the base64 option, what would it take to decode it on the javascript side?

@maartenbreddels
Copy link
Collaborator

About the base64, the atob function turns it into a binary string. Then you'd have to mantually convert every (say 4) bytes to a float.

About the binary transfer. Now with this PR jupyter-widgets/ipywidgets#1194
it is possible to transfer the 'raw' data from the numby array to the js side. There you get a Float32Array, no memory copies (except when going over the websocket).

I've implemented the basics in d008749 , but it (probably) does not work with animations yet. So maybe you can take a look at that.
So to get this working, you need jupyter-widgets/ipywidgets#1194 (or https://github.com/maartenbreddels/ipywidgets/tree/nested-buffers ) (see the installation instructions, or simply run sh dev-install.sh and hope it runs)
Then set ipyvolume.serialize.binary = True, and the data will be transferred in binary. One caviat is that the data is always a 1dim array, so to detect the sequences, we need to inspect the shape. I hope I have some time to work on it this week, otherwise I'm happy to see a PR. I wasn't that impressed by the speedup at my laptop, but maybe it's overhead.

@maartenbreddels
Copy link
Collaborator

Just to let you know there exists these libraries:

Would make live easier on the js side

@jeammimi
Copy link
Contributor Author

jeammimi commented Mar 9, 2017

Haha, you got the perfect timing.
I got It working on this example with byte from numpy, and I was going to ask you how to convert it to 2D
array:
numpy transfert
right now for 1000 000 array it is a 800 x improvement!

@jeammimi
Copy link
Contributor Author

jeammimi commented Mar 9, 2017

I took some of the code from here:
https://gist.github.com/nvictus/88b3b5bfe587d32ac1ab519fd0009607
not sure about the licence

@maartenbreddels
Copy link
Collaborator

Really? That sounds good!. Ok, I gave it a little thought..
My preference goes to ndarray, it's light weight, everything is split up in packages, that won't bloat the js file further. Just look at how d3 for instance is added (add to package.json, and run npm install)
We could have (for most of the arrays, x, y, z, vx, vy, vz and size (maybe size_selected)) a custom serializer and deserializer that converts it to an ndarray. Color needs special treatment, to be able to send it back to the python side. So I'd focus on everything but color for now (if you can't wait I'd look at ndarray-fill)
See https://gist.github.com/maartenbreddels/ac41cdcd22c8f84c9fc639dea7c90848 and jupyter-widgets/ipywidgets#1194 for updates on the state of it now.

@maartenbreddels
Copy link
Collaborator

This is really looking promising, also for situations where you cannot do the whole animation in the browser, but need to feed the widget with data every X msec, if this is fast enough, we could maybe do sth like 10^4-5 particles, animated.

@jeammimi
Copy link
Contributor Author

jeammimi commented Mar 9, 2017

I am not sure we need any of this library.
Apparently in both you need to access the value with the method get, which I implemented:
numpy transfer
I wanted to implement the bracket but apparently it is not possible.
what do you think?

@SylvainCorlay
Copy link
Contributor

@jeammimi nice experiments! This is really promising.

I like that you are following the npy format spec.

@SylvainCorlay
Copy link
Contributor

In terms of front-end package for the implementation, ndarray looks promising though.

@maartenbreddels
Copy link
Collaborator

I would really go for this package, it's lightweird (the source is small: https://github.com/scijs/ndarray/blob/master/ndarray.js ) and they care about performance. They tested out all these things for performance, and it opens up many options (fft, image manipulation, name it: http://scijs.net/packages/ )
Futhermore, whatever we do here, will go into a more general package in the future, having this mature well tested library is I think better for general purposes.

The code is also now license, ndarray is MIT, so we're save there. What do you think?

@maartenbreddels
Copy link
Collaborator

It also seem that we can do (http://scijs.net/packages/#scijs/ndarray):

sizes = ndarray(data, shape=[10, 100]) // sequence of 10, with 100 sizes
var sizes_attr = new THREE.InstancedBufferAttribute(sizes.pick(sequence_index, null) , 1, 1);

No loops needed, just plain memcpy's!

@jeammimi
Copy link
Contributor Author

jeammimi commented Mar 9, 2017

I am really not good with javascript basics:
how do you import this module (I installed it with npm, but I don't succed in loading it)
I tried

define('hello', ["jupyter-js-widgets","ndarray"], function(widgets,ndarray) {

}

but it does not work

@maartenbreddels
Copy link
Collaborator

Good question. What I did, is npm install ndarray in the js folder, put a ndarray = require("ndarray") somewhere at the top of volume.js. Reload the notebook, now if you do

%%javascript
require(["ipyvolume", function(ipyvolume) {
console.log("ipyvolume loaded")
})

you already get ndarray in the global namespace as I saw from the dev console (actually, that shouldn't happen). If that does not happen, put window.ndaray = ndarray in volume.js to make it global (this if eval, but for dev or debugging is bliss)

@jeammimi
Copy link
Contributor Author

jeammimi commented Mar 9, 2017

Is there a way to avoid that we have to convert the array after everycall of
this.model.get("x"), because if x does not change there is no need to convert it evrytime.

@maartenbreddels
Copy link
Collaborator

Yes, see the (de)serialize here
The idea is that say we have two functions, (de)serialize_ndarray that will eventually go into a seperate jupyter js package, so that other projects can use them.
Returning the ndarray in this function will cause model.get('x') to return a ndarray, as 'expected'. Numpy in, ndarray out, no hassle anymore.

@jeammimi
Copy link
Contributor Author

Hi,
I think I am too bad with javascript to implement, it.
I tried one hour this morning to implement your doubler on the numpy transfer, and I have no clue why it
is not working...
So I don't know if I can be of any help to implement it.
Regarding the ndarray, it seems nice, but don't you think that it will be easier for people to adopt it,
if we return regular single dimension array or double dimentison array.

@maartenbreddels
Copy link
Collaborator

Any JS error to share? Maybe take this here: https://gitter.im/maartenbreddels/ipyvolume

@maartenbreddels
Copy link
Collaborator

Ok, I now have all three methods implemented, see 88c3ad0. Set ipyvolume.serialize.performance to 0 (default), 1 or 2:

  • ipyvolume.serialize.performance = 0: json for encode en decode
  • ipyvolume.serialize.performance = 1: np.save to a StringIO/BytesIO (credits to @jeammimi
  • ipyvolume.serialize.performance = 2: uses a list of memoryviews to numpy arrays, this will need Any descendant of the state can be a binary_type jupyter-widgets/ipywidgets#1194. It will be more flexible, since it allows a list of numpy arrays, so the # of glyphs can change per sequence-element with this.

Color isn't attacked with this though, when performance >= 0, it should be possible to transfer it the same way if not a list of strings. PR welcome 😉
Let me know about the performance differences.

Btw, I've added examples/test.ipynb for some basic testing, don't execute the cells directly after eachother, give a 1 few mseconds, savefig isn't really working that well yet. The last part in this notebook is kind of a performance test/demo. Saw quite a speedup going from performance=0 to 1

@maartenbreddels
Copy link
Collaborator

I consider this closed now, further improvements can go in a seperate issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants