-
Notifications
You must be signed in to change notification settings - Fork 662
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training is incredibly slow compared to BrainJS #70
Comments
Are you sure that BrainJS is a LSTM based network ? |
Hello nemo, can you post your tests codes so we can make a diagnosis? |
@Pummelchen – BrainJS is just a feed-forward network. And I'm not using synaptic's LSTM capabilities. @menduz – here's the network, training is called from here with these options. Thanks for helping out! |
Hey @nemo, sorry for the late response, I haven't run your code yet but something that I notice is that you are using cross entropy as the cost function, while brainjs uses minimum squared error. You could try replacing this line with |
@cazala haven't tested the performance of the activate function most optimization yet. Using MSE as the cost function doesn't improve the performance by much either, unfortunately. |
is BrainJS possible using some form of threading (web-workers), and what about synaptic's use of idle cores ? |
You may be interested in taking a look at this: https://github.com/arrayfire/arrayfire-js For this to be feasible, synaptic would need to encapsulate all array handling, so that a different back-end can be provided/used for this. |
@cazala: Would it be possible to get a few pointers regarding the feasibility of encapsulating array handling to make use of a library like arrayfire-js ? |
I am willing to take a look at this, as far as I can tell, the optimize() function may be a good starting point, because it's already converting a whole network, including all neurons, into a hard-coded function, right ? https://github.com/cazala/synaptic/blob/master/src/network.js#L123 |
I'm currently working on brain.js and am very interested in multi threading, will do more research and we can help each other out. |
explicit multithreading probably doesn't scale too well - an arrayfire-js based approach would have the advantage that parallelization takes place implicitly, so that even OpenCL is supported (think GPUs, FPGAs). Such hardware is known for speeding up vectorization by a factor of up to 250x In conjunction with some kind of clustering module on top of arrayfire-js, a neural network library like Brain.js or synaptic would have no hard-coded limits regarding the degree of parallelization it can use, which would even mean supporting heterogenous hardware clusters. For that to happen, all array-handling and calculations would need to be moved to a helper object, that can serve as the "driver" for different back-ends - such a "driver" could then be shared with other ANN projects like Brain.js/synaptic, and could even become its own project at some point - think of it as an API for creating ANN frameworks. |
arrayfire seems like magic! |
It doesn't need to be arrayfire - there are other solutions that allow code to be vectorized automatically to make use of different back-ends - no matter if that means OpenMP (SIMD) or OpenCL. Note that OpenCL would work even for GPU-less systems as long as a an openCL environment is installed that exposes the CPU as an OpenCL environment (e.g. Intel/AMD) |
So who here has the time and knowledge to implement that ? |
I can offer some time. @UniqueFool do you have an example of your neural net with and without arrayfire? That would be terribly helpful. |
I think I posted a comment a few days ago containing a link to the arrayfire example implementing a NN: https://github.com/arrayfire/arrayfire-js/blob/master/examples/es6/machine-learning/neuralNetwork.js Note that this would require arrayfire, and arrayfire-js to be installed first of all (there are different dependencies required depending on the back-ends you want to support/use). And notice that, for the time being, this is unrelated to synaptic - which is why I asked for pointers on refactoring the existing synaptic code to encapsulate array handling and any calculations that would benefit from vectorization But like I said, I would not necessarily make this specific to arrayfire, it was really just meant to make the point that OpenCL scales better than OpenMP-level parallelization, because the latter can be used by the former on platforms without hardware-accelerated GPUs/FPGAs, while the opposite is not the case - which is to say that even heavily multi-threaded code cannot automatically benefit from dedicated vectorization hardware, unless it happens to be a CPU. My suggestion would be not to actually do any coding until @cazala has left a comment, and preferably a few pointers, here. Personally, my suggestion would be supporting something like arrayfire as an optional back-end, and for that, we would need to work out a way to rework the existing code base so that its major computation workhorses are encapsulated into helper functions/classes that can directly deal with arrays in a functional fashion (think map/reduce and filter), at which point it will be much more straightforward to map everything to a different back-end like arrayfire-js. Obviously, it does not make sense to make such a back-end mandatory, because OpenCL/CUDA and even C is really only supported by node, whereas synaptic has to run in the browser, too - and with WebCL still being experimental, it is rather challenging to make that work in a portable fashion (e.g. see this) |
I should have referenced the above link, and asked for a pure js example. Speaking naively, what I meant was do you have a version with no arrayfire? So we could see exactly the before arrayfire and after arrayfire implementations, for ease of reference. |
If you are referring to pure synaptic vs. synaptic using arrayfire - no, I don't have any code doing that, that was the whole point of my original comment - however, looking at the code in question, what we basically need to do to use a different vectorization backend (like arrayfire) is locating the calculation routines in syntaptic and mapping those to the af.* calls that can be seen in the arrayfire NN example There's another example in the machine learning folder at: https://github.com/arrayfire/arrayfire-js/blob/master/examples/es6/machine-learning/ann.js In general, the docs are pretty good actually: http://arrayfire.org/arrayfire-js/ Apart from that, I am not sure if there is a side-by-side comparison illustrating how to adopt the library - at least, I haven't seen one yet. |
One of the things that makes these libraries so powerful is that they run in the browser, node, and elsewhere. I bet if we put some thought into it, we can achieve a model that checks for the existence of arrayfire, and uses it, if not it gracefully degrades to the status quo. |
I believe the only two parts of the code that need to be optimized are the activation and propagation methods from the |
Thanks for getting back i ntouch, I posted a follow-up over at the arrayfire tracker (here), and they generally seem supportive of the idea - but mentioned that some key APIs are going to change due to async related refactorings. I do agree that a map/reduce approach would be useful to help generalize the existing code, at which point it will be easier to adopt a different back-end like arrayfire |
How about moving the current implementation to a different object, and letting the constructor accept a calllback that directly deals with the corresponding arrays in a forEach fashion ? That way, we could override the default behavior using a different/hardware-accelerated back-end like arrayfire, just by providing a custom callback that implements the activation/propagation methods using whatever means is available ? |
I like it!
|
And that would in fact also work very well for all currently supported use-cases, including the browser/non-arrayfire scenario - because the default behavior would be left "as is", whereas we could add some startup/runtime flag (or heuristics) to enable the arrayfire based version of the activation/propagation functions. The added benefit is that arrayfire-js optionally supports custom OpenCL kernels, which is to say that existing OpenCL kernels implementing CNNs, RNNs etc could be reused for a more aggressively optimized version, despite never having been written with JavaScript use in mind. Equally, a map/reduce approach would make it much easier to make use of web workers (browser) or other parallelization schemes (think clustering) that may not even have access to OpenCL or the GPU in general. |
Just for the sake of completeness, even the training/backpropagation part of the code could in theory make use of parallelization, and thus, speed up network training considerably. Here's a short, and very accessible/light, 7-page PDF illustrating the basic concept, based on doing parallel backprop on 1ghz dual-core systems (note, no GPU use at all): http://www.neuropro.ru/mypapers/lncs3606.pdf This is something that people experimented with already in the early 90s: http://docs.lib.purdue.edu/cgi/viewcontent.cgi?article=1226&context=ecetr And here's a more recent paper detailing how OpenCL can be used for parallelizing ANN training: https://bib.irb.hr/datoteka/584308.MIPRO_2011_Nenad.pdf
|
Just for future reference, here are two github projects which apparently use OpenCL kernels for parallel backpropagation: Besides, the arrayfire project is currently creating a machine-learning library on top of arrayfire, which will include ANN support For details, see: arrayfire/arrayfire-ml#3 |
Hey folks,
We have a relatively small dataset (<1000) that when we use BrainJS – trains in about 2-4 minutes.
However, with synaptic – the same task takes about 45 minutes or so. We're using a 1 layer Architect.Perceptron on Node 4.2.2. Changing the learning rate / error has given us some minimal speed improvements but definitely not anywhere close to the BrainJS version.
Any thoughts around this? What else should we try?
The text was updated successfully, but these errors were encountered: