-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance of determinant #908
Comments
Determinants can result in big equations. Is the slowness due to: a) the creation of the equations or the b) the evaluation of the created equation? If (b) then my Equations module might help, since it only evaluates sub-expressions once. |
There aren't really equations into play here, it's just numeric number crunching of a matrix (see det.js). |
Oh. I see. It's numeric, not symbolic. I had a hunch that a symbolic equation representing the determinant could be optimized. However, getting that symbolic equation would take some doing. 😶 |
I agree that mathjs operator polymorphism (e.g., add() vs '+') may be the magnitude killer here. Its the old "interpreted vs. compiled" order of magnitude. If this hypothesis holds, then we might want to consider a sister package, "mathjs-double". Such a package would be tuned for performance but retain most of the mathjs API goodness. An alternative solution would be to go the C++ node extension route, which is a lot of work 😩 |
(deleted comment. sorry wrong thread) |
For some Matrix operations, there is already an optimization in place that checks whether the Matrix does not contain mixed types (i.e. just numbers or just BigNumbers), and in that case, don't use the generic mixed type But at this point I don't know if that's actually the cause, or that the current algorithm isn't efficient enough and we should look for a different algorithm. We need to figure that out first. |
As a thought exercise, if we expressed any particular determinant as an expanded equation (i.e., write out the det()) and compiled that expression to Javascript, it would be faster because we would have eliminated all the function calls and conditionals. Indeed, we know this works because other Javascript math packages are faster. An additional performance boost for matrices of size N>3 would be to memoize the sub-products that recur. Such memoization would precede the compilation into Javascript. The above exercise leads one to consider an architectural approach to performance overall that would rely on the extensive use of compiled expression trees with fully expanded function nodes. Such an approach could yield blazingly fast evals of all compiled expressions. Some functions like sin() would still be atomic, but others like det() and derivative() would benefit. In other words, my hypothesis is that the symbolic processing of math expressions is the key to performance. Symbolic processing of math expressions allows us to maintain polymorphism via compilation rather than having it burden each evaluation. In this manner, we would compile det([[a,b], [c,d]]) as |
Yes, so we can optimize by doing analysis of the data types beforehand, and optimize evaluation for that. Note that math.js already uses compiling expressions and typed functions dynamically, that's why the expression parser is very fast once compiled, which allows you to evaluate an expression against a scope very fast. Also, for example the function I don't think we necessary need to compile dynamic JavaScript, but we need switches which check for mixed type vs. single type contents of matrices, and then use either the generic or type specific implementation of an operation. Something like: // pseudo code for adding two matrices:
var A = math.matrix([...]);
var B = math.matrix([...]);
var operation = math.add;
// we need a new method which returns the different types of the values in the matrix
// for example return ['number'] when single type, or ['number', 'BigNumber'] in case
// of mixed types
var typesA = A.contentTypes();
var typesB = B.contentTypes();
if (typesA.length === 1 && typesB.length === 1 && typesA[0] === typesB[0]) {
// both matrices are single type, and have the same type (for example 'number')
// -> find the type specific implementation of operation math.add for numbers:
var signature = typesA[0] + ',' + typesB[0] // 'number,number'
var addSingleType = operation.signatures[signature];
if (addSingleType) {
// evaluate with the fast, single type implementation of math.add
return algorithm13(A, B, addSingleType);
}
}
// evaluate with the generic, multi type implementation of math.add
// (algorithm13 is one of a set of helper functions which applies a
// callback function on two matrices element wise)
return algorithm13(A, B, operation); |
That would be faster, yes. 😀 |
I tried replacing the polymorphic arithmetic in
Virtually all of the speed up was due to refactoring It turns out you can use the LU decomposition to compute the determinant, which is much faster:
But the LU decomposition can be numerically unstable so the computed determinant starts to differ significantly between the two methods once you get larger than about 50x50. Implementing a native version of
|
ha ha, of course not. I like being able to think aloud and have a sort of a brainstorm, which can result in great ideas :) @ericman314 thanks for doing these experiments. So optimizing for matrices with a single data type can yield an improvement in the order of a factor 3 or 4, and a much bigger gain could be replacing the algorithm with LU decomposition for example. But unstable results for larger matrices are an issue of course :( |
Anyone doing operations on matrices that big should 1) already understand
the numerical issues involved, and 2) probably be using a dedicated and
optimized linear algebra library.
Maybe we should also be looking at benchmarks for calculating lots of small
determinants as well as one big one?
…On Aug 6, 2017 12:12 PM, "Jos de Jong" ***@***.***> wrote:
(this is where you nod somberly and ignore me)
ha ha, of course not. I like being able to think aloud and have a sort of
a brainstorm, which can result in great ideas :)
@ericman314 <https://github.com/ericman314> thanks for doing these
experiments. So optimizing for matrices with a single data type can yield
an improvement in the order of a factor 3 or 4, and a much bigger gain
could be replacing the algorithm with LU decomposition for example. But
unstable results for larger matrices are an issue of course :(
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#908 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AGDTkd3KXOs3CjsH0ZcVi0AFnlWOy8wUks5sVgIOgaJpZM4OoxMg>
.
|
So you mean to say that it's acceptable to have a solution that becomes unstable for large matrices? That can be a pragmatic (and very fast) approach. I would love to somehow give a warning or error then. Or automatically fall back to a slow but precise and stable algorithm for large matrices. What do you think?
Yes, that will be best. Like calculating it for 2x2 or 3x3 can/should be extremely fast :) Feel free to extend the benchmark folder with new tests where needed. |
Are we still interested in this? I stashed my determinant testing code somewhere and now I can't find it. I can recreate it if we still want to use lup decomposition for calculating the determinant. |
Yes I think so! The current performance of |
I never did find that code I wrote, but I can do it again. Code always comes out better the second time around, anyway. About falling back to the slow algorithm, I wouldn't recommend that. For matrices of the size where that would matter, the slow algorithm would be much too slow. I will do some research and see if there are any relaxation methods that can improve the accuracy of the LU decomp before evaluating the determinant. |
Well, I don't know if my code was any better the second time around but it is still faster than the previous implementation. Here are the results of the benchmark before and after the changes in PR #1118. (It doesn't contain any relaxation methods, though.)
|
Wow!!! that's huge! Thanks a lot. I would say the new performance is acceptable, it's now its on par with other matrix operations like multiplication. Shall we close this issue, or do you want to keep it open for another improvement round? |
Computing determinants isn't very important in linear algebra, but computing the LU decomposition is. If someone would like to profile that function and look for some optimizations I think it could be worth the time. |
That makes sense. I will close this issue now, the improvements are now released in |
@ericman314 I just discovered something awesome (triggered by #1154): the performance of matrix operations is only optimized for matrices containing a single datatype (like numbers) when you explicitly pass the datatype. I wrongly assumed there was code in place to automatically detect the datatype of matrix contents. So: const A1 = math.zeros(100, 100)
const A2 = math.matrix(A1, 'dense', 'number')
console.time('1');
const R1 = math.multiply(A1, A1);
console.timeEnd('1'); // 147 ms on Firefox, 86 ms on Chrome
console.time('2');
const R2 = math.multiply(A2, A2);
console.timeEnd('2'); // 11 ms on Firefox, 30 ms on Chrome This means that mathjs is / can be much faster than we described in our paper: at worst only 2 times slower than the fastest JS libraries instead of 10 times (!). I've run the benchmarks again and added test cases for generic (mixed) vs number Matrices:
I came across a bug when trying to add two matrices having a datatype defined, fixed that in the develop branch, see b44ce14 (you have to run From this I see two interesting action points:
|
Cool! And you're right, |
Hmm, I'm actually not sure now about And I found the place where matrices are constructed, but I'm afraid I'll mess something up if I try to figure out how to automatically guess the dataType when it's not supplied. |
Thanks for checking it out. Looks like we have to dig deeper then to find out why A different solution would indeed be to have a separate implementation for the different numeric types but I'm not sure if we should go there right now. In #1154, @JasonShin is working on a new function |
there is an amazing algorithm for it which takes performance to O(n^2.372) Meanwhile I'll take a look on current implementation. There probably is some bottleneck or unnecessary condition which breaks something |
Yes it could very well be an implementation issue and not the algorithm itself. |
@josdejong I have read code (nothing caught my attention tho) and tested several different algorithms for determinants, even these from numericjs which were supposed to be the fastest. They are, but only for normal numbers (I got ~7 times increased speed). Though when i try using BigNumbers, performance drops so drastically, that your algorithm is faster (400000ops/s for normal numbers, 60000ops/s, 130ops/s for BigNumbers). Sadly I have to say that I won't help with this issue. It is not my field of expertice and honestly I have no other idea how to touch it :/. |
Thanks for having a look Bartosz. Maybe we should close this issue, a big step was already made by Eric long time ago. I've created a separate issue for the idea I explained in #1154 (comment), it's separate from improving determinant. |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
When running some simple benchmarks of mathjs:
You see output like:
math.js is slower than other alternatives (up to 1 order of magnitude). This is probably caused by math.js dealing with mixed types rather than just numbers, and there is room for improvement there, but that's for an other topic.
There is one extreme outlier above: calculating the determinant
det(A)
which is really hundreds of times slower than other math libraries.Who is interested in diving into this performance issue and try to improve the performance of the determinant?
The text was updated successfully, but these errors were encountered: