-
-
Notifications
You must be signed in to change notification settings - Fork 528
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
random_matrix() with prescribed density buggy #3436
Comments
comment:2
I'll also point out that the documentation for random_matrix says that density should be an integer, not a float -- see also randomize in matrix2.pyx and random_element in matrix_space.py -- while in matrix_integer_dense.pyx, randomize says that density should be a float. |
comment:3
Some more examples with sage 4.1. The last few examples show that the density is about an order of magnitude smaller than what we asked for.
|
comment:4
Replying to @jasongrout:
When I tried this, I got answers like the first two: the ratio of (fraction of nonzero entries) / (density) was about 4/5. This is consistent with the observation in the ticket description that the randomize function doesn't check whether the random element is zero: for integers, the documentation for I would try to write a patch for this but pretty much all of the matrix code is written in Cython, which I don't know, so anything I write would likely be buggy and slow things down by a factor of 100. |
comment:5
Attached a patch. It introduces a new method in On the matrix end, I've introduced a new parameter Finally, this is now also used to ensure that generating a fraction field element does not cause a division by zero error. Sebastian |
Author: spancratz |
Attachment: trac3436_a.patch.gz First patch |
comment:6
I've attached the patch now. I've called it "First patch" since I've only doctested Sebastian |
Attachment: trac3436_b.patch.gz Additional patch (more doctests) |
comment:7
The additional patch takes care of the remaining three doctest failures. This should now pass all doctests done with Sebastian |
comment:8
This looks good with one notable exception. Due to implementation details, Matrix_cyclo_dense.randomize() makes overly dense matrices. This should be pretty easy to fix. |
comment:9
I mentioned this to Tom, but just in case no one gets to it: implementing this for cyclotomic matrices should be pretty easy once you find out about the |
comment:10
Craig: the approach you suggest does not seem to work out quite as nicely, since the Sebastian |
comment:11
I'm a little confused -- why aren't we just using |
comment:12
I think that is the same reason why certain matrices have their Personally, I don't really care about this (the speed of generating random matrices), but since others do, I don't think this patch should make anything significantly slower. |
comment:13
I care because this gets used a LOT in benchmarking and automated testing of linear algebra algorithms. |
comment:14
all that has to happen here is
|
comment:15
Tom, before you wrote your message, I already wrote the code for the randomize method, by including a _randomize_column method in matrix_rational_dense, and this is then used in the cyclotomic matrix code. This way, the code doesn't have to modify any more classes than it does already. I'll do some testing and upload it later. Also, I talked to Robert Miller and he said this patch needs to be re-based to 4.3.1.rc0, so I'll do that, too. Sebastian |
comment:16
I view the fact that the random_element methods don't pass *args and **kwargs down the line as a bug -- and if you have to rebase, we might as well fix those up too. Moreover, I'm not convinced that randomizing columns is the right way to go -- though I haven't seen your implementation, so I might be wrong. |
comment:17
I've now finished the implementation, but I still need to re-base it. Here are some timings. First, the old implementation:
Second, the new implementation:
|
Third patch, for matrices over cyclotomic fields |
Attachment: trac3436_c.patch.gz Attachment: trac3436_rb.patch.gz Rebase of the first three patches to 4.3.1.rc0 |
comment:18
I've now re-based the patch. I still need to run tests, I'll write back later to confirm if everything passes. Tom, which random element methods are you referring to? Sebastian |
comment:19
The re-base makes lots of doctests fail, I'll update the patch and post another one when I am done. Sebastian |
Fixes the doctests for the rebase |
Attachment: trac3436_rb_doctests.patch.gz random_element method in polynomial quotient rings (and number fields) passes on args and kwds |
comment:20
Attachment: trac3436_rb_poly.patch.gz Tom suggested that we should change the code for random matrix generation over cyclotomic fields to rely on the random element generation code in the cyclotomic field, mostly in order to reduce code size. The patch above now ensures that the polynomial quotient rings (and number fields) pass on additional arguments. This patch should be applied in addition to all the other ones, in any case. The next step is to compare the timings of the above code to Tom's suggestion. If there is no noticeable speed difference, we should go with Tom's suggestion as it provides for cleaner code. Otherwise, we should leave it done in the above patches. We'll see. |
Attachment: trac3436-tb.patch.gz |
Addendum to the last patch on polynomial quotients |
comment:21
Attachment: trac3436_rb_poly2.2.patch.gz Here are the timings for the code that Tom wrote, which gets rid of the "ugly" code I wrote for generating the random matrices over cyclotomic fields and instead uses the generic code of random elements in cyclotomic fields (implemented by two of the patches above):
I suspect that the slowness of this code is largely due to the fact that for every random element of the cyclotomic field, two conversions take place. One of a random list of rationals into the polynomial ring, and then another of that polynomial into the polynomial quotient ring. I don't think there is much we can do about this at the moment. Thus, I propose that we should apply all of the patches on this ticket apart from Any comments? PS: Of course, I'll include the fixed doctests (64-bit difference) from Tom's patch. |
comment:22
I'll eat crow here -- spancratz's implementation is so much faster, we'd be dumb not to use it. I'll test a clean build once you post the new patches. |
Main patch |
Attachment: trac3436_main.patch.gz Attachment: trac3436_poly.patch.gz Polynomial part |
comment:23
I've now uploaded all the work we did in two patches, one for the matrix generation (and all the additional code all over the place necessary) as I am running tests now. I've got the feeling that it might fail doctests in two files on the remote machine at Oxford but that it'll pass those on my local machine. But we'll see. I'll report back once that's done. Sebastian |
comment:24
So I'm tempted to (re-)propose a fix in the other direction. As Tom noted, the problem with the "smaller" bit of code is that generating random elements in a cyclotomic field is insanely slow. Rather than program around that, why not fix it? I'm going to sit down and look at this now ... |
comment:25
Attachment: trac3436_whitespace.patch.gz I attached a minor patch that fixes two whitespace problems in the doctests that caused failures. Current patch list: trac3436_main.patch |
comment:26
Okay, so I'm finally convinced. I just spent some time optimizing generation of random number field elements, and got something like a ~150X speedup -- and the resulting randomization for cyclotomic matrices was still something like six times slower than the original version. So I'm going to make a new ticket with that code, but I think the code that Sebastian's been writing is definitely the winner. (Tom said he was going to review it in the morning, so I'll let him do that.) The underlying reason for the speed difference is easy to understand -- we represent elements of number fields as NTL polynomials, so the process of randomly generating the matrix entries involves a huge number of copies of the data: once from sage Integers to NTL integers in generating a random element of the number field, and then two copies (due to the way things are getting moved around) in moving the values from the NTL polynomial into the matrix entries. There's just no way this is going to beat code that just generates entries down the line right in the matrix. I'll add a note on this ticket once I clean up and post the faster number field random element code ... |
comment:27
Looks good / all tests pass. |
comment:28
For the sake of it, I thought I'd mention that I posted my random number field element code at #8007. If we one day move away from NTL for representing elements of polynomials, I think that the "generic" approach Tom and I were trying to push above would probably be a good thing. |
Changed author from spancratz to Sebastian Pancratz |
Merged: sage-4.3.2.alpha0 |
Reviewer: Tom Boothby, Craig Citro |
comment:29
Merged patches in this order: |
Matrices with prescribed density are not generated correctly:
To wit: the actual density of the matrix over GF(2) is only approximately half of what we expect. This happens because the randomize() function populating the entries does not check whether the random element picked actually is non-zero. Apparently, all of the matrix classes are affected by this bug.
CC: @craigcitro
Component: linear algebra
Author: Sebastian Pancratz
Reviewer: Tom Boothby, Craig Citro
Merged: sage-4.3.2.alpha0
Issue created by migration from https://trac.sagemath.org/ticket/3436
The text was updated successfully, but these errors were encountered: