-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Grisu #7291
Conversation
Wow!! I hope that test file was automatically generated? :) |
Wow indeed. That is a massive test file. :P If you want to remove Chunks like this can just be removed in their entirety, and when you're removing dependencies you'll need to excise them from makefile rules that depend on them. These two are the only ones you'll need to worry about in You should remove any patches or wrappers we have stored in |
So you mean I've packaged double-conversion in Fedora for nothing? I can't keep up! ;-) |
quelle surprise! |
@nalimilan This is not tagged for 0.3, and I can't see any reason why this should be merged right before a release. That way (I think) you'll have plenty of time to maintain the fedora package until it becomes obsolete. |
@ivarne Glad to hear that! |
Very flashy, posting this the day after the Dates.jl PR :) How does he do it??! |
Well, I figured if I was going to do a PR that was net +3K code and tests (Dates.jl), I should also do one that made up for it! ;) Yeah, there's no reason this needs to be 0.3. The great thing about a change like this is its purely internal machinery. A few more thoughts I forgot to share:
|
…nce with the grisu alogrithm (printing the shortest form that can be read back in accurately).
Really cool! It would be great to avoid the dependency on double-conversion. |
One other thing to note (now that I fixed the travis failure for it), is that, following up on #5948, |
The use of GMP here seems to be very specific, so we could probably roll out a small pure-Julia quasi-BigInt implementation if desired/needed. |
Yeah, the BigInt double-conversion rolls doesn't actually look too complicated, so that could be a good template to go off. I did have a question for all the license-experts out there: the original library is here, being almost completely ignorant of license stuff, do I need to copy over the license from the original files? Anything else I need to do? |
/dsfmt-* | ||
/fftw-* | ||
!fftw-config-nopthreads.patch | ||
/git-* | ||
/gmp-* | ||
/grisu-* | ||
/libgrisu.so |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would probably be good to keep these gitignore patterns so that one can switch branches without risking accidentally checking in these files (or be curious as to why they suddenly appeared. It is somewhat annoying when git complains about files not checked inn.
Maybe we could have a block at the end of the .gitignore
file something like:
# obsolete dependencies
/double-conversion-*
grisu-*
/libgrisu.so
Some automatic way of deletion would probably also be good. What about adding a few rm -rf
to make clean
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I'll revert the .gitignore changes. For the deletion part, you're saying one someone runs make clean
to have a rule that gets rid of old deps/double-conversion
stuff?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't worry about the automatic deletion. In general, we do not delete old versions when we update to new versions. Over time, people will compile a fresh checkout of julia and it will be clean. @staticfloat Thoughts?
Depends. If you looked at the code while implementing this (rather than the original paper), this probably counts as a derived work, so you'll have to include the license. |
We may be able to ask Florian Loitisch if we can relicense it as MIT since BSD 2-clause is equivalent and link to his library as the original code. |
In the longer term, we have to consider that many people might not want to depend on GMP, but want to print floats. |
@JeffBezanson, for sure, and just to reiterate, we're only talking about 0.6% of all Float64s that can't be reliably printed with the native grisu algorithm, which in my mind, makes it not as big a deal if the fallback is simply I definitely spent a good chunk of time digging through the actual code (as well as reading thru the paper). @StefanKarpinski, that seems like a good idea, so I would just stick a link to his library code at the top of each file? |
Unfortunately, it appears that Google owns the double-conversion code base. Florian asked me to sign a CLA when I added a few lines to make a package. So any relicensing will probably have to be accepted by the Google management... |
In a sense that's good news since there's a single owner. For now, we can just stick the BSD license and on each of those files. BSD isn't viral and is compatible with MIT so it's not problematic. It would just be simpler if we could have these be MIT-licensed. We could also ask for one-off permission to create an MIT-licensed derived work. |
The BSD 2-clause license is fully compatible with MIT. Why bother with relicensing? It's not like the rest of the codebase and dependencies are all MIT licensed anyways. |
The original code is BSD 3-clause |
Ok, make the |
Maybe a comment block on the top of each file that says something like: # This file implements the GRISU algorigthm, but its implementation was
# inspired by the double-conversion library from Google. The original
# copyright and 3 clause BSD licence in base/grisu/LICENSE may apply The licence file might also benefit from a header explaining why it appares in the middle of a Julia repository. |
Wooooooooooo!!!! |
I love the dev period. |
|
Well, that's exciting. |
How bad can it be with a travis green light, right? Plus, this crazy-everything's-going-to-break dev period has been far too mild so far IMO. I think I'm the only one who's actually broken anything... 😉 |
Was that not a real warning message, @IainNZ? |
No sorry was just a joke! I'm sure it'll be fine. We'll find out in 12 hours or so! |
I'm skeptical there are many packages relying on/testing how floats are printed anyway. It'll be interesting to see. |
Haha. Ok, I was not sure but even if we did break all packages on 0.4-dev, that would be ok. |
So what library is going to be ported next? :) |
I think good candidates for porting are
That said, I have no idea what other dependencies we'd want to eventually get rid of; libuv? openlibm? pcre? mpfr/gmp for better performance? |
OpenBLAS. You know you want to. |
Single best candidate for replacing next is arpack IMO. It's been causing problems, is not that well maintained, is in Fortran so difficult to work on, and isn't as enormous a library as say OpenBLAS. |
librmath? zlib? gmp/mpfr? and openblas, like Jiahao said. suitesparse would also be a good candidate (due to license). low level libraries (llvm, libunwind, libuv) would probably be the last to go. |
eliminating fortran codes (openspecfun?) would make julia easier to compile |
Not using LLVM and libuv would be actively detrimental – those give us regular free improvements and but fixes when they get upgraded. |
@tkelman we're working on ARPACK. Promise. |
Are we still depending on librmath? I thought that was pretty much excised. |
We still need librmath. |
@jiahao good, it's obviously not a trivial exercise, that much is clear. Jutho and Viral have been doing good work on that front. The modularity issues apply here - for the remaining large and useful deps, it'll be good to refactor in order to make them optional. Jeff has mentioned doing this for LLVM, FFTW is underway, GMP and MPFR and SuiteSparse are good targets for modularizing. Hell, even trying to make OpenBLAS/Lapack optional is a good long-term goal. Merging pure-Julia Grisu just made GMP more necessary, didn't it? |
@tkelman yes, pure-Julia Grisu uses BigInts. That's another 600 LOC or so to roll a minimal BigInt that's needed. Maybe if the modularizing gets more serious, I'll bump that up on the priority list. |
We will need a lot of compiler improvements and multi threading to get rid of the various linear algebra libraries. It is in sight but not there yet. Of we can get a comparable gemm performance kernel, we can dump OpenBLAS quickly. I feel that LAPACK can just be translated to julia with an f2j script, but you really want to rewrite for multi threading. |
First blood! Keno/SIUnits.jl#38 |
Second! Keno/TexExtensions.jl#2 |
Leave it to @Keno to use unexported macros from an unexported module in Base....:) |
And GiovineItalia/Gadfly.jl#409 |
Implement grisu algorithm in native Julia. Fixes #5959. In particular, this creates a more Julian API to grisu in allowing any size float to be printed in its "shortest" form (while still being parsed back to the correct value), which is a surprisingly tricky problem (link).
Net code reduction is nice from 6K C++ to about 1K Julia. Initial performance benchmarks show it's just under 2x the native C++.
I would also appreciate someone testing this out on a 32-bit machine; I don't foresee any cross-platform issues, but some of the bit-twiddling may cause problems between 32/64-bit.