Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Q-based qrjacobimatrix() in O(n) #138

Merged
merged 33 commits into from
Jun 8, 2023
Merged

Implement Q-based qrjacobimatrix() in O(n) #138

merged 33 commits into from
Jun 8, 2023

Conversation

TSGut
Copy link
Member

@TSGut TSGut commented May 22, 2023

@ioannisPApapadopoulos @dlfivefifty

This PR adds an optional method argument to qr_jacobimatrix() which allows either :Q or :R as input and will do the QR raising using the chosen matrix, both in O(n). The default if no method is supplied is to use :Q.

The Q method more or less matches the R method in efficiency (some efficiency is lost to make sure the resulting matrix fits into an adaptive SymTridiagonal but this is also true for the R approach) - there is more optimization that could be done in principle but I am not sure it's worth it considering we can do this:

julia> P = Normalized(legendre(0..1));

julia> x = axes(P,1);

julia> J = jacobimatrix(P);

julia> wf(x) = (1-x)^2;

julia> sqrtwf(x) = (1-x);

julia> Jchol = cholesky_jacobimatrix(wf, P);

julia> JqrQ = qr_jacobimatrix(sqrtwf, P);

julia> JqrR = qr_jacobimatrix(sqrtwf, P, :R);

julia> N = 100_000
100000

julia> @time Jchol[1:N,1:N]
  1.015375 seconds (13.59 M allocations: 988.733 MiB, 11.53% gc time)
100000×100000 BandedMatrix{Float64} with bandwidths (1, 1):
 0.25      0.193649                                                                                                                                                         
 0.193649  0.416667  0.225374                                                                                                                                                 
          0.225374  0.458333  0.236228                                                                                                                                        
                   0.236228  0.475     0.241209                                                                                                                               
                            0.241209  0.483333  0.243904                                                                                                                      
                                     0.243904  0.488095  0.245525                                                                                                            
                                              0.245525  0.491071  0.246576                                                                                                    
                                                       0.246576  0.493056  0.247296                                                                                           
                                                                0.247296  0.494444  0.24781                                                                                   
                                                                                                                                                                                              
                                                                                                                                 0.25  0.5   0.25                             
                                                                                                                                      0.25  0.5   0.25                        
                                                                                                                                           0.25  0.5   0.25                   
                                                                                                                                               0.25  0.5   0.25              
                                                                                                                                                     0.25  0.5   0.25         
                                                                                                                                                          0.25  0.5   0.25    
                                                                                                                                                               0.25  0.5   0.25
                                                                                                                                                                    0.25  0.5

julia> @time JqrQ[1:N,1:N]
  1.347002 seconds (21.48 M allocations: 1.476 GiB, 12.26% gc time)
100000×100000 BandedMatrix{Float64} with bandwidths (1, 1):
 0.25      0.193649                                                                                                                                                         
 0.193649  0.416667  0.225374                                                                                                                                                 
          0.225374  0.458333  0.236228                                                                                                                                        
                   0.236228  0.475     0.241209                                                                                                                               
                            0.241209  0.483333  0.243904                                                                                                                      
                                     0.243904  0.488095  0.245525                                                                                                            
                                              0.245525  0.491071  0.246576                                                                                                    
                                                       0.246576  0.493056  0.247296                                                                                           
                                                                0.247296  0.494444  0.24781                                                                                   
                                                                                                                                                                                              
                                                                                                                                 0.25  0.5   0.25                             
                                                                                                                                      0.25  0.5   0.25                        
                                                                                                                                           0.25  0.5   0.25                   
                                                                                                                                               0.25  0.5   0.25              
                                                                                                                                                     0.25  0.5   0.25         
                                                                                                                                                          0.25  0.5   0.25    
                                                                                                                                                               0.25  0.5   0.25
                                                                                                                                                                    0.25  0.5

julia> @time JqrR[1:N,1:N]
  1.179295 seconds (16.09 M allocations: 1.229 GiB, 11.27% gc time)
100000×100000 BandedMatrix{Float64} with bandwidths (1, 1):
 0.25      0.193649                                                                                                                                                         
 0.193649  0.416667  0.225374                                                                                                                                                 
          0.225374  0.458333  0.236228                                                                                                                                        
                   0.236228  0.475     0.241209                                                                                                                               
                            0.241209  0.483333  0.243904                                                                                                                      
                                     0.243904  0.488095  0.245525                                                                                                            
                                              0.245525  0.491071  0.246576                                                                                                    
                                                       0.246576  0.493056  0.247296                                                                                           
                                                                0.247296  0.494444  0.24781                                                                                   
                                                                                                                                                                                              
                                                                                                                                 0.25  0.5   0.25                             
                                                                                                                                      0.25  0.5   0.25                        
                                                                                                                                           0.25  0.5   0.25                   
                                                                                                                                               0.25  0.5   0.25              
                                                                                                                                                     0.25  0.5   0.25         
                                                                                                                                                          0.25  0.5   0.25    
                                                                                                                                                               0.25  0.5   0.25
                                                                                                                                                                    0.25  0.5

Once this is done I will change my PR in SemiclassicalOPs to use this approach for the hierarchy.

@TSGut
Copy link
Member Author

TSGut commented May 22, 2023

@dlfivefifty Not sure what's going on here, I think the failing tests have nothing to do with my changes. Did something recently change further upstream in the dependencies?

@dlfivefifty
Copy link
Member

JuliaArrays/FillArrays.jl#254

@MikaelSlevinsky
Copy link
Member

Since we know the true result in this case, how is the 2-norm stability of each approach?

Some weight modifications preserve even-odd symmetries, like sqrtw(x) = 1-x^2. How does each approach fare for symmetry preservation?

@TSGut
Copy link
Member Author

TSGut commented May 22, 2023

Let's see...

julia> Jclass = jacobimatrix(Normalized(jacobi(2,0,0..1)))
ℵ₀×ℵ₀ LazyBandedMatrices.SymTridiagonal{Float64, ApplyArray{Float64, 1, typeof(vcat), Tuple{Float64, BroadcastVector{Float64, typeof(/), Tuple{BroadcastVector{Float64, typeof(-), Tuple{BroadcastVector{Float64, typeof(/), Tuple{Float64, BroadcastVector{Float64, typeof(*), Tuple{InfiniteArrays.InfStepRange{Float64, Float64}, InfiniteArrays.InfStepRange{Float64, Float64}}}}}, Fill{Float64, 1, Tuple{InfiniteArrays.OneToInf{Int64}}}}}, Float64}}}}, BroadcastVector{Float64, typeof(sqrt), Tuple{BroadcastVector{Float64, typeof(*), Tuple{BroadcastVector{Float64, typeof(/), Tuple{BroadcastVector{Float64, typeof(/), Tuple{BroadcastVector{Float64, typeof(*), Tuple{InfiniteArrays.InfStepRange{Float64, Float64}, InfiniteArrays.InfStepRange{Float64, Float64}}}, BroadcastVector{Float64, typeof(*), Tuple{InfiniteArrays.InfStepRange{Float64, Float64}, InfiniteArrays.InfStepRange{Float64, Float64}}}}}, Float64}}, ApplyArray{Float64, 1, typeof(vcat), Tuple{Float64, BroadcastVector{Float64, typeof(/), Tuple{BroadcastVector{Float64, typeof(/), Tuple{BroadcastVector{Float64, typeof(*), Tuple{InfiniteArrays.InfStepRange{Int64, Int64}, InfiniteArrays.InfStepRange{Float64, Float64}}}, BroadcastVector{Float64, typeof(*), Tuple{InfiniteArrays.InfStepRange{Float64, Float64}, InfiniteArrays.InfStepRange{Float64, Float64}}}}}, Float64}}}}}}}}} with indices OneToInf()×OneToInf():
 0.25      0.193649                                                          
 0.193649  0.416667  0.225374                                                  
          0.225374  0.458333  0.236228                                         
                   0.236228  0.475     0.241209                                
                            0.241209  0.483333  0.243904                       
                                     0.243904  0.488095  0.245525             
                                              0.245525  0.491071  0.246576     
                                                       0.246576  0.493056     
                                                                0.247296     
                                                                                

julia> N = 100_000
100000

julia> norm(Jchol[1:N,1:N]-Jclass[1:N,1:N],2)
4.847754503646846e-7

julia> norm(JqrQ[1:N,1:N]-Jclass[1:N,1:N],2)
7.385957369779282e-14

julia> norm(JqrR[1:N,1:N]-Jclass[1:N,1:N],2)
1.5291452704376217e-13

Maybe ever so slightly better to do Q than R in terms of stability? But given the roughly equivalent cost (slightly larger constant in Q method I think due to constructing the givens / householder matrices) I don't see a reason not to use this. Keep in mind this comparison is unfair to Cholesky since it goes directly here rather than step by step where it performs a lot closer to the QR method.

Here is another example (again, where I do the not-ideal thing of going directly instead of step by step) which shows similar behavior:

julia> wf(x) = x^2*(1-x)^2;

julia> sqrtwf(x) = x*(1-x);

julia> Jchol = cholesky_jacobimatrix(wf, P);

julia> JqrQ = qr_jacobimatrix(sqrtwf, P);

julia> JqrR = qr_jacobimatrix(sqrtwf, P, :R);

julia> Jclass = jacobimatrix(Normalized(jacobi(2,2,0..1)))
ℵ₀×ℵ₀ LazyBandedMatrices.SymTridiagonal{Float64, ApplyArray{Float64, 1, typeof(vcat), Tuple{Float64, BroadcastVector{Float64, typeof(/), Tuple{BroadcastVector{Float64, typeof(-), Tuple{BroadcastVector{Float64, typeof(/), Tuple{Float64, BroadcastVector{Float64, typeof(*), Tuple{InfiniteArrays.InfStepRange{Float64, Float64}, InfiniteArrays.InfStepRange{Float64, Float64}}}}}, Fill{Float64, 1, Tuple{InfiniteArrays.OneToInf{Int64}}}}}, Float64}}}}, BroadcastVector{Float64, typeof(sqrt), Tuple{BroadcastVector{Float64, typeof(*), Tuple{BroadcastVector{Float64, typeof(/), Tuple{BroadcastVector{Float64, typeof(/), Tuple{BroadcastVector{Float64, typeof(*), Tuple{InfiniteArrays.InfStepRange{Float64, Float64}, InfiniteArrays.InfStepRange{Float64, Float64}}}, BroadcastVector{Float64, typeof(*), Tuple{InfiniteArrays.InfStepRange{Float64, Float64}, InfiniteArrays.InfStepRange{Float64, Float64}}}}}, Float64}}, ApplyArray{Float64, 1, typeof(vcat), Tuple{Float64, BroadcastVector{Float64, typeof(/), Tuple{BroadcastVector{Float64, typeof(/), Tuple{BroadcastVector{Float64, typeof(*), Tuple{InfiniteArrays.InfStepRange{Int64, Int64}, InfiniteArrays.InfStepRange{Float64, Float64}}}, BroadcastVector{Float64, typeof(*), Tuple{InfiniteArrays.InfStepRange{Float64, Float64}, InfiniteArrays.InfStepRange{Float64, Float64}}}}}, Float64}}}}}}}}} with indices OneToInf()×OneToInf():
 0.5       0.188982                                                          
 0.188982  0.5       0.218218                                                  
          0.218218  0.5       0.230283                                         
                   0.230283  0.5       0.236525                                
                            0.236525  0.5       0.240192                       
                                     0.240192  0.5       0.242536             
                                              0.242536  0.5       0.244126     
                                                       0.244126  0.5          
                                                                0.245256     
                                                                                 

julia> N = 100_000
100000

julia> norm(Jchol[1:N,1:N]-Jclass[1:N,1:N],2)
8.88926155041543e-5

julia> norm(JqrQ[1:N,1:N]-Jclass[1:N,1:N],2)
7.653723227459066e-11

julia> norm(JqrR[1:N,1:N]-Jclass[1:N,1:N],2)
2.2962555778506003e-10

What would you like to see as a test of symmetry? I think currently Sheehan's ConvertedOrthogonalPolynomial always uses Cholesky (which makes sense since it's more universal) so proper hierarchy testing may have to wait until we are in the SemiclassicalOPs context.

@dlfivefifty
Copy link
Member

I can't decide whether to ask why the errors are so large in the second example or so small in the first 😅 Since if round-off grows like N we expect errors on the order of:

julia> eps() * 100_000
2.220446049250313e-11

@TSGut
Copy link
Member Author

TSGut commented May 22, 2023

I don't know. 😅 But actually I think the first is more representative of the general behavior, not the second. In the second I'm doing x and 1-x in one step, which is worse than doing them one at a time. In the semiclassical hierarchy we'd be doing things like in the first example.

@TSGut
Copy link
Member Author

TSGut commented May 22, 2023

Just to back up my claim with code here is an example of raising by the other weight term in one step, at a higher point in the Jacobi family. If we do a single step we see the errors observed in the first example.

julia> P = Normalized(jacobi(3,3,0..1));

julia> x = axes(P,1);

julia> J = jacobimatrix(P);

julia> sqrtwf(x) = x;

julia> JqrQ = qr_jacobimatrix(sqrtwf, P);

julia> JqrR = qr_jacobimatrix(sqrtwf, P, :R);

julia> N = 100_000
100000
julia> Jclass = jacobimatrix(Normalized(jacobi(3,5,0..1)))
ℵ₀×ℵ₀ LazyBandedMatrices.SymTridiagonal{Float64, ApplyArray{Float64, 1, typeof(vcat), Tuple{Float64, BroadcastVector{Float64, typeof(/), Tuple{BroadcastVector{Float64, typeof(-), Tuple{BroadcastVector{Float64, typeof(/), Tuple{Float64, BroadcastVector{Float64, typeof(*), Tuple{InfiniteArrays.InfStepRange{Float64, Float64}, InfiniteArrays.InfStepRange{Float64, Float64}}}}}, Fill{Float64, 1, Tuple{InfiniteArrays.OneToInf{Int64}}}}}, Float64}}}}, BroadcastVector{Float64, typeof(sqrt), Tuple{BroadcastVector{Float64, typeof(*), Tuple{BroadcastVector{Float64, typeof(/), Tuple{BroadcastVector{Float64, typeof(/), Tuple{BroadcastVector{Float64, typeof(*), Tuple{InfiniteArrays.InfStepRange{Float64, Float64}, InfiniteArrays.InfStepRange{Float64, Float64}}}, BroadcastVector{Float64, typeof(*), Tuple{InfiniteArrays.InfStepRange{Float64, Float64}, InfiniteArrays.InfStepRange{Float64, Float64}}}}}, Float64}}, ApplyArray{Float64, 1, typeof(vcat), Tuple{Float64, BroadcastVector{Float64, typeof(/), Tuple{BroadcastVector{Float64, typeof(/), Tuple{BroadcastVector{Float64, typeof(*), Tuple{InfiniteArrays.InfStepRange{Int64, Int64}, InfiniteArrays.InfStepRange{Float64, Float64}}}, BroadcastVector{Float64, typeof(*), Tuple{InfiniteArrays.InfStepRange{Float64, Float64}, InfiniteArrays.InfStepRange{Float64, Float64}}}}}, Float64}}}}}}}}} with indices OneToInf()×OneToInf():
 0.6      0.14771                                                           
 0.14771  0.566667  0.184374                                                  
         0.184374  0.547619  0.203579                                         
                  0.203579  0.535714  0.215229                                
                           0.215229  0.527778  0.222909                       
                                    0.222909  0.522222  0.228266             
                                             0.228266  0.518182  0.232161     
                                                      0.232161  0.515152     
                                                               0.235087     
                                                                                

julia> norm(JqrQ[1:N,1:N]-Jclass[1:N,1:N],2)
6.784308695781622e-14

julia> norm(JqrR[1:N,1:N]-Jclass[1:N,1:N],2)
1.619843167289299e-13

Same behavior in terms of norms.

@TSGut
Copy link
Member Author

TSGut commented May 22, 2023

Ok actually the constant between the Q and R method is more substantial than the simple timer was suggesting because of an implementation detail in how cached vectors are expanded.

Here are two CPU timings showing the actual linear complexity in both methods with current implementation.

XraisingLegendre

I prematurely stopped optimizing the Q method because I thought it was matching R but with the difference being this big (roughly 5x the cpu time) it is worth taking another shot at bringing the cost down. I have a few implementation ideas that should do it based on profiling them, so it should end up closer together than it is now.

@TSGut
Copy link
Member Author

TSGut commented May 22, 2023

Factor 2 speed-up bringing Q closer to R method. Let's see if there is anything left to do (which doesn't involve new custom struct to replace SymTridiagonal which is always an option).

XraisingLegendreImproved

@codecov
Copy link

codecov bot commented May 22, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.22 🎉

Comparison is base (a143ef5) 89.69% compared to head (765d0e2) 89.91%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #138      +/-   ##
==========================================
+ Coverage   89.69%   89.91%   +0.22%     
==========================================
  Files          17       17              
  Lines        1756     1795      +39     
==========================================
+ Hits         1575     1614      +39     
  Misses        181      181              
Impacted Files Coverage Δ
src/ClassicalOrthogonalPolynomials.jl 86.60% <ø> (ø)
src/choleskyQR.jl 100.00% <100.00%> (ø)

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@TSGut
Copy link
Member Author

TSGut commented May 22, 2023

Okay, I will leave it for now, I think the difference in cost has become acceptable considering the Q variant has to construct parts of the Householder matrices. The final comparison for a raising from Legendre looks like this.
XraisingLegendreImproved

Notably raising in some random Jacobi basis seems to put the two methods closer together, so maybe R has some additional advantages due to how Legendre is implemented somewhere. Here is a more general Jacobi example:
XraisingJacobiImproved

In any case they are both easily fast enough to go far beyond 100k entries so we probably don't need to worry too much. I just didn't like the ~5x cost, <1.5x is something I can stomach given that the Q approach is more involved.

@MikaelSlevinsky
Copy link
Member

Let's say it's fast enough! There's probably a hard theoretical limit where the Q method is some specific constant slower than the R method.

Are you computing both bands of the modified Jacobi matrix independently? Looks like some code specializes on :dv and :ev.

@TSGut
Copy link
Member Author

TSGut commented May 23, 2023

Yes but R and Q both do this, which makes both of them at minimum a factor 2 slower than their theoretical limit since they have to redo the computations for each band. It's a frustrating consequence of sticking to SymTridiagonal. The reason for that is that we want this to be a cached SymTridiagonal which takes two bands as input. To be consistent with SymTridiagonal , Julia needs to know how to extend each band individually and since each band is its own object it can't tell Julia to simultaneously extend the other one. The two bands of a SymTridiagonal are unaware of each other.

The way around this would be to make a new struct as a subset of AbstractCachedArray which then gets extended as a whole entity. This would be a factor 2 speed-up for both methods and I actually had it set up that way before but after Sheehan's feedback in the previous PR we settled on it being a SymTridiagonal.

This would be easy to do if we want to tease out more efficiency but it would "fit" less well into existing packages which expect SymTridiagonal for a jacobimatrix.

@TSGut
Copy link
Member Author

TSGut commented May 23, 2023

I should add: There are certain advantages to the current approach. Resizing a vector is better than resizing a matrix, so if we resize often at small dimensions (arguably the usual use-case) this will be faster. But to reach high N in one swoop as in the above benchmarks, the current approach underperforms the theoretical limit by a factor 2.

@MikaelSlevinsky
Copy link
Member

Maybe the key then is for both bands to store the same cached QR factorization of sqrtw? This could be done I think if the cached qr(sqrtw) could be called in qr_jacobimatrix and it gets passed to both band constructors.

@TSGut
Copy link
Member Author

TSGut commented May 23, 2023

That would work, yes. It wouldn't affect the factor 2 I spoke of since that's all the householder stuff we have to redo but it should lead to a speedup nonetheless. Should be a quick change.

@TSGut
Copy link
Member Author

TSGut commented May 23, 2023

As a matter of fact, the Cholesky version already does this. So that's an oversight in the QR variant.

@TSGut
Copy link
Member Author

TSGut commented May 23, 2023

I am somewhat surprised that this doesn't appear to make a measureable performance impact but it still makes sense to do it this way regardless (at least it should save memory). I will check tomorrow whether I missed something. But I checked by selectively expanding and they are definitely sharing the same QR now in the same way as the Cholesky approach has been doing. I guess maybe it wasn't a significant contribution in high N benchmarks since I pre-fill before the entry-expansion loop.

@dlfivefifty
Copy link
Member

There are extreme amounts of allocations happening, and you should never form a dense Householder matrix, instead only apply it to a vector/matrix. So I'd expect the timings could be significantly improved (but not urgent).

@TSGut
Copy link
Member Author

TSGut commented May 23, 2023

I'll take another stab at reducing the cost in a bit.

@TSGut
Copy link
Member Author

TSGut commented May 23, 2023

Ok, so I of course wasn't computing dense householder matrices per se, I was computing the dense acting block of the householder which is an O(b) where b=bandwidths(sqrtW) operation and allocation, so it's cheap.

I have benchmarked it now and pre-allocating the b x b part of H (without ever densely computing H) is faster for computing all the required entries in the block H*M*H where M is a O(b) x O(b) block that moves down the band as we expand. The reason it's cheaper to pre-allocate the important part of H I think is that writing it all out the action of H*M*H is something like (M-vv'M-Mvv'+vv'Mvv' (I am omitting the tau constants for simplicity). And we really do need all those entries, not just one of them, to be adaptive. Doing it in this way unnecessarily recomputes the vv' repeatedly which in benchmarks costs more than just pre-computing it and re-using it (and this is equivalent to computing the acting block of H). So I don't think this path leads to any more speed-ups. To be clear, the v, M and H are all just of sizes b or b x b so there is very little to gain here either way. But maybe there is another special case thing here that I am missing (it's clear to me that one-sided H application can be sped up this way, just not sure on two-sided, hence why I benchmarked it).

Nevertheless, this tip did lead me to find an operation that was O(n*b^2) for no reason and reduce it to O(n*b), so Q is even closer to R now. The general Jacobi case is now strikingly similar:
XraisingJacobiImproved

I think it's fair to say they now perform de-facto the same. In any case these optimizations are mainly for vanity I think, or more charitably so that we can say in the paper that the two methods perform similarly well.

@TSGut
Copy link
Member Author

TSGut commented May 23, 2023

@dlfivefifty I am happy for this to be merged if all tests pass.

@TSGut
Copy link
Member Author

TSGut commented May 27, 2023

Ok, I reworked it such that both bands are generated together while retaining the nice interaction with SymTridiagonal. This is a strange object so there may still be some stuff to tidy further but it all works at least.

This hasn't affected the gap between R and Q, which is still about as narrow as shown above but it has given us an approximately 1.5x speed-up (it's less than 2x because we were already resizing a joint QR / Cholesky so the workload was already partially shared). Allocations have also been more than halved since the first working implementation.

@TSGut TSGut requested a review from dlfivefifty May 27, 2023 16:19
@TSGut
Copy link
Member Author

TSGut commented Jun 7, 2023

@dlfivefifty Some mind bending indexing later, this now just uses reflectorApply!. I don't see any meaningful performance / allocation improvements (but to be fair also no deprecations, it's pretty much the same). But as said before the Q approach did dip under the R approach in allocations so perhaps there just isn't much more to be gained. Perhaps there is also some new sneaky allocation somewhere. I will take my eyes off it for a bit and have another look later today.

@TSGut
Copy link
Member Author

TSGut commented Jun 7, 2023

Not sure why 1.7 specifically fails. I guess I have to download that version and check what's happening there.

@TSGut
Copy link
Member Author

TSGut commented Jun 7, 2023

Ok, so @dlfivefifty: apparently in 1.7 they changed reflectorApply!() to only accept matrices, i.e. it's function reflectorApply!(x::AbstractVector, τ::Number, A::AbstractMatrix). This was then reverted at some point, in 1.9 at least it reads function reflectorApply!(x::AbstractVector, τ::Number, A::AbstractVecOrMat).

The 1.7 version in general seems to have had loss of functionality, so I am considering just copying the 1.9 version explicitly into this code. Alternatively we can make some special case for 1.7 but I guess I am not a big fan of that.

Your preference?

@dlfivefifty
Copy link
Member

Why not just make v1.9 required?

That would mean we can start using extensions (it doesn't look like anything currently here can be moved to an extension but why not?)

@TSGut
Copy link
Member Author

TSGut commented Jun 8, 2023

That's alright with me, I just didn't want to make a package-level decision like that for you. I'll change it to do that. 👍

@TSGut
Copy link
Member Author

TSGut commented Jun 8, 2023

ok, I changed it to ask for 1.9, to only test on 1.9 and upped the version number since I guess this requirement is a significant change.

Other than that, I am once again happy with the state of the PR. ready for review or merge at your discretion assuming the tests pass

@dlfivefifty
Copy link
Member

I think its fine for now. There are still a few obvious allocations but its ok

@dlfivefifty dlfivefifty merged commit 526dc6d into JuliaApproximation:main Jun 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants