Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rename extensions #476

Closed
wants to merge 1 commit into from
Closed

rename extensions #476

wants to merge 1 commit into from

Conversation

CarloLucibello
Copy link
Member

@CarloLucibello CarloLucibello commented Feb 18, 2023

Renaming extensions based on the discussion in JuliaMath/ChangesOfVariables.jl#13

Also done some extra cleaning of the CUDA extension (just removed all the non-source files and moved the tests to test/ext_cuda/) but still without actually hooking it (to be done in #445).

@CarloLucibello CarloLucibello changed the title remane extensions rename extensions Feb 18, 2023
@ToucheSir
Copy link
Member

I don't think we can rename NNlibCUDA because the extension name has to match the module name, and the module name has to match the registered package name. Likewise we'd need a way to run just the NNlibCUDA tests when someone calls ] test NNlibCUDA from the REPL, and I don't think this allows for that?

@CarloLucibello
Copy link
Member Author

I thought that the plan was to not have a NNlibCUDA package anymore, just an extension of NNlib.

@ToucheSir
Copy link
Member

We can't do that unless we want to stop supporting Julia <1.9 in both NNlib and Flux.

@CarloLucibello
Copy link
Member Author

why not? in the end we will want an extension in any case, it is the right fit for our use case. What are the alternatives? NNlibCUDA just as a subpackage or it is possible to have it as a subpackage and an extension at the same time? What was your intent with #445?

@ToucheSir
Copy link
Member

why not? in the end we will want an extension in any case, it is the right fit for our use case.

It means maintaining backport branches for any bugfixes or enhancements we want to make available for Julia 1.6-1.8 users. My recollection of the last time we tried this for Flux was that it didn't go great.

it is possible to have it as a subpackage and an extension at the same time

I think so, which is why I stopped working on #445 and submitted #471 first. However, I believe this requires us to keep the subpackage name as NNlibCUDA. Which means the extension name has to be NNlibCUDA (no Ext). If anyone knows of a way around that, please do chime in.

@pxl-th
Copy link
Member

pxl-th commented Feb 19, 2023

We can leave NNlibCUDA as is until we drop support for older versions.

@CarloLucibello
Copy link
Member Author

@ToucheSir I'm not sure this subpackage AND extension thing is supported. Extensions are supposed to be located at ext/AAAExt/AAAExt.jl or ext/AAAExt.jl, not at ext/AAAExt/src/AAAExt.jl. Is there any precedent?

With package images and extensions, julia 1.9 brings so much to the table that users and package developers will switch quickly to it, I don't see the need to keep supporting older versions. Also, I don't see such critical hotfixes these days that would make me worry about backporting.

@ToucheSir
Copy link
Member

I think it would be sufficient to add a ext/AAAExt/AAAExt.jl which includes ext/AAAExt/src/AAAExt.jl.

With package images and extensions, julia 1.9 brings so much to the table that users and package developers will switch quickly to it, I don't see the need to keep supporting older versions. Also, I don't see such critical hotfixes these days that would make me worry about backporting.

My worry is that it's just too early to ascertain whether there will be a switch en masse. 1.8 also brought great improvements for inference and TTFX, but as of a couple months ago I was seeing issues/questions from people at organizations using Julia who were running 1.7...

@pxl-th
Copy link
Member

pxl-th commented Feb 20, 2023

My recollection of the last time we tried this for Flux was that it didn't go great.

Was it due to some Julia infrastructure limitation?
Otherwise should be pretty easy to maintain two branches with backport fixes (even if manual) to the second one.


With push to 1.9 we can also start using KernelAbstractions and drastically reduce the amount of work needed to add functionality to different backends.

@ToucheSir
Copy link
Member

ToucheSir commented Feb 20, 2023

Was it due to some Julia infrastructure limitation?
Otherwise should be pretty easy to maintain two branches with backport fixes (even if manual) to the second one.

I don't quite remember and I can't seem to find issues talking about it now. @CarloLucibello would though. The registrator should work, but no idea if TagBot would. My recollection of the problem was that commits either made it onto the wrong branch or not all commits that needed to be backported were backported before a release was cut. Certainly if we wanted to do this going forward, I reckon we'd have to be a little more fastidious about squashing/rebasing PRs with lots of intermediate testing commits.

With push to 1.9 we can also start using KernelAbstractions and drastically reduce the amount of work needed to add functionality to different backends.

This was another factor, but I noticed recently that KA master lowered the Julia compat bound to 1.6? Maybe something to do with using it in GPUArrrays + CUDA.jl.

@pxl-th
Copy link
Member

pxl-th commented Feb 20, 2023

Maybe if we are not sure about the status of NNlibCUDA, we can go ahead only with AMDGPU for now?


I'm not sure that the amount of users that do not want to migrate to newer Julia versions is big and is more likely to be low. They certainly are not relying on some new features to come.

@ToucheSir
Copy link
Member

ToucheSir commented Feb 21, 2023

In hindsight #471 should've been merged into ext/NNlibCUDAExt, but I was ignorant of the discussion on extension naming conventions at the time. However, we may still be able to have our cake and eat it too by doing the following:

  1. Split https://github.com/FluxML/NNlib.jl/blob/master/ext/NNlibCUDA/src/NNlibCUDA.jl into two parts: one which has everything inside the module definition, and one which is just
module NNlibCUDA
  include("everything_inside.jl")
end # module
  1. Create a separate file ext/NNlibCUDAExt.jl or ext/NNlibCUDAExt/NNlibCUDAExt.jl with the following:
module NNlibCUDAExt
  include("[../]NNlibCUDA/src/everything_inside.jl")
end # module

It's not the prettiest since we'll still have this ext/NNlibCUDA directory hanging around, but it should work.

Alternatively, we could soft undo #471 by deleting ext/NNlibCUDA (the commits will remain, but the files on master will not). Then we'd have a cleaner slate with which to evaluate the following options:

  1. Add the extension as a copy-paste of NNlibCUDA source like Re-integrate NNlibCUDA as a package extension #445 and keep the standalone repo for the glue package. This would involve backports but it seems like we're not too worried about those.
  2. Do the same, but re-merge the NNlibCUDA source under lib/NNlibCUDA (i.e. a non-extension path) first. This would bring everything back into one repo and would let us do a sort of "copy-on-write" process where the extension initially just includes everything in the subpackage. Whenever the two diverge, we'd create new files in the extension directory and leave the subpackage ones untouched.
  3. Re-merge the NNlibCUDA source under ext/NNlibCUDAExt and do the two module/separate entry file process described above, but with a saner directory structure. I'm not sure an attempt to re-merge would work, but it's worth a try.

Edit: re-merging does work for option 2/3, but we get basically duplicate commits in the history.

Thoughts?

@CarloLucibello
Copy link
Member Author

Another option is to reintroduce NNlibCUDA in this repo under @require for julia < v1.9, as documented in the pkg extensions docs.
I think that would be the cleanest option since we would have the same behavior on all julia versions, it's easy to do, and it's an "officially supported" pattern.
So if we think that it is worth supporting < v1.9 version (which I don't) the path forward could be:

  • something like this PR, also adding Requires.jl
  • tag NNlib v0.9, so that NNlibCUDA (limited to NNlib v0.8) won't cause conflicts with the newer versions of NNlib

@ToucheSir
Copy link
Member

Don't think that would work because it introduces a circular dependency.

@CarloLucibello
Copy link
Member Author

Don't think that would work because it introduces a circular dependency.

what do you mean?

@ToucheSir
Copy link
Member

Reintroducing NNlibCUDA as a Requires dependency of NNlib introduces a circular dependency because NNlib is a dependency of NNlibCUDA. One way to get around this would be to just copy the NNlibCUDA code and put it behind a @requires block, but that runs into the same lack of precompilation issue #445 ran into.

@CarloLucibello
Copy link
Member Author

CarloLucibello commented Feb 24, 2023

Maybe a good compromise is
https://pkgdocs.julialang.org/dev/creating-packages/#Transition-from-normal-dependency-to-extension
This way, we don't have to drop previous julia versions, but the development of NNlibCUDA continues here and new cuda features will be available only on v1.9+.

@ToucheSir
Copy link
Member

I'm a little confused by your proposal, since the link talks about migrating normal dependencies to weak ones. Since NNlib is a dependency of NNlibCUDA and not the other way around, I don't understand how it applies here (compared to e.g. Flux where it would apply, because Flux depends on NNlibCUDA). Can you elaborate on the concrete changes which would have to be made to this repo?

@CarloLucibello
Copy link
Member Author

CarloLucibello commented Feb 26, 2023

@ToucheSir sorry that was me waking up confused in the middle of the night.
I think we both agree that at some point nnlibcuda will have to end up as an extension here, and the point is how to manage the transition.

The first proposal in #476 (comment) for having both a subpackage and an extension should in principle work but:

  • requires some una tantum logistic work (the new registration in General Registries) and the management of a subpackage
  • If I'm not mistaken it will introduce different behavior across julia versions. Code like
    using NNlib, NNlibCUDA
    ...
    will work fine on julia 1.8 and have issues on julia 1.9.
    On the contrary
    using NNlib, CUDA
    will give some functionalities on 1.9 but not on 1.8

Just dropping <1.9 support seems to me the best option by far. In case of the unlikely event of backports needed, backports will be made.

@ToucheSir
Copy link
Member

We could probably make at least using NNlib, CUDA, NNlibCUDA work on all Julia versions by adding a if VERSION ... block inside the top-level NNlibCUDA code, but I take your point about using NNlib, CUDA behaving differently. That said, isn't it a similar story for most package extensions?

Summarizing the current most popular option to make sure we're all on the same page:

  • NNlibCUDA package source code remains in a separate repo
  • Copy of NNlibCUDA source here will be directly reworked as package extension without backwards compat path for Julia 1.9
  • Further development on NNlib + CUDA integration will happen on the extension
  • Any features or bugfixes we want to make available to Julia <1.9 users will be manually backported to the NNlibCUDA repo.

Does that sound right?

@CarloLucibello
Copy link
Member Author

yes. I will repurpose this PR to do that

@CarloLucibello
Copy link
Member Author

closing in favor of #481

@CarloLucibello CarloLucibello deleted the cl/ext branch June 15, 2023 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants