Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace Distributed with Malt soon! #2240

Merged
merged 46 commits into from
Sep 18, 2023
Merged

Replace Distributed with Malt soon! #2240

merged 46 commits into from
Sep 18, 2023

Conversation

savq
Copy link
Contributor

@savq savq commented Aug 5, 2022

After this PR, we will no longer use the Distributed stdlib to start and control the notebook Julia process (where your code runs) but we use our home-brewed package Malt.jl. The advantages are:

  • It is not Distributed, which means that your code does not run as a Distributed worker. This fixes Allow using Distributed #300.
  • The process launches faster, about 1 second with Malt instead of 4 seconds with Distributed.
  • Slightly improved developer experience: the API of Malt is more intuitive for our use-case and we need fewer workarounds.
  • Room for future improvement: splitting Malt into its own package makes it much easier to improve our multiprocessing functionality (because the codebase and test/benchmark suite is much smaller). This might include:

This PR replaces Distributed with Malt.jl to add support for Distributed itself. Fix #300

Test release

After this PR is merged, we have a testing period where Pluto still uses Distributed, but you can enable Malt with:

Pluto.run(workspace_use_distributed_stdlib=false)

This PR is part of GSoC 2022. Some info on that here.

@github-actions
Copy link
Contributor

github-actions bot commented Aug 5, 2022

Try this Pull Request!

Open Julia and type:

julia> import Pkg
julia> Pkg.activate(temp=true)
julia> Pkg.add(url="https://github.com/savq/Pluto.jl", rev="malt")
julia> using Pluto

@savq savq force-pushed the malt branch 3 times, most recently from afd0fa0 to 9c6a566 Compare September 4, 2022 20:50
@savq savq force-pushed the malt branch 4 times, most recently from 344ac94 to 8beea21 Compare September 24, 2022 23:53
@savq savq force-pushed the malt branch 2 times, most recently from e528ace to e948e7b Compare October 1, 2022 07:00
@disberd
Copy link
Contributor

disberd commented Oct 11, 2022

Would this change allow to interrupt cells on windows as well, or is that broken anyhow?

- Prefer quote blocks to colon quote expressions
- Prefer if blocks to ternary operators
- Remove single pipes
- Minor changes to docstrings and comments
- Remove `workspace_use_distributed` flag. Pluto will now always use
  separate processes.
- Replace calls to `remotecall_eval` with the appropiate Malt calls.
- Replace Distributed with Malt in tests to remove it from dependencies.
  - Add `unmake_workspace` calls in `MacroAnalysis` to prevent creating
    too many processes at the same time.
- Replace Distributed exceptions with Malt exceptions.
- Replace `pid::Integer` with `worker::Worker` in WorkspaceManager.
- Rename `workspaces` to `active_workspaces`.
- Rename `distributed_exception_result` to `workspace_exception_result`.
- Rename `make_distributed_serializable` to `make_serializable`.
@savq savq mentioned this pull request Oct 16, 2022
@fonsp fonsp added backend Concerning the julia server and runtime wide audience This affects a wide audience of Pluto users and future Pluto users performance labels Apr 26, 2023
@fonsp
Copy link
Owner

fonsp commented Sep 13, 2023

Wowww well found @Pangoraw !
This is currently set on all distributed procs for every OS, right? Let's do the same by default on Malt directly?

@fonsp
Copy link
Owner

fonsp commented Sep 13, 2023

Not sure about the nightly test, and on 1.6 the test never seems to start...

@Pangoraw
Copy link
Collaborator

This is currently set on all distributed procs for every OS, right? Let's do the same by default on Malt directly?

yes, this is probably the right move.

For nightly, we could test 1.10.0-beta2 instead? and for 1.6, I don't know...

@fonsp
Copy link
Owner

fonsp commented Sep 13, 2023

Which 1.10 are we testing right now? Testing the beta sounds good!

I also made #2643 to avoid the interrupt call. This means avoiding the problem instead of fixing it, but I don't think that supporting schedule(pluto_server_task, InterruptException()) on Windows is something we need to test. It's important that Ctrl+C in the terminal shuts down the server, but I think that we are simulating this wrong with schedule, so I don't really know how to test for that.

@Pangoraw
Copy link
Collaborator

nightly is actually julia master (future 1.11), it would be nice if a docker tag for 1.10 existed so that we can get the alphas/betas & stable release for 1.10 automatically.

@fonsp
Copy link
Owner

fonsp commented Sep 13, 2023

Still failing with out-of-memory on Windows Julia 1.6 :(

@fonsp
Copy link
Owner

fonsp commented Sep 16, 2023

Maybe we just merge for testing (disabled by default), and eventually only enable Malt as default on Julia 1.8 and up?

@Pangoraw wdyt, any other ideas?

@fonsp
Copy link
Owner

fonsp commented Sep 16, 2023

Or @savq any ideas why we have this last failure with Malt julia 1.6 windows, but not with Distributed backend or another julia distro?

@Pangoraw
Copy link
Collaborator

@Pangoraw wdyt, any other ideas?

I don't have more ideas, we might have to test in real windows and see if we can spot the problem here.

@fonsp
Copy link
Owner

fonsp commented Sep 18, 2023

(btw some improvement to Crrl+C on windows in JuliaLang/julia#51307)

@pankgeorg
Copy link
Collaborator

(btw some improvement to Crrl+C on windows in JuliaLang/julia#51307)

I do not see an improvement, see https://julialang.slack.com/archives/C6SMTHQ3T/p1694888153763359?thread_ts=1694639864.397389&cid=C6SMTHQ3T (which contains that PR)

@fonsp
Copy link
Owner

fonsp commented Sep 18, 2023

Heyy so I think I know what happened!

This initial PR removed the workspace_use_distributed option altogether, then later we added it to Malt (JuliaPluto/Malt.jl#39) but some of the tests in the PR still had the workspace_use_distributed=false setting removed.

I fixed this in two recent commits: 7b09e1d (#2240) and a37fd31 (#2240) and then suddenly the tests passed with Malt enabled! fd8c68a (#2240)

I noticed that we were also getting OOM errors with the Distributed backend as default (27307e3 (#2240)) which means that it's not caused by Malt :) It's just some freaky combination of starting/closing an HTTP server while starting a Julia process that does not work on Windows.

Yayy so I will merge this with Malt still disabled by default, for the testing period, and then we can enable it by default soon (for all Julia versions)!

@fonsp fonsp changed the title Replace Distributed with Malt Replace Distributed with Malt soon! Sep 18, 2023
@fonsp fonsp merged commit a7a424c into fonsp:main Sep 18, 2023
14 of 15 checks passed
@Pangoraw
Copy link
Collaborator

Awesome!! 🎉 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend Concerning the julia server and runtime performance wide audience This affects a wide audience of Pluto users and future Pluto users
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow using Distributed
6 participants