Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

limit the number of parallel build process #2767

Closed
shionryuu opened this issue Jan 14, 2023 · 3 comments
Closed

limit the number of parallel build process #2767

shionryuu opened this issue Jan 14, 2023 · 3 comments

Comments

@shionryuu
Copy link

Environment

$ ./rebar3 report "./rebar3 compile"
Rebar3 report
 version 3.20.0
 generated at 2023-01-14T07:08:52+00:00
=================
Please submit this along with your issue at https://github.com/erlang/rebar3/issues (and feel free to edit out private information, if any)
-----------------
Task: ./rebar3
Entered as:
  ./rebar3 compile
-----------------
Operating System: x86_64-pc-linux-gnu
ERTS: Erlang/OTP 25 [erts-13.1.3] [source] [64-bit] [smp:6:6] [ds:6:6:10] [async-threads:1] [jit:ns]
Root Directory: /usr/lib/erlang
Library directory: /usr/lib/erlang/lib
-----------------
Loaded Applications:
bbmustache: 1.12.2
certifi: 2.9.0
cf: 0.3.1
common_test: 1.23.2
compiler: 8.2.2
crypto: 5.1.2
cth_readable: 1.5.1
dialyzer: 5.0.4
edoc: 1.2
erlware_commons: 1.5.0
eunit: 2.8.1
eunit_formatters: 0.5.0
getopt: 1.0.1
inets: 8.2
kernel: 8.5.2
providers: 1.9.0
public_key: 1.13.2
relx: 4.7.0
sasl: 4.2
snmp: 5.13.2
ssl_verify_fun: 1.1.6
stdlib: 4.2
syntax_tools: 3.0
tools: 3.5.3

-----------------
Escript path: /home/user/Workspace/erlang/project/rebar3
Providers:
  app_discovery as clean compile compile cover ct deps dialyzer do edoc escriptize eunit get-deps help install install_deps list lock new path pkgs release relup report repos shell state tar tree unlock update upgrade upgrade upgrade vendor version xref 

Current behaviour

rebar compile open too many process, occupy too mary resources (cpu, memory) and will kill by system

$ DEBUG=1 ./rebar3 compile
===> Expanded command sequence to be run: [app_discovery,install_deps,lock,compile]
===> Running provider: app_discovery
===> Found top-level apps: [project]
	using config: [{src_dirs,["src"]},{lib_dirs,["apps/*","lib/*","."]}]
===> Evaluating config script "/home/user/Workspace/erlang/project/_build/default/lib/lager/rebar.config.script"
===> Running provider: install_deps
===> Verifying dependencies...
===> Comparing git ref ...
===> Running provider: lock
===> Running provider: compile
===> Compile (apps)
===> Running hooks for compile with configuration:
===> 	{pre_hooks, []}.
===> Copying existing files from /home/user/Workspace/erlang/project/ebin to /home/user/Workspace/erlang/project/_build/default/lib/project/ebin
===> Compile (project_apps)
===> Running hooks for compile in app project (/home/user/Workspace/erlang/project) with configuration:
===> 	{pre_hooks, []}.
===> Running hooks for erlc_compile in app project (/home/user/Workspace/erlang/project) with configuration:
===> 	{pre_hooks, []}.
===> Analyzing applications...
=ERROR REPORT==== 14-Jan-2023::15:03:08.009214 ===
File operation error: emfile. Target: /home/user/Workspace/erlang/project/_build/default/lib/lager/ebin/erl_posix_msg.beam. Function: get_file. Process: code_server.
=ERROR REPORT==== 14-Jan-2023::15:03:08.039182 ===
File operation error: emfile. Target: /home/user/Workspace/erlang/project/_build/default/lib/mysql_poolboy/ebin/erl_posix_msg.beam. Function: get_file. Process: code_server.
=ERROR REPORT==== 14-Jan-2023::15:03:08.063239 ===
File operation error: emfile. Target: /home/user/Workspace/erlang/project/_build/default/lib/eredis/ebin/erl_posix_msg.beam. Function: get_file. Process: code_server.
=ERROR REPORT==== 14-Jan-2023::15:03:08.458702 ===
Error in process <0.26822.0> with exit value:
{{nocatch,
     {error,
         {rebar_compiler_erl,
             {cannot_read_file,
                 "/home/user/Workspace/erlang/project/src/server/server_timer.erl",
                 "too many open files"}}}},
 [{rebar_compiler_erl,dependencies,4,
      [{file,
           "/home/runner/work/rebar3/rebar3/apps/rebar/src/rebar_compiler_erl.erl"},
       {line,127}]},
  {rebar_compiler_dag,'-prepopulate_deps/5-fun-0-',5,
      [{file,
           "/home/runner/work/rebar3/rebar3/apps/rebar/src/rebar_compiler_dag.erl"},
       {line,461}]}]}

...

[1]    39118 killed     DEBUG=1 ./rebar3 compile
$ find . -name '*.erl' | wc -l
5501

Expected behaviour

limit the number of parallel build process

@ferd
Copy link
Collaborator

ferd commented Jan 17, 2023

Hm, that number should already be limited, at least in the actual compilation.
The limit is set at

queue(Tasks, WorkF, WArgs, Handler, HArgs) ->
Parent = self(),
Worker = fun() -> worker(Parent, WorkF, WArgs) end,
Jobs = min(length(Tasks), erlang:system_info(schedulers)),
?DIAGNOSTIC("Starting ~B worker(s)", [Jobs]),
Pids = [spawn_monitor(Worker) || _ <- lists:seq(1, Jobs)],
parallel_dispatch(Tasks, Pids, Handler, HArgs).

The queue of worker is set up at:

compile_parallel(Targets, Opts, BaseOpts, Mappings, CompilerMod) ->
Tracking = erlang:function_exported(CompilerMod, compile_and_track, 4),
rebar_parallel:queue(
Targets,
fun compile_worker/2, [Opts, BaseOpts, Mappings, CompilerMod],
fun compile_handler/2, [BaseOpts, Tracking]
).

This should give you no more compile units than there are schedulers on your VM for normal Erlang code builds.

However the issue you show comes from the code server in the analysis step... It is possible that the issue comes from the OTP-provided epp module that we use to access the code's abstract files:

deps(File, Opts) ->
{EppOpts, ExtraOpts} = split_opts(Opts),
{ok, Forms} = epp:parse_file(File, EppOpts),
normalize(handle_forms(Forms, default_attrs(), ExtraOpts)).

What I've found is that we had a patch in the DAG analysis to speed it up that does call for more concurrency, unbounded one at that:

%% Add dependencies of a given file to the DAG. If the file is not found yet,
%% mark its timestamp to 0, which means we have no info on it.
%% Source files will be covered at a later point in their own scan, and
%% non-source files are going to be covered by `populate_deps/3'.
prepopulate_deps(Compiler, InDirs, Source, DepOpts, Control) ->
{Worker, _MRef} = spawn_monitor(
fun () ->
SourceDir = filename:dirname(Source),
AbsIncls = case erlang:function_exported(Compiler, dependencies, 4) of
false ->
Compiler:dependencies(Source, SourceDir, InDirs);
true ->
Compiler:dependencies(Source, SourceDir, InDirs, DepOpts)
end,
Control ! {deps, self(), AbsIncls}
end
),
Worker.

That was added in a performance drive in 2020: #2322

Chances are that the only needed fix is to transform the DAG analysis to use a worker queue to be faster. I can try experimenting with that, it should be workable.

ferd added a commit to ferd/rebar3 that referenced this issue Jan 23, 2023
This addresses erlang#2767 by creating
a pool mechanism in rebar_parallel that keeps as similar of an interface
as possible as the queue mechanism, with the one caveat that it allows
the asynchronous creation of tasks rather than requiring them all at
start time.

The mechanism is not tested super deeply, which is probably a mistake,
but I wanted to get a reviewable PR first.

The mechanism is also added to the rebar_compiler_dag module to cover
use cases that were handled by spawning an unbounded number of processes
before, which would cause problem with low file descriptors being
allocated and lots of files being used and open in parallel. The pool
mechanism puts an upper bound on processing but also on resource usage.

So this PR may also come with a performance regression, and if so we'd
want to override the default 1-per-scheduler pool options to use a lot
more and hit a middleground in performance vs. resource usage.
@ferd
Copy link
Collaborator

ferd commented Jan 23, 2023

See #2768 for a potential fix to this. It seems fragile, but may allow checking the early approach.

ferd added a commit to ferd/rebar3 that referenced this issue Jan 23, 2023
This addresses erlang#2767 by creating
a pool mechanism in rebar_parallel that keeps as similar of an interface
as possible as the queue mechanism, with the one caveat that it allows
the asynchronous creation of tasks rather than requiring them all at
start time.

The mechanism is not tested super deeply, which is probably a mistake,
but I wanted to get a reviewable PR first.

The mechanism is also added to the rebar_compiler_dag module to cover
use cases that were handled by spawning an unbounded number of processes
before, which would cause problem with low file descriptors being
allocated and lots of files being used and open in parallel. The pool
mechanism puts an upper bound on processing but also on resource usage.

So this PR may also come with a performance regression, and if so we'd
want to override the default 1-per-scheduler pool options to use a lot
more and hit a middleground in performance vs. resource usage.
@shionryuu
Copy link
Author

My project can build successfully after checkout #2768. Thank you.

ferd added a commit to ferd/rebar3 that referenced this issue Feb 15, 2023
This addresses erlang#2767 by creating
a pool mechanism in rebar_parallel that keeps as similar of an interface
as possible as the queue mechanism, with the one caveat that it allows
the asynchronous creation of tasks rather than requiring them all at
start time.

The mechanism is not tested super deeply, which is probably a mistake,
but I wanted to get a reviewable PR first.

The mechanism is also added to the rebar_compiler_dag module to cover
use cases that were handled by spawning an unbounded number of processes
before, which would cause problem with low file descriptors being
allocated and lots of files being used and open in parallel. The pool
mechanism puts an upper bound on processing but also on resource usage.

So this PR may also come with a performance regression, and if so we'd
want to override the default 1-per-scheduler pool options to use a lot
more and hit a middleground in performance vs. resource usage.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants