Unbounded %MEM growth #2628

dcnorris · 2024-10-19T11:31:56Z

Below is a self-contained repro for my problem. The unbounded %MEM growth in top apparently happens where I try to use d_n_qs_int/4 'backwards' to convert the integer 4th argument to the list-of-quotients 3rd arg. (I would really like not to have to implement separate 'forward' and 'backward' predicates! 🤡 )

I'm very glad to pare this repro down further, if the problem is not immediately apparent.

% Repro for unbounded mem growth on large inputs
:- use_module(library(lists)).
:- use_module(library(clpz)).
:- use_module(library(lambda)).
:- use_module(library(error)).
:- use_module(library(debug)).

clpz:monotonic.

q_r(T/N, T:U) :- 0 #=< #T, 0 #=< #U, #N #= T + U.

qs_Ts_Us(Qs, ΣTs, ΣUs) :-
    maplist(\Q^T^U^(q_r(Q, T:U)), Qs, Ts, Us),
    intlist_partsums(Ts, ΣTs),
    intlist_partsums(Us, ΣUs).

intlist_partsums([X|Xs], [X|Ss]) :-
    intlist_partsums_acc(Xs, Ss, X).

intlist_partsums_acc([], [], _).
intlist_partsums_acc([X|Xs], [S|Ss], A) :-
    #S #= #X + #A,
    intlist_partsums_acc(Xs, Ss, S).

qs_d_nmax(Qs, D, Nmax) :-
    length(Qs, D),
    maplist(\Q^T^N^(Q = T/N), Qs, Ts, Ns),
    Ns ins 0..Nmax, label(Ns),
    maplist(\T^N^(T in 0..N), Ts, Ns), label(Ts).

% d_n_qs_int(+D, +N, ?Qs, ?K)
d_n_qs_int(D, N, Qs, K) :-
    #B #= #D * #N + 1, % K is a base-(DN+1) number
    #M #= #B ^ #D, % M-1 is maximum D-digit, base-B number
    length(Qs, D),
    qs_Ts_Us(Qs, Ts, Us),
    base_digits_int(B, Ts, TK),
    base_digits_int(B, Us, UK),
    %   Top D digits +  Low D digits
    #K #= (#M * #TK) + (#M - 1 - #UK). % a (2*D)-digit number

horner(X, A, P0, P) :- #P #= #A + #X * #P0. % https://en.wikipedia.org/wiki/Horner%27s_method

base_digits_int(B, Digs, K) :-
    #Bminus1 #= #B-1,
    Digs ins 0..Bminus1,
    foldl(horner(B), Digs, 0, K).

/*
?- D=3, Nmax=6,
   L+\(findall(Q, qs_d_nmax(Q, D, Nmax), Qs),
   maplist(d_n_qs_int(D,Nmax), Qs, Ks),
   sort(Ks, SKs),
   same_length(SQs, Qs),
   maplist(same_length, SQs, Qs),
   maplist(d_n_qs_int(D,Nmax), SQs, SKs),
   length(SKs, L)).
%@    error('$interrupt_thrown',repl/0). %MEM in 'top' grows w/o limit
%@    D = 2, Nmax = 6, L = 784. % D=2 case completes in mere seconds
*/

% For context, the rationale for above is the following predicate,
% which accepts a _partially ordered_ +Qs and returns -SQs sorted
% according to a *complete* order that contains the partial order.
% In my application, this enables certain calculations to be done
% in a single pass through a large list such as the one generated
% by findall/3 above.
qs_sorted(Qs, SQs) :-
    N = 6, % TODO: Generalize
    maplist(d_n_qs_int(D,N), Qs, Ks),
    sort(Ks, SKs),
    same_length(SQs, Qs),
    maplist(same_length, SQs, Qs),
    maplist(d_n_qs_int(D,N), SQs, SKs).

The text was updated successfully, but these errors were encountered:

triska · 2024-10-20T08:15:13Z

One general note, after a brief look at this example: Some computations inherently take a lot of space, this does not necessarily mean that the memory growth is unbounded.

In this concrete example, there seems to be a huge internal "branching" of terms. For instance, if we consider just the following fragment:

?- D=3, Nmax=6,
   L+\(findall(Q, qs_d_nmax(Q, D, Nmax), Qs),
       length(Qs, L)).
   D = 3, Nmax = 6, L = 21952.

The step from D = 2 to D = 3 causes a huge increase in this length, from 784 to more than 20_000 elements.

This list with more than 20_000 elements is represented in memory, and subsequent goals seem to fan this out further.

I once had a similar situation with a huge Binary Decision Diagram (more than 40 GB) I wanted to represent in memory. For such purposes, you can rent a server with 256 GB of RAM for less than 200 USD per month:

https://www.hetzner.com/de/dedicated-rootserver/matrix-ex/

dcnorris · 2024-10-20T09:55:30Z

My little laptop has 64GB ram, I think. I know I will soon want to refine the generation of test cases to something less profligate, and no doubt I will take advantage of CLP(ℤ) for this! But my sense is that it's the "fanning-out" of subsequent goals you reference that really causes the problem. (I conjecture that a more 'eager' clpz might just eat the goals as they are posted, but don't know how to express that with any sophistication. I'm going to try rewriting the encoding predicate in the hope of promoting such eagerness.)

%?- D=2, Nmax=6, time(L+\(findall(Q, qs_d_nmax(Q, D, Nmax), Qs), length(Qs, L))).
%@    % CPU time: 0.068s, 249_976 inferences
%@    D = 2, Nmax = 6, L = 784.

%?- D=3, Nmax=6, time(L+\(findall(Q, qs_d_nmax(Q, D, Nmax), Qs), length(Qs, L))).
%@    % CPU time: 1.519s, 6_660_470 inferences
%@    D = 3, Nmax = 6, L = 21952.

%?- D=4, Nmax=6, time(L+\(findall(Q, qs_d_nmax(Q, D, Nmax), Qs), length(Qs, L))).
%@    % CPU time: 41.612s, 182_781_624 inferences
%@    D = 4, Nmax = 6, L = 614656.

triska · 2024-10-20T17:05:38Z

Also, Scryer Prolog does not yet have a garbage collector and therefore currently reclaims memory only on backtracking.

You can use this to reclaim unneeded memory, by wrapping a computation in findall/3 (which uses a failure-driven loop internally to find all solutions), especially if you need only a small number of terms as the result of the computation. For instance, with findall(N, computation(..., N, ...), [N]), you would get only the integer N (as presumed result of a computation), while the memory that was allocated during that computation is efficiently reclaimed. That's not elegant, but may work as a last resort and should of course ideally be kept as localized as possible.

Another way out could be store what you need more compactly in memory. For instance, you can use the fundamental theorem of arithmetic to store a sequence of positive integers [A,B,C,...] as a single integer by regarding the numbers as powers of different primes and computing a single composite number where the sequence can be read off from the prime decomposition when you need it, i.e., 2^A×3^B×5^C×···. If you can represent a sequence of, say 6 integers, as a single integer that fits in a 64-bit WAM cell, then this will reduce the memory overhead by a factor of 18.

Another approach may be to introduce program-specific "abbreviations" of sequences or other data structures that occur frequently in your program, such as compressed(1), compressed(2), compressed(3) as variants of (say) verbatim([1,2,3]), verbatim(4,3,0) etc.

Yet another way could be to make use of Scryer Prolog's compact internal string representation, and store a sequence of integers as a list of characters with these code points. For instance, we would expect the following to create such a compact string from a list of codes, and also automatically reclaim data that was created during the conversion:

:- use_module(library(lists)).
:- use_module(library(iso_ext)).

codes_to_chars(Codes, Chars) :-
        findall(Cs, (maplist(char_code, Cs0, Codes),
                     partial_string(Cs0, Cs, [])),
                [Chars]).

Example:

?- codes_to_chars([1,2,3,4], Chars).
   Chars = "\x1\\x2\\x3\\x4\".

Chars can be represented extremely compactly in memory, especially with the rebis-dev development branch, which is in fact also needed to reclaim such strings on backtracking.

triska · 2024-10-22T05:58:36Z

One additional point: The rebis-dev branch stores short atoms in a single WAM heap cell (64-bit) without affecting the atom table (and which therefore also yields efficient memory reclamation on backtracking!), so you can store a single atom that takes up to 6 bytes (small integers) very compactly:

?- atom_codes(A, [1,2,3,4,5,6]).
   A = '\x1\\x2\\x3\\x4\\x5\\x6\'.

Also implements what *should* (algorithmically) have been [I think] a more efficient computation d_gs/2 of all D+1 gₓ, but without demonstrated speedup and still very high peak memory usage.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unbounded %MEM growth #2628

Unbounded %MEM growth #2628

dcnorris commented Oct 19, 2024 •

edited

Loading

triska commented Oct 20, 2024

dcnorris commented Oct 20, 2024 •

edited

Loading

triska commented Oct 20, 2024

triska commented Oct 22, 2024

Unbounded %MEM growth #2628

Unbounded %MEM growth #2628

Comments

dcnorris commented Oct 19, 2024 • edited Loading

triska commented Oct 20, 2024

dcnorris commented Oct 20, 2024 • edited Loading

triska commented Oct 20, 2024

triska commented Oct 22, 2024

dcnorris commented Oct 19, 2024 •

edited

Loading

dcnorris commented Oct 20, 2024 •

edited

Loading