Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve control in generating foodwebs from structural models. #117

Merged
merged 1 commit into from
Feb 23, 2023

Conversation

ismael-lajaaiti
Copy link
Collaborator

@ismael-lajaaiti ismael-lajaaiti commented Feb 22, 2023

Address #115 and #116.

Changes

  • when creating a FoodWeb a structural model the user has no access to check_cycle and check_disconnected to choose to discard or not network that respectively contains cycle(s) or disconnected species. By default, check_disconnected = true and check_cycle = false.
  • the maximum number of try before throwing an error for the food web generation with a structural model is now exposed to the user with the kwarg iter_safe
  • the error message have been corrected to include the right kwarg name (see Loosen default tolerance ΔC for large richnesses. #115)
  • change the default tolerance argument when generating food webs with a structural model to 10% (either of the number of links or the connectance, depending on the argument given by the user) and update in the docstring.

@iago-lito
Copy link
Collaborator

Thank you! I'm looking into this rn. But I this is not addressing #114, right? I think you meant #116 ^ ^"

@iago-lito iago-lito changed the title Handle cycles. Improve control in generating foodwebs from structural models. Feb 22, 2023
@ismael-lajaaiti
Copy link
Collaborator Author

Thank you! I'm looking into this rn. But I this is not addressing #114, right? I think you meant #116 ^ ^"

Yes thank you, I just edited the PR message. Sorry for the confusion.

Copy link
Collaborator

@iago-lito iago-lito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this quickfix @ismael-lajaaiti :) I'll let you handle these three concerns before merging this in:

  • English: maybe if you're still around @Thomalpas you can be the second pair of eyes regarding docstring modifications here 0:)

  • I worry that the function warning_disconnected has a messy purpose. Checking is one thing, and warning is another thing. I'm sure you'd also prefer that it be split into is_disconnected and disconnection_warning or something, to ease reuse.

  • I worry that tol is unfortunate naming. And it's an unfortunate pattern to have it either carry a ΔC value or a ΔL value, which are not even the same type. Unless you have a good reason to keep it, I'd suggest you drop it in favour of separate ΔC and ΔL arguments.

This is it. Sorry if my comments are not entirely up to date wrt this concluding message. I write them on the fly and then I summarize all this into that ^ ^"

@@ -111,14 +111,18 @@ true

Moreover, while generating the `FoodWeb` we check that it does not have cycles
or disconnected species.
But this behavior can be changed by setting respectively the keyword arguments
`check_cycle` and `check_disconnected` to `false`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about the english here @Thomalpas?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moreover, while generating the FoodWeb, by default we check that it does not contain cycles or disconnected species.
These behaviours can respectively be changed by setting the keyword arguments check_cycle and check_disconnected to false.

@@ -131,6 +135,22 @@ julia> n_links(foodweb) == 15
true
```

Note that if you provide a number of `L`inks the `tol`erance is interpreted as
as an absolute tolerance on the number of links,
i.e. `L - tol <= n_links(foodweb) L + tol`, where `foodweb` is the output of `FoodWeb`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing <= sign here maybe?

@@ -131,6 +135,22 @@ julia> n_links(foodweb) == 15
true
```

Note that if you provide a number of `L`inks the `tol`erance is interpreted as
as an absolute tolerance on the number of links,
i.e. `L - tol <= n_links(foodweb) L + tol`, where `foodweb` is the output of `FoodWeb`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"where foodweb is the output of FoodWeb".. nah you can drop that, I'm sure they'll get it ;)

i.e. `L - tol <= n_links(foodweb) L + tol`, where `foodweb` is the output of `FoodWeb`.
Similarly, if you provide a `C`onnectance the `tol`erance is interpreted as
as an absolute connectance on the connectance,
i.e. `C - tol <= connectance(foodweb) <= C + tol`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I sort of understand why you need to keep the tol name here because it either means ΔC or ΔL. But I keep thinking that having explicit ΔC and ΔL arguments can be helpful. Maybe you can feature both?

function f(...; tol=nothing, ΔC=nothing, ΔL=nothing)
    # ...
    s = sum(isnothing.((tol, ΔC, ΔL)))
    if s == 0
        # setup the default values
    elseif s > 1
        throw(ArgumentError("Only one argument among (tol, ΔC, ΔL) must be set."))
    else
        if !isnothing(tol)
            # process tol
        elseif !isnothing(ΔC)
            # process ΔC
        elseif !isnothing(ΔL)
            # process ΔL
        end
    end
end

or something?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UPDATE: maybe just drop tol instead.

@@ -198,25 +218,48 @@ function FoodWeb(
L = nothing,
p_forbidden = nothing,
tol = nothing,
check_cycle = false,
check_disconnected = true,
iter_safe = 1e5,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This scientific notation yields a floating point number. Maybe try 10^5 to get an integer instead.

@@ -198,25 +218,48 @@ function FoodWeb(
L = nothing,
p_forbidden = nothing,
tol = nothing,
check_cycle = false,
check_disconnected = true,
iter_safe = 1e5,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about the naming here. What about iter_max ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(unless you mean something specific by "safe"?)

end
(A, species, M, metabolic_class, method) = structural_foodweb_data(uni_net, M, Z, model)
quiet || warning_disconnected(A)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a pretty unusual pattern I must say. What about:

quiet || is_disconnected(A) || @warn "..."

warning_disconnected(A) does two different things: check + warn. And this makes it hard to reuse elsewhere. A simpler is_disconnected function would be more useful to the rest of the package.

Maybe you ended up with this because you didn't want to duplicate your error message. That's fair. But then you already know how to nicely factorize error messages if I trust src/macros.jl ;)

Maybe macros are overkill for the job here though, a simpler approach like:

const cycle_warning = "<blah>"
function FoodWeb(...)
    # ...
    quiet || is_disconnected(A) || @warn cycle_warning
end

may do the job.

Or if you need the message to be dynamically generated:

cycle_warning(a, b) = @warn "<blah> $a <blah> $b"
function FoodWeb(...)
    # ...
    quiet || is_disconnected(A) || cycle_warning(a, b)
end

C <= 1 || throw(ArgumentError("Connectance `C` should be smaller than 1."))
C >= (S - 1) / S^2 || throw(ArgumentError("Connectance `C` should be \
greater than (S-1)/S^2 ($((S-1)/S^2) for S=$S) \
if C < (S - 1) / S^2 && check_disconnected
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer

if check_disconnected && C < (S - 1) / S^2

That way, if check_disconnected is false, then julia won't even bother doing the calculation on the right operand. This is what "short-circuiting" was initially designed for ;)

iter += 1
end
iter <= iter_safe ||
throw(ErrorException("Could not generate adequate network with C=$C \
and ΔC=$ΔC before the maximum number of iterations ($iter_safe) was reached. \
Is the constraint impossible to solve?"))
and tol=$ΔC before the maximum number of iterations ($iter_safe) was reached. \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arh. With my suggestion above to feature both tol and ΔC/ΔL arguments, this error message would become difficult to get right. But I keep thinking that ΔC/ΔL is better naming than tol.

Would it be an option then to drop tol and to replace it with ΔC/ΔL completely? Is there another reason to have tol being either one or the other?

and ΔC=$ΔC before the maximum number of iterations ($iter_safe) was reached. \
Is the constraint impossible to solve?"))
throw(ErrorException("Could not generate adequate network with L=$L \
and tol=$ΔL before the maximum number of iterations ($iter_safe) was reached. \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, and this message is ΔL-oriented while the other is only ΔC.. how come these two different values ended up in the same variable tol in the first place? Maybe this is the source of troubles here.

@@ -111,14 +111,18 @@ true

Moreover, while generating the `FoodWeb` we check that it does not have cycles
or disconnected species.
But this behavior can be changed by setting respectively the keyword arguments
`check_cycle` and `check_disconnected` to `false`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moreover, while generating the FoodWeb, by default we check that it does not contain cycles or disconnected species.
These behaviours can respectively be changed by setting the keyword arguments check_cycle and check_disconnected to false.

By default, we set a tolerance of 1 link between the number of links asked by the user
and the number of links of the returned `FoodWeb`.
By default, we set a tolerance corresponding to 10% of the number of links
given in argument.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given in argument L ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"C or L"?

@@ -131,6 +135,22 @@ julia> n_links(foodweb) == 15
true
```

Note that if you provide a number of `L`inks the `tol`erance is interpreted as
as an absolute tolerance on the number of links,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

double 'as' ?

as an absolute connectance on the connectance,
i.e. `C - tol <= connectance(foodweb) <= C + tol`.

`FoodWeb`s are generated randomly and can be rejected if do not verify
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if they do not verify

note: I'm not sure about the use of the word verify, I would prefer satisfy?

`FoodWeb`s are generated randomly and can be rejected if do not verify
one of the properties defined above (number of links not in the tolerance range,
cycles, etc.).
Thus we repeat the generation of the `FoodWeb` until of all criterion are verified
Copy link
Collaborator

@Thomalpas Thomalpas Feb 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

satisfied I would prefer here too I think from an english perspective (but if there is a computational-based definition of verify then I am happy to be corrected)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think people usually pick "satisfy" in constraints-solving contexts.

Thus we repeat the generation of the `FoodWeb` until of all criterion are verified
or until the number of maximum iteraction `iter_safe` is reached.
If the maximum number of iterations is reached an error is thrown.
That number can be controled by the `iter_safe` keyword argument,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

controlled has 2 l's

@ismael-lajaaiti
Copy link
Collaborator Author

Thank you @iago-lito and @Thomalpas for you reviews! I've pushed a commit integrating your suggestions. Let me know what you think about it.

Copy link
Collaborator

@iago-lito iago-lito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great :) Thank you for the fix @ismael-lajaaiti. All my comments below are minor and easy-fix. So I'll go ahead and fix them myself before I squash/merge this in. Good job!

@@ -6,6 +6,9 @@ Generating FoodWeb objects
const AdjacencyMatrix = SparseMatrixCSC{Bool,Int64}
const Label = Union{String,Symbol}

const disconnected_warning = "'A' contains disconnected species \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good. This is a bit far from where it's actually used, and I would reserve this file header for less anectotical definitions like abstract type EcologicalNetwork (BAM!) and mutable struct Foodweb (biim!). For example, this could live just below the FoodWeb constructor methods?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(maybe you thought it needs to be defined prior the function definition where it's used.. but it doesn't ;)

By default, we set a tolerance of 1 link between the number of links asked by the user
and the number of links of the returned `FoodWeb`.
By default, we set a tolerance corresponding to 10% of the number of links
given in argument (`L`), or 10% of the connectance if you give a connectane (`C`).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"connectance"

By default, we set a tolerance corresponding to 10% of the number of links
given in argument (`L`), or 10% of the connectance if you give a connectane (`C`).
For instance, if you create a `FoodWeb` with 20 links,
by default we ensure that `FoodWeb` in output will contain between 18 and 22 links.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Crystal-clear <3

`FoodWeb`s are generated randomly and can be rejected if they do not satisfy
one of the properties defined above (number of links not in the tolerance range,
cycles, etc.).
Thus we repeat the generation of the `FoodWeb` until of all criterion are satisfied
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"until all criteria"

@@ -172,7 +193,7 @@ function FoodWeb(
@check_size_is_richness² A S
metabolic_class = clean_metabolic_class(metabolic_class, A)
species = clean_labels(species, S)
quiet || warning_cyclic_disconnected(A)
quiet || is_connected(SimpleDiGraph(A)) || @warn disconnected_warning
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised you did not just add a new method for that ;)

is_connected(A::AbstractMatrix) = is_connected(SimpleDigraph(A))

Copy link
Collaborator

@iago-lito iago-lito Feb 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, scratch that: you didn't author is_connected, so that would be piracy! Good thing you didn't then ^ ^"

@@ -197,26 +218,54 @@ function FoodWeb(
C = nothing,
L = nothing,
p_forbidden = nothing,
tol = nothing,
tol_C = isnothing(C) ? nothing : 0.1 * C,
tol_L = isnothing(L) ? nothing : round(Integer, 0.1 * L),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I like the idea :D

isnothing(tol_C) || throw(ArgumentError("You gave a number of links (`L = $L`) and \
a tolerance on the connectance (`tol_C = $tol_C`). \
Either provide a connectance (`C`) or \
use `tol_L` to control the tolerance on the number of links."))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat :)

and ΔC=$ΔC before the maximum number of iterations ($iter_safe) was reached. \
Is the constraint impossible to solve?"))
and tol_C=$ΔC before the maximum number of iterations ($iter_max) was reached. \
Consider either increasing the `tol`erance on the connectance \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"the tolerance" is enough, because tol is no more.

- Loosen default values for `tol_L` and `tol_C`.
- Users can control cycle checking and disconnection.
@iago-lito iago-lito merged commit 758f64e into BecksLab:dev Feb 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants