Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simulation output utils #87

Merged
merged 4 commits into from
Apr 19, 2023
Merged

Simulation output utils #87

merged 4 commits into from
Apr 19, 2023

Conversation

alaindanet
Copy link
Contributor

This pull request contains refactorisation of output analysis from simulate().

It contains several outputs:

  • Diversity: species_richness(), species_persistence(), living_species(),
    get_extinct_species(), shannon(), simpson(), evenness()
  • Biomass: biomass() returns total biomass but also of each species.
  • Producer growth rate: producer_growth() returns, mean, sd, and all the
    timeseries of producer production
  • Trophic structure: trophic_structure() returns max, mean, biomass weighted
    mean trophic level (keeping only the living species), the list of living species and the living network.

The functions takes a solution object as primary argument but you add arguments such as last
(over the last timestep), threshold (to consider that biomass is 0) for many of them. Now there is also
idxs argument in some functions to pick the species you want to compute the
metric of interest (exemple I want the species persistence of producer only, or of
non-producer only).

@alaindanet alaindanet requested review from iago-lito and ismael-lajaaiti and removed request for iago-lito and ismael-lajaaiti January 12, 2023 10:56
@iago-lito iago-lito linked an issue Jan 13, 2023 that may be closed by this pull request
@iago-lito iago-lito requested review from iago-lito and removed request for ismael-lajaaiti January 13, 2023 09:43
@iago-lito iago-lito removed a link to an issue Jan 13, 2023
@iago-lito iago-lito linked an issue Jan 13, 2023 that may be closed by this pull request
@iago-lito iago-lito requested review from ismael-lajaaiti and iago-lito and removed request for iago-lito and ismael-lajaaiti January 13, 2023 12:39
@iago-lito
Copy link
Collaborator

Hi @alaindanet, and 💔 :'(

The tests in test/measures/test-stability.jl are not run here because they are not yet listed in test/runtests.jl. I have tried to add them and they.. fail :( Sorry for that. Can you fix and then come again?

To avoid this in the future, make sure, at least once per test file, that they are actually run by intentionally breaking them and then watching them fail. If they don't fail when you can tell that they are broken, then maybe it's because they are not actually run.

@alaindanet
Copy link
Contributor Author

It should good now!!

Copy link
Collaborator

@iago-lito iago-lito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @alaindanet and good job for this PR. FWIU you are essentially extending the set of possible post-processing/analysis of the trajectories resulting from simulate(), and you are taking this opportunity to refresh functionning.jl, which was a rather old file (and thank you for that <3). There are a lot of tiny comments in my review here, and I haven't written all of them, but here is the general essence of this review :)

  • I trust you regarding the analyses, you know what a CV is better than me ;)

  • I think your overall design is good. Most analyses are based on the result of filter_sim, extracting only the last few timestep in the solution. I also like how you are returning explicit named tuples from your functions, this is very comfortable for the callers :) Be careful though:

    • If filter_sim is never going to have another use than extracting the last timesteps, maybe it's worth renaming it to something more explicit like last_steps_of(solution; species, n) or tail(solution; n, species) or trajectory_end(solution; species, n) or something?
    • filter_sim currently returns a read/write alias to the underlying solution tables, and is exposed to user. This makes it possible to (inadvertently) modify the solution by modifying the output of filter_sim. If this is undesired, consider returning a copy instead.
    • BONUS: would it be useful that species be alternately requested by name instead of index in filter_sim and all other methods using idxs?
  • I think I understand that functionning.jl and stability.jl were old files not exactly synchronized with the rest of the project in terms of coding style, and you have started fixing that. Unfortunately we are not quite there yet ^ ^" Here are a few common style nitpicks I have spotted throughout the PR:

    • unnecessary type constrainst in methods parameters (like ::Float64 or ::Int64), you can remove them all unless they are truly needed for method disambiguation.
    • unnecessary return keywords at the end of blocks.
    • uninformative docstrings for alternate methods of functions. You can merge them with the "main" docstring to make things clearer.
    • a few old docstrings still violate line length limits (92)
    • missing references in docstrings for literature-based calculations, or missing links/dois.
    • Maybe all doctests can be made more compact/readable the following way: instead of
julia> line 1;

julia> line 2;

julia> line 3
expected_output

Maybe we should turn all these to:

julia> line 1;
       line 2;
       line 3
expected_output

(note that this is not only for this PR but valid throughout the project ^ ^")

Tell me what you think, and when/whether you have time to address all this :)

src/measures/functioning.jl Outdated Show resolved Hide resolved
src/measures/utils.jl Outdated Show resolved Hide resolved
src/measures/utils.jl Outdated Show resolved Hide resolved
src/measures/utils.jl Outdated Show resolved Hide resolved
src/measures/utils.jl Outdated Show resolved Hide resolved
src/measures/functioning.jl Show resolved Hide resolved
src/measures/stability.jl Outdated Show resolved Hide resolved
src/measures/stability.jl Outdated Show resolved Hide resolved
src/measures/stability.jl Show resolved Hide resolved
test/measures/test-stability.jl Outdated Show resolved Hide resolved
@alaindanet
Copy link
Contributor Author

Thank you for the feedbacks, I think that I covered all your comments!

Specifically for the doctests, I agree with your suggestion to gather piece of code that are tested, e.g. FoodWeb(); ModelParameters(). I gave a try on that!

@iago-lito
Copy link
Collaborator

Great, thank you :) Maybe I can review it on Friday 🤞

@alaindanet
Copy link
Contributor Author

This pull request should also solve #97

@alaindanet alaindanet linked an issue Jan 19, 2023 that may be closed by this pull request
Copy link
Collaborator

@iago-lito iago-lito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @alaindanet, and thank you for your modifications. Your PR is becoming nicer every time I read it :) I also appreciate how you've taken this opportunity to address #97 <3

I'm sorry to spot new things in my comments now that I could maybe have seen last time, and I hope you'll forgive me to suggest yet another round. Here is the general trend of my review:

  • Strongest first: resetting the last option without user consent is a rather bad thing to do IMO, especially when it's possible to just silent the associated warning with quiet = true. I suggest you better issue a hard error when you think it's impossible to extract the number of timesteps that the user is asking for.

  • I am not sure whether you are defending that filter_sim is a good name for this function, or if you have reasons to think that my other suggestions here are inadequate?

  • Regardless of its name, filter_sim seems central to this PR since almost every other function depends on it. One problem then is that options to filter_sim are repeated within every other function. You can make the design DRY-er here by making use of kwargs... in functions that would then just forward the options to filter_sim as-is (explanations in the comments).

  • I think your idea of specifying the last timesteps in percentage rather than an absolute number of steps is rather good. And maybe the user could benefit from it. last = "10%" could become the default option, even though last = 10 would still be understood.

  • When type-based method call disambiguation relies on ::Vector or ::Matrix constraint, then it's relying on exactly these two (concrete) types, so the dispatching will not happen as you expect if the caller uses a BitVector value or a SparseMatrix value. Indeed:

julia> BitVector <: Vector
false

julia> SparseMatrix <: Matrix
false

To overcome this, remember to keep type constraints as flexible as possible in Julia's methods arguments. Here for instance with ::AbstractVector and ::AbstractMatrix. If you did not know about them, you can now query Julia with e.g. subtypes(AbstractVector) and supertypes(Vector) ;)

  • The overall files coding style has been improved a lot since I last reviewed this, good job :) You can now take this one step further by making more frequent uses of markdown's * and ` markers to better format the docstrings. Think about how happy you feel when you are reading nice docs online, I'm sure you would be happy to produce the same kind of good-quality docs ;)

I think this PR is getting riper overall, and I guess maybe it'll be ready to land next time and close #79, #87, #97 ^ ^ Keep up with the good pace!

src/measures/functioning.jl Outdated Show resolved Hide resolved
src/measures/functioning.jl Show resolved Hide resolved
src/measures/functioning.jl Outdated Show resolved Hide resolved
src/measures/functioning.jl Outdated Show resolved Hide resolved
src/measures/functioning.jl Outdated Show resolved Hide resolved
src/measures/utils.jl Outdated Show resolved Hide resolved
src/measures/utils.jl Outdated Show resolved Hide resolved
src/measures/utils.jl Outdated Show resolved Hide resolved
src/measures/utils.jl Outdated Show resolved Hide resolved
test/measures/test-utils.jl Show resolved Hide resolved
@alaindanet
Copy link
Contributor Author

Thank you so much @iago-lito for the review and thanks guys for the suggestions regarding naming and featuring of extract_last_timesteps(). I fixed the raised issues and the new features are tested and documented!

Copy link
Collaborator

@iago-lito iago-lito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @alaindanet and thank you for improving your PR again :)

It may not be immediately clear from the number of comments in this new review, but I swear this is becoming better every time ^ ^" In particular, I see no incorrect design accross functions anymore (code style, kwargs..) and all the suggested fixes are now scoped within their own functions now, which I guess is the sign that your code is becoming stable.

  • There is confusion which functions are exported and which are not, leading to confusion which functions need to be prefixed with BEFWM2. and which do not. I suspect that you were initially intending to export all of them, so that BEFWM2. would be not needed anywhere.. but maybe I'm wrong?

  • You have removed too many return keywords and some of them were actually necessary ^ ^" I have marked them. This is a fortunate accident though, because, since you have not noticed the problem, it's revealing that the associated test cases were missing from the doctests or the test/ folder. Better adding them now :P

  • I think there is much confusion regarding type-checking in extract_last_timesteps. This function is complicated indeed because it accepts various possible types of arguments, which requires careful processing. On the other hand, by forwarding arguments to it via the kwargs variables, you have successfully narrowed this complexity down to this function only, so there should be no problem left once this one is fixed. I have alternate designs in my comments below.

  • A few items from previous reviews have not yet been addressed, especially these type constraints which were too hard. Don't hesitate to skim through this entire github thread and use it as a giant TODO list. I think you can even tick the boxes in my own messages :P

Alé, good luck with your last round (maybe? 🤞). Looking forward to merging this soon.

PS: Your branch is late wrt dev and I was expecting many conflicts because of @Thomalpas's pass through the docs. Well it turns out that rebasing it was not that difficult in the end. So I have rebased and squashed your commits if you fancy updating it :) The result can be found at simulation_output_utils-squashed-rebased, and also usecase_competition-rebased if you need.

test/measures/test-functioning.jl Outdated Show resolved Hide resolved
test/measures/test-stability.jl Outdated Show resolved Hide resolved
src/measures/utils.jl Outdated Show resolved Hide resolved
src/measures/utils.jl Outdated Show resolved Hide resolved
src/measures/utils.jl Outdated Show resolved Hide resolved
src/measures/functioning.jl Outdated Show resolved Hide resolved
src/measures/functioning.jl Outdated Show resolved Hide resolved
src/measures/functioning.jl Outdated Show resolved Hide resolved
src/measures/utils.jl Outdated Show resolved Hide resolved
src/measures/functioning.jl Outdated Show resolved Hide resolved
Copy link
Collaborator

@iago-lito iago-lito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @alaindanet and thank you again :)

You have finally agreed to drop the autofix_last argument, and the code in utils.jl has become much clearer! I need to congratulate you because I suppose it hasn't been easy to drop this feature you liked, but I hope you also find like me that the extract_last_timesteps function has become much neater :)

Most my comments here are minor lexical/wording nitpicks that you may choose to address or not.

Only three are non-minor, but they should not require deep refactorings either:

  • autofix_last is no more, but it's still documented as a possible option. You need to update your docstring :P

  • NEW: I am now confident that Julia deprecates abbreviations in naming. As a consequence, maybe consider again renaming idxs to indexes, and named tuples fields like avg, sp, com to average, species, <??>. I think it's okay to keep cv because it's rather an "acronym" than an "abbreviation", and coefficient_of_variation would be not concise enough in a named tuple. Also, std is, well, standard enough to be kept as-is ;)

  • NEW: I worry that it's unclear why you are using the corrected = false variants of std() and cov(). Can you double-check with @skefi / @andbeck / @evadelmas that this is the right way to calculate these statistics? And that it would not incur biases in downstream analyses performed by our users?

See you next round, but I swear it really seems like there should be not many more <3

PS: I have rebased your branch to 157938b to integrate the latest fix on dev. This required a force-push. I know you know how to handle this, but don't hesitate to ping me if you need help :)

test/measures/test-functioning.jl Show resolved Hide resolved
test/measures/test-functioning.jl Show resolved Hide resolved
test/measures/test-functioning.jl Show resolved Hide resolved
test/measures/test-functioning.jl Show resolved Hide resolved
test/measures/test-stability.jl Show resolved Hide resolved
src/measures/functioning.jl Outdated Show resolved Hide resolved
src/measures/stability.jl Outdated Show resolved Hide resolved
src/measures/stability.jl Outdated Show resolved Hide resolved
src/measures/stability.jl Outdated Show resolved Hide resolved
src/measures/stability.jl Outdated Show resolved Hide resolved
@alaindanet
Copy link
Contributor Author

I have completed the review finally, thank you so much for your feedback!

Some introduced changes:

  • Set last = 1 everywhere expect for temporal stability related metrics (last = "10%")
  • Set corrected = true by default in temporal stability related metrics
  • Introduction of max_trophic_level(), mean_trophic_level() for
    weighted_average_trophic_level() specific network metrics (tested).
    (Solve Make subfunctions for trophic structure analysis #111 ?)
  • Set threshold = 0 in all the functions to identify extinct species at each
    timestep

I have some concerns about my code, I would be grateful to receive your feedback on it:

Argument type

I have to specify an argument type to avoid wrong dispatch of
max_trophic_level(), synchrony() and species_cv(). It required
solution::SciMLBase.ODESolution to avoid the confusion with other
AbstractMatrix (sparse matrix, transposed matrix). I might have done something
wrong in my arguments 😅.

function max_trophic_level(solution::SciMLBase.ODESolution; threshold = 0, kwargs...)
   ...
end
max_trophic_level(A::Matrix;) = max_trophic_level(trophic_levels(A))

Filter extinct species

For now, the idenfication of extinct species relies on the fact that there are
defined by a biomass equal to 0, is it okay for now and we switch to more
robust programming when we develop util functions from the extinction
dictionnary?

Tuple names

Some naming are a bit too much again, particularly in trophic_structure(), I
would be happy to have your opinion on this one 😄

(:max,  :average, :weighted_average, :alive_species, :alive_trophic_level , :alive_A)

@iago-lito iago-lito linked an issue Mar 10, 2023 that may be closed by this pull request
src/measures/utils.jl Outdated Show resolved Hide resolved
src/measures/utils.jl Show resolved Hide resolved
src/EcologicalNetworksDynamics.jl Show resolved Hide resolved
src/measures/stability.jl Outdated Show resolved Hide resolved
@iago-lito iago-lito mentioned this pull request Mar 14, 2023
3 tasks
src/measures/functioning.jl Outdated Show resolved Hide resolved
src/measures/functioning.jl Outdated Show resolved Hide resolved
src/measures/stability.jl Outdated Show resolved Hide resolved
test/measures/test-stability.jl Show resolved Hide resolved
test/measures/test-utils.jl Outdated Show resolved Hide resolved
"""
function avg_cv_sp(mat)
function species_cv(solution::SciMLBase.ODESolution; threshold = 0, last = "10%", kwargs...)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at supertypes(SciMLBase.ODESolution) and supertypes(SciMLBase.RODESolution), it looks like the types SciMLBase.AbstractODESolution or SciMLBase.AbstractTimeseriesSolution would be more stable on the long run

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solved by @iago-lito 😄
f413f66

@iago-lito iago-lito force-pushed the simulation_output_utils branch from 3eec5e7 to d5b316c Compare April 18, 2023 13:28
alaindanet and others added 4 commits April 19, 2023 11:11
Add various functions
to post-process the solutions trajectories returned by `simulate`.
Take this opportunity to refresh a few old files under `src/measures/`.
- Vertical whitespace.
- Multiline strings.
- `; arg = arg` elisions.
- ..
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make subfunctions for trophic structure analysis Last = 1000 too high Refactor functioning.jl.
5 participants