You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I should give a more realistic example in documentation about the fact that boot_specieslevel or boot_networklevel expect a list of one or more data frames of interactions. Each interaction (row in the data frame) must be repeated as many times as it was observed. E.g. if the interaction species_1 x species_2 was observed 5 times, then repeat that row 5 times within the data frame.
One misleading workflow in data preparation now is to build a web matrix with the table function from the row data, that most likely contains an enumeration of interactions. The table function will not consider the abundance of the interactions, unless they are repeated within the raw data. So, there is high risk to lose data. Then, once that web is build with the table function, the user goes to using web_matrix_to_df, which kind of completes a vicious data processing circle.
So, the user tends to build the web matrix from a data frame and then transform the matrix back into an expanded data frame. This is an unnecessary journey and I guess was inspired somehow from how I constructed the example with Safariland from bipartite. But a more real case is to take the raw data with the interactions, enumerate/explode the rows based on some abundance column and then that data is already good to use directly for boot_specieslevel or boot_networklevel. No need to use table, especially that creates a misleading way towards data loss.
So, try to make a more realistic simple usage example of boot_specieslevelandboot_networklevel, without needing to use web_matrix_to_df`, which seems to be reserved rather in rare cases. The user tends to have the raw data more as data frame from Excel than as a web matrix/community matrix.
The text was updated successfully, but these errors were encountered:
I should give a more realistic example in documentation about the fact that
boot_specieslevel
orboot_networklevel
expect a list of one or more data frames of interactions. Each interaction (row in the data frame) must be repeated as many times as it was observed. E.g. if the interaction species_1 x species_2 was observed 5 times, then repeat that row 5 times within the data frame.One misleading workflow in data preparation now is to build a web matrix with the
table
function from the row data, that most likely contains an enumeration of interactions. Thetable
function will not consider the abundance of the interactions, unless they are repeated within the raw data. So, there is high risk to lose data. Then, once that web is build with thetable
function, the user goes to usingweb_matrix_to_df
, which kind of completes a vicious data processing circle.So, the user tends to build the web matrix from a data frame and then transform the matrix back into an expanded data frame. This is an unnecessary journey and I guess was inspired somehow from how I constructed the example with Safariland from
bipartite
. But a more real case is to take the raw data with the interactions, enumerate/explode the rows based on some abundance column and then that data is already good to use directly forboot_specieslevel
orboot_networklevel
. No need to usetable
, especially that creates a misleading way towards data loss.So, try to make a more realistic simple usage example of boot_specieslevel
and
boot_networklevel, without needing to use
web_matrix_to_df`, which seems to be reserved rather in rare cases. The user tends to have the raw data more as data frame from Excel than as a web matrix/community matrix.The text was updated successfully, but these errors were encountered: