-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Errors in Chen_MSB2009 benchmark #175
Comments
Ah it looks like the benchmark was exported from the Hass (MATLAB) suite where the same mismatch is present: https://github.com/Benchmarking-Initiative/Benchmark-Models/blob/master/Benchmark-Models/Chen_MSB2009/Data/model1_data4.xlsx |
Thanks for raising this issue, and the thorough feedback! I am currently the only maintainer of this repo now -- unfortunately, I haven't worked with this model yet. What I got from this is:
😭 |
Looking at the Chen_MSB2009 benchmark model, I suspect I may have identified some errors in the measurements table (https://github.com/Benchmarking-Initiative/Benchmark-Models-PEtab/blob/master/Benchmark-Models/Chen_MSB2009/measurementData_Chen_MSB2009.tsv).
The original data is available from the supplement of https://doi.org/10.1038/msb.2008.74 (MSB data), which was reused in https://doi.org/10.1371/journal.pcbi.1005331 (PLoSCB data). The issue with the MSB data is that standard deviations for measurements often contain 0 (see in supplement to https://doi.org/10.1038/msb.2008.74
_dataset/Chen et al - Experimental Data/A431_experiment.out
), which makes the data not suitable for fitting. This is the likely reason why I added 0.1 to the standard deviations in the PLoSCB data (it's been a while ...; see supplement to https://doi.org/10.1371/journal.pcbi.1005331code/project/data/getData.m
lines 756-758.).However, I ran into the following discrepancies:
ERK_PP
data formodel1_data3
condition in benchmark doesn't match MSB data (Low (1e-11 M) EGF condition) or PLoSCB data (D(3)
, lines 687-698) (looks like a copy & paster error in the benchmark data, as model formodel1_data2
andmodel1_data3
are the same). MSB and PLoSCB data match.AKT_PP
data formodel1_data4
condition in benchmark does match MSB data (Low (1e-10 M) HRG condition) but not PLoSCB data (D(4)
, lines 704-715) (looks like a copy & paste error in PLoSCB data, as data formodel1_data3
andmodel1_data4
are the same. This sucks, but shouldn't affect any of the conclusions in the paper).This of course begs the question about the origin of the benchmark data. As the data in the benchmark example also contains 0.1 values (as in the PLoSCB data) for the standard deviation instead of 0.0 values (as in the MSB data), this makes me believe the measurements file in the benchmark was likely derived from PLoSCB data (likely fixing the issue with
model1_data4
, but introducing the issue withmodel1_data3
😢).I will refrain from making any remarks regarding how much I loathe data that is not available in easily machine readable formats and data processing pipelines that involve manual steps ...
The text was updated successfully, but these errors were encountered: