-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding Level 3-5 Cell Painting Data Questions #3
Comments
|
Not currently blocking 👍 |
This was the workflow used (repo is private), which predates the profiling handbook. My suggestion is to use the the Level 3 data in If you have the time, I would suggested comparing the level 3, 4, 5 data to that generated using A hack for figuring out which statistic was used to summarize single cell profiles: x <-
list.files(
"../../backend/2016_04_01_a549_48hr_batch1/",
recursive = TRUE,
full.names = TRUE,
pattern = "*augmented.csv"
) %>%
map_df(function(fname) {
read_csv(
fname,
col_types =
cols_only(
Cells_AreaShape_Area = col_double(),
Metadata_Plate = col_character()
)
)
})
x %<>%
group_by(Metadata_Plate) %>%
summarise(is_median = sum(is_median), n = n())
x %<>%
mutate(is_median = ceiling(Cells_AreaShape_Area * 2) == Cells_AreaShape_Area * 2)
x %<>%
group_by(Metadata_Plate) %>% summarise(is_median = sum(is_median), n = n())
x %<>%
filter(is_median != n)
x %>%
knitr::kable() The level 3 data for these plates were created using means, not medians and should be reprocessed to using medians.
|
I am working through confirming pycytominer and cytominer equivalency. I added cytomining/pycytominer#72 to mirror the cytominer "robust" function. I tested this using one example plate ( I am noting here an observation and potential discrepancy in the cytominer processing details. Specifically, I noticed that the cytominer processing (link above) notes that the plate was normalized against DMSO. However, when I compare the cytominer results to the pycytominer results, the cytominer results are closer to pycytominer results when normalizing against the whole plate. Furthermore, the cytominer level 4a results are more similar to cytominer level 3 results processed with pycytominer using all samples vs. using DMSO samples. from pycytominer.cyto_utils import infer_cp_features
from pycytominer import normalize
cytominer_df = data["level_three"]["cytominer"]
pycytominer_df = data["level_three"]["pycytominer"]
pycytominer_df.Metadata_broad_sample = pycytominer_df.Metadata_broad_sample.fillna("DMSO")
cp_cols = infer_cp_features(pycytominer_df)
# Process pycytominer level 3 data with two different normalization strategies
pycytominer_norm_all_df = normalize(
profiles=pycytominer_df,
features="infer",
method="mad_robustize",
samples="all",
output_file="none",
).loc[:, cp_cols]
pycytominer_norm_dmso_df = normalize(
profiles=pycytominer_df,
features="infer",
method="mad_robustize",
samples="Metadata_broad_sample == 'DMSO'",
output_file="none",
).loc[:, cp_cols]
# Process cytominer level 3 data with two different normalization strategies
cytominer_norm_all_df = normalize(
profiles=cytominer_df,
features="infer",
method="mad_robustize",
samples="all",
output_file="none",
).loc[:, cp_cols]
cytominer_norm_dmso_df = normalize(
profiles=cytominer_df,
features="infer",
method="mad_robustize",
samples="Metadata_broad_sample == 'DMSO'",
output_file="none",
).loc[:, cp_cols]
# Also load cytominer level four data
cytominer_level_four_df = data["level_four"]["cytominer"]
# Screenshot of results below
((cytominer_level_four_df - pycytominer_norm_all_df).sum().abs() > 1e-10).sum()
((cytominer_level_four_df - pycytominer_norm_dmso_df).sum().abs() > 1e-10).sum()
((cytominer_level_four_df - cytominer_norm_all_df).sum().abs() > 1e-10).sum()
((cytominer_level_four_df - cytominer_norm_dmso_df).sum().abs() > 1e-10).sum() SummaryCytominer normalization seems closer to whole plate normalization than DMSO normalization. QuestionWhich is the most appropriate normalization strategy? All pycytominer-derived data is based on a DMSO-normalized strategy. This includes the results with batch effect observed across DMSO wells broadinstitute/cell-health#84 |
Note an update to |
closing in favor of #22 |
I am in the process of adding level 3-5 profiles to this repo (using git lfs). I will use this issue to document various questions I have about the process.
cytominer
profiles here? We should consider the pycytominer-based profiles less mature (and therefore less stable)? The cytominer profiles are the ones that were originally computed./home/ubuntu/bucket/projects/2015_10_05_DrugRepurposing_AravindSubramanian_GolubLab_Broad/workspace/backend/2016_04_01_a549_48hr_batch1
The text was updated successfully, but these errors were encountered: