You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
use_pancancer should be false for all_other_cancers in process_data_for_gene() (i.e. there shouldn't be unused dummy/one-hot variables)
Label filtering should happen after we take the intersection of samples between gene expression and mutation (this will make the proportions in 08_cell_line_prediction/download_data.ipynb match what we actually see when the scripts run)
tcga_utilities should probably be renamed to something more general, or split
CNV data for cell lines, in ccle_data_model_generate_labels()
remove unknown/non-cancerous samples in load_sample_info()
maybe try sklearn LogisticRegression with elastic net penalty rather than SGDClassifier
use_pancancer
should be false forall_other_cancers
inprocess_data_for_gene()
(i.e. there shouldn't be unused dummy/one-hot variables)08_cell_line_prediction/download_data.ipynb
match what we actually see when the scripts run)tcga_utilities
should probably be renamed to something more general, or splitccle_data_model
_generate_labels()
load_sample_info()
LogisticRegression
with elastic net penalty rather thanSGDClassifier
The text was updated successfully, but these errors were encountered: