Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concordance of resistance predictions with phenotype #76

Closed
mbhall88 opened this issue Apr 13, 2021 · 15 comments
Closed

Concordance of resistance predictions with phenotype #76

mbhall88 opened this issue Apr 13, 2021 · 15 comments
Assignees

Comments

@mbhall88
Copy link
Owner

We do not have phenotype data for every sample/drug, so just for those we do have.
Similar to #75, we will have a table showing Very Major Error (missed resistance), Major Error (missed susceptible), PPV (what % of R calls are R) and NPV (what % of S calls are S) for each tool.

@mbhall88 mbhall88 self-assigned this Apr 13, 2021
@mbhall88
Copy link
Owner Author

mbhall88 commented Jul 7, 2021

First draft of the figure for this analysis


Figure 2

Number of resistant (left) and susceptible (right) phenotypes correctly identified by mykrobe from Illumina (blue) and Nanopore (purple) data from the same samples. The red bars indicate missed (FN) or incorrect (FP) predictions. The x-axis shows the drugs with available phenotype data that mykrobe also makes predictions for. E - ethambutol; H - isoniazid; Z - pyrazinamide; R - rifampicin; S - streptomycin; Km - kanamycin; Am - amikacin; Ofx - ofloxacin; Cm - capreomycin; Mfx - moxifloxacin.

image

@mbhall88
Copy link
Owner Author

mbhall88 commented Jul 7, 2021

We could conceivable also use a table (or replace the figure with a table if more informative)

drug technology NPV PPV sensitivity specificity ME/FP VME/FN TP TN
Amikacin Illumina 0.974025974025974 0.8181818181818182 0.8181818181818182 0.974026 2 2 9 75
Amikacin Nanopore 1.0 0.8461538461538461 1.0 0.974026 2 0 11 75
Capreomycin Illumina 1.0 0.0 - 0.980392 1 0 0 50
Capreomycin Nanopore 1.0 0.0 - 0.980392 1 0 0 50
Ethambutol Illumina 0.9375 0.38461538461538464 0.7142857142857143 0.789474 16 4 10 60
Ethambutol Nanopore 0.9393939393939394 0.4166666666666667 0.7142857142857143 0.815789 14 4 10 62
Isoniazid Illumina 0.8490566037735849 0.9333333333333333 0.84 0.9375 3 8 42 45
Isoniazid Nanopore 0.8541666666666666 0.86 0.86 0.854167 7 7 43 41
Kanamycin Illumina 1.0 0.0 - 0.980392 1 0 0 50
Kanamycin Nanopore 1.0 0.0 - 0.980392 1 0 0 50
Moxifloxacin Illumina - 0.0 - 0 1 0 0 0
Moxifloxacin Nanopore - 0.0 - 0 1 0 0 0
Ofloxacin Illumina 1.0 0.7142857142857143 1.0 0.947368 4 0 10 72
Ofloxacin Nanopore 1.0 0.7142857142857143 1.0 0.947368 4 0 10 72
Pyrazinamide Illumina 1.0 - - 1 0 0 0 1
Pyrazinamide Nanopore 1.0 - - 1 0 0 0 1
Rifampicin Illumina 0.8775510204081632 0.9761904761904762 0.8723404255319149 0.977273 1 6 41 43
Rifampicin Nanopore 0.8775510204081632 0.9761904761904762 0.8723404255319149 0.977273 1 6 41 43
Streptomycin Illumina 0.935064935064935 0.23076923076923078 0.375 0.878049 10 5 3 72
Streptomycin Nanopore 0.9605263157894737 0.35714285714285715 0.625 0.890244 9 3 5 73

@iqbal-lab
Copy link
Collaborator

iqbal-lab commented Jul 7, 2021

Cool! So essentially identical results except slightly better VME for amikacin, isoniazid and streptomycin, and slightly worse ME for isoniazid . I call that a win

@iqbal-lab
Copy link
Collaborator

Wait a minute, how come pyrazinamide is in the table, I thought we didn't have any phenotypes for that? Did I forget/get that wrong???

@mbhall88
Copy link
Owner Author

mbhall88 commented Jul 7, 2021

Wait a minute, how come pyrazinamide is in the table, I thought we didn't have any phenotypes for that? Did I forget/get that wrong???

We have 1 sample with PZA DST haha. Maybe I just leave it out then?

@mbhall88
Copy link
Owner Author

mbhall88 commented Jul 7, 2021

Here is another plot that is very insightful

Effect of Nanopore read depth on mykrobe phenotype prediction. Each point indicates the proportion (y-axis) of classifications of that type at the read depth (x-axis). Read depth is "binned". That is, read depth 40 is all samples with a read depth greater than 40 and less than or equal to 50. FP - false positive; TN - true negative; etc.

image

@iqbal-lab
Copy link
Collaborator

For this table #76 (comment)
we will need to have confidence intervals, eg see https://www.nature.com/articles/ncomms10063/tables/1

@mbhall88
Copy link
Owner Author

mbhall88 commented Jul 7, 2021

I don't really understand where the confidence intervals come from? The values aren't the result of any kind of aggregation/averaging...

@iqbal-lab
Copy link
Collaborator

The confidence intervals inform you how much you can trust the rate (FPR, VME, whatever) based on the number of samples.
A TPR of 90% is more confident if you find 9900 out of 10000 resistant samples than if you find 9 out of 10.
This stuff (confidence intervals) always does my head in a bit though, so don't worry when you look up the definitions and they confuse the hell out of you.

@mbhall88
Copy link
Owner Author

I see. Ok, I've added that in using the Wilson score interval - which is the same as was used in the recent mykrobe paper. However, I notice the Nature Comms paper used Clopper–Pearson confidence interval. Although I don't think the two are that different.

@mbhall88
Copy link
Owner Author

I've also overlayed sample size for the coverage plot

image

@iqbal-lab
Copy link
Collaborator

Channeling my inner Michael patrolling appropriate slack channels...
This issue is about concordance with phenotype, and this comment

#76 (comment)

Is about concordance with illumina, which is a different issue.

@mbhall88
Copy link
Owner Author

Ah, right you are. Thank you!

@mbhall88
Copy link
Owner Author

Something interesting to know about mykrobe - when using a diploid model for ONT it goes crazy and calls everything resistant to Isoniazid. Looking at a few samples, it looks like it calls a whole bunch of indels as HET

image

@mbhall88
Copy link
Owner Author

mbhall88 commented Jul 12, 2021

I am going to start writing the results section with the following final (pending major issues) plot and table

image

Drug Technology FN(R) FP(S) FNR(95% CI) FPR(95% CI) PPV(95% CI) NPV(95% CI)
Amikacin Illumina 2(11) 2(77) 18.2% (5.1-47.7%) 2.6% (0.7-9.0%) 81.8% (52.3-94.9%) 97.4% (91.0-99.3%)
Amikacin Nanopore 0(11) 2(77) 0.0% (0.0-25.9%) 2.6% (0.7-9.0%) 84.6% (57.8-95.7%) 100.0% (95.1-100.0%)
Capreomycin Illumina 0(0) 1(51) - 2.0% (0.3-10.3%) 0.0% (0.0-79.3%) 100.0% (92.9-100.0%)
Capreomycin Nanopore 0(0) 1(51) - 2.0% (0.3-10.3%) 0.0% (0.0-79.3%) 100.0% (92.9-100.0%)
Ethambutol Illumina 3(14) 16(76) 21.4% (7.6-47.6%) 21.1% (13.4-31.5%) 40.7% (24.5-59.3%) 95.2% (86.9-98.4%)
Ethambutol Nanopore 4(14) 14(76) 28.6% (11.7-54.6%) 18.4% (11.3-28.6%) 41.7% (24.5-61.2%) 93.9% (85.4-97.6%)
Isoniazid Illumina 8(50) 3(48) 16.0% (8.3-28.5%) 6.2% (2.1-16.8%) 93.3% (82.1-97.7%) 84.9% (72.9-92.1%)
Isoniazid Nanopore 7(50) 7(48) 14.0% (7.0-26.2%) 14.6% (7.2-27.2%) 86.0% (73.8-93.0%) 85.4% (72.8-92.8%)
Kanamycin Illumina 0(0) 1(51) - 2.0% (0.3-10.3%) 0.0% (0.0-79.3%) 100.0% (92.9-100.0%)
Kanamycin Nanopore 0(0) 1(51) - 2.0% (0.3-10.3%) 0.0% (0.0-79.3%) 100.0% (92.9-100.0%)
Ofloxacin Illumina 0(10) 4(76) 0.0% (-0.0-27.8%) 5.3% (2.1-12.8%) 71.4% (45.4-88.3%) 100.0% (94.9-100.0%)
Ofloxacin Nanopore 0(10) 4(76) 0.0% (-0.0-27.8%) 5.3% (2.1-12.8%) 71.4% (45.4-88.3%) 100.0% (94.9-100.0%)
Rifampicin Illumina 5(47) 1(44) 10.6% (4.6-22.6%) 2.3% (0.4-11.8%) 97.7% (87.9-99.6%) 89.6% (77.8-95.5%)
Rifampicin Nanopore 6(47) 1(44) 12.8% (6.0-25.2%) 2.3% (0.4-11.8%) 97.6% (87.7-99.6%) 87.8% (75.8-94.3%)
Streptomycin Illumina 5(8) 10(82) 62.5% (30.6-86.3%) 12.2% (6.8-21.0%) 23.1% (8.2-50.3%) 93.5% (85.7-97.2%)
Streptomycin Nanopore 3(8) 9(82) 37.5% (13.7-69.4%) 11.0% (5.9-19.6%) 35.7% (16.3-61.2%) 96.1% (89.0-98.6%)

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants