Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question: how was filter of delly calls performed and decided upon? #21

Open
anoronh4 opened this issue Jun 16, 2021 · 1 comment
Open

Comments

@anoronh4
Copy link

We recently accessed several of the pcawg variant calls and realized that based on the name of the files, it was likely that the delly filter step was being omitted, since the naming pattern was <id>.embl-delly_<version>-preFilter.<date>.somatic.sv.vcf.gz. we were just wondering the reason for this, and how pcawg filters the calls before merging them with other callers.

we did notice that the script DellySomaticFreqFilter.py exists in this repo, but were curious to know why any *highConf.vcf were left out of the shared pcawg SV calls, and why the workflow deviated from the built in delly filter algorithm. also, any guidance on how to build the PoN for delly calls?

@anoronh4 anoronh4 changed the title question: why not filter delly calls? question: how was filter of delly calls performed and decided upon? Jun 16, 2021
@anoronh4
Copy link
Author

anoronh4 commented Jun 25, 2021

i did give the script DellySomaticFreqFilter.py a try for the unfiltered vcf calls that we downloaded from pcawg, and found it strange that i got a such a wide range proportions of vcf records that were kept. i'm wondering now if there's some feature in some of the vcf files (DO1015_tumor__DO1015_normal for example) that would cause variants to be incorrectly skipped. below is the table of counts:

id original number of rows number of rows after filtering proportion pcawg vcf file prefix
DO1015_tumor__DO1015_normal 432 0 0 f393bb08-5b50-e009-e040-11ac0d484537.embl-delly_1-0-0-preFilter.20150618
DO218709_tumor__DO218709_normal 368 0 0 fc8130e0-0f1a-b6eb-e040-11ac0c48328f.embl-delly_1-0-0-preFilter.20150614
DO220878_tumor__DO220878_normal 189 0 0 9fc5b5c7-3973-42b4-8710-454de0cb5b50.embl-delly_1-0-0-preFilter.20150730
DO32960_tumor__DO32960_normal 131 0 0 b37d6283-6f95-4975-a794-f3d5c4bbc7b3.embl-delly_1-0-0-preFilter.20150705
DO45239_tumor__DO45239_normal 684 0 0 d182b67c-c622-11e3-bf01-24c6515278c0.embl-delly_1-0-0-preFilter.20150712
DO51533_tumor__DO51533_normal 160 0 0 d1804679-e728-4597-ac69-49554c087b9e.embl-delly_1-0-0-preFilter.20150722
DO6580_tumor__DO6580_normal 95 0 0 c642b9cc-bdb1-4796-9692-8be92398be17.embl-delly_1-0-0-preFilter.20150604
DO220885_tumor__DO220885_normal 127 64 50.3 16d5519e-ecb9-4fc8-81f1-e0e4adf722a8.embl-delly_1-3-0-preFilter.20150817
DO49445_tumor__DO49445_normal 76 76 100.0 b5cabba2-30a4-458e-897c-00ec3fefa6d2.embl-delly_1-0-0-preFilter.20150622
DO6436_tumor__DO6436_normal 77 77 100.0 ef3b454c-b2cf-4f68-a2ab-733620b6714e.embl-delly_1-0-0-preFilter.20150607
DO51484_tumor__DO51484_normal 142 118 83.0 6ad44218-d34e-4126-bf56-1be2140cd3fb.embl-delly_1-3-0-preFilter.20150826
DO48677_tumor__DO48677_normal 182 151 82.9 9ba2c970-c622-11e3-bf01-24c6515278c0.embl-delly_1-3-0-preFilter.20150829
DO48545_tumor__DO48545_normal 155 155 100.0 44406493-37f4-48c7-961b-8714be50773a.embl-delly_1-0-0-preFilter-hpc.150701
DO51490_tumor__DO51490_normal 159 157 98.7 6297aa77-37a0-4f46-987b-32bd8653c0c2.embl-delly_1-0-0-preFilter.20150722
DO51514_tumor__DO51514_normal 189 188 99.4 65d2dbc3-a163-4696-b246-47a430e66572.embl-delly_1-0-0-preFilter.20150618
DO51187_tumor__DO51187_normal 205 205 100.0 dc7faf84-4438-447b-abcf-a3af87043654.embl-delly_1-0-0-preFilter-hpc.150702
DO220906_tumor__DO220906_normal 249 248 99.5 deb9fbb6-656b-41ce-8299-554efc2379bd.embl-delly_1-0-0-preFilter.20150805
DO51172_tumor__DO51172_normal 273 273 100.0 29127cde-548f-4c42-96cf-6f0020c3db9a.embl-delly_1-0-0-preFilter-hpc.150702
DO34504_tumor__DO34504_normal 275 274 99.6 6bdf00f6-670f-466e-87fb-e853e41f000e.embl-delly_1-0-0-preFilter.20150705
DO49181_tumor__DO49181_normal 294 294 100.0 88bc38ba-ad1d-431e-a67e-0a5a23678386.embl-delly_1-0-0-preFilter-hpc.150701
DO51497_tumor__DO51497_normal 299 299 100.0 a3210fd0-344c-468e-8ff2-2d0869a2fb75.embl-delly_1-0-0-preFilter-hpc.151230
DO49184_tumor__DO49184_normal 303 303 100.0 d4907a1b-8b06-47c5-8bca-c781d9cddaf8.embl-delly_1-0-0-preFilter-hpc.150630
DO51053_tumor__DO51053_normal 378 337 89.1 289790a5-77bd-49a9-a1ec-478a8ecacd7f.embl-delly_1-3-0-preFilter.20150817
DO50427_tumor__DO50427_normal 344 339 98.5 f9c52187-2e82-d58a-e040-11ac0d484fc4.embl-delly_1-0-0-preFilter.20150622
DO51528_tumor__DO51528_normal 341 341 100.0 c0523251-3ac2-4292-bb00-9ae9ea9009f6.embl-delly_1-0-0-preFilter-hpc.150605
DO49076_tumor__DO49076_normal 347 347 100.0 0554ffe5-31f7-43f5-8372-2b73c9cf3527.embl-delly_1-0-0-preFilter-hpc.150630
DO49090_tumor__DO49090_normal 352 352 100.0 4cbe411b-b05e-46bd-bea8-126289a0866c.embl-delly_1-0-0-preFilter-hpc.150630
DO48541_tumor__DO48541_normal 375 375 100.0 5702affd-eafe-42a4-8f56-c1f22f8f184d.embl-delly_1-0-0-preFilter-hpc.150701
DO33488_tumor__DO33488_normal 402 402 100.0 0cf9bbc2-cbd5-4b64-8d90-cfa416307b39.embl-delly_1-0-0-preFilter-hpc.150629
DO45223_tumor__DO45223_normal 425 404 95.0 b67208c4-c622-11e3-bf01-24c6515278c0.embl-delly_1-3-0-preFilter.20150909
DO49129_tumor__DO49129_normal 429 429 100.0 5f94cb62-4019-47ff-bf6a-eeda8e9e033c.embl-delly_1-0-0-preFilter-hpc.150630
DO32875_tumor__DO32875_normal 434 434 100.0 ee5d5e7d-78cf-4a29-a9ee-56aa3da877dd.embl-delly_1-0-0-preFilter-hpc.150628
DO51541_tumor__DO51541_normal 689 487 70.6 3933c60d-73d6-4f74-ae02-fd545fc1f092.embl-delly_1-3-0-preFilter.20150801
DO220841_tumor__DO220841_normal 520 518 99.6 9e0009d1-c993-4247-9706-88ee84591dec.embl-delly_1-0-0-preFilter.20150730
DO45096_tumor__DO45096_normal 522 522 100.0 27fcccdc-c622-11e3-bf01-24c6515278c0.embl-delly_1-0-0-preFilter-hpc.150901
DO51487_tumor__DO51487_normal 681 681 100.0 d333b55b-8bac-4a99-9d23-3cc0c25057bf.embl-delly_1-0-0-preFilter-hpc.151230
DO49460_tumor__DO49460_normal 686 684 99.7 9ffe694e-b488-489e-bdbe-0800e505eec4.embl-delly_1-0-0-preFilter.20150622
DO51466_tumor__DO51466_normal 962 879 91.3 c13fb736-614c-4d5f-83bf-2d7586f4fb53.embl-delly_1-3-0-preFilter.20150826
DO51118_tumor__DO51118_normal 1064 1037 97.4 a6045753-60bb-4e65-bc89-1ef0b47aab35.embl-delly_1-3-0-preFilter.20150816
DO50793_tumor__DO50793_normal 1071 1071 100.0 3f99ae0e-c623-11e3-bf01-24c6515278c0.embl-delly_1-0-0-preFilter-hpc.150901
DO220909_tumor__DO220909_normal 1622 1619 99.8 142b6dbf-c943-4a7d-8ab6-13a975f48d7a.embl-delly_1-0-0-preFilter.20150805
DO45191_tumor__DO45191_normal 9148 9106 99.5 9563a264-c622-11e3-bf01-24c6515278c0.embl-delly_1-0-0-preFilter.20150709

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant