-
Notifications
You must be signed in to change notification settings - Fork 5
Home
Yaobo Xu edited this page Jun 15, 2020
·
20 revisions
Collection of code for checking NSG sequencing results.
-
For usage:
compareBamGenotypes.pl -h
.Compare genotypes of a set of BAM files from the same donor and produces the fraction of matched genotypes. It also checks if the inferred genders are matched.
-
For usage:
verifyBamHomChk.pl -h
.Runs verify BAM
-
For usage:
validate_sample_meta.pl -h
.Validate sample meta data and corresponding bam files, upon successful validation, UUIDs will be assigned and md5sum of bam files will be produced.
If the bam header satisfies pan-prostate bam header requirements (SQ lines and CL lines are checked for this) as well, it'll be labeled as 'pp-remapped', 'raw' if it does not.
- Donor ID
- Tissue ID
- Sample ID (should be unique as well)
- is_normal (Y/N, Yes/No, indicate if a sample is normal sample or not a normal sample, ie: a tumour sample)
- is_normal_for_donor (Y/Yes or blank, indicate if a sample is used as the matched normal for all other samples of the donor)
- relative_file_path (an existing bam file, with at least a million reads and satisfying requirements below)
- The bam should have Read Group (RG) line(s). Each RG line has ID tag, library (LB) tag, platform (PL) tag and sample (SM) tag
- ID tag value must be unique, and reads been assigned with the ID must exist in the bam
- PL tag value must be 'ILLUMINA', as we don't support other platforms yet
- SM tag value must be the corresponded Sample ID
- All reads in the bam should have RG ID assigned, and the assigned ID must be declared in one of the RG lines in the header