Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vcf2smc #264

Open
Risingsun93 opened this issue Aug 25, 2024 · 5 comments
Open

vcf2smc #264

Risingsun93 opened this issue Aug 25, 2024 · 5 comments

Comments

@Risingsun93
Copy link

hii dear @terhorst @willright28 I'm facing same issue "RuntimeError("Distinguished lineages not found in data?")
RuntimeError: Distinguished lineages not found in data?"
using example data mentioned in this github repository.https://github.com/popgenmethods/smcpp/blob/master/example/example.vcf.gz

smc++ vcf2smc example.vcf.gz chr1.smc.gz chr1 CEU:NA12878,NA12879
smc++ vcf2smc -d NA12878 NA12879 example.vcf.gz chr1.smc.gz chr1 CEU:NA12878,NA12879
for i in {7..9};
do smc++ vcf2smc -d NA1287$i NA1287$i example.vcf.gz out.$i.txt chr1 NA12877 NA12878 NA12890;
done
smc++ estimate -o output/ 0.1 out1.txt

kindly help me to solve
please check the header for this file and sample and population info. and suggest me changes to be do accordingly

###########
mylinux@ChiragsPC:~/smcppdata$ smc++ vcf2smc example.vcf.gz chr1.smc.gz chr1 CEU:NA12878,NA12879
2016 smcpp.commands.vcf2smc WARNING Neither missing cutoff (-c) or mask (-m) has been specified. This means that stretches of the chromosome that do not have any VCF entries (for example, centromeres) will be interpreted as homozygous recessive.
2020 smcpp.commands.vcf2smc INFO Population 1:
2020 smcpp.commands.vcf2smc INFO Distinguished lineages: NA12878:0, NA12878:1
2021 smcpp.commands.vcf2smc INFO Undistinguished lineages: NA12879:0, NA12879:1
[E::idx_find_and_load] Could not retrieve index file for 'example.vcf.gz'
Traceback (most recent call last):
File "/home/mylinux/.local/bin/smc++", line 8, in
sys.exit(main())
File "/home/mylinux/.local/lib/python3.10/site-packages/smcpp/frontend/console.py", line 28, in main
cmds[args.command].main(args)
File "/home/mylinux/.local/lib/python3.10/site-packages/smcpp/commands/vcf2smc.py", line 134, in main
raise RuntimeError("Distinguished lineages not found in data?")
RuntimeError: Distinguished lineages not found in data?

mylinux@ChiragsPC:~/smcppdata$ smc++ vcf2smc -d NA12878 NA12879 example.vcf.gz chr1.smc.gz chr1 CEU:NA12878,NA12879
2028 smcpp.commands.vcf2smc WARNING Neither missing cutoff (-c) or mask (-m) has been specified. This means that stretches of the chromosome that do not have any VCF entries (for example, centromeres) will be interpreted as homozygous recessive.
2029 smcpp.commands.vcf2smc INFO Population 1:
2029 smcpp.commands.vcf2smc INFO Distinguished lineages: NA12878:0, NA12879:1
2029 smcpp.commands.vcf2smc INFO Undistinguished lineages: NA12878:1, NA12879:0
[E::idx_find_and_load] Could not retrieve index file for 'example.vcf.gz'
Traceback (most recent call last):
File "/home/mylinux/.local/bin/smc++", line 8, in
sys.exit(main())
File "/home/mylinux/.local/lib/python3.10/site-packages/smcpp/frontend/console.py", line 28, in main
cmds[args.command].main(args)
File "/home/mylinux/.local/lib/python3.10/site-packages/smcpp/commands/vcf2smc.py", line 134, in main
raise RuntimeError("Distinguished lineages not found in data?")
RuntimeError: Distinguished lineages not found in data?

mylinux@ChiragsPC:~/smcppdata$ for i in {7..9};

do smc++ vcf2smc -d NA1287$i NA1287$i example.vcf.gz out.$i.txt chr1 NA12877 NA12878 NA12890;
done
usage: smc++ vcf2smc [-h] [-v] [--cores CORES] [-d sample_id sample_id] [--length LENGTH] [--ignore-missing] [--missing-cutoff c] [--mask MASK] [--drop-first-last] vcf.gz out[.gz] contig pop1 [pop2]
smc++ vcf2smc: error: argument pop1: 'NA12877' should be a comma-separated list of sample ids preceded by a population identifier. See 'smc++ vcf2smc -h'.
usage: smc++ vcf2smc [-h] [-v] [--cores CORES] [-d sample_id sample_id] [--length LENGTH] [--ignore-missing] [--missing-cutoff c] [--mask MASK] [--drop-first-last] vcf.gz out[.gz] contig pop1 [pop2]
smc++ vcf2smc: error: argument pop1: 'NA12877' should be a comma-separated list of sample ids preceded by a population identifier. See 'smc++ vcf2smc -h'.
usage: smc++ vcf2smc [-h] [-v] [--cores CORES] [-d sample_id sample_id] [--length LENGTH] [--ignore-missing] [--missing-cutoff c] [--mask MASK] [--drop-first-last] vcf.gz out[.gz] contig pop1 [pop2]
smc++ vcf2smc: error: argument pop1: 'NA12877' should be a comma-separated list of sample ids preceded by a population identifier. See 'smc++ vcf2smc -h'.

smc++ vcf2smc example.vcf.gz chr1.smc.gz chr1 CEU:NA1885,NA3861
827 smcpp.commands.vcf2smc WARNING Neither missing cutoff (-c) or mask (-m) has been specified. This means that stretches of the chromosome that do not have a
ny VCF entries (for example, centromeres) will be interpreted as homozygous recessive.
827 smcpp.commands.vcf2smc INFO Population 1:
827 smcpp.commands.vcf2smc INFO Distinguished lineages: NA1885:0, NA1885:1
827 smcpp.commands.vcf2smc INFO Undistinguished lineages: NA3861:0, NA3861:1
Traceback (most recent call last):
File "/home/exouser/.local/bin/smc++", line 8, in
sys.exit(main())
File "/home/exouser/.local/lib/python3.8/site-packages/smcpp/frontend/console.py", line 28, in main
cmds[args.command].main(args)
File "/home/exouser/.local/lib/python3.8/site-packages/smcpp/commands/vcf2smc.py", line 128, in main
vcf = VariantFile(args.vcf)
File "pysam/libcbcf.pyx", line 4117, in pysam.libcbcf.VariantFile.init
File "pysam/libcbcf.pyx", line 4347, in pysam.libcbcf.VariantFile.open
ValueError: invalid file b'example.vcf.gz' (mode=b'r') - is it VCF/BCF format?

@willright28 kindly send me your header info from vcf.gz file. If, possible then example data set from your original data,
so that i can do necessary changes accordingly

@terhorst @willright28 i'm using ubuntu linux application on windows10

Regards
Thankyou

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants
@Risingsun93 and others