You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In some cases metaber7 flips the sign of the meta-analysis result to keep the effect for the intended A1/A2, but does not also flip signs in the Direction field.
Details
The workflow for metaber7 is roughly:
Built a metal script for the meta-analysis
Read through each input GWAS file to build an index of descriptive info on each variant (e.g. chr/bp, a1/a2, info, freq). For the sake of tracking frequency, A1/A2 are taken from the first time the variant is encountered.
Create a "clean" copy of each input GWAS (excludes extreme ORs and SNPs with A1/A2 discordant with the indexed A1/A2.
Run meta-analysis using metal with the clean GWASs
Read through metal results, and write daner output with the meta-analysis results plus the info that metal isn't set to track (case/control freq, info, sample size, etc).
For step 5, it is possible that the order of A1/A2 reported by metal is not the order of A1/A2 indexed by metaber7 in step 2 (see Background below). In order to ensure that the allele frequency information added to the final output is correct, metaber7 checks for this swap. If the alleles have been swapped, metaber7 flips the sign of the effect in the meta-analysis results to match the A1/A2 order indexed in step 2. This maintains A1/A2 in the order indexed from the output GWAS, and reports the corresponding freq and effect size.
This is all reasonable, but the meta-analysis results also include a "Direction" field that reports the sign of the effect from each cohort in the meta-analysis. This field gets passed through directly from the metal output, and thus always reports the sign for the A1/A2 ordering in metal even when the the final output is reported a swapped order with the sign of the effect size flipped.
Example
Adapted from observed data, with numbers rounded/obscured to protect actual results
Dataset 1:
CHR SNP BP A1 A2 FRQ_A_500 FRQ_U_10000 INFO OR SE P ngt
1 rs123456789 87654321 G T 0.01000000000000000 0.02000000000000000 0.90000 1.0100000000000 0.3000000 0.9000000 0
Dataset 2:
CHR SNP BP A1 A2 FRQ_A_1500 FRQ_U_10000 INFO OR SE P ngt
1 rs123456789 87654321 G T 0.01000000000000000 0.02000000000000000 0.90000 0.9500000000000 0.2000000 0.7000000 0
Metal output:
MarkerName Allele1 Allele2 Effect StdErr P-value Direction HetISq HetChiSq HetDf HetPVal
rs123456789 t g 0.0900 0.2000 0.500 -+ 0.0 0.100 1 0.7000
Note: effect of T allele is positive, with Direction "-+" in the two cohorts, matching the above input with A1/A2 swapped.
Daner output:
CHR SNP BP A1 A2 FRQ_A_5000 FRQ_U_10000 INFO OR SE P ngt Direction HetISqt HetDf HetPVa Nca Nco Neff
1 rs123456789 87654321 G T 0.01000 0.02000 0.900 0.9000 0.2000 0.5000 0 -+ 0.0 1 0.7000 2000 20000 8000.00
Note: A1/A2 swapped to match input GWAS, and reports negative effect (OR < 1) of G allele. Direction field still reports "-+", which is result for T allele rather than G allele.
Possible Solutions
The relevant code block in metaber7 for step 5 is lines 923-1056. Should be relatively straight-forward to fix in one of two ways:
Update the Direction field when flipping the effect size from metal. This update could be done along the effect size flip at line 956. It would require parsing the Direction field string to make the relevant substitutions. Printing this updated string would occur at line 1019 (in place of $cells[6]).
Take A1/A2 order as reported by metal, and flip the indexed allele frequency to match. In place of the effect size flip at 956, frequencies fca{}/nca{} and fco{}/nco{} currently printed lins 996-1000 wound need to be updated. Also, the A1/A2 from metal (e.g. $a1_name) would need to be printed rather than the indexed alleles (e.g. $a1{$snp_name}) at line 975-976.
Not sure which of these might be preferred based on any impacts on downstream analysis/processing.
Additional Background
This issue arises in part due to strange allele swapping behavior from metal. While metal documentation states that the reported Allele1 is the first allele from the first occurrence of the marker in the input files, it appears this isn't actually the case. This alternative instruction file instead says the "lowest numeric reference allele" with be used if the input files differ, which seems closer to accurate, but still not quite right.
The relevant code block from the metal source code is:
This gets run for the first occurrence of every marker.
If it returns true, then FlipAllele(al1) and FlipAllele(al2) are run again prior to final output. Because this code actually alters al1 and al2, any switching from al1.Swap(al2) will propagate though to metal results files.
By my read, this means markers where the first occurrence has alleles C/A, C/T, G/A, G/C, G/T, and T/A markers will have their allele order swapped, while A/C, A/G, A/T, C/G, T/C, and T/G markers will have their allele orders preserved. Looking through some results from metal seems to confirm this. It's also consistent with the motivating example above that swaps G/T to T/G.
The text was updated successfully, but these errors were encountered:
Thank you for the explanation about how METAL actually handles the flipping of the alleles. It was baffling me when I was comparing my results with what I thought they should look like after referencing the METAL documentation. Your interpretation of METAL's behavior is consistent with what I see in my results.
Do you have any sense of why METAL handles the flipping of the alleles this way?
Summary
In some cases
metaber7
flips the sign of the meta-analysis result to keep the effect for the intended A1/A2, but does not also flip signs in the Direction field.Details
The workflow for
metaber7
is roughly:metal
script for the meta-analysismetal
with the clean GWASsmetal
results, and write daner output with the meta-analysis results plus the info thatmetal
isn't set to track (case/control freq, info, sample size, etc).For step 5, it is possible that the order of A1/A2 reported by
metal
is not the order of A1/A2 indexed bymetaber7
in step 2 (see Background below). In order to ensure that the allele frequency information added to the final output is correct,metaber7
checks for this swap. If the alleles have been swapped,metaber7
flips the sign of the effect in the meta-analysis results to match the A1/A2 order indexed in step 2. This maintains A1/A2 in the order indexed from the output GWAS, and reports the corresponding freq and effect size.This is all reasonable, but the meta-analysis results also include a "Direction" field that reports the sign of the effect from each cohort in the meta-analysis. This field gets passed through directly from the
metal
output, and thus always reports the sign for the A1/A2 ordering inmetal
even when the the final output is reported a swapped order with the sign of the effect size flipped.Example
Adapted from observed data, with numbers rounded/obscured to protect actual results
Dataset 1:
Dataset 2:
Metal output:
Note: effect of T allele is positive, with Direction "-+" in the two cohorts, matching the above input with A1/A2 swapped.
Daner output:
Note: A1/A2 swapped to match input GWAS, and reports negative effect (OR < 1) of G allele. Direction field still reports "-+", which is result for T allele rather than G allele.
Possible Solutions
The relevant code block in
metaber7
for step 5 is lines 923-1056. Should be relatively straight-forward to fix in one of two ways:Update the Direction field when flipping the effect size from
metal
. This update could be done along the effect size flip at line 956. It would require parsing the Direction field string to make the relevant substitutions. Printing this updated string would occur at line 1019 (in place of$cells[6]
).Take A1/A2 order as reported by
metal
, and flip the indexed allele frequency to match. In place of the effect size flip at 956, frequenciesfca{}/nca{}
andfco{}/nco{}
currently printed lins 996-1000 wound need to be updated. Also, the A1/A2 frommetal
(e.g.$a1_name
) would need to be printed rather than the indexed alleles (e.g.$a1{$snp_name}
) at line 975-976.Not sure which of these might be preferred based on any impacts on downstream analysis/processing.
Additional Background
This issue arises in part due to strange allele swapping behavior from
metal
. Whilemetal
documentation states that the reportedAllele1
is the first allele from the first occurrence of the marker in the input files, it appears this isn't actually the case. This alternative instruction file instead says the "lowest numeric reference allele" with be used if the input files differ, which seems closer to accurate, but still not quite right.The relevant code block from the
metal
source code is:This gets run for the first occurrence of every marker.
If it returns
true
, thenFlipAllele(al1)
andFlipAllele(al2)
are run again prior to final output. Because this code actually altersal1
andal2
, any switching fromal1.Swap(al2)
will propagate though tometal
results files.By my read, this means markers where the first occurrence has alleles C/A, C/T, G/A, G/C, G/T, and T/A markers will have their allele order swapped, while A/C, A/G, A/T, C/G, T/C, and T/G markers will have their allele orders preserved. Looking through some results from
metal
seems to confirm this. It's also consistent with the motivating example above that swaps G/T to T/G.The text was updated successfully, but these errors were encountered: