Fine-mapping
easyfinemap supports three types of fine-mapping methods: LD-free, LD-based, and function-based. LD-free methods do not require LD information, LD-based methods rely on LD information, and function-based methods require functional annotation information. Below are the specific software supported for each of these three methods in easyfinemap. The principles of these software are not explained here and can be referred to their respective documentation.
- LD-free fine-mapping
- LD-based fine-mapping
- Function-based fine-mapping
LD-free fine-mapping¶
$ easyfinemap fine-mapping \
-m abf \
--credible-threshold 0.95 \
PH251.txt.gz \
PH251.distance.loci.txt \
PH251.distance.leadsnp.txt \
PH251.abf.txt
- GWAS summary statistics file
- loci file
- lead SNPs file
- output file
The ABF method was referenced from the article by Asimit, J. L. et al. in Eur J Hum Genet (2016).
where w is variance prior (parameter --var-prior
), usually set to 0.15 for quantitative traits
and 0.2 for binary traits.
the posterior probability of each variant being causal is calculated
using the formula:
PP(causal) = ABF / sum(all_abfs)
Option --credible-threshold
is used to specify the credible threshold, which is used to determine the credible set. The default value is 0.95. The credible set is determined by the posterior probability of each variant being causal. The variants with the highest posterior probability are included in the credible set until the sum of the posterior probability of the variants in the credible set is greater than or equal to the credible threshold. Will output all variants if the credible threshold is set to None.
LD-based fine-mapping¶
$ easyfinemap fine-mapping \
-m finemap \
--ldref ./EUR.valid.chr{chrom} \
--use-ref-eaf \
--credible-threshold 0.95 \
-n 1000 \
./PH251.txt.gz \
PH251.distance.loci.txt \
PH251.distance.leadsnp.txt \
PH251.abf
Please note that if the summary statistics do not include EAF (Effect Allele Frequency) information, you need to use the --use-ref-eaf
parameter to use the LD reference's EAF as the EAF in order to avoid any errors.
Function-based fine-mapping¶
Users can also perform fine-mapping using the prior probabilities provided by PolyFun. This method requires users to provide functional annotation information, such as gene expression levels and chromatin states. The specific usage details can be found in the PolyFun documentation.
To perform PolyFun's fine-mapping using easyfinemap, it is straightforward. Simply specify the --prior-file
parameter in the command line, for example:
$ easyfinemap fine-mapping \
-m polyfun_finemap \
--prior-file ./prior.txt \
--credible-threshold 0.95 \
PH251.txt.gz \
PH251.distance.loci.txt \
PH251.distance.leadsnp.txt \
PH251.abf.txt
The prior probability file provided by PolyFun is in Parquet format. To facilitate GitHub uploading, it is divided into two files: chr1-7 and chr8-22. Users need to merge these two files into one and then specify the path to this file using the --prior-file
parameter.
The format is as follows:
CHR BP SNP A1 A2 snpvar_bin
1 10177 rs367896724 A AC 0.0
1 10352 rs201106462 T TA 0.0
1 10511 rs534229142 G A 0.0
1 10616 1:10616_CCGCCGTTGCAAAGGCGCGCCG_C CCGCCGTTGCAAAGGCGCGCCG C 0.0
1 11008 rs575272151 C G 0.0
1 11012 rs544419019 C G 0.0
1 13110 rs540538026 G A 0.0
1 13116 rs62635286 T G 0.0
1 13118 rs62028691 A G 0.0
Fine-mapping with conditional analysis¶
If users have used conditional analysis to determine the lead SNP of a locus, they can utilize the --conditional
parameter to perform fine-mapping using the results of conditional analysis. For example:
$ easyfinemap fine-mapping \
-m finemap \
--credible-threshold 0.95 \
--conditional \
--ldref ./EUR.valid.chr{chrom} \
--use-ref-eaf \
-n 1000 \
PH251.txt.gz \
PH251.conditional.loci.txt \
PH251.conditional.leadsnp.txt \
PH251.abf.txt
Fine-mapping with multiple methods¶
In summary, easyfinemap supports multiple fine-mapping methods. Users can specify a single method or multiple methods to perform fine-mapping simultaneously by using multiple -m
parameters in the command line. For example:
$ easyfinemap fine-mapping \
-m abf -m finemap \
--ldref ./EUR.valid.chr{chrom} \
--use-ref-eaf \
--credible-threshold 0.95 \
--credible-method finemap \
-n 1000 \
./PH251.txt.gz \
PH251.distance.loci.txt \
PH251.distance.leadsnp.txt \
PH251.abf
-m
all to perform fine-mapping using all supported methods. For example:
$ easyfinemap fine-mapping \
-m all \
--ldref ./EUR.valid.chr{chrom} \
--use-ref-eaf \
--credible-threshold 0.95 \
--credible-method finemap \
-n 1000 \
./PH251.txt.gz \
PH251.distance.loci.txt \
PH251.distance.leadsnp.txt \
PH251.abf
--credible-method
parameter to determine which method should be used to determine the credible set.
Output¶
The output file is a tab-delimited text file. The first line is the header, which contains the following columns:
$ head PH251.finemap.txt
SNPID CHR BP rsID EA NEA EAF MAF BETA SE P PP_FINEMAP LEAD_SNP
22-29451671-A-G 22 29451671 rs4823006 G A -0.0241 0.0037 7.99e-11 0.917195 22-29451671-A-G
22-29449477-A-G 22 29449477 rs2294239 G A -0.0243 0.004 8.26e-10 0.0721721 22-29451671-A-G
PP_FINEMAP
column is the posterior probability of each variant being causal. The LEAD_SNP
column is the lead SNP of the locus.
Other parameters¶
--max-causal
parameter is used to specify the maximum number of causal SNPs. The default value is 1.
--cond-snps-wind-kb
parameter is used to specify the conditional SNPs window size, in kb. The default value is 10000.
--threads
parameter is used to specify the number of threads. The default value is 1.
Help information¶
$ easyfinemap fine-mapping -h
────────────────────────────────── EasyFinemap ───────────────────────────────────
Version: 0.4.0
Author: Jianhua Wang
Email: jianhua.mert@gmail.com
Usage: easyfinemap fine-mapping [OPTIONS] SUMSTATS_PATH LOCI_PATH
LEAD_SNPS_PATH OUTFILE
Fine mapping.
╭─ Arguments ────────────────────────────────────────────────────────────────────╮
│ * sumstats_path TEXT The path to the GWAS summary statistics file. │
│ [default: None] │
│ [required] │
│ * loci_path TEXT The path to the loci file, generated by │
│ get-loci command. │
│ [default: None] │
│ [required] │
│ * lead_snps_path TEXT The path to the lead SNPs file, generated by │
│ get-loci command. │
│ [default: None] │
│ [required] │
│ * outfile TEXT The output file. [default: None] [required] │
╰────────────────────────────────────────────────────────────────────────────────╯
╭─ Options ──────────────────────────────────────────────────────────────────────╮
│ * --methods -m [abf|finemap|paintor| The methods to use. │
│ caviarbf|susie|polyfu [default: None] │
│ n_susie|polyfun_finem [required] │
│ ap|all] │
│ --var-prior FLOAT The prior variance for │
│ the aBF method. │
│ [default: 0.2] │
│ --conditional -c Whether to use │
│ conditional mode. │
│ --prior-file TEXT The path to the prior │
│ file. │
│ [default: None] │
│ --sample-size -n INTEGER The sample size for │
│ conditional mode. │
│ [default: None] │
│ --ldref TEXT The path to the LD │
│ reference file. │
│ [default: None] │
│ --cond-snps-wind-kb INTEGER The conditional SNPs │
│ window size, in kb. │
│ [default: 10000] │
│ --max-causal INTEGER The maximum number of │
│ causal SNPs. │
│ [default: 1] │
│ --credible-threshold FLOAT The credible │
│ threshold. │
│ [default: None] │
│ --credible-method TEXT The fine-mapping │
│ method for credible │
│ set. │
│ [default: None] │
│ --use-ref-eaf Whether to use the │
│ reference panel EAF. │
│ --threads -t INTEGER The number of threads. │
│ [default: 1] │
│ --help -h Show this message and │
│ exit. │
╰────────────────────────────────────────────────────────────────────────────────╯