credtools¶
Multi-ancestry fine-mapping pipeline.
- Documentation: https://Jianhua-Wang.github.io/credtools
- GitHub: https://github.com/Jianhua-Wang/credtools
- PyPI: https://pypi.org/project/credtools/
- Free software: MIT
Features¶
- Whole-genome preprocessing: Start from raw GWAS summary statistics and genotype data
- Standardize and munge summary statistics from various formats
- Prepare LD matrices and fine-mapping inputs automatically
- Multi-ancestry fine-mapping: Support for multiple fine-mapping tools (SuSiE, FINEMAP, etc.)
- Meta-analysis capabilities: Combine results across populations and cohorts
- Quality control: Built-in QC metrics and visualizations
- Command-line interface: Easy-to-use CLI for all operations
Installation¶
Basic Installation¶
Install with uv¶
Quick Start¶
Command Line Usage¶
# Complete workflow: from whole-genome data to fine-mapping results
# Step 1: Standardize summary statistics
credtools munge population_config.txt output/munged/
# Step 2: Identify loci, chunk data, and extract LD matrices
credtools chunk output/munged/sumstat_info_updated.txt output/chunks/
# Step 3: Run fine-mapping pipeline
credtools pipeline output/chunks/loci_list.txt output/results/
Preprocessing Workflow¶
credtools now supports starting from whole-genome summary statistics and genotype data, eliminating the need for manual preprocessing:
Step 1: Munge Summary Statistics (credtools munge)¶
- Purpose: Standardize and clean GWAS summary statistics from various formats
- Features:
- Automatic header detection and mapping
- Data validation and quality control
- Support for multiple file formats
- Input: Raw GWAS files with various column headers
- Output: Standardized
.munged.txt.gzfiles
Step 2: Chunk Loci (credtools chunk)¶
- Purpose: Identify independent loci, create regional chunks, and extract LD matrices
- Features:
- Distance-based independent SNP identification
- Cross-ancestry loci coordination
- Configurable significance thresholds
- Automatic LD matrix extraction when
ld_refis provided in population config - Input: Munged summary statistics files (or population config with
ld_ref) - Output: Locus-specific chunked files, LD matrices, and credtools-ready input files
Multi-Ancestry Support¶
- Consistent loci definition: Union approach across ancestries
- Flexible input formats: Support for various GWAS summary statistics formats
- Coordinated processing: Ensure compatibility across populations
Documentation¶
For detailed documentation, see https://Jianhua-Wang.github.io/credtools