Skip to content

Quickstart

This quickstart uses the small files in exampledata/test_mock_data. It starts with raw summary statistics and PLINK reference files, then runs the CREDTOOLS workflow.

Run the commands from the repository root.

1. Create a Population Config

The population config is a tab-separated file. Each row is one study or cohort.

cat > /tmp/credtools_population_config.tsv <<'EOF'
popu    cohort  sample_size path    ld_ref
EUR cohort1 10000   exampledata/test_mock_data/EUR_all_loci.sumstats    exampledata/test_mock_data/EUR_all_loci
AFR cohort1 8000    exampledata/test_mock_data/AFR_all_loci.sumstats    exampledata/test_mock_data/AFR_all_loci
EAS cohort1 12000   exampledata/test_mock_data/EAS_all_loci.sumstats    exampledata/test_mock_data/EAS_all_loci
EOF

The ld_ref column is a PLINK prefix. For example, exampledata/test_mock_data/EUR_all_loci points to:

exampledata/test_mock_data/EUR_all_loci.bed
exampledata/test_mock_data/EUR_all_loci.bim
exampledata/test_mock_data/EUR_all_loci.fam

2. Clean the Summary Statistics

credtools munge /tmp/credtools_population_config.tsv /tmp/credtools_munged --force

This creates standardized files and an updated config:

/tmp/credtools_munged/
- EUR_cohort1.munged.txt.gz
- AFR_cohort1.munged.txt.gz
- EAS_cohort1.munged.txt.gz
- sumstat_info_updated.txt

3. Split the Data Into Loci

credtools chunk \
  /tmp/credtools_munged/sumstat_info_updated.txt \
  /tmp/credtools_chunks \
  --threads 2

This step identifies loci, cuts the summary statistics into locus-sized files, and extracts LD matrices because ld_ref is present.

The most important file is:

/tmp/credtools_chunks/loci_list.txt

4. Run the Pipeline

credtools pipeline \
  /tmp/credtools_chunks/loci_list.txt \
  /tmp/credtools_results \
  --tool susie \
  --meta-method meta_all

The pipeline runs meta-analysis, QC, and fine-mapping.

5. Check the Results

Look for these files:

/tmp/credtools_results/
- overall_run_summary.log
- <locus_id>/
  - pips.txt.gz
  - causal_variants.txt.gz
  - credible_sets_summary.txt.gz
  - parameters.json
  - run_summary.log
  - expected_z.txt.gz
  - dentist_s.txt.gz
  - compare_maf.txt.gz

Create a quick QC plot:

credtools plot \
  /tmp/credtools_results \
  --type summary \
  --output /tmp/credtools_results/qc_summary.png

What You Just Did

graph TD
    A[population config] --> B[munge]
    B --> C[clean summary stats]
    C --> D[chunk]
    D --> E[loci_list.txt]
    E --> F[pipeline]
    F --> G[PIPs and credible sets]
    F --> H[QC tables]
    H --> I[plot]

Next Step

If this worked, read Raw GWAS to Results for a slower walkthrough with more context. If you already have prepared locus files, read Existing Loci List.