QC Metrics Dictionary¶

CREDTOOLS QC is meant to answer one practical question: do the summary statistics and LD reference look consistent enough to trust fine-mapping?

Start with qc.txt.gz, then open the detailed files only for suspicious loci.

`qc.txt.gz`¶

Column	Meaning	How to read it
`popu`	population label	comes from `loci_list.txt`
`cohort`	cohort label	comes from `loci_list.txt`
`n_snps`	variants used in QC	small values can mean poor sumstats/LD overlap
`n_1e-5`	variants with p-value below `1e-5`	quick signal count
`n_5e-8`	variants with p-value below `5e-8`	genome-wide significant count
`maf_corr`	correlation between sumstats MAF and LD-reference MAF	low values suggest allele-frequency mismatch
`lambda_s`	RSS regularization estimate	high values suggest LD or summary-stat mismatch
`n_lambda_s_outlier`	count from LD-mismatch and marginal rules	variants flagged by the kriging RSS rules
`n_dentist_s_outlier`	count from Dentist-S	variants inconsistent with the lead-SNP LD pattern
`n_c1b_outlier`	optional C1b count	appears when C1b/adaptive QC is enabled

Look for patterns, not one number

A single odd metric is a prompt to inspect the locus. A pattern across many loci or one cohort is a stronger sign that the input panel, build, or allele coding needs attention.

`expected_z.txt.gz`¶

This file comes from the kriging RSS check.

Column	Meaning
`SNPID`	variant identifier
`z`	transformed z-score
`condmean`	expected z-score given other variants and LD
`condvar`	conditional variance
`z_std_diff`	standardized residual between observed and expected z-score
`logLR`	log likelihood ratio for allele-switch-like behavior
`lambda_s`	locus-level regularization estimate
`cohort`	combined `popu_cohort` label

Default C1 and C2 rules use:

Rule	Default condition
C1 LD mismatch	`logLR > 2` and `abs(z) > 2`
C2 marginal	`abs(z) < 2`, `abs(z_std_diff) > 3`, and lead-SNP correlation above `0.8`
C1b high-z residual	`abs(z_std_diff) > 10` and `abs(z) > 2`

C1b is only counted when enabled with --enable-c1b or through adaptive QC.

`dentist_s.txt.gz`¶

Dentist-S asks whether a variant's marginal association is consistent with the lead variant and LD.

Column	Meaning
`SNPID`	variant identifier
`t_dentist_s`	Dentist-S test statistic
`-log10p_dentist_s`	evidence against consistency
`r2`	LD R2 with the lead variant
`cohort`	combined `popu_cohort` label

The default Dentist-S outlier rule is:

-log10p_dentist_s >= 4 and r2 >= 0.6

`compare_maf.txt.gz`¶

Column	Meaning
`SNPID`	variant identifier
`MAF_sumstats`	minor allele frequency from summary statistics
`MAF_ld`	minor allele frequency from the LD map `AF2` column
`cohort`	combined `popu_cohort` label

If the LD map has no AF2 column, CREDTOOLS cannot compute this comparison and the detailed MAF file may be empty.

Outlier Removal Files¶

When --remove-outlier or --adaptive-qc is used, CREDTOOLS writes a cleaned/ folder.

File	Use
`cleaned/outlier_snps.txt.gz`	variant-level outlier calls
`cleaned/outlier_removal_summary.txt.gz`	counts removed per locus and cohort
`cleaned/cleaned_loci_info.txt.gz`	input file for downstream fine-mapping on cleaned data

--adaptive-qc implies outlier removal. It first removes baseline C1, C2, and C3 outliers. If the cleaned locus still has high lambda_s, it adds C1b outliers from the original locus and recomputes QC.

Heterogeneity Metrics¶

Meta-analysis writes heterogeneity outputs before combining cohorts.

Metric	File	Meaning
LD fourth moment	`ld_4th_moment.txt.gz`	local LD structure around each variant
LD decay	`ld_decay.txt.gz`	average LD by distance bin
SNP missingness	`snp_missingness.txt.gz`	which cohorts contain each variant
Cochran-Q	`cochran_q.txt.gz`	effect-size heterogeneity across cohorts
Summary	`heterogeneity.txt.gz`	per-cohort rollup

Use heterogeneity.txt.gz for the first pass. Open detailed files when one cohort has unusually high missingness, high LD differences, or many heterogeneous SNPs.

Practical Review Flow¶

Sort qc.txt.gz by lambda_s, n_lambda_s_outlier, and maf_corr.
Pick the worst locus for each suspicious cohort.
Open expected_z.txt.gz, dentist_s.txt.gz, and compare_maf.txt.gz.
Plot the locus with credtools plot.
Decide whether to remove outliers, change LD reference, or drop the locus.

QC Metrics Dictionary¶

qc.txt.gz¶

expected_z.txt.gz¶

dentist_s.txt.gz¶

compare_maf.txt.gz¶