Genomic heterogeneity inflates the performance of variant pathogenicity predictions

This is the official repository for our paper Genomic heterogeneity inflates the performance of variant pathogenicity predictions.

It provides a genome-wide, variant-type-stratified benchmark dataset (>250,000 ClinVar variants) and the code to evaluate state-of-the-art DNA-based and protein-based models for variant pathogenicity prediction.

Notebooks

We provide one-click Jupyter notebook examples for each evaluated model, benchmark creation, and results visualization.

DNA-based models:
AlphaGenome, DNABERT2, Evo2, GPN-MSA, Nucleotide Transformer (NT), PhyloGPN, PhyloP
→ Notebooks are available in the DNA-based Models/ directory.
Protein-based models:
ESM family models, AlphaMissense, PrimateAI-3D
→ Notebooks are available in the protein_models/ directory.
Benchmark creation:
→ See VEP_ClinVar_Benchmarking_RefSeq.ipynb.
Visualization:
→ See VEP_AUROC_figure.ipynb.

Results

Figure 1. Pathogenicity prediction performance of frontier sequence-based models across variant types.
Evaluation and comparison of DNA and protein sequence AI models for their capacity to distinguish between pathogenic and benign variants across variant types, measured by the area under the receiver operating characteristic curve (AUROC). Error bars denote 95% confidence intervals estimated by stratified bootstrap resampling (1,000 iterations) within each variant group.

%P indicates the proportion of pathogenic variants in each group.
Some groups are defined by multiple annotated effects (e.g., both missense and 3′ UTR, with respect to different transcripts).
DNA models are shown as solid bars, protein models as dashed bars.

Note: The evaluation of PrimateAI-3D on stop-gain variants includes only 19,795 variants.

Citation

If you find this benchmark useful for your research, please cite our paper:

@article{genomic2025biorxiv,
  author    = {Baiyu Lu and Xueshen Liu and Po-Yu Lin and Nadav Brandes},
  title     = {Genomic heterogeneity inflates the performance of variant pathogenicity predictions},
  journal   = {bioRxiv},
  year      = {2025},
  doi       = {10.1101/2025.09.05.674459},
  url       = {https://www.biorxiv.org/content/10.1101/2025.09.05.674459v2},
  eprint    = {https://www.biorxiv.org/content/10.1101/2025.09.05.674459v2.full.pdf}
}

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
DNA-based Models		DNA-based Models
protein_models		protein_models
Figure1.svg		Figure1.svg
README.md		README.md
VEP_AUROC_figure.ipynb		VEP_AUROC_figure.ipynb
VEP_ClinVar_Benchmarking_RefSeq.ipynb		VEP_ClinVar_Benchmarking_RefSeq.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Genomic heterogeneity inflates the performance of variant pathogenicity predictions

Contents

Notebooks

Results

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 4

Uh oh!

Languages

Brandes-Lab/VEP-eval

Folders and files

Latest commit

History

Repository files navigation

Genomic heterogeneity inflates the performance of variant pathogenicity predictions

Contents

Notebooks

Results

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Packages