892 Plasma Proteins Linked to Heart Disease Risk in 53,000-Person Study
A landmark 14-year UK Biobank study maps 3,089 protein-CVD associations, revealing causal biomarkers and drug repurposing opportunities.
Summary
Researchers analyzed 2,920 plasma proteins in 53,026 UK Biobank participants over 14 years, identifying 892 unique proteins significantly associated with 13 cardiovascular disease outcomes. Key findings include NT-proBNP as the strongest predictor of atrial fibrillation and GDF15 for heart failure. Mendelian randomization confirmed 225 proteins causally linked to CVDs, with LPA showing the strongest coronary artery disease association. A machine-learning prediction model achieved an AUC of 0.86 for abdominal aortic aneurysm. Mediation analysis revealed that modifiable risk factors like smoking and BMI mediate many protein-CVD relationships, pointing toward actionable prevention targets.
Detailed Summary
Cardiovascular disease remains the leading cause of death globally, yet comprehensive, longitudinal analyses linking the full plasma proteome to diverse CVD subtypes have been lacking. This study addresses that gap by systematically mapping plasma protein associations across 13 incident CVD outcomes in one of the largest proteomics cohorts ever assembled.
Using data from 53,026 UK Biobank participants with a median 14-year follow-up, researchers measured 2,920 baseline plasma proteins and applied Cox proportional hazard models to identify significant associations after stringent Bonferroni correction. They found 3,089 significant protein-CVD associations involving 892 unique protein analytes. The most striking associations included NT-proBNP for atrial fibrillation (P = 6.31×10⁻³¹³), NPPB and GDF15 for heart failure, and LEP and FABP4 as the strongest correlates of cardiac structure and function measured by cardiovascular magnetic resonance (CMR) imaging.
A machine-learning prediction model trained on 257 pre-selected proteins significantly outperformed the standard SCORE2 cardiovascular risk tool across most outcomes, achieving an AUC of 0.86 for abdominal aortic aneurysm. Adding protein data to SCORE2 further improved discrimination, underscoring the additive predictive value of the plasma proteome beyond conventional clinical risk factors.
Two-sample Mendelian randomization analysis, using genetic variants as instrumental variables to minimize confounding, identified 225 proteins causally linked to CVDs. LPA showed the strongest causal association with coronary artery disease (OR = 1.13, P = 2.38×10⁻¹⁵). Critically, many of these causal proteins are already targets of existing drugs, opening concrete drug repurposing opportunities. Mediation analyses revealed broad-spectrum mediators — IGFBP4 and GDF15 each influencing 9 cardiovascular outcomes — with modifiable risk factors such as smoking and BMI serving as the primary mediating pathways between proteins and CVD risk.
The study provides an unprecedented systems-level map of how the plasma proteome interacts with cardiovascular health across disease subtypes, cardiac structure, and function. By combining longitudinal epidemiology, causal inference, and prediction modeling, it offers a foundation for developing protein-based biomarker panels, refining early CVD detection, and identifying novel therapeutic targets.
Key Findings
- 892 unique plasma proteins significantly associated with 13 CVD outcomes across 53,026 participants over 14 years.
- NT-proBNP showed the strongest single association, with atrial fibrillation at P = 6.31×10⁻³¹³.
- Machine-learning protein model achieved AUC 0.86 for abdominal aortic aneurysm, outperforming standard SCORE2.
- 225 proteins causally linked to CVDs via Mendelian randomization; LPA strongest for coronary artery disease.
- IGFBP4 and GDF15 mediate 9 cardiovascular outcomes each, with smoking and BMI as key modifiable mediators.
Methodology
Prospective cohort study using 53,026 UK Biobank participants followed for a median 14 years. Cox proportional hazard models assessed 2,920 plasma proteins against 13 CVD outcomes with Bonferroni correction. Causal inference used two-sample Mendelian randomization with GWAS-derived genetic instruments; prediction models applied machine learning on 257 pre-selected proteins.
Study Limitations
The UK Biobank cohort is predominantly white British, limiting generalizability to other ethnic populations. Plasma protein levels were measured only at baseline, so temporal changes during follow-up are not captured. As an observational study, residual confounding remains possible despite Mendelian randomization efforts to establish causality.
Enjoyed this summary?
Get the latest longevity research delivered to your inbox every week.
