Abstract
朱晨旭,宋雨昕,郝元涛,陈峰,魏永越.大型人群队列在疾病风险预测模型研究中的作用:以英国生物银行为例[J].Chinese journal of Epidemiology,2024,45(10):1433-1440
大型人群队列在疾病风险预测模型研究中的作用:以英国生物银行为例
Contribution of the large-scale population cohort in disease risk prediction model study: taking United Kingdom Biobank as an example
Received:May 07, 2024  
DOI:10.3760/cma.j.cn112338-20240507-00245
KeyWord: 大型人群队列  疾病风险预测模型  英国生物银行  数据共享
English Key Word: Large-scale population cohort  Disease risk prediction model  United Kingdom Biobank  Data sharing
FundProject:国家自然科学基金面上项目(82473728,81973142)
Author NameAffiliationE-mail
Zhu Chenxu Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing 211166, China  
Song Yuxin Peking University Center for Public Health and Epidemic Preparedness & Response, Beijing 100191, China  
Hao Yuantao Peking University Center for Public Health and Epidemic Preparedness & Response, Beijing 100191, China
Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing 100191, China
Key Laboratory of Epidemiology of Major Diseases (Peking University), Ministry of Education, Beijing 100191, China 
 
Chen Feng Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing 211166, China  
Wei Yongyue Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing 211166, China
Peking University Center for Public Health and Epidemic Preparedness & Response, Beijing 100191, China
Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing 100191, China
Key Laboratory of Epidemiology of Major Diseases (Peking University), Ministry of Education, Beijing 100191, China 
ywei@pku.edu.cn 
Hits: 800
Download times: 232
Abstract:
      疾病风险预测模型是精准预防的基础,也是临床诊疗决策的重要参考依据。风险预测模型的开发需要大量高质量数据的支持,大型人群队列研究是其重要的基础。英国生物银行(UKB)作为大型人群队列和生物样本库,其丰富的基线和随访数据以及向全球共享的理念和机制,在疾病病因探索和疾病防控相关研究领域有重要的作用。本研究遵循PRISMA规范,纳入了210篇文献,其通讯作者来自18个国家,其中有58篇(27.62%)来自英国。提取针对癌症、心脑血管疾病、内分泌营养代谢疾病、呼吸系统疾病、其他疾病及其亚组人群的491个疾病风险预测模型,其中UKB开发-无验证132个、UKB开发-内部验证183个、UKB开发-外部验证17个、外部开发-UKB验证159个。仅采用宏观变量的模型共188个(38.29%),宏观与微观变量相结合的模型共303个(61.71%)。模型构建方法包括生存结局模型、logistic回归、机器学习,生存结局模型中以Cox比例风险回归模型为主,少量模型考虑了竞争风险、加速失效模型或不同的基线风险函数。机器学习模型采用了随机森林、极限梯度提升法、分类提升算法、支持向量机和卷积神经网络等方法。UKB为多种疾病风险预测模型研究提供了重要资源。
English Abstract:
      The disease risk prediction model is the basis of precision prevention and an essential reference for clinical treatment decisions. The development of risk prediction models requires the support of a large amount of high-quality data. A large population cohort study is an important basis for this study. The United Kingdom Biobank (UKB), as a mega-population cohort and biobank, has played an essential role in the exploration of disease etiology and research related to disease prevention and control, with its rich baseline and follow-up data and concepts and mechanisms shared globally. This study followed PRISMA guidelines and included 210 articles with corresponding authors from 18 countries, of which 58 (27.62%) were from the UKB. A total of 491 disease risk prediction models were extracted for cancer, cardiovascular and cerebrovascular diseases, endocrine and metabolic diseases, respiratory diseases, and other diseases and their subgroups, of which 132 were developed by UKB without validation, 183 were developed by UKB with internal validation, 17 were developed by UKB with external validation, and 159 were developed by external development with UKB validation. A total of 188 models used only macro variables (38.29%), and 303 models combined macro and micro variables (61.71%). Model construction methods included survival outcome models, logistic regression, and machine learning. Survival outcome models were dominated by Cox proportional risk regression models and a few models considering competitive risk, accelerated failure models, or different baseline risk functions. Machine learning models included random forest, XGBoost, CatBoost, support vector machine, convolutional neural network, and other methods. The UKB is an essential resource for multiple disease risk prediction modeling studies.
View Fulltext   Html FullText     View/Add Comment  Download reader
Close