朱猛,程阳,戴俊程,谢兰,靳光付,马红霞,胡志斌,师咏勇,林东昕,沈洪兵.基于全基因组关联研究的中国人群肺癌风险预测模型[J].Chinese journal of Epidemiology,2015,36(10):1047-1052 |
基于全基因组关联研究的中国人群肺癌风险预测模型 |
Genome-wide association study based risk prediction model in predicting lung cancer risk in Chinese |
Received:June 15, 2015 |
DOI:10.3760/cma.j.issn.0254-6450.2015.10.002 |
KeyWord: 肺癌 全基因组关联研究 风险预测模型 |
English Key Word: Lung cancer Genome-wide association study Risk prediction model |
FundProject:国家自然科学基金重点项目(81230067);江苏高校优势学科建设工程专项资金(公共卫生与预防医学) |
Author Name | Affiliation | E-mail | Zhu Meng | Department of Epidemiology, School of Public Health, Nanjing Medical University, Nanjing 211166, China | | Cheng Yang | Department of Epidemiology, School of Public Health, Nanjing Medical University, Nanjing 211166, China | | Dai Juncheng | Department of Epidemiology, School of Public Health, Nanjing Medical University, Nanjing 211166, China | | Xie Lan | Medical Systems Biology Research Center, Tsinghua University School of Medicine | | Jin Guangfu | Department of Epidemiology, School of Public Health, Nanjing Medical University, Nanjing 211166, China | | Ma Hongxia | Department of Epidemiology, School of Public Health, Nanjing Medical University, Nanjing 211166, China | | Hu Zhibin | Department of Epidemiology, School of Public Health, Nanjing Medical University, Nanjing 211166, China | | Shi Yongyong | Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University | | Lin Dongxin | State Key Laboratory of Molecular Oncology, Cancer Institute and Hospital, Chinese Academy of Medical Sciences, Peking Union Medical College | | Shen Hongbing | Department of Epidemiology, School of Public Health, Nanjing Medical University, Nanjing 211166, China | hbshen@njmu.edu.cn |
|
Hits: 15594 |
Download times: 4387 |
Abstract: |
目的 联合使用遗传因素和吸烟信息构建中国汉族人群的肺癌风险预测模型。方法 基于中国汉族人群全基因组关联研究(GWAS)数据,根据样本地区来源将样本分为训练集(南京与上海:1 473 名病例vs. 1 962 名对照)和测试集(北京与武汉:858 名病例vs. 1 115 名对照)。系统整理已报道肺癌易感位点,在训练集中用逐步后退法筛选具有独立效应的位点,并通过加权法估算个体遗传得分用于建模。在训练集中分别构建基于吸烟信息、遗传得分和联合使用吸烟与遗传信息的3 种风险预测模型(吸烟模型、遗传效应模型和联合模型),并根据受试者工作特征(ROC)曲线、曲线下面积(AUC)、净分类指数(NRI)和整体鉴别指数(IDI)评价模型对肺癌风险预测的效能。对于构建的模型,进一步在测试集中进行验证。结果 在训练集中,联合模型、吸烟模型和遗传效应模型AUC分别为0.69(0.67~0.71)、0.65(0.63~0.66)和0.60(0.59~0.62)。在训练集和测试集中联合模型的风险预测效能高于吸烟模型或遗传模型,差异有统计学意义(P<0.001)。重分类结果显示,联合模型与吸烟模型相比,在训练集中NRI 增加4.57%(2.23%~6.91%),IDI 增加3.11%(2.52%~3.69%)。在测试集中,NRI和IDI 分别增加2.77%和3.16%。结论 遗传得分可以显著提高肺癌传统风险模型的预测效能。联合使用遗传因素和吸烟信息构建的中国汉族人群肺癌风险预测模型可用于筛选中国汉族人群中肺癌发病的高危人群。 |
English Abstract: |
Objective To evaluate the predictive power of risk model by combining traditional epidemiological factors and genetic factors. Methods Our previous GWAS data of lung cancer in Chinese were used in training set(Nanjing and Shanghai:1 473 cases vs. 1 962 control)and testing set(Beijing and Wuhan:858 cases vs. 1 115 control). All the single nucleotide polymorphisms (SNPs)associated with lung cancer risk were systematically selected and stepwise logistic regression analysis was used to select independent factors in the training set. The wGRS(weighted genetic score)was further used to calculate genetic risk score. To evaluate the contribution of the genetic factors,3 risk models were established by using the training set,i.e. smoking model (based on smoking status),genetic risk model(based on genetic risk score)and combined model(based on smoke and genetic risk score). The predictability of the models were evaluated by the areas under the receiver operating characteristic(ROC) curves,area under curve(AUC),net reclassification improvement(NRI) and integrated discrimination index(IDI). Besides,the results were further verified in the testing set. Results In the training set,it was found that the AUC of the smoking, genetic risk and combined models were 0.65(0.63-0.66),0.60(0.59-0.62)and 0.69(0.67-0.71), respectively. Compared with combined model,the predictive power of other two models significantly declined,the difference was statistically significant (P<0.001). Furthermore,compared with the smoking model,the NRI of the combined model increased by 4.57%(2.23%-6.91% ) and IDI increased by 3.11%(2.52%-3.69%)in the training set,the difference was statistically significant(P< 0.001). Similarly,in the testing set NRI increased by 2.77%,the difference was not statistically significant(P=0.069),and IDI increased by 3.16%,the difference was statistically significant(P< 0.001). Conclusion This study showed that combining 14 genetic variants with traditional epidemiological factors could improve the predictive power of risk model for lung cancer. The model could be used in the screening of high-risk population of lung cancer in Chinese and provide evidence for the early diagnosis and treatment of lung cancer. |
View Fulltext
Html FullText
View/Add Comment Download reader |
Close |
|
|
|