文章摘要
朱猛,程阳,戴俊程,谢兰,靳光付,马红霞,胡志斌,师咏勇,林东昕,沈洪兵.基于全基因组关联研究的中国人群肺癌风险预测模型[J].中华流行病学杂志,2015,36(10):1047-1052
基于全基因组关联研究的中国人群肺癌风险预测模型
Genome-wide association study based risk prediction model in predicting lung cancer risk in Chinese
投稿时间:2015-06-15  
DOI:10.3760/cma.j.issn.0254-6450.2015.10.002
中文关键词: 肺癌;全基因组关联研究;风险预测模型
英文关键词: Lung cancer;Genome-wide association study;Risk prediction model
基金项目:国家自然科学基金重点项目(81230067);江苏高校优势学科建设工程专项资金(公共卫生与预防医学)
作者单位E-mail
朱猛 211166 南京医科大学公共卫生学院流行病与卫生统计学系  
程阳 211166 南京医科大学公共卫生学院流行病与卫生统计学系  
戴俊程 211166 南京医科大学公共卫生学院流行病与卫生统计学系  
谢兰 清华大学医学院医学系统生物学研究中心  
靳光付 211166 南京医科大学公共卫生学院流行病与卫生统计学系  
马红霞 211166 南京医科大学公共卫生学院流行病与卫生统计学系  
胡志斌 211166 南京医科大学公共卫生学院流行病与卫生统计学系  
师咏勇 上海交通大学上海市精神卫生中心重点实验室  
林东昕 中国医学科学院北京协和医院肿瘤医院分子肿瘤学国家重点实验室  
沈洪兵 211166 南京医科大学公共卫生学院流行病与卫生统计学系 hbshen@njmu.edu.cn 
摘要点击次数: 6518
全文下载次数: 1649
中文摘要:
      目的 联合使用遗传因素和吸烟信息构建中国汉族人群的肺癌风险预测模型。方法 基于中国汉族人群全基因组关联研究(GWAS)数据,根据样本地区来源将样本分为训练集(南京与上海:1 473 名病例vs. 1 962 名对照)和测试集(北京与武汉:858 名病例vs. 1 115 名对照)。系统整理已报道肺癌易感位点,在训练集中用逐步后退法筛选具有独立效应的位点,并通过加权法估算个体遗传得分用于建模。在训练集中分别构建基于吸烟信息、遗传得分和联合使用吸烟与遗传信息的3 种风险预测模型(吸烟模型、遗传效应模型和联合模型),并根据受试者工作特征(ROC)曲线、曲线下面积(AUC)、净分类指数(NRI)和整体鉴别指数(IDI)评价模型对肺癌风险预测的效能。对于构建的模型,进一步在测试集中进行验证。结果 在训练集中,联合模型、吸烟模型和遗传效应模型AUC分别为0.69(0.67~0.71)、0.65(0.63~0.66)和0.60(0.59~0.62)。在训练集和测试集中联合模型的风险预测效能高于吸烟模型或遗传模型,差异有统计学意义(P<0.001)。重分类结果显示,联合模型与吸烟模型相比,在训练集中NRI 增加4.57%(2.23%~6.91%),IDI 增加3.11%(2.52%~3.69%)。在测试集中,NRI和IDI 分别增加2.77%和3.16%。结论 遗传得分可以显著提高肺癌传统风险模型的预测效能。联合使用遗传因素和吸烟信息构建的中国汉族人群肺癌风险预测模型可用于筛选中国汉族人群中肺癌发病的高危人群。
英文摘要:
      Objective To evaluate the predictive power of risk model by combining traditional epidemiological factors and genetic factors. Methods Our previous GWAS data of lung cancer in Chinese were used in training set(Nanjing and Shanghai:1 473 cases vs. 1 962 control)and testing set(Beijing and Wuhan:858 cases vs. 1 115 control). All the single nucleotide polymorphisms (SNPs)associated with lung cancer risk were systematically selected and stepwise logistic regression analysis was used to select independent factors in the training set. The wGRS(weighted genetic score)was further used to calculate genetic risk score. To evaluate the contribution of the genetic factors,3 risk models were established by using the training set,i.e. smoking model (based on smoking status),genetic risk model(based on genetic risk score)and combined model(based on smoke and genetic risk score). The predictability of the models were evaluated by the areas under the receiver operating characteristic(ROC) curves,area under curve(AUC),net reclassification improvement(NRI) and integrated discrimination index(IDI). Besides,the results were further verified in the testing set. Results In the training set,it was found that the AUC of the smoking, genetic risk and combined models were 0.65(0.63-0.66),0.60(0.59-0.62)and 0.69(0.67-0.71), respectively. Compared with combined model,the predictive power of other two models significantly declined,the difference was statistically significant (P<0.001). Furthermore,compared with the smoking model,the NRI of the combined model increased by 4.57%(2.23%-6.91% ) and IDI increased by 3.11%(2.52%-3.69%)in the training set,the difference was statistically significant(P< 0.001). Similarly,in the testing set NRI increased by 2.77%,the difference was not statistically significant(P=0.069),and IDI increased by 3.16%,the difference was statistically significant(P< 0.001). Conclusion This study showed that combining 14 genetic variants with traditional epidemiological factors could improve the predictive power of risk model for lung cancer. The model could be used in the screening of high-risk population of lung cancer in Chinese and provide evidence for the early diagnosis and treatment of lung cancer.
查看全文   Html全文     查看/发表评论  下载PDF阅读器
关闭