Abstract
李镒冲,赵寅君,王丽敏,张梅,周脉耕.考虑多阶段抽样设计的误差估计[J].Chinese journal of Epidemiology,2016,37(3):425-429
考虑多阶段抽样设计的误差估计
Variance estimation considering multistage sampling design in multistage complex sample analysis
Received:July 27, 2015  
DOI:10.3760/cma.j.issn.0254-6450.2016.03.028
KeyWord: 抽样研究  计算机模拟  多阶段  方差估计  抽样误差
English Key Word: Sampling studies  Computer simulation  Multistage  Variance estimation  Sampling error
FundProject:国家自然科学基金(81202287)
Author NameAffiliationE-mail
Li Yichong Division of Surveillance, National Center for Chronic and Non-communicable Disease Control and Prevention, Beijing 100050, China alexleeliyichong@gmail.com 
Zhao Yinjun Division of Surveillance, National Center for Chronic and Non-communicable Disease Control and Prevention, Beijing 100050, China  
Wang Limin Division of Surveillance, National Center for Chronic and Non-communicable Disease Control and Prevention, Beijing 100050, China  
Zhang Mei Division of Surveillance, National Center for Chronic and Non-communicable Disease Control and Prevention, Beijing 100050, China  
Zhou Maigeng Chinese Center for Disease Control and Prevention, Beijing 100050, China  
Hits: 4939
Download times: 2281
Abstract:
      多阶段随机抽样是公共卫生开展人群抽样调查的常用设计。多阶段抽样设计下获得的样本具有复杂样本的特征,存在群效应或数据不独立,若不考虑抽样设计,通常会低估抽样误差或增加统计推断Ⅰ类错误的风险。由于复杂样本误差估计形式较复杂,目前常用统计软件均默认采用极群方差估计策略来简化样本结构,即假设样本来自于一阶段整群抽样,忽略除第一阶段抽样外的所有抽样设计,从而实现对误差的近似估计。然而,在初级抽样单元入样比较高时,后继抽样阶段对误差的贡献不可忽略,极群方差估计策略可能导致无效的误差估计。本文旨在介绍考虑多阶段抽样设计下的误差估计方法,并通过对现实数据进行多阶段模拟抽样,探讨在不同抽样设计下,极群方差估计策略和考虑多阶段抽样设计下的误差估计差异。模拟结果显示,随初级抽样单元入样比的增加,极群方差估计策略估计的误差出现不同程度的偏倚,且随入样比增加偏倚加重;而考虑多阶段抽样设计下的误差估计则较准确反映误差水平,可得到准确的统计推断结果。
English Abstract:
      Multistage sampling is a frequently-used method in random sampling survey in public health. Clustering or independence between observations often exists in the sampling, often called complex sample, generated by multistage sampling. Sampling error may be underestimated and the probability of typeⅠerror may be increased if the multistage sample design was not taken into considerationin analysis. As variance (error) estimator in complex sample is often complicated, statistical software usually adopt ultimate cluster variance estimate (UCVE) to approximate the estimation, which simply assume that the sample comes from one-stage sampling. However, with increased sampling fraction of primary sampling unit, contribution from subsequent sampling stages is no more trivial, and the ultimate cluster variance estimate may, therefore, lead to invalid variance estimation. This paper summarize a method of variance estimation considering multistage sampling design. The performances are compared with UCVE and the method considering multistage sampling design by simulating random sampling under different sampling schemes using real world data. Simulation showed that as primary sampling unit (PSU) sampling fraction increased, UCVE tended to generate increasingly biased estimation, whereas accurate estimates were obtained by using the method considering multistage sampling design.
View Fulltext   Html FullText     View/Add Comment  Download reader
Close