Abstract
周江杰,王胜锋,李立明.Python爬虫技术在信息流行病学中的应用[J].Chinese journal of Epidemiology,2020,41(6):952-956
Python爬虫技术在信息流行病学中的应用
Application of Python web crawler technology in infodemiology
Received:September 01, 2019  
DOI:10.3760/cma.j.cn112338-20190901-00643
KeyWord: Python爬虫技术  信息流行病学  公共卫生监测  健康干预  智慧寻医
English Key Word: Python web crawler technology  Infodemiology  Public health surveillance  Health intervention  Smart doctor seeking
FundProject:
Author NameAffiliationE-mail
Zhou Jiangjie Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing 100191, China  
Wang Shengfeng Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing 100191, China  
Li Liming Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing 100191, China lmlee@bjmu.edu.cn 
Hits: 5970
Download times: 2135
Abstract:
      Python网络爬虫技术是一种通过模拟用户的网络浏览行为以实现从网络中自动、大量提取信息的技术,是信息流行病学研究收集并整合多源异构信息数据的关键基础。Python网络爬虫可分为简单爬虫与大型爬虫,集数据采集与数据库构建于一体,语法简洁、灵活性高、学习成本低、维护成本低。它适用于信息流行病学的各种应用场景,通过对互联网中健康相关信息的分析,实现多种公共卫生监测、健康干预实施及效果评价、智慧寻医方略优化等目标。近年,我国政府开始鼓励对含互联网信息在内的多源大数据的整合利用,在此背景下,Python爬虫技术的应用场景势必会越来越多,相应的人才培养、技术革新建议纳入到公共卫生教育和科研体系之中。
English Abstract:
      Python web crawler technology, which automatically and massively getting information from the Internet by mimicking net users’ browsing behavior, is a basic supporting technique to extract and integrate multi-source heterogeneous data in the field of Infodemiology. There are two types of Python web crawler: simple and massive-scale, both collect information simultaneously from the database establishment. Advantages of this technique are characterized as: being simple syntax, in high flexibility and low cost in learning and maintenance. Contents of the current application scenarios include surveillance, implementation and evaluation of health intervention programs on public health issues, as well as on smart doctor seeking. For the last two years, the Chinese government started to encourage the integration and utilization of multi-source heterogeneous data including internet information. Hence, the number of application scenarios for Python web crawler technology are bound to increase in the foreseeable future. Corresponding matched talent cultivations and technical innovations are suggested to add to the current education and research systems on public health issues.
View Fulltext   Html FullText     View/Add Comment  Download reader
Close