周江杰,王胜锋,李立明.Python爬虫技术在信息流行病学中的应用[J].Chinese journal of Epidemiology,2020,41(6):952-956 |
Python爬虫技术在信息流行病学中的应用 |
Application of Python web crawler technology in infodemiology |
Received:September 01, 2019 |
DOI:10.3760/cma.j.cn112338-20190901-00643 |
KeyWord: Python爬虫技术 信息流行病学 公共卫生监测 健康干预 智慧寻医 |
English Key Word: Python web crawler technology Infodemiology Public health surveillance Health intervention Smart doctor seeking |
FundProject: |
Author Name | Affiliation | E-mail | Zhou Jiangjie | Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing 100191, China | | Wang Shengfeng | Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing 100191, China | | Li Liming | Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing 100191, China | lmlee@bjmu.edu.cn |
|
Hits: 5968 |
Download times: 2135 |
Abstract: |
Python网络爬虫技术是一种通过模拟用户的网络浏览行为以实现从网络中自动、大量提取信息的技术,是信息流行病学研究收集并整合多源异构信息数据的关键基础。Python网络爬虫可分为简单爬虫与大型爬虫,集数据采集与数据库构建于一体,语法简洁、灵活性高、学习成本低、维护成本低。它适用于信息流行病学的各种应用场景,通过对互联网中健康相关信息的分析,实现多种公共卫生监测、健康干预实施及效果评价、智慧寻医方略优化等目标。近年,我国政府开始鼓励对含互联网信息在内的多源大数据的整合利用,在此背景下,Python爬虫技术的应用场景势必会越来越多,相应的人才培养、技术革新建议纳入到公共卫生教育和科研体系之中。 |
English Abstract: |
Python web crawler technology, which automatically and massively getting information from the Internet by mimicking net users’ browsing behavior, is a basic supporting technique to extract and integrate multi-source heterogeneous data in the field of Infodemiology. There are two types of Python web crawler: simple and massive-scale, both collect information simultaneously from the database establishment. Advantages of this technique are characterized as: being simple syntax, in high flexibility and low cost in learning and maintenance. Contents of the current application scenarios include surveillance, implementation and evaluation of health intervention programs on public health issues, as well as on smart doctor seeking. For the last two years, the Chinese government started to encourage the integration and utilization of multi-source heterogeneous data including internet information. Hence, the number of application scenarios for Python web crawler technology are bound to increase in the foreseeable future. Corresponding matched talent cultivations and technical innovations are suggested to add to the current education and research systems on public health issues. |
View Fulltext
Html FullText
View/Add Comment Download reader |
Close |
|
|
|