-
Notifications
You must be signed in to change notification settings - Fork 5
ZRXXUAN/news-webscraping
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
About
基于Scrapy的新闻爬虫,利用Redis和MongoDB来避免重复爬取和数据的保存,有用到代理池来反反爬,保存的字段为标题、时间、正文、URL、作者/来源、来源URL。爬取对象为网易/腾讯/新浪/搜狐这四个门户网站,爬取板块为新闻/科技/娱乐/财经四大板块。
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published