crawl_vnexpress

Overview

Use https://scrapy.org/ to crawl data from Vnexpress.net
Folder vnexpress crawl with scrapy-splash
Folder scrapy_vnexress use only scrapy

Scrapy_vnexpress

This project only focus on category 'thoi-su' (news category)
Crawl ~ 10000 article with title, date, tags, content with vnexpress_spider.py
It crawls comment in each article with comment_spider.py
It crawls all comment from each user who vnexpress_spider got id

Setup for scrapy_vnexpress

pip install scrapy

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
scrapy_vnexpress/vnexpress		scrapy_vnexpress/vnexpress
vnexpress		vnexpress
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

crawl_vnexpress

Overview

Scrapy_vnexpress

Setup for scrapy_vnexpress

About

Releases

Packages

Languages

thuy4tbn99/crawl_vnexpress

Folders and files

Latest commit

History

Repository files navigation

crawl_vnexpress

Overview

Scrapy_vnexpress

Setup for scrapy_vnexpress

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages