Skip to content

struggling-man/LagouJob

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

#Data analysis of Lagou LagouIcon ###Main Function

  1. scrape data from Lagou, and know the latest info of Internet career

  2. data analysis and visualize

  3. crawl job details info and generate word cloud as Job Impression

###Note Because lagou's back-end API has been changed, this repository may not work well.

I will try to fix these problems and publish V2.0 in the near future.

THX for your star and watching!

I will try my best to make it better and more robust with more new features as well!

Sorry for the inconvenience it may bring!

V2.0_ALPHA is developing ~

###Install Prerequisition

  1. Python Version >= 3.4
  2. Third Party Library:

pip install requests pip install beautifulsoup4 pip install jieba pip install openpyxl

###Basic Usage

  1. clone this project from github

  2. change the path of job.xml in lagouspider.py readconfig() method configmap = toolkit.readconfig(YourLocalPath)

  3. run lagouspider.py to get job data in JSON

  4. run excelhelper.py to generate every Excel file towards each job

  5. run jobdetailspider.py to get job recruitment details ----V1.3 updated

  6. run analyser.py to cut sentences, and return TOP20 hot words ----V1.3 updated

###Analysis Results

Image1 Image2 Image3 Image4 Image5

For more information, please visit my answer at Zhihu

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%