a robot that syncs job information to WeChat group
- crawler
- data source is from Alibaba专场链接.
- crawler should be triggered once a day.
- parser is one part of the crawler, whose input is HTML and output is structured data such as JSON.
- storage
- save the output from crawler layer into storage medium, such as DB or file system
- the record must have flag which can stand for whether it is sent or not
- push
- push module's responsibility is to load all job records which are not sent, and then send them to WeChat group
- push module should be triggered once a day, but after crawler is finished. one way is to do push immediately after storage.
- WeChat API can be used here.
- Spring
- spring-boot
- Apache HttpClient
- Jsoup - Java HTML Parser
- Quartz - Scheduler
- h2database - small DB
- manager @萌萌哒的John
- developer @魏晋风度