This is a bot thought to do periodical scraping of ads from commercial websites.
Found a new ad the bot will send it to you exploiting Apprise channels
The relative package is available on Pypi
pip install scraper-bot
The package heavily relays on playwright
package, so before start to use the bot you have to install a playwright browser
playwright install --with-deps firefox
You can found further information in the playwright
documentation
(n.b. the bot are not limited to use firefox only)
The scraper-bot
package provide the following command to run the bot
scraper-bot
The CI builds the container for each version and it puts it on the public GitHub registry
ghcr.io/robertobochet/scraper-bot
- Create a telegram bot and retrieve its token
- Download
config.example.yaml
and rename it toconfig.yaml
- Change the configuration follow the guidelines
- Download
docker-compose.yaml
- Start the scraper with
docker-compose
docker-compose up
- Wait that the bot does its work!
For the deploy of the Scraper Bot is also available a helm chart
You can found the source code in the repo scraper-bot-chart
Helm chart package is available in the github OCI registry
oci://ghcr.io/robertobochet/scraper-bot-chart
You can use it to directly deploy on your kubernetes cluster
- Retrieve the default values file
helm show values oci://ghcr.io/robertobochet/scraper-bot-chart > values.yaml
- Customize the
values.yaml
- Install the scaper bot
helm install oci://ghcr.io/robertobochet/scraper-bot-chart scraper-bot -f values.yaml
By default the bot looks for a configuration file in the following path ./config.y(a)ml
and /etc/scaraper-bot/config.y(a)ml
. You cna override this behavior passing via command line the --config
argument followed by the config file path
scraper-bot --config /path/to/scraper-bot-config.yaml
The configuration file has to satisfy the pydantic model which you can find in scraper_bot.settings
.
Furthermore you can get the config json schema from command line with --config-schema
argument
scraper-bot --config-schema
You can also find a configuration example in config.example.yaml
.