Ulixee is a scraping engine with a built-in deployment unit that enables out-of-the-box querying across a horizontal deployment.
This repository is the development home to several of the tools that make it easy to build and manage these scripts, including Ulixee Desktop, Cloud and Datastore.
- Hero
/hero
. The Automated Browser Engine built for scraping. (repository home - https://github.com/ulixee/hero). - Datastore
/datastore
. Packaged "database" containing API access to crawler functions and extractor functions. - Cloud
/cloud
. Run Ulixee tooling on a remote machine. - Stream
/stream
. Query, transform and compose Datastores running on any machine. - Desktop
/desktop
. Supercharge scraper script development using a Hero Replay toolset, remote Datastore viewer and Error troubleshooter (repository home - https://github.com/ulixee/desktop).
Try out Ulixee Desktop!. It's a helpful tool for developing and managing your Ulixee scripts.
We publish a Docker image of the latest Ulixee Cloud to:
- Github Container Registry:
docker pull ghcr.io/ulixee/ulixee-cloud && docker tag ghcr.io/ulixee/ulixee-cloud ulixee/ulixe-cloud
- DockerHub:
docker pull ulixee/ulixee-cloud
To use the image, we have a run.sh script that will run with a non-root user on your choice of port. All environmental configurations are listed here.
This project serves as a Monorepo for developing the Ulixee Datastore and Cloud. If you are developing, you might wish to have hero as a project adjacent to this one.
1Run yarn build:all
from the this repository to build all the projects.
Learn more about Ulixee at ulixee.org.
See How to Contribute for ways to get started.
This project has a Code of Conduct. By interacting with this repository, organization, or community you agree to abide by its terms.
We'd love your help in making Ulixee a better set of tools. Please don't hesitate to send a pull request.