Data Labeling, curation, and Inference Store
Designed for MLOps & Feedback Loops
🆕 🔥 Play with Argilla UI with this live-demo powered by Hugging Face Spaces ( login:
argilla
, password:12345678
)
🆕 🔥 Since
1.2.0
Argilla supports vector search for finding the most similar records to a given one. This feature uses vector or semantic search combined with more traditional search (keyword and filter based). Learn more on this deep-dive guide
- Programmatic labeling using weak supervision. Built-in label models (Snorkel, Flyingsquid)
- Bulk-labeling and search-driven annotation
- Iterate on training data with any pre-trained model or library
- Efficiently review and refine annotations in the UI and with Python
- Use Argilla built-in metrics and methods for finding label and data errors (e.g., cleanlab)
- Simple integration with active learning workflows
- Close the gap between production data and data collection activities
- Auto-monitoring for major NLP libraries and pipelines (spaCy, Hugging Face, FlairNLP)
- ASGI middleware for HTTP endpoints
- Argilla Metrics to understand data and model issues, like entity consistency for NER models
- Integrated with Kibana for custom dashboards
- Bring different users and roles into the NLP data and model lifecycles
- Organize data collection, review and monitoring into different workspaces
- Manage workspace access for different users
Argilla is composed of a Python Server with Elasticsearch as the database layer, and a Python Client to create and manage datasets.
To get started you just need to run the docker image with following command:
docker run -d --name quickstart -p 6900:6900 argilla/argilla-quickstart:latest
This will run the latest quickstart docker image with 2 users admin
and argilla
. The password for these users is
12345678
. You can also configure these environment variables as per you needs.
ADMIN_USERNAME
: The admin username to log in Argilla. The default admin username isadmin
. By setting up a custom username you can use your own username to login into the app.ADMIN_API_KEY
: Argilla provides a Python library to interact with the app (read, write, and update data, log model predictions, etc.). If you don't set this variable, the library and your app will use the default API key i.e.admin.apikey
. If you want to secure your app for reading and writing data, we recommend you to set up this variable. The API key you choose can be any string of your choice and you can check an online generator if you like.ADMIN_PASSWORD
: This sets a custom password for login into the app with theargilla
username. The default password is12345678
. By setting up a custom password you can use your own password to login into the app.ANNOTATOR_USERNAME
: The annotator username to login in Argilla. The default annotator username isargilla
. By setting up a custom username you can use your own username to login into the app.ANNOTATOR_PASSWORD
: This sets a custom password for login into the app with theargilla
username. The default password is12345678
. By setting up a custom password you can use your own password to login into the app.ARGILLA_WORKSPACE
: The name of a workspace that will be created and used by default for admin and annotator users. The default value will be the one defined byADMIN_USERNAME
environment variable.LOAD_DATASETS
: This variables will allow you to load sample datasets. The default value will befull
. The supported values for this variable is as follows:single
: Load single datasets for TextClassification task.full
: Load all the sample datasets for NLP tasks (TokenClassification, TextClassification, Text2Text)none
: No datasets being loaded.