Datasaurus

Do computer vision with 1000x less data

Hosted App (coming soon) - Running Locally

Leverage a foundational text-vision model for your computer vision tasks. Instead of having to train your own models from scratch, rely on pre-trained models. You can achieve great performance with no data, and an even better one with a couple of datapoints. If you already have a lot of data, then exceed your previous models' performance by fine-tuning a foundational model (coming soon).

Features

Fully open-source
- Fine-tuned model weights can be downloaded.
Do computer vision tasks without any data.
Promptable system
- If your requirements change, then you just need to adjust your prompt; no need to retrain an entire computer vision model.
And many more features coming soon...

Examples

Instead of training a model from scratch, you can just prompt your images.

Color Detection Pipeline
- Prompt: Determine the main color of specific objects within an image.
Count and Action Recognition Pipeline
- Prompt: Identify the number of people in a scene and their actions.
Fruit Ripeness Analysis Pipeline
- Prompt: Analyze images of fruit to determine their level of ripeness.
Dog Breed Identification Pipeline
- Prompt: Classify the breed of a dog from a given image.

Supported Base Models

LLaVA-v1.5-7B
LLaVA-v1.5-13B
GPT4-V (as soon as the model API is available)

Roadmap

v0 launched
Add examples sections.
Dataset importer + associated dashboard.
Parameterizable prompt.
Fine-tune models.
Add visual in-context learning.
Support for additional inference backend (gglm).
Hosting service deployment (right now, waitlist).
Stronger output guidance.

Running Locally

Install NodeJS 20 (earlier versions will very likely work but aren't tested)
Install Supabase with npm i supabase --save-dev
Install Conda
Clone this repository and open it: git clone https://github.com/datasaurus-ai/datasaurus && cd datasaurus
Install the frontend dependencies: cd frontend && npm install && cd ..
Install the backend dependencies: cd backend && virtualenv datasaurus-backend && source datasaurus-backend/bin/activate && pip install -r requirements.txt && cd ..
Start Supabase: cd supabase && supabase start && cd ..
Register for an account on replicate.com and obtain an API key. We utilize Replicate as our backend for performing inference. (Note: running models locally capability coming soon.)
Create the backend .env file (cd backend && cp .env.example .env && cd ..) and complete it
Create the frontend .env file (cd frontend && cp .env.example .env && cd ..) and complete it
Start the backend: cd backend && source datasaurus-backend/bin/activate && uvicorn src.main:app --reload && cd ..
Start the frontend: cd frontend && npm run dev && cd ...
Navigate to http://localhost:3000

Interested?

If you are interested, please leave us a star and/or sign up for launch of the hosted version on datasaurus.app

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
backend		backend
frontend		frontend
supabase		supabase
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Datasaurus

Features

Examples

Supported Base Models

Roadmap

Running Locally

Interested?

About

Releases

Packages

Languages

License

datasaurus-ai/datasaurus

Folders and files

Latest commit

History

Repository files navigation

Datasaurus

Features

Examples

Supported Base Models

Roadmap

Running Locally

Interested?

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages