Replies: 1 comment
-
Hey @adamo1139 I only just noticed this, sorry for the late reply. Can you clarify how this website idea would be a better incentive than the one proposed in the Reddit post? The idea I proposed was to gamify the crowdsourcing of better prompts (and verification of them) by effectively serving as a free-to-use Stable Diffusion website. And yes, the goal would be to train open source language models for the task. GPT-3 is only being used here for the initial creation of datasets. Besides GPT Neo I think there's also OPT/BLOOM/etc to consider for various parts of this project, although I'm not familiar enough with those models yet.
As mentioned in this proposal we can generate the prompt pairs from large sets of prompts (such as lexica.art). Asking users to write prompts may not even be necessary. Either way, the general idea of crowdsourcing feedback from people might be useful at some point. Midjourney significantly benefited from this human feedback loop. I think there could also be a place for the use of CLIP and other models (e.g. https://github.com/crowsonkb/simulacra-aesthetic-models) for filtering/sorting images generated from prompts. Here's a few-shot demo of
|
Beta Was this translation helpful? Give feedback.
-
You won't get too much quality crowdsourced data for free if you require users to spend a lot of their time on improving prompts. It's basically impossible for it to get traction because it requires quite a lot of unfulfilling work to complete.
Do you have enough simple prompt <> detailed prompt pairs to train the llm model?
Have you been successful in getting the model to properly compress detailed prompt into simple prompt?
If answer to both is "not really" , I suggest setting a website where user sees random image from Lexica.art, don't see the detailed prompt. User is asked to enter a simple description of the image, up to 100 characters. This simple description is saved together with the prompt scraped from Lexica.art and used for training later. Alternative of using image recognition software instead of crowdsourced data to get more pairs quickly and cheaply is possible.
I noticed that you seem to be using gpt-3 davinci. Is it possible to switch to open source gtp-neo family model for this project to make it possible for end users to run the whole thing locally?
Beta Was this translation helpful? Give feedback.
All reactions