Reply to the ideas from last reddit post #2

adamo1139 · 2022-10-08T21:01:30Z

adamo1139
Oct 8, 2022

You won't get too much quality crowdsourced data for free if you require users to spend a lot of their time on improving prompts. It's basically impossible for it to get traction because it requires quite a lot of unfulfilling work to complete.

Do you have enough simple prompt <> detailed prompt pairs to train the llm model?
Have you been successful in getting the model to properly compress detailed prompt into simple prompt?

If answer to both is "not really" , I suggest setting a website where user sees random image from Lexica.art, don't see the detailed prompt. User is asked to enter a simple description of the image, up to 100 characters. This simple description is saved together with the prompt scraped from Lexica.art and used for training later. Alternative of using image recognition software instead of crowdsourced data to get more pairs quickly and cheaply is possible.

I noticed that you seem to be using gpt-3 davinci. Is it possible to switch to open source gtp-neo family model for this project to make it possible for end users to run the whole thing locally?

scf4 · 2022-10-18T03:26:42Z

scf4
Oct 18, 2022
Maintainer

Hey @adamo1139 I only just noticed this, sorry for the late reply.

Can you clarify how this website idea would be a better incentive than the one proposed in the Reddit post? The idea I proposed was to gamify the crowdsourcing of better prompts (and verification of them) by effectively serving as a free-to-use Stable Diffusion website.

And yes, the goal would be to train open source language models for the task. GPT-3 is only being used here for the initial creation of datasets. Besides GPT Neo I think there's also OPT/BLOOM/etc to consider for various parts of this project, although I'm not familiar enough with those models yet.

Do you have enough simple prompt <> detailed prompt pairs to train the llm model?

As mentioned in this proposal we can generate the prompt pairs from large sets of prompts (such as lexica.art). text-davinci-002 is very capable at this task (see below).

Asking users to write prompts may not even be necessary. Either way, the general idea of crowdsourcing feedback from people might be useful at some point. Midjourney significantly benefited from this human feedback loop.

I think there could also be a place for the use of CLIP and other models (e.g. https://github.com/crowsonkb/simulacra-aesthetic-models) for filtering/sorting images generated from prompts.

Here's a few-shot demo of text-davinci-002 on random prompts from the prompts.txt file linked to in proposal 001. (It even simplifies the "Keanue Reeves as Spiderman" prompt perfectly).

We take complex descriptions of images and simplify them to their core elements.

Original: "close portrait of golden retriever dog, wearing a top hat, wearing a monocle over one eye, smiling, vaporwave, highly detailed, digital painting, artstation, concept art, smooth, sharp focus, illustration, art by artgerm and greg rutkowski and alphonse mucha"
Simple: "golden retriever wearing a top hat and monocle"

Original: "A highly detailed matte painting of cottage in a dark forest, by Studio Ghibli, Makoto Shinkai, by Artgerm, by beeple, by Greg Rutkowski, volumetric lighting, octane render, 4K resolution, trending on artstation, masterpiece"
Simple: "a cottage in a dark forest"

Original: "Pixar poster for the film alex the last boy in australia ; 8 k uhd ; very detailed, focused, colorful, antoine pierre mongin, trending on artstation ;"
Simple: "Pixar poster for the film Alex the Last Boy in Australia"

Original: "high-tech egg made of glass containing an human baby inside, artificial belly with wires made of glass, liquids, dan experiment, biopunk, solarpunk, sci-fi mech, futuristic, cyberpunk, sci-fi, fantasy, elegant, highly detailed, digital painting, artstation, pinterest, concept art, smooth, sharp focus, illustration, art by artgerm and greg rutkowski and alphonse mucha"
Simple: "high-tech egg made of glass containing an human baby inside"

Original: "tiny cat riding a labrador, cat on top of dog, concept art, artstation, trending on Instagram, Twitter"
Simple: "tiny cat riding a labrador"

Original: "Taylor Swift sitting on a majestic throne wearing a crown, D&D style, fantasy, intricate, elegant, highly detailed, digital painting, artstation, concept art, matte, sharp focus, illustration, art by Artgerm and Greg Rutkowski and Alphonse Mucha"
Simple: "Taylor Swift sitting on a throne wearing a crown"

Original: "Beautiful colorful Minimalist!! Horror Movie Poster made for the film Kafka's the Metamorphosis (1952) Starring Peter Lorre and a Beetle and Kafka's father,  photo collage and oil painting by H R Giger and Man Ray, trending on artstation dramatic lighting minimalist! collage 8k"
Simple: "movie poster for Kafka's the Metamorphosis"

Original: "symmetry!! rpg! portrait of a distorted man on white background, intricate, highly detailed, digital painting, artstation, concept art, smooth, sharp focus, illustration, art by norman rockwell emiliano ponzi andrey remnev yoann lossel john currin aaron jasinski ivan albright hsiao - ron cheng, 8 k"
Simple: "portrait of a distorted man"

Original: "full body 3d render of samurai as a funko pop!, studio lighting, white background, single body, no shadow, blender, trending on artstation, 8 k, highly detailed"
Simple: "A samurai as a Funko Pop"

Original: "professional cinematic photograph concept render of a single massive beautiful neo-futuristic matte symmetrical elongated oval monolith emerging from fog by ilm and denis villeneuve, emmanuel shiu, zaha hadid, sunset, white vapor, architectural scale, dark background, hyperrealism, matte painting, very high detail, volumetric, trending on artstation, cgsociety, rendered in octane"
Simple: "a massive neo-futuristic oval monolith emerging from fog"

Original: "midnight, meadows on hills, blue flowers bloomed all over, glowing spores flying, cinematic view, epic sky, dynamic lighting, photorealistic fantasy concept art, stunning visuals, creative, cinematic, detailed, concept art, low angle, high detail, warm lighting, volumetric, godrays, vivid, beautiful, trending on artstation, by jordan grimmer, huge scene, grass, art greg rutkowski"
Simple: "meadows on hills at midnight with blue flowers and glowing spores"

Original: "an art nouveau realistic illustration of a cat lying on her bed wearing an holographic kimono designed by balenciaga by john berkey by stanley artgerm lau, greg rutkowski, thomas kinkade, alphonse mucha, loish, norman rockwell"
Simple: "a cat lying on her bed wearing an holographic kimono"

Original: "astronaut, horror poster 90's, cosmic horror, abstract, ghostly, arcade, duotone, poltergeist, lets get weird, intricate, elegant, highly detailed, digital painting, artstation, smooth, sharp focus, art by mondo, julian del rey and greg rutkowski, david la chapelle, ultraviolet colors,"
Simple: "astronaut horror poster from the 90's"

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reply to the ideas from last reddit post #2

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Reply to the ideas from last reddit post #2

adamo1139 Oct 8, 2022

Replies: 1 comment

scf4 Oct 18, 2022 Maintainer

adamo1139
Oct 8, 2022

scf4
Oct 18, 2022
Maintainer