[🤗 Demo]
When training PhotoMaker V2, we focused on improving ID fidelity. Compared to PhotoMaker V1, we introduced 1️⃣ new training strategies, incorporated 2️⃣ more portrait datasets, and utilized 3️⃣ a more powerful ID extraction encoder. We will release a technical report soon. Thank you all for your attention.
- ID fidelity has been further improved, especially for single image input and Asian facial inputs. Of course, feeding more facial images can still yield better results.
- By integrating ControlNet, T2I-Adapter, and IP-Adapter, the generation process becomes more controllable. We provide corresponding scripts for reference. Additionally, PhotoMaker V2 allows users to achieve better ID consistency by combining it with IP-Adapter-FaceID, InstantID, and character LoRA.
- PhotoMaker V2 inherits the promising features of PhotoMaker V1, such as high-quality and diverse generation capabilities, and powerful text control. Additionally, it can still integrate previous applications like bringing characters from old photos or paintings back to reality, identity mixing, and changing age or gender.
We selected the three most prevalent methods in ID personalization generation, namely PhotoMaker V1, IP-Adapter-FaceID-Plus-V2 (best of IP-Adapter-FaceID), and InstantID.
To ensure a fair comparison, we used the same base model (RealVisXL-V4.0) and scheduler (Euler), and selected the best out of four randomly generated images from each method for visualization. The prompts and negative prompts were consistent:
Prompt: instagram photo, portrait photo of a woman img holding two cats, colorful, perfect face, natural skin, hard shadows, film grain
Negative Prompt: (asymmetry, worst quality, low quality, illustration, 3d, 2d, painting, cartoons, sketch), open mouth
We can see that our method has advantages in maintaining ID fidelity and in the quality of the generated images
PhotoMaker V2 can collaborate with T2I-Adapter’s doodle mode, allowing for controlled image generation based on user drawings and prompts. This feature can be experienced in [🤗 our official demo]. The following video is an example of the experience process:
photomaker_v2_demo_small.mp4
Additionally, PhotoMaker V2 can work with ControlNet and T2I-Adapter for layout control, such as edge, pose, depth, and more.
We provide two example scripts:
The image below is an example of controlled generation using pose through ControlNet:
Our sample scripts can be referred to: inference_pmv2_ip_adapter.py
The image below is an example:
PhotoMaker V2, as a plugin, can work well with other plugins, such as IP-Adapter-FaceID or InstantID, to further improve ID fidelity, or combining with LCM for acceleration. We look forward to your exploration of more features, and welcome you to provide PRs or contribute to the open-source community
🥳 If you have built or known repositories or applications around PhotoMaker V2, please leave us a message in the discussion. We will include them in our README.
Since PhotoMaker V2 relies on InsightFace, it also needs to comply with its license.