Readme
PhotoMaker
Customizing Realistic Human Photos via Stacked ID Embedding .
Usage
Users can input one or a few face photos, along with a text prompt, to receive a customized photo or painting within seconds (no training required!). Additionally, this model can be adapted to any base model based on SDXL or used in conjunction with other LoRA modules.
Realistic results
Stylization results
More results can be found in our project page
Model Details
It mainly contains two parts corresponding to two keys in loaded state dict:
-
id_encoder
includes finetuned OpenCLIP-ViT-H-14 and a few fuse layers. -
lora_weights
applies to all attention layers in the UNet, and the rank is set to 64.