Dataset

The dataset used for training is composed of 80 images with a resolution of 512x512 pixels.

Caption method

Initially the captions were created using taggui, a first pass of auto generated captions using florence-large-ft was done before manually reviewing each caption.

However the best results were obtained form a training without captions, after the fist version all successive trainings were done without capions.

Training method

The LORA is trained using replicate ai-toolkit. The training used 6000 steps, a learning rate of 0.0004 and a network dimension of 32.

Recommended settings

Experimenting the model lead to finding the following settings give a good result

prompt: including “SLOW3D” gives stronger results, follow with any description you prefer (the model is trained on women images so best results are archived when generating women)
lora_scale: 0.8
num_inference_steps: 28
model: dev
guidance_scale: 3.5 or 2.5

Model created over 1 year ago