Explore

Fine-tune FLUX fast

Customize FLUX.1 [dev] with the fast FLUX trainer on Replicate

Train the model to recognize and generate new concepts using a small set of example images, for specific styles, characters, or objects. It's fast (under 2 minutes), cheap (under $2), and gives you a warm, runnable model plus LoRA weights to download.

Official models

Official models are always on, maintained, and have predictable pricing.

View all official models

I want to…

Upscale images

Upscaling models that create high-quality images from low-quality images

Make 3D stuff

Models that generate 3D objects, scenes, radiance fields, textures and multi-views.

Restore images

Models that improve or restore images by deblurring, colorization, and removing noise

Make videos with Wan2.1

Generate videos with Wan2.1, the fastest and highest quality open-source video generation model.

Control image generation

Guide image generation with more than just text. Use edge detection, depth maps, and sketches to get the results you want.

Try for free

Get started with these models without adding a credit card. Whether you're making videos, generating images, or upscaling photos, these are great starting points.

Enhance videos

Models that enhance videos with super-resolution, sound effects, motion capture and other useful production effects.

Detect objects

Models that detect or segment objects in images and videos.

Use FLUX fine-tunes

Browse the diverse range of fine-tunes the community has custom-trained on Replicate

Latest models

Change the strength of the prompt to enable editing style and content. Recommendation: keep the seed constant and tune the strength.

Updated 370 runs

This models allow changing the strength of the Redux image prompt, which allows the text prompt to have a stronger effect. It is particularly useful at taking content from the provided image and applying style or editing changes from the prompt.

Updated 1.9K runs

Microsoft's tool to convert Office documents, PDFs, images, audio, and more to LLM-ready markdown.

Updated 48.4K runs

REAL-ESRGAN superresolution to upsample low resolution satellite imagery.

Updated 57 runs

Updated 35 runs

Updated 19 runs

Creative Upscale focuses on enhancing details and refining complex elements in the image. It doesn’t just increase resolution but adds depth by improving textures, fine details, and facial features.

Updated 4.7K runs

Designed to make images sharper and cleaner, Crisp Upscale increases overall quality, making visuals suitable for web use or print-ready materials.

Updated 104.2K runs

SVFR: A Unified Framework for Generalized Video Face Restoration

Updated 545 runs

End-to-end AI speech model designed for natural-sounding conversational speech synthesis, with support for context-aware prosody, intonation, and emotional expression.

Updated 26.1K runs

Image generation, Added: inpaint_strength loras_custom_urls

Updated 326.6K runs

Simple tool to merge together separate video snippets

Updated 441 runs

allenai/OLMo-2-1124-13B-Instruct, text generation model

Updated 115 runs

refinement module to improve satellite derived shorelines

Updated 5 runs

2025 fork of closed Coqui XTTS-v2: Multilingual Text To Speech Voice Clone

Updated 405 runs

Cog implementation of LTX video from its diffusers pipeline

Updated 70 runs

Cog implementation of LTX image to video from its diffusers pipeline

Updated 119 runs

Island Segmentation!

Updated 15 runs

SoTA depth estimation

Updated 599 runs

SDXL Canny controlnet with LoRA support.

Updated 393.6K runs

test

Updated 16 runs

Whisper Model that can be use for adding domain-specific words

Updated 33K runs

Kokoro is a frontier TTS model for its size of 82 million parameters (text in/audio out).

Updated 810 runs

Updated 51 runs

LoRA Inference for hunyuanvideo-community/HunyuanVideo finetunes

Updated 78 runs

Updated 192 runs

LTX-Video is the first DiT-based video generation model capable of generating high-quality videos in real-time. It produces 24 FPS videos at a 768x512 resolution faster than they can be watched.

Updated 109.6K runs

Finetune HunyuanVideo LoRAs with kohya-ss/musibi-tuner

Updated 85 runs

Updated 2.2K runs

Updated 28 runs

Simple tool to merge a foreground and background image

Updated 2K runs

Convert musubi-tuner LoRA to ComfyUI compatible format

Updated 47 runs

Fine-tune HunyuanVideo via a-r-r-o-w/finetrainers (Work In Progress)

Updated 53 runs

Microsoft's Florence 2 Base

Updated 246 runs

Minimal and Universal Control for Diffusion Transformer - demo for Subject-driven generation

Updated 1.9K runs

Minimal and Universal Control for Diffusion Transformer - demo for Spatially aligned control

Updated 106 runs

Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization

Updated 11.1K runs

Updated 209 runs

One Diffusion to Generate Them All

Updated 159 runs

Upscale low resolution images to high resolution images

Updated 3.8K runs

Cog implementation of Diffusers Flux RFInversion Pipeline

Updated 204 runs

Detect deepfake faceswap image

Updated 82 runs

Swap the source face to target face

Updated 839 runs

Unofficial community fork and Diffusers formatted weights of tencent/HunyuanVideo

Updated 183 runs

Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding

Updated 1.1K runs

A simple model to detect and crop face found in image, made for https://outfit.fm

Updated 7.9K runs

Fork / Remix of Apollo 7B by Luis C. (https://replicate.com/lucataco/apollo-7b) to support multi-turn conversations.

Updated 24 runs

QVQ-72B-Preview by Qwen is an experimental research model focusing on enhancing visual reasoning capabilities

Updated 272 runs

Remodels interior

Updated 2.4K runs