Explore

Fine-tune FLUX fast

Customize FLUX.1 [dev] with the fast FLUX trainer on Replicate

Train the model to recognize and generate new concepts using a small set of example images, for specific styles, characters, or objects. It's fast (under 2 minutes), cheap (under $2), and gives you a warm, runnable model plus LoRA weights to download.

Official models

Official models are always on, maintained, and have predictable pricing.

View all official models

I want to…

Upscale images

Upscaling models that create high-quality images from low-quality images

Make 3D stuff

Models that generate 3D objects, scenes, radiance fields, textures and multi-views.

Restore images

Models that improve or restore images by deblurring, colorization, and removing noise

Make videos with Wan2.1

Generate videos with Wan2.1, the fastest and highest quality open-source video generation model.

Control image generation

Guide image generation with more than just text. Use edge detection, depth maps, and sketches to get the results you want.

Try for free

Get started with these models without adding a credit card. Whether you're making videos, generating images, or upscaling photos, these are great starting points.

Enhance videos

Models that enhance videos with super-resolution, sound effects, motion capture and other useful production effects.

Detect objects

Models that detect or segment objects in images and videos.

Use FLUX fine-tunes

Browse the diverse range of fine-tunes the community has custom-trained on Replicate

Latest models

Updated 67 runs

Updated 122 runs

Updated 451 runs

ECCV2022 Quick background removal

Updated 48 runs

A good anime merge from 12 other models

Updated 1.2K runs

Updated 258 runs

Updated 224 runs

Great text-to-image model by Cagliostro Lab

Updated 3.4K runs

Updated 815 runs

⚡️ Blazing fast audio transcription with speaker diarization | Whisper Large V3 Turbo | word & sentence level timestamps | prompt

Updated 1.6M runs

OmniParser is a screen parsing tool to convert general GUI screen to structured elements.

Updated 59.2K runs

Use a mask to inpaint the image or generate a prompt based on the mask.

Updated 76.4K runs

Place items in a scene without needing to train on them

Updated 2.7K runs

Cogified implementation of OminiControl

Updated 75 runs

Updated 82 runs

Regression of musical arousal and valence values

Updated 8.8K runs

Step-Audio-TTS-3B represents the industry's first Text-to-Speech (TTS) model trained on a large-scale synthetic dataset utilizing the LLM-Chat paradigm

Updated 1.1K runs

Tiled inference implementation of PLKSR

Updated 69 runs

Updated 182 runs

VideoLLaMA 3: Frontier Multimodal Foundation Models for Video Understanding

Updated 2.4K runs

Flex.1 alpha is a pre-trained base 8 billion parameter rectified flow transformer capable of generating images from text descriptions

Updated 317 runs

Zonos-v0.1 by Zyphra, voice cloning, 5 languages and emotion control

Updated 1.5K runs

Janus-Pro is a novel autoregressive framework for multimodal understanding

Updated 6.7K runs

Updated 682 runs

Anthropic's fastest, most cost-effective model, with a 200K token context window (claude-3-5-haiku-20241022)

Updated 1.5M runs

Anthropic's most intelligent language model to date, with a 200K token context window and image understanding (claude-3-5-sonnet-20241022)

Updated 486.5K runs

Updated 325 runs

Transform Images & Text into 3D Models with AI

Updated 50 runs

DeepSeek-VL2, an advanced series of large Mixture-of-Experts (MoE) Vision-Language Models that significantly improves upon its predecessor, DeepSeek-VL

Updated 57.1K runs

DeepSeek-VL2-small, an advanced series of large Mixture-of-Experts (MoE) Vision-Language Models that significantly improves upon its predecessor, DeepSeek-VL

Updated 1K runs

Zonos-v0.1 beta, a SOTA text-to-speech Transformer model with extraordinary expressive range, built by Zyphra.

Updated 260 runs

Converts a video into a black and white dotted video effect

Updated 1K runs

Hibiki: High-Fidelity Simultaneous Speech-To-Speech Translation

Updated 12 runs

Scaling Diffusion Models for High Resolution Textured 3D Assets Generation

Updated 1.7K runs

Upscale images 2x or 4x times

Updated 6.2K runs

Updated 211 runs

Updated 199 runs

Kokoro v1.0 - text-to-speech (82M params, based on StyleTTS2)

Updated 3.2K runs

Updated 199 runs

Updated 195 runs

Make Fun by Changing Face on a GIF!

Updated 58.9K runs

Updated 191 runs

Updated 252 runs

Updated 194 runs

Rembg implementation with mask output

Updated 46 runs

Updated 196 runs

Janus-Pro is a novel autoregressive framework for multimodal understanding

Updated 12.6K runs

Generate music with YuE-s1-7B (English, chain of thought model)

Updated 2.2K runs

Test deployment of OuteTTS 500M

Updated 1.2K runs