Explore

Fine-tune FLUX fast

Customize FLUX.1 [dev] with the fast FLUX trainer on Replicate

Train the model to recognize and generate new concepts using a small set of example images, for specific styles, characters, or objects. It's fast (under 2 minutes), cheap (under $2), and gives you a warm, runnable model plus LoRA weights to download.

Get started Learn more

Featured models

minimax / hailuo-02

Hailuo 2 is a text-to-video and image-to-video model that can make 6s or 10s videos at 768p (standard) or 1080p (pro). It excels at real world physics.

22.1K runs

minimax / hailuo-02-fast

A low cost and fast version of Hailuo 02. Generate 6s and 10s videos in 512p

481 runs

bytedance / omni-human

Turns your audio/video/images into professional-quality animated videos

534 runs

google / veo-3-fast

A faster and cheaper version of Google’s Veo 3 video model, with audio

14.6K runs

google / veo-3

Sound on: Google’s flagship Veo 3 text to video model, with audio

129.5K runs

flux-kontext-apps / kontext-emoji-maker

Use kontext to turn any image into an emoji, using a lora by starsfriday

501 runs

wan-video / wan-2.2-t2v-fast

A very fast and cheap PrunaAI optimized version of Wan 2.2 A14B text-to-video

5.2K runs

black-forest-labs / flux-krea-dev

An opinionated text-to-image model from Black Forest Labs in collaboration with Krea that excels in photorealism. Creates images that avoid the oversaturated "AI look".

13.9K runs

wan-video / wan-2.2-i2v-a14b

Image-to-video at 720p and 480p with Wan 2.2 A14B

2.9K runs

Official models

Official models are always on, maintained, and have predictable pricing.

minimax / hailuo-02

Generate videos, and Videos from images

22.1K runs

bytedance / omni-human

Turns your audio/video/images into professional-quality animated videos

534 runs

openai / clip

Official CLIP models, generate CLIP (clip-vit-large-patch14) text & image embeddings

90 runs

ibm-granite / granite-speech-3.3-8b

Granite-speech-3.3-8b is a compact and efficient speech-language model, specifically designed for automatic speech recognition (ASR) and automatic speech translation (AST).

733 runs

black-forest-labs / flux-krea-dev

An opinionated text-to-image model from Black Forest Labs in collaboration with Krea that excels in photorealism. Creates images that avoid the oversaturated "AI look".

13.9K runs

wan-video / wan-2.2-i2v-a14b

Generate videos, Videos from images, and Make videos with Wan

2.9K runs

minimax / video-01

Generate videos, and Videos from images

554.9K runs

ibm-granite / granite-3.3-8b-instruct

Use LLMs

857.8K runs

ibm-granite / granite-vision-3.3-2b

Granite-vision-3.3-2b is a compact and efficient vision-language model, specifically designed for visual document understanding, enabling automated content extraction from tables, charts, infographics, plots, diagrams, and more.

5.3K runs

bytedance / seedance-1-lite

Generate videos, and Videos from images

237.8K runs

bytedance / seedance-1-pro

Generate videos, and Videos from images

168.3K runs

luma / photon-flash

Generate images

128.3K runs

luma / ray-2-540p

Generate videos, and Videos from images

9.8K runs

luma / ray-2-720p

Generate videos, and Videos from images

24.2K runs

luma / ray-flash-2-720p

Generate videos, and Videos from images

25.5K runs

luma / reframe-image

Change the aspect ratio of any photo using AI (not cropping)

6.1K runs

View all official models

I want to…

Generate images

Use AI To Generate Images & Photos with an API

Caption videos

Use AI To Caption Videos with an API

Generate speech

Convert text to speech

Use a face to make images

Make realistic images of people instantly

Generate videos

Use AI To Generate Videos with an API

Upscale images

Upscaling models that create high-quality images from low-quality images

Generate music

Use AI To Generate Music with an API

Edit images

Use AI To Edit Any Image with an API

Transcribe speech

Models that convert speech to text

Extract text from images

Optical character recognition (OCR) and text extraction

Remove backgrounds

Models that remove backgrounds from images and videos

Use the FLUX family of models

The FLUX family of text-to-image models from Black Forest Labs

Restore images

Models that improve or restore images by deblurring, colorization, and removing noise

Enhance videos

Upscaling models that create high-quality video from low-quality videos

Edit Videos

Tools for editing videos.

Videos from images

Use AI To Generate Videos from images with an API

Make videos with Wan

Generate videos with Wan, the fastest and highest quality open-source video generation model.

Use Kontext fine-tunes

Browse the diverse range of fine-tunes the community has custom-trained on Replicate

Caption images

Use AI To Caption Images with an API

Chat with images

Ask language models about images

Use LLMs

Models that can understand and generate text

Make 3D stuff

Models that generate 3D objects, scenes, radiance fields, textures and multi-views.

Use handy tools

Toolbelt-type models for videos and images.

Control image generation

Guide image generation with more than just text. Use edge detection, depth maps, and sketches to get the results you want.

Sing with voices

Voice-to-voice cloning and musical prosody

Get embeddings

Models that generate embeddings from inputs

Try for free

Get started with these models without adding a credit card. Whether you're making videos, generating images, or upscaling photos, these are great starting points.

Use official models

Official models are always on, maintained, and have predictable pricing.

Detect objects

Models that detect or segment objects in images and videos.

Use FLUX fine-tunes

Browse the diverse range of fine-tunes the community has custom-trained on Replicate

Popular models

openai/whisper

Convert speech in audio to text

Updated 8 months, 1 week ago 110M runs

prunaai/flux.1-dev

This is the fastest Flux Dev endpoint in the world, contact us for more at pruna.ai

Updated 1 week, 1 day ago 11.9M runs

turian/insanely-fast-whisper-with-video

whisper-large-v3, incredibly fast, with video transcription

Updated 1 year, 6 months ago 2.5M runs

salesforce/blip

Generate image captions

Updated 2 years, 10 months ago 167.3M runs

jaaari/kokoro-82m

Kokoro v1.0 - text-to-speech (82M params, based on StyleTTS2)

Updated 6 months ago 35.6M runs

beautyyuyanli/multilingual-e5-large

multilingual-e5-large: A multi-language text embedding model

Updated 1 year, 6 months ago 23.7M runs

xinntao/gfpgan

Practical face restoration algorithm for *old photos* or *AI-generated faces*

Updated 2 years, 10 months ago 36.1M runs

851-labs/background-remover

Remove backgrounds from images.

Updated 7 months, 2 weeks ago 5.4M runs

Latest models

vetkastar/image-slicing

Split one or multiple images into four equal parts

Updated 9 months, 3 weeks ago 60 runs

zsxkib/pyramid-flow

Text-to-Video + Image-to-Video: Pyramid Flow Autoregressive Video Generation method based on Flow Matching

Updated 9 months, 3 weeks ago 8.7K runs

lucataco/diffusers-dreambooth-lora

FLUX.1-Dev LoRA Training by Huggingface Diffusers

Updated 9 months, 3 weeks ago 222 runs

visoar/headshots.fun

Fun & Pro for Every Occasion, Just Shoot at https://HeadShots.fun/

Updated 9 months, 3 weeks ago 8.3K runs

aihilums/problem_assessment

Updated 9 months, 3 weeks ago 8 runs

aodianyun/ad-pdf-extract

Updated 9 months, 3 weeks ago 227 runs

xlabs-ai/flux-dev-controlnet

XLabs v3 canny, depth and soft edge controlnets for Flux.1 Dev

Updated 9 months, 3 weeks ago 235.8K runs

aitechtree/test-world

test-world

Updated 9 months, 3 weeks ago 5 runs

aitechtree/test-hello

test-world

Updated 9 months, 3 weeks ago 13 runs

nousresearch/hermes-2-theta-llama-8b

Hermes-2 Θ (Theta) is the first experimental merged model released by Nous Research, in collaboration with Charles Goddard at Arcee, the team behind MergeKit.

Updated 9 months, 3 weeks ago 30K runs

datacte/mobius

Mobius: Redefining State-of-the-Art in Debiased Diffusion Models

Updated 9 months, 3 weeks ago 131 runs

chenxwh/lotus

Diffusion-based Visual Foundation Model for High-quality Dense Prediction

Updated 9 months, 3 weeks ago 371 runs

lucataco/flux-dev-lora

FLUX.1-Dev LoRA Explorer (DEPRECATED Please use: black-forest-labs/flux-dev-lora)

Updated 9 months, 4 weeks ago 3.8M runs

justmalhar/meta-llama-3.2-11b-vision

This is a test version, more updates coming

Updated 9 months, 4 weeks ago 249 runs

villesau/whisper-timestamped

Transcribes audio using Whisper Large V3 with precise word-level timestamps and confidence scores.

Updated 9 months, 4 weeks ago 5.6K runs

dashed/whisperx-subtitles-replicate

Generates subtitles from audio using whisperX (faster-whisper-large-v3)

Updated 9 months, 4 weeks ago 1.2K runs

jimothyjohn/phi3-vision-instruct

A soon-to-be accelerated endpoint for multi-modal inference.

Updated 10 months ago 201 runs

lucataco/flux.1-controlnet-lineart-promeai

Controlnet trained on black-forest-labs/FLUX.1-dev with lineart condition

Updated 10 months ago 353 runs

smoretalk/flamel-inpainting

Add or change what you want on your image

Updated 10 months ago 2.7K runs

coder-pranav/image_better_text

Updated 10 months ago 89 runs

chenxwh/depthcrafter

Generating Consistent Long Depth Sequences for Open-world Videos

Updated 10 months ago 205 runs

smoretalk/flamel-eraser

Erase what you don't want on your image

Updated 10 months ago 383 runs

baaivision/emu3-chat

Emu3-Chat for vision-language understanding

Updated 10 months ago 28 runs

baaivision/emu3-gen

Emu3-Gen for image generation

Updated 10 months ago 54 runs

pikachupichu25/image-faceswap

Updated 10 months ago 12.8K runs

zsxkib/molmo-7b

allenai/Molmo-7B-D-0924, Answers questions and caption about images

Updated 10 months, 1 week ago 125.4K runs

zsxkib/flux-music

🎼FluxMusic Text-to-Music Generation with Rectified Flow Transformer🎶

Updated 10 months, 1 week ago 8.5K runs

justmalhar/meta-llama-3.2-3b

Meta Llama 3.2 1B

Updated 10 months, 1 week ago 2.7K runs

justmalhar/meta-llama-3.2-1b

Meta Llama 3.2 1B

Updated 10 months, 1 week ago 198 runs

okaris/omni-zero-couples

Omni-Zero Couples: A diffusion pipeline for zero-shot stylized couples portrait creation.

Updated 10 months, 1 week ago 18.6K runs

aleksanderobuchowski/bielik-11b-v2.3-instruct

Bielik-11B-v2.3-Instruct is a generative text model made by SpeakLeash and Cyfronet featuring 11 billion parameters. It is a linear merge of the Bielik-11B-v2.0-Instruct, Bielik-11B-v2.1-Instruct, and Bielik-11B-v2.2-Instruct models.

Updated 10 months, 1 week ago 1.4K runs