Explore

Generated with davisbrown/flux-half-illustration

Fine-tune FLUX

Customize FLUX.1 [dev] with Ostris's AI Toolkit on Replicate. Train the model to recognize and generate new concepts using a small set of example images, for specific styles, characters, or objects. (Generated with davisbrown/flux-half-illustration.)

I want to…

Make videos with Wan2.1

Generate videos with Wan2.1, the fastest and highest quality open-source video generation model.

Upscale images

Upscaling models that create high-quality images from low-quality images

Restore images

Models that improve or restore images by deblurring, colorization, and removing noise

Use official models

Official models are always on, maintained, and have predictable pricing.

Enhance videos

Models that enhance videos with super-resolution, sound effects, motion capture and other useful production effects.

Detect objects

Models that detect or segment objects in images and videos.

Make 3D stuff

Models that generate 3D objects, scenes, radiance fields, textures and multi-views.

Use FLUX fine-tunes

Browse the diverse range of fine-tunes the community has custom-trained on Replicate

Control image generation

Guide image generation with more than just text. Use edge detection, depth maps, and sketches to get the results you want.

Latest models

Make realistic images of real people instantly (w/ ip-adapter-plus-face_sdxl_vit-h)

Updated 4K runs

Updated 81 runs

PixArt Sigma 900M is a text-to-image generation model based on the PixArt Sigma architecture

Updated 2.2K runs

Updated 46.5K runs

araby.ai oneshot video faceswap

Updated 21.1K runs

MARS5, a fully open-source (commercially usable) voice-cloning/TTS with break-through prosody and realism.

Updated 667 runs

for backsound

Updated 111 runs

audio to srt

Updated 30 runs

Cog wrapper for Ollama llama3:70b

Updated 6.6K runs

Cog wrapper for Ollama llama3:8b

Updated 14 runs

Input a video. Ask anything about it

Updated 3.5K runs

YOLOv10: Real-Time End-to-End Object Detection

Updated 271 runs

DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

Updated 426 runs

Take audio from one video and add it to a second video. Good for adding back audio to liveportrait.

Updated 205 runs

Change the fps of a video without changing its length or speed

Updated 104 runs

Portrait animation using a driving video source

Updated 80.7K runs

Efficient Portrait Animation with Stitching and Retargeting Control

Updated 1.2K runs

Kolors is a SOTA base image model for high quality image generation

Updated 1.2K runs

Updated 14 runs

Updated 91 runs

Updated 50 runs

The API automatically detects objects in an input image and returns their positional and mask information.

Updated 4.2K runs

Create music for your content

Updated 499.5K runs

Updated 393 runs

Mama ママ 2.0 Shinsei Galverse Anime-themed text-to-image model

Updated 2.4K runs

InternLM2.5 has open-sourced a 7 billion parameter base model and a chat model tailored for practical scenarios.

Updated 58 runs

Create videos from illustrated input images

Updated 50.1K runs

Qwen2 57 billion parameter language model from Alibaba Cloud, fine tuned for chat completions

Updated 1.3K runs

Generate clay style images based on prompts or images

Updated 468 runs

GLM-4V is a multimodal model released by Tsinghua University that is competitive with GPT-4o and establishes a new SOTA on several benchmarks, including OCR.

Updated 86.6K runs

Convert speech in audio to text w/ `tiny`, `small`, `base`, and `large-v3` models

Updated 127 runs

Extended video synthesis model that generates 128 frames

Updated 203 runs

Image generation, Inpaint Strength, loras custom_urls and enhancer.

Updated 445 runs

Depth estimation with faster inference speed, fewer parameters, and higher depth accuracy.

Updated 196.1K runs

Updated 20 runs

Hermes 2 Pro is an updated and cleaned version of the OpenHermes 2.5 Dataset, as well as a newly introduced Function Calling and JSON Mode dataset developed in-house

Updated 344 runs

Best Open-Source Model for Function Calling

Updated 33 runs

Speech to speech with any RVC v2 trained AI voice

Updated 663K runs

hello world

Updated 46 runs

Google's Gemma2 27b instruct model

Updated 12.8K runs

AuraSR: GAN-based Super-Resolution for real-world

Updated 2.4K runs

Google's Gemma2 9b instruct model

Updated 21.3K runs

Model

Updated 411 runs

A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

Updated 1.1K runs

Model that generates Cartoon like characters

Updated 765 runs

Stable Diffusion 3 with Differential Diffusion inpainting (experimental)

Updated 267 runs

Fork of https://replicate.com/schananas/grounded_sam that uses OwlV2 instead of Grounding Dino

Updated 1.9K runs

Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks

Updated 116.7K runs