Explore

Generated with davisbrown/flux-half-illustration

Fine-tune FLUX

Customize FLUX.1 [dev] with Ostris's AI Toolkit on Replicate. Train the model to recognize and generate new concepts using a small set of example images, for specific styles, characters, or objects. (Generated with davisbrown/flux-half-illustration.)

I want to…

Make videos with Wan2.1

Generate videos with Wan2.1, the fastest and highest quality open-source video generation model.

Upscale images

Upscaling models that create high-quality images from low-quality images

Restore images

Models that improve or restore images by deblurring, colorization, and removing noise

Use official models

Official models are always on, maintained, and have predictable pricing.

Enhance videos

Models that enhance videos with super-resolution, sound effects, motion capture and other useful production effects.

Detect objects

Models that detect or segment objects in images and videos.

Make 3D stuff

Models that generate 3D objects, scenes, radiance fields, textures and multi-views.

Use FLUX fine-tunes

Browse the diverse range of fine-tunes the community has custom-trained on Replicate

Control image generation

Guide image generation with more than just text. Use edge detection, depth maps, and sketches to get the results you want.

Latest models

Generate a model with a garment faster if you have a mask image

Updated 588 runs

Only makes segmentations for further processing

Updated 195 runs

Find out how similar Japanese sentences are

Updated 12 runs

Fast text-to-3D Gaussian generation by bridging 2D and 3D diffusion models

Updated 247 runs

Yuan2.0 is a new generation LLM developed by IEIT System, enhanced the model's understanding of semantics, mathematics, reasoning, code, knowledge, and other aspects.

Updated 30 runs

Trajectory Consistency Distillation

Updated 569 runs

Removes silence from your audio

Updated 75 runs

A diffusion-based method to enhance visual consistency for I2V generation

Updated 3.2K runs

Rethinking Inductive Biases for Surface Normal Estimation

Updated 72 runs

Updated 1.1K runs

AI Music Structure Analyzer + Stem Splitter using Demucs & Mdx-Net with Python-Audio-Separator

Updated 21K runs

Experimental & for non-commercial use only

Updated 6.6K runs

High-quality multilingual text-to-speech library

Updated 1.4K runs

DUSt3R: Geometric 3D Vision Made Easy

Updated 428 runs

Sentiment Analysis with Texts

Updated 4.9K runs

A wrapper around bel-tts

Updated 1.4K runs

Turn a face into a sticker

Updated 1.4M runs

Updated 253 runs

Surya is a document OCR toolkit that does:

Updated 5.7K runs

Generates 3D assets from images

Updated 2.9K runs

SDXL lightning mult-controlnet, img2img & inpainting

Updated 9.3K runs

dreamshaper-xl-lightning is a Stable Diffusion model that has been fine-tuned on SDXL

Updated 117.2K runs

ProteusV0.4: The Style Update

Updated 110.8K runs

Updated 190 runs

Lightweight multimodal model for visual question answering, reasoning and captioning

Updated 7.8K runs

Updated 192K runs

Simple video chroma keying

Updated 48 runs

Multilingual E5-small language embedding model

Updated 51 runs

Multilingual E5-large language embedding model

Updated 23 runs

Multilingual E5-large language embedding model

Updated 538 runs

Tea Segmentation Demo

Updated 27 runs

Function calling LLM that surpasses the state-of-the-art in function calling capabilities

Updated 65 runs

OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement

Updated 117 runs

Updated 82 runs

AnimateDiff video to video

Updated 626 runs

Segments an audio recording based on who is speaking

Updated 2.8K runs

Updated 5.1K runs

Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild. This is the SUPIR-v0F model and does NOT use LLaVA-13b.

Updated 14K runs

Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild. This is the SUPIR-v0Q model and does NOT use LLaVA-13b.

Updated 75.6K runs

Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild. This version uses LLaVA-13b for captioning.

Updated 186.6K runs

POC implementation of Depth-anything to produce a 3D SBS video

Updated 197 runs

E5-mistral-7b-instruct language embedding model

Updated 634 runs

Merge two images together with a prompt

Updated 6.1K runs

Honeycomb NLQ Generator

Updated 181 runs

ProteusV0.4: The Style Update - enhances stylistic capabilities, similar to Midjourney's approach, rather than advancing prompt comprehension

Updated 131.5K runs

hello-world from cog example

Updated 34 runs

A collection of anime stable diffusion models with VAEs and LORAs.

Updated 3.7K runs

Get the width, height, and duration in seconds from a video

Updated 220 runs

7B base version of Google’s Gemma model

Updated 7.5K runs

2B base version of Google’s Gemma model

Updated 2.4K runs