Explore
![](https://tjzk.replicate.delivery/models_models_featured_image/db4434b4-7b0f-49f7-b78a-774fe9e630a7/batou.jpeg)
batouresearch/high-resolution-controlnet-tile
UPDATE: new upscaling algorithm for a much improved image quality. Fermat.app open-source implementation of an efficient ControlNet 1.1 tile for high-quality upscales. Increase the creativity to encourage hallucination.
![](https://tjzk.replicate.delivery/models_models_featured_image/0411f758-80e6-4794-bd5d-d04198d891a5/image-90.png)
stability-ai/stable-diffusion-3
A text-to-image model with greatly improved performance in image quality, typography, complex prompt understanding, and resource-efficiency
![](https://tjzk.replicate.delivery/models_models_featured_image/8c0e2917-501a-41ed-aadb-65886a34dcf9/ic-light-featured.png)
zsxkib/ic-light
✍️✨Prompts to auto-magically relights your images
![](https://tjzk.replicate.delivery/models_models_featured_image/793e32b4-913c-4036-a847-4afb38e42fc1/Snowflake_Arctic_Opengraph_120.png)
snowflake/snowflake-arctic-instruct
An efficient, intelligent, and truly open-source language model
![](https://tjzk.replicate.delivery/models_models_featured_image/3dcb020b-1fad-4101-84cf-88af9b20ac21/meta-logo.png)
meta/meta-llama-3-70b-instruct
A 70 billion parameter language model from Meta, fine tuned for chat completions
![](https://tjzk.replicate.delivery/models_models_featured_image/831172d8-5976-415b-b8da-8462c9368b7e/fofr_dog.jpg)
suno-ai/bark
🔊 Text-Prompted Generative Audio Model
I want to…
Generate images
Models that generate images from text prompts
Use a language model
Models that can understand and generate text
Caption images
Models that generate text from images
Edit images
Tools for manipulating images.
Restore images
Models that improve or restore images by deblurring, colorization, and removing noise
Upscale images
Upscaling models that create high-quality images from low-quality images
Get embeddings
Models that generate embeddings from inputs
Train a language model
Language models that you can fine-tune using Replicate's training API.
Chat with images
Ask language models about images
Transcribe speech
Models that convert speech to text
Extract text from images
Optical character recognition (OCR) and text extraction
Use a face to make images
Make realistic images of people instantly
Use handy tools
Toolbelt-type models for videos and images.
Generate music
Models to generate and modify music
Generate videos
Models that create and edit videos
Generate speech
Convert text to speech
Make 3D stuff
Models that generate 3D objects, scenes, radiance fields, textures and multi-views.
Get structured data
Language models that support grammar-based decoding as well as jsonschema constraints.
Popular models
SDXL-Lightning by ByteDance: a fast text-to-image model that makes high-quality images in 4 steps
Visual instruction tuning towards large language and vision models with GPT-4 level capabilities
A text-to-image generative AI model that creates beautiful images
Return CLIP features for the clip-vit-large-patch14 model
Practical face restoration algorithm for *old photos* or *AI-generated faces*
Proteus v0.2 shows subtle yet significant improvements over Version 0.1. It demonstrates enhanced prompt understanding that surpasses MJ6, while also approaching its stylistic capabilities.
Latest models
Kolors is a SOTA base image model for high quality image generation
UPDATE: new upscaling algorithm for a much improved image quality. Fermat.app open-source implementation of an efficient ControlNet 1.1 tile for high-quality upscales. Increase the creativity to encourage hallucination.
Bilateral Reference for High-Resolution Dichotomous Image Segmentation (arXiv 2024)
MimicMotion: High-quality human motion video generation with pose-guided control
InternLM2.5 has open-sourced a 7 billion parameter base model and a chat model tailored for practical scenarios.
Real-ESRGAN with optional face correction and adjustable upscale
Phi-3-Mini-4K-Instruct is a 3.8B parameters, lightweight, state-of-the-art open model trained with the Phi-3 datasets
Qwen2 57 billion parameter language model from Alibaba Cloud, fine tuned for chat completions
Run any ComfyUI workflow. Guide: https://github.com/fofr/cog-comfyui
GLM-4V is a multimodal model released by Tsinghua University that is competitive with GPT-4o and establishes a new SOTA on several benchmarks, including OCR.
GPU accelerated replay renderer / video data clipper for comma.ai connect's openpilot route data. SEE README.
Convert speech in audio to text w/ `tiny`, `small`, `base`, and `large-v3` models
Dolphin-2.9 has a variety of instruction, conversational, and coding skills. It also has initial agentic abilities and supports function calling
Image generation, Inpaint Strength, loras custom_urls and enhancer.
A powerful LLM competitive with Claude Sonnet and GPT 3.5 but fully opensource and Decentralized
Depth estimation with faster inference speed, fewer parameters, and higher depth accuracy.
Hermes-2 Θ (Theta) 70B is the continuation of our experimental merged model released by Nous Research
Hermes 2 Pro is an updated and cleaned version of the OpenHermes 2.5 Dataset, as well as a newly introduced Function Calling and JSON Mode dataset developed in-house
Hermes-2 Θ (Theta) is the first experimental merged model released by Nous Research
Hermes 2 Pro is an updated and cleaned version of the OpenHermes 2.5 Dataset, as well as a newly introduced Function Calling and JSON Mode dataset developed in-house