Explore

I want to…

Restore images

Models that improve or restore images by deblurring, colorization, and removing noise

Upscale images

Upscaling models that create high-quality images from low-quality images

Make 3D stuff

Models that generate 3D objects, scenes, radiance fields, textures and multi-views.

Latest models

Yi-VL-34B is the first open-source 34B VL model worldwide. It demonstrates exceptional performance, ranking first among all existing open-source models in the latest benchmarks including MMMU and CMMMU.

Updated 292 runs

🖼️ Super fast 1.5B Image Captioning/VQA Multimodal LLM (Image-to-Text) 🖋️

Updated 2K runs

High-Quality Image Restoration Following Human Instructions

Updated 9.2K runs

Updated 1.3K runs

Generates speech from text

Updated 121.4K runs

The Segment Anything Model (SAM) is a powerful and versatile image segmentation model. It leverages a "foundation model" approach, meaning it can be used for various segmentation tasks without needing to be specifically trained for each one.

Updated 279 runs

Source: pipizhao/Pandalyst_13B_V1.0 ✦ Quant: TheBloke/Pandalyst_13B_V1.0-AWQ ✦ Pandalyst: A large language model for mastering data analysis using pandas

Updated 14 runs

A better alternative to SDXL refiners, providing a lot of quality and detail. Can also be used for inpainting or upscaling.

Updated 894K runs

'''Last update: Now supports img2img.''' SDXL Canny controlnet with LoRA support.

Updated 663K runs

VideoCrafter2: Text-to-Video and Image-to-Video Generation and Editing

Updated 30K runs

DiffusionLight: Light Probes by Painting a Chrome Ball

Updated 747 runs

Phi-2 by Microsoft

Updated 2.8K runs

A 70 billion parameter Llama tuned for coding and conversation

Updated 21.7K runs

Generate panoramic images with text prompts

Updated 117 runs

Locality-enhanced Projector for Multimodal LLM

Updated 24 runs

Updated 54 runs

A 70 billion parameter Llama tuned for coding with Python

Updated 1.1K runs

a family of multimodal small language models

Updated 66 runs

Undi95's FlatDolphinMaid 8x7B Mixtral Merge, GGUF Q5_K_M quantized by TheBloke.

Updated 412.8K runs

Generate Arab Maqam Melodic Improvisations (Taqasim)

Updated 20 runs

Updated 14 runs

InstantID : Zero-shot Identity-Preserving Generation in Seconds with ⚡️LCM-LoRA⚡️. Using AlbedoBase-XL v2.0 as base model.

Updated 78.1K runs

Take an image and an audio file and create a video clip

Updated 1.5K runs

Fork of cagliostrolab/animagine-xl-3, an anime style Stable Diffusion XL

Updated 5.5K runs

amrul-hzz's fine-tuned version of vit-base-patch16-224-in21k for watermark image detection

Updated 231 runs

Proteus v0.2 Model (Text2Img, Img2Img and Inpainting)

Updated 13.5K runs

Many models: RealVisXL, Juggernaut, Proteus, DreamShaper, etc.

Updated 10.5K runs

I fed the beast my oil paintings, made in the south of France. (version ec0d4305 is my fav)

Updated 3.9K runs

Runs Mixtral 8x7B on a single A40 GPU

Updated 55 runs

Remix the music into another styles with MusicGen Chord

Updated 8.3K runs

(Research only) Moondream1 is a vision language model that performs on par with models twice its size

Updated 10.4K runs

Proteus v0.2 shows subtle yet significant improvements over Version 0.1. It demonstrates enhanced prompt understanding that surpasses MJ6, while also approaching its stylistic capabilities.

Updated 7.6M runs

Tiny vision language model

Updated 291 runs

Undi95's Borealis 10.7B Mistral DPO Finetune, GGUF Q5_K_M quantized by Undi95.

Updated 73 runs

InstantID : Zero-shot Identity-Preserving Generation in Seconds. Using Juggernaut-XL v8 as the base model to encourage photorealism

Updated 27K runs

InstantID : Zero-shot Identity-Preserving Generation in Seconds. Using Dreamshaper-XL as the base model to encourage artistic generations

Updated 1.8K runs

Generate song ideas!

Updated 578 runs

Highly practical solution for robust monocular depth estimation by training on a combination of 1.5M labeled images and 62M+ unlabeled images

Updated 4.4K runs

AI powered speech denoising and enhancement

Updated 117 runs

SigLIP proposes to replace the loss function used in CLIP by a simple pairwise sigmoid loss

Updated 136 runs

WizardCoder: Empowering Code Large Language Models with Evol-Instruct

Updated 16.6K runs

Consistent Diffusion Features for Consistent Video Editing

Updated 2K runs

Nebul.Redmond - Stable Diffusion SD XL Finetuned Model

Updated 15.7K runs

Create photos, paintings and avatars for anyone in any style within seconds. (Stylization version)

Updated 746.1K runs

Video Smoother: AMT All-Pairs Multi-Field Transforms for Efficient Frame Interpolation

Updated 15.8K runs

MythoMax-L2-13B-GPTQ from TheBloke

Updated 156 runs

Updated 34.8K runs

Updated 85.4K runs

NeuralBeagle14-7B is (probably) the best 7B model you can find!

Updated 12.2K runs

Updated 256 runs