edoproch/deepseekr1-distilled-llama-8b-ollama | Run with an API on Replicate

edoproch / deepseekr1-distilled-llama-8b-ollama

DeepSeek-R1 distilled on LLaMA 8B

Public
564 runs
Weights
Paper
License

Run time and cost

This model costs approximately $0.0055 to run on Replicate, or 181 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia L40S GPU hardware. Predictions typically complete within 6 seconds.

Readme

🚀 Meet DeepSeek-R1 distilled on LLaMA 8B! Unlike other similar models on Replicate, this one has its weights cached, so you don’t have to waste time downloading them every time. ⏳💨

But wait, there’s more! 🎉 It’s also quantized, meaning you get way better efficiency with barely any performance loss. Smarter, faster, and optimized just for you! ⚡🔥