Model: https://huggingface.co/TheBloke/airoboros-7B-gpt4-1.4-GPTQ
Fast inference thanks to https://github.com/turboderp/exllama
Test out fast inference with ExLlama and 4bit quantization!
Model: https://huggingface.co/TheBloke/airoboros-7B-gpt4-1.4-GPTQ
Fast inference thanks to https://github.com/turboderp/exllama