andreasjansson / llama-2-13b-gguf

Llama-2 13B with support for grammars and jsonschema

  • Public
  • 752 runs
  • L40S
  • GitHub
  • License

Input

*string
Shift + Return to add a new line

Prompt

*string
Shift + Return to add a new line

Grammar in GBNF format

integer

Max number of tokens to return

Default: 500

Output

{"bcc":[],"body":"Hello, today it's rainy and 14 degrees.","cc":[],"from":"andreas86@telia.se","subject":"Today weather in Stockholm","to":["myfriend@telia.se"]}
Generated in

This output was created using a different version of the model, andreasjansson/llama-2-13b-gguf:19eb0a04.

Run time and cost

This model costs approximately $0.015 to run on Replicate, or 66 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia L40S GPU hardware. Predictions typically complete within 16 seconds. The predict time for this model varies significantly based on the inputs.

Readme

This model doesn't have a readme.