peter65374/openbuddy-llemma-34b-gguf:5ebb3f43 | Run with an API on Replicate

You're looking at a specific version of this model. Jump to the model overview.

peter65374 /openbuddy-llemma-34b-gguf:5ebb3f43

Input schema

The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.

Field	Type	Default value	Description
prompt	string		None
max_new_tokens	integer	1024 Min: 1 Max: 3500	The maximum number of tokens the model should generate as output.
temperature	number	0.7	The value used to modulate the next token probabilities.
top_p	number	0.95 Min: 0.01 Max: 1	A probability threshold for generating the output. If < 1.0, only keep the top tokens with cumulative probability >= top_p (nucleus filtering). Nucleus filtering is described in Holtzman et al. (http://arxiv.org/abs/1904.09751).
top_k	integer	40 Min: 1 Max: 100	The number of highest probability tokens to consider for generating the output. If > 0, only keep the top k tokens with highest probability (top-k filtering).
do_sample	boolean	True	Whether or not to use sampling ; use greedy decoding otherwise.
num_beams	integer	1 Min: 1 Max: 10	Number of beams for beam search. 1 means no beam search.
repetition_penalty	number	1.1 Min: 0.01 Max: 5	Repetition penalty, (float, optional, defaults to 1.0): The parameter for repetition penalty. 1.0 means no penalty. values greater than 1 discourage repetition, less than 1 encourage it. See [this paper](https://arxiv.org/pdf/1909.05858.pdf) for more details.
presence_penalty	number	0	Repeat alpha presence penalty (default: 0.0, 0.0 = disabled).
frequency_penalty	number	0	Repeat alpha frequency penalty (default: 0.0, 0.0 = disabled)
prompt_template	string	You are a helpful high school Math tutor. If you don't know the answer to a question, please don't share false information. You can speak fluently in many languages. User: Hi Assistant: Hello, how can I help you?</s> User: {prompt} Assistant:	The template used to format the prompt. The input prompt is inserted into the template using the `{prompt}` placeholder.
padding_mode	boolean	True	Whether to pad the left side of the prompt with eos token.
stop_sequences	string		A comma-separated list of sequences to stop generation at. For example, '<end>,<stop>' will stop generation at the first instance of 'end' or '<stop>'.
debug	boolean	False	provide debugging output in logs

Output schema

The shape of the response you’ll get when you run this model with an API.

Schema

{'items': {'type': 'string'},
 'title': 'Output',
 'type': 'array',
 'x-cog-array-display': 'concatenate',
 'x-cog-array-type': 'iterator'}