You're looking at a specific version of this model. Jump to the model overview.

tomasmcm /llama-2-7b-chat-hf:cf245016

Input schema

The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.

Field Type Default value Description
prompt
string
Text prompt to send to the model.
n
integer
1
Number of output sequences to return for the given prompt.
presence_penalty
number
0

Min: -5

Max: 5

Float that penalizes new tokens based on whether they appear in the generated text so far. Values > 0 encourage the model to use new tokens, while values < 0 encourage the model to repeat tokens.
frequency_penalty
number
0

Min: -5

Max: 5

Float that penalizes new tokens based on their frequency in the generated text so far. Values > 0 encourage the model to use new tokens, while values < 0 encourage the model to repeat tokens.
repetition_penalty
number
1

Min: 0.01

Max: 5

Float that penalizes new tokens based on whether they appear in the generated text so far. Values > 1 encourage the model to use new tokens, while values < 1 encourage the model to repeat tokens.
temperature
number
0.8

Min: 0.01

Max: 5

Float that controls the randomness of the sampling. Lower values make the model more deterministic, while higher values make the model more random. Zero means greedy sampling.
top_p
number
0.95

Min: 0.01

Max: 1

Float that controls the cumulative probability of the top tokens to consider. Must be in (0, 1]. Set to 1 to consider all tokens.
top_k
integer
-1
Integer that controls the number of top tokens to consider. Set to -1 to consider all tokens.
stop
string
List of strings that stop the generation when they are generated. The returned output will not contain the stop strings.
max_tokens
integer
128
Maximum number of tokens to generate per output sequence.

Output schema

The shape of the response you’ll get when you run this model with an API.

Schema
{'title': 'Output', 'type': 'string'}