adirik / mamba-2.8b-chat

Mamba 2.8B state space language model fine tuned for chat

  • Public
  • 810 runs
  • L40S
  • GitHub
  • Paper
  • License
Iterate in playground

Input

*string
Shift + Return to add a new line

The message to generate a response for

string
Shift + Return to add a new line

The message history to generate a response for

Default: "[]"

number
(minimum: 0.1, maximum: 5)

Adjusts randomness of outputs, greater than 1 is random and 0 is deterministic, 0.75 is a good starting value.

Default: 0.9

number
(minimum: 0.01, maximum: 1)

When decoding text, samples from the top p percentage of most likely tokens; lower to ignore less likely tokens.

Default: 0.7

integer

When decoding text, samples from the top k most likely tokens; lower to ignore less likely tokens.

Default: 1

number
(minimum: 0.01, maximum: 10)

Penalty for repeated words in generated text; 1 is no penalty, values greater than 1 discourage repetition, less than 1 encourage it.

Default: 1

integer

The seed for the random number generator

Output

I do not have access to specific information about large language models. However, here are some general tips on deployment best practices: 1. Use a cloud-based deployment platform: cloud-based deployment platforms like azure, aws, and google cloud provide a scalable and reliable environment for deploying large language models. 2. Use a managed service: a managed service provides a pre-built and pre-configured environment for deploying large language models. This can save time and resources for deploying and maintaining the model. 3. Use a reliable and secure network: ensure that the network is secure and reliable to ensure that the model is not compromised. 4. Use a monitoring solution: monitor the model to ensure that it is functioning properly and that there are no issues with the deployment. 5. Use a data protection solution: ensure that the model is protected from unauthorized access and data breaches. 6. Use a data science platform: use a data science platform to manage the deployment and management of the model. This can help with scaling and automating the deployment process. 7. Use a data science tool: use a data science tool to automate the deployment and management of the model. This can help with reducing the time and effort required for deployment. overall, deploying a large language model can be a complex process, but with the right tools and best practices, it can be manageable.
Generated in

Run time and cost

This model costs approximately $0.028 to run on Replicate, or 35 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia L40S GPU hardware. Predictions typically complete within 29 seconds. The predict time for this model varies significantly based on the inputs.

Readme

Mamba-Chat

Mamba-Chat is the first chat language model based on mamba, which is a language model that leverages state-space model architecture. See the original repo and paper for more details.

Basic Usage

The API input arguments are as follows:
- message: The input message to the chatbot.
- message_history: The chat history as json string to condition the chatbot on.
- temperature: Adjusts randomness of outputs, greater than 1 is random and 0 is deterministic, 0.75 is a good starting value.
- top_p: Samples from the top p percentage of most likely tokens during text decoding, lower to ignore less likely tokens.
- top_k: Samples from the top k most likely tokens during text decoding, lower to ignore less likely tokens.
- repetition_penalty: Penalty for repeated words in generated text; 1 is no penalty, values greater than 1 discourage repetition, less than 1 encourage it.
- seed: The seed parameter for deterministic text generation. A specific seed can be used to reproduce results or left blank for random generation.

References

@article{mamba,
  title={Mamba: Linear-Time Sequence Modeling with Selective State Spaces},
  author={Gu, Albert and Dao, Tri},
  journal={arXiv preprint arXiv:2312.00752},
  year={2023}
}