You're looking at a specific version of this model. Jump to the model overview.
Input
Run this model in Node.js with one line of code:
npm install replicate
REPLICATE_API_TOKEN
environment variable:export REPLICATE_API_TOKEN=<paste-your-token-here>
Find your API token in your account settings.
import Replicate from "replicate";
const replicate = new Replicate({
auth: process.env.REPLICATE_API_TOKEN,
});
Run jichengdu/llasa using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
const output = await replicate.run(
"jichengdu/llasa:e159ffbd476eaad8ddc3d05b73074a618a32a0aa4efb2e652aba0268ef506f37",
{
input: {
text: "为所有的猫猫奋斗终身!",
voice_sample: "https://replicate.delivery/pbxt/MiFpnTHt7iIQ8LELP7yEKUvk1yO3HZwz9NquUVpOQ7SNPa74/zero_shot_prompt.wav"
}
}
);
// To access the file URL:
console.log(output.url()); //=> "http://example.com"
// To write the file to disk:
fs.writeFile("my-image.png", output);
To learn more, take a look at the guide on getting started with Node.js.
pip install replicate
REPLICATE_API_TOKEN
environment variable:export REPLICATE_API_TOKEN=<paste-your-token-here>
Find your API token in your account settings.
import replicate
Run jichengdu/llasa using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
output = replicate.run(
"jichengdu/llasa:e159ffbd476eaad8ddc3d05b73074a618a32a0aa4efb2e652aba0268ef506f37",
input={
"text": "为所有的猫猫奋斗终身!",
"voice_sample": "https://replicate.delivery/pbxt/MiFpnTHt7iIQ8LELP7yEKUvk1yO3HZwz9NquUVpOQ7SNPa74/zero_shot_prompt.wav"
}
)
print(output)
To learn more, take a look at the guide on getting started with Python.
REPLICATE_API_TOKEN
environment variable:export REPLICATE_API_TOKEN=<paste-your-token-here>
Find your API token in your account settings.
Run jichengdu/llasa using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.
curl -s -X POST \
-H "Authorization: Bearer $REPLICATE_API_TOKEN" \
-H "Content-Type: application/json" \
-H "Prefer: wait" \
-d $'{
"version": "e159ffbd476eaad8ddc3d05b73074a618a32a0aa4efb2e652aba0268ef506f37",
"input": {
"text": "为所有的猫猫奋斗终身!",
"voice_sample": "https://replicate.delivery/pbxt/MiFpnTHt7iIQ8LELP7yEKUvk1yO3HZwz9NquUVpOQ7SNPa74/zero_shot_prompt.wav"
}
}' \
https://api.replicate.com/v1/predictions
To learn more, take a look at Replicate’s HTTP API reference docs.
brew install cog
If you don’t have Homebrew, there are other installation options available.
Run this to download the model and run it in your local environment:
cog predict r8.im/jichengdu/llasa@sha256:e159ffbd476eaad8ddc3d05b73074a618a32a0aa4efb2e652aba0268ef506f37 \
-i 'text="为所有的猫猫奋斗终身!"' \
-i 'voice_sample="https://replicate.delivery/pbxt/MiFpnTHt7iIQ8LELP7yEKUvk1yO3HZwz9NquUVpOQ7SNPa74/zero_shot_prompt.wav"'
To learn more, take a look at the Cog documentation.
Run this to download the model and run it in your local environment:
docker run -d -p 5000:5000 --gpus=all r8.im/jichengdu/llasa@sha256:e159ffbd476eaad8ddc3d05b73074a618a32a0aa4efb2e652aba0268ef506f37
curl -s -X POST \ -H "Content-Type: application/json" \ -d $'{ "input": { "text": "为所有的猫猫奋斗终身!", "voice_sample": "https://replicate.delivery/pbxt/MiFpnTHt7iIQ8LELP7yEKUvk1yO3HZwz9NquUVpOQ7SNPa74/zero_shot_prompt.wav" } }' \ http://localhost:5000/predictions
To learn more, take a look at the Cog documentation.
Add a payment method to run this model.
By signing in, you agree to our
terms of service and privacy policy
Output
- Chapters
- descriptions off, selected
- captions settings, opens captions settings dialog
- captions off, selected
This is a modal window.
Beginning of dialog window. Escape will cancel and close the window.
End of dialog window.
{
"completed_at": "2025-03-24T13:44:11.898278Z",
"created_at": "2025-03-24T13:40:26.986000Z",
"data_removed": false,
"error": null,
"id": "1kqy6vrrx9rme0cns1h8bttbnc",
"input": {
"text": "为所有的猫猫奋斗终身!",
"voice_sample": "https://replicate.delivery/pbxt/MiFpnTHt7iIQ8LELP7yEKUvk1yO3HZwz9NquUVpOQ7SNPa74/zero_shot_prompt.wav"
},
"logs": "/root/.pyenv/versions/3.10.15/lib/python3.10/site-packages/transformers/models/whisper/generation_whisper.py:496: FutureWarning: The input name `inputs` is deprecated. Please make sure to use `input_features` instead.\nwarnings.warn(\nDue to a bug fix in https://github.com/huggingface/transformers/pull/28687 transcription using a multilingual Whisper will default to language detection followed by transcription instead of translation to English.This might be a breaking change for your use case. If you want to instead always translate your audio to English, make sure to pass `language='en'`.\nPassing a tuple of `past_key_values` is deprecated and will be removed in Transformers v4.43.0. You should pass an instance of `EncoderDecoderCache` instead, e.g. `past_key_values=EncoderDecoderCache.from_legacy_cache(past_key_values)`.\nWhisper transcription: 希望你以后能够做得比我还好哟\nPrompt Vq Code Shape: torch.Size([1, 1, 175])\nThe attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.\nSetting `pad_token_id` to `eos_token_id`:None for open-end generation.\nThe attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.\nStarting from v4.46, the `logits` model output will have the same type as the model (except at train time, where it will always be FP32)",
"metrics": {
"predict_time": 6.727991584,
"total_time": 224.912278
},
"output": "https://replicate.delivery/xezq/GaVxDJMlWdqPFdeERPPGEZL7uz4HlfwEd2XGdbt6NGZrOxbUA/output.wav",
"started_at": "2025-03-24T13:44:05.170286Z",
"status": "succeeded",
"urls": {
"stream": "https://stream.replicate.com/v1/files/bcwr-rplxii7pg5ffqfbzl4re7lwa2ms2ifqektavidmhyjdevaqyshaq",
"get": "https://api.replicate.com/v1/predictions/1kqy6vrrx9rme0cns1h8bttbnc",
"cancel": "https://api.replicate.com/v1/predictions/1kqy6vrrx9rme0cns1h8bttbnc/cancel"
},
"version": "e159ffbd476eaad8ddc3d05b73074a618a32a0aa4efb2e652aba0268ef506f37"
}
/root/.pyenv/versions/3.10.15/lib/python3.10/site-packages/transformers/models/whisper/generation_whisper.py:496: FutureWarning: The input name `inputs` is deprecated. Please make sure to use `input_features` instead.
warnings.warn(
Due to a bug fix in https://github.com/huggingface/transformers/pull/28687 transcription using a multilingual Whisper will default to language detection followed by transcription instead of translation to English.This might be a breaking change for your use case. If you want to instead always translate your audio to English, make sure to pass `language='en'`.
Passing a tuple of `past_key_values` is deprecated and will be removed in Transformers v4.43.0. You should pass an instance of `EncoderDecoderCache` instead, e.g. `past_key_values=EncoderDecoderCache.from_legacy_cache(past_key_values)`.
Whisper transcription: 希望你以后能够做得比我还好哟
Prompt Vq Code Shape: torch.Size([1, 1, 175])
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:None for open-end generation.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Starting from v4.46, the `logits` model output will have the same type as the model (except at train time, where it will always be FP32)