cjwbw

seamless_communication

Cold

SeamlessM4T—Massively Multilingual & Multimodal Machine Translation

Public

85K runs

Run with an API

Playground API Examples README Versions

Input

Video Player is loading.

Current Time 00:00:000

Duration 00:00:000

Loaded: 0%

Stream Type LIVE

Remaining Time 00:00:000

task_name

string

Choose a task

Default: "S2ST (Speech to Speech translation)"

input_audio

file

Provide input file for tasks with speech input: S2ST, S2TT and ASR.

input_text

string

Shift + Return to add a new line

Provide input for tasks with text: T2ST and T2TT.

input_text_language

string

Specify language of the input_text for T2ST and T2TT

Default: "None"

target_language_with_speech

string

Set target language for tasks with speech output: S2ST or T2ST. Less languages are available for speech compared to text output.

Default: "French"

target_language_text_only

string

Set target language for tasks with text output only: S2TT, T2TT and ASR.

Default: "Norwegian Nynorsk"

max_input_audio_length

number

Set maximum input audio length.

Default: 60

Run this model in Node.js with one line of code:

npx create-replicate --model=cjwbw/seamless_communication

or set up a project from scratch

Install Replicate’s Node.js client library:

npm install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Import and set up the client:

import Replicate from "replicate";

const replicate = new Replicate({
  auth: process.env.REPLICATE_API_TOKEN,
});

Run cjwbw/seamless_communication using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

const output = await replicate.run(
  "cjwbw/seamless_communication:668a4fec05a887143e5fe8d45df25ec4c794dd43169b9a11562309b2d45873b0",
  {
    input: {
      task_name: "S2ST (Speech to Speech translation)",
      input_audio: "https://replicate.delivery/pbxt/JWSAJpKxUszI0scNYatExIXZX2rJ78UBilGXCTq4Ct9BDwTA/sample_input_2.mp3",
      input_text_language: "None",
      max_input_audio_length: 60,
      target_language_text_only: "Norwegian Nynorsk",
      target_language_with_speech: "French"
    }
  }
);

console.log(output);

To learn more, take a look at the guide on getting started with Node.js.

Install Replicate’s Python client library:

pip install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Import the client:

import replicate

Run cjwbw/seamless_communication using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

output = replicate.run(
    "cjwbw/seamless_communication:668a4fec05a887143e5fe8d45df25ec4c794dd43169b9a11562309b2d45873b0",
    input={
        "task_name": "S2ST (Speech to Speech translation)",
        "input_audio": "https://replicate.delivery/pbxt/JWSAJpKxUszI0scNYatExIXZX2rJ78UBilGXCTq4Ct9BDwTA/sample_input_2.mp3",
        "input_text_language": "None",
        "max_input_audio_length": 60,
        "target_language_text_only": "Norwegian Nynorsk",
        "target_language_with_speech": "French"
    }
)

print(output)

To learn more, take a look at the guide on getting started with Python.

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Run cjwbw/seamless_communication using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d $'{
    "version": "cjwbw/seamless_communication:668a4fec05a887143e5fe8d45df25ec4c794dd43169b9a11562309b2d45873b0",
    "input": {
      "task_name": "S2ST (Speech to Speech translation)",
      "input_audio": "https://replicate.delivery/pbxt/JWSAJpKxUszI0scNYatExIXZX2rJ78UBilGXCTq4Ct9BDwTA/sample_input_2.mp3",
      "input_text_language": "None",
      "max_input_audio_length": 60,
      "target_language_text_only": "Norwegian Nynorsk",
      "target_language_with_speech": "French"
    }
  }' \
  https://api.replicate.com/v1/predictions

To learn more, take a look at Replicate’s HTTP API reference docs.

Output

text_output

Le modèle M4T sans faille de MetaAI démocratise la communication parlée à travers les barrières linguistiques.

audio_output

Video Player is loading.

Current Time 00:00:000

Duration 00:00:000

Loaded: 0%

Stream Type LIVE

Remaining Time 00:00:000

Generated in

5.1 seconds

Tweak it Report View full prediction

Run time and cost

This model costs approximately $0.069 to run on Replicate, or 14 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia L40S GPU hardware. Predictions typically complete within 71 seconds. The predict time for this model varies significantly based on the inputs.

Readme

SeamlessM4T

SeamlessM4T is designed to provide high quality translation, allowing people from different linguistic communities to communicate effortlessly through speech and text.

SeamlessM4T covers:
- 📥 101 languages for speech input.
- ⌨️ 96 Languages for text input/output.
- 🗣️ 35 languages for speech output.

This unified model enables multiple tasks without relying on multiple separate models:
- Speech-to-speech translation (S2ST)
- Speech-to-text translation (S2TT)
- Text-to-speech translation (T2ST)
- Text-to-text translation (T2TT)
- Automatic speech recognition (ASR)

Citation

If you use SeamlessM4T in your work or any models/datasets/artifacts published in SeamlessM4T, please cite :

@article{seamlessm4t2023,
  title={SeamlessM4T—Massively Multilingual & Multimodal Machine Translation},
  author={{Seamless Communication}, Lo\"{i}c Barrault, Yu-An Chung, Mariano Cora Meglioli, David Dale, Ning Dong, Paul-Ambroise Duquenne, Hady Elsahar, Hongyu Gong, Kevin Heffernan, John Hoffman, Christopher Klaiber, Pengwei Li, Daniel Licht, Jean Maillard, Alice Rakotoarison, Kaushik Ram Sadagopan, Guillaume Wenzek, Ethan Ye,  Bapi Akula, Peng-Jen Chen, Naji El Hachem, Brian Ellis, Gabriel Mejia Gonzalez, Justin Haaheim, Prangthip Hansanti, Russ Howes, Bernie Huang, Min-Jae Hwang, Hirofumi Inaguma, Somya Jain, Elahe Kalbassi, Amanda Kallet, Ilia Kulikov, Janice Lam, Daniel Li, Xutai Ma, Ruslan Mavlyutov, Benjamin Peloquin, Mohamed Ramadan, Abinesh Ramakrishnan, Anna Sun, Kevin Tran, Tuan Tran, Igor Tufanov, Vish Vogeti, Carleigh Wood, Yilin Yang, Bokai Yu, Pierre Andrews, Can Balioglu, Marta R. Costa-juss\`{a} \footnotemark[3], Onur \,{C}elebi,Maha Elbayad,Cynthia Gao, Francisco Guzm\'an, Justine Kao, Ann Lee, Alexandre Mourachko, Juan Pino, Sravya Popuri, Christophe Ropers, Safiyyah Saleem, Holger Schwenk, Paden Tomasello, Changhan Wang, Jeff Wang, Skyler Wang},
  journal={ArXiv},
  year={2023}
}