prunaai/f-lite-juiced | Run with an API on Replicate

Input

Run this model in Node.js with one line of code:

npx create-replicate --model=prunaai/f-lite-juiced

or set up a project from scratch

Install Replicate’s Node.js client library:

npm install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Import and set up the client:

import Replicate from "replicate";
import fs from "node:fs";

const replicate = new Replicate({
  auth: process.env.REPLICATE_API_TOKEN,
});

Run prunaai/f-lite-juiced using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

const output = await replicate.run(
  "prunaai/f-lite-juiced:b5bb58776e3e5adbc83d721489c1770fca8449dbcdbb488d390e6cae577db048",
  {
    input: {
      seed: -1,
      prompt: "A cake with 'pruna' written on it",
      guidance: 3,
      image_size: 1024,
      aspect_ratio: "1:1",
      output_format: "webp",
      output_quality: 80,
      num_inference_steps: 30
    }
  }
);

// To access the file URL:
console.log(output.url()); //=> "http://example.com"

// To write the file to disk:
fs.writeFile("my-image.png", output);

To learn more, take a look at the guide on getting started with Node.js.

Install Replicate’s Python client library:

pip install replicate

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Import the client:

import replicate

Run prunaai/f-lite-juiced using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

output = replicate.run(
    "prunaai/f-lite-juiced:b5bb58776e3e5adbc83d721489c1770fca8449dbcdbb488d390e6cae577db048",
    input={
        "seed": -1,
        "prompt": "A cake with 'pruna' written on it",
        "guidance": 3,
        "image_size": 1024,
        "aspect_ratio": "1:1",
        "output_format": "webp",
        "output_quality": 80,
        "num_inference_steps": 30
    }
)

# To access the file URL:
print(output.url())
#=> "http://example.com"

# To write the file to disk:
with open("my-image.png", "wb") as file:
    file.write(output.read())

To learn more, take a look at the guide on getting started with Python.

Set the REPLICATE_API_TOKEN environment variable:

export REPLICATE_API_TOKEN=<paste-your-token-here>

Find your API token in your account settings.

Run prunaai/f-lite-juiced using Replicate’s API. Check out the model's schema for an overview of inputs and outputs.

curl -s -X POST \
  -H "Authorization: Bearer $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Prefer: wait" \
  -d $'{
    "version": "prunaai/f-lite-juiced:b5bb58776e3e5adbc83d721489c1770fca8449dbcdbb488d390e6cae577db048",
    "input": {
      "seed": -1,
      "prompt": "A cake with \'pruna\' written on it",
      "guidance": 3,
      "image_size": 1024,
      "aspect_ratio": "1:1",
      "output_format": "webp",
      "output_quality": 80,
      "num_inference_steps": 30
    }
  }' \
  https://api.replicate.com/v1/predictions

To learn more, take a look at Replicate’s HTTP API reference docs.

Output

{
  "completed_at": "2025-04-30T09:22:50.887277Z",
  "created_at": "2025-04-30T09:22:37.919000Z",
  "data_removed": false,
  "error": null,
  "id": "fbxg3catvxrj00cpgr0t0jtmam",
  "input": {
    "seed": -1,
    "prompt": "A cake with 'pruna' written on it",
    "guidance": 3,
    "image_size": 1024,
    "aspect_ratio": "1:1",
    "output_format": "webp",
    "output_quality": 80,
    "num_inference_steps": 30
  },
  "logs": "Using txt2img pipeline\nRunning prediction with args: ['prompt', 'height', 'width', 'guidance_scale', 'num_inference_steps', 'generator']\n  0%|          | 0/30 [00:00<?, ?it/s]\n  7%|▋         | 2/30 [00:00<00:07,  3.59it/s]\n 10%|█         | 3/30 [00:01<00:10,  2.63it/s]\n 13%|█▎        | 4/30 [00:01<00:11,  2.31it/s]\n 17%|█▋        | 5/30 [00:02<00:11,  2.16it/s]\n 20%|██        | 6/30 [00:02<00:11,  2.07it/s]\n 23%|██▎       | 7/30 [00:03<00:11,  2.02it/s]\n 27%|██▋       | 8/30 [00:03<00:11,  1.99it/s]\n 30%|███       | 9/30 [00:04<00:10,  1.97it/s]\n 33%|███▎      | 10/30 [00:04<00:10,  1.95it/s]\n 37%|███▋      | 11/30 [00:05<00:09,  1.94it/s]\n 40%|████      | 12/30 [00:05<00:09,  1.93it/s]\n 43%|████▎     | 13/30 [00:06<00:08,  1.93it/s]\n 47%|████▋     | 14/30 [00:06<00:08,  1.94it/s]\n 53%|█████▎    | 16/30 [00:07<00:05,  2.51it/s]\n 60%|██████    | 18/30 [00:07<00:04,  2.89it/s]\n 63%|██████▎   | 19/30 [00:08<00:04,  2.59it/s]\n 67%|██████▋   | 20/30 [00:08<00:04,  2.41it/s]\n 77%|███████▋  | 23/30 [00:09<00:02,  3.39it/s]\n 80%|████████  | 24/30 [00:09<00:02,  2.97it/s]\n 87%|████████▋ | 26/30 [00:10<00:01,  3.20it/s]\n 90%|█████████ | 27/30 [00:10<00:01,  2.82it/s]\n 93%|█████████▎| 28/30 [00:11<00:00,  2.55it/s]\n 97%|█████████▋| 29/30 [00:12<00:00,  2.38it/s]\n100%|██████████| 30/30 [00:12<00:00,  2.49it/s]",
  "metrics": {
    "predict_time": 12.958231527,
    "total_time": 12.968277
  },
  "output": "https://replicate.delivery/yhqm/M4OZ1vyvSf2wESeImhGxq8qBv3zWprkG1E0aeCJqa11UvzPpA/output_-1_0.webp",
  "started_at": "2025-04-30T09:22:37.929045Z",
  "status": "succeeded",
  "urls": {
    "stream": "https://stream.replicate.com/v1/files/yswh-akpl34dkrsxwuz4bfwoxl7wremzrmdu4tbldc6rsoyx4u4nuehua",
    "get": "https://api.replicate.com/v1/predictions/fbxg3catvxrj00cpgr0t0jtmam",
    "cancel": "https://api.replicate.com/v1/predictions/fbxg3catvxrj00cpgr0t0jtmam/cancel"
  },
  "version": "b5bb58776e3e5adbc83d721489c1770fca8449dbcdbb488d390e6cae577db048"
}

Generated in

13.0 seconds

Tweak it Iterate in playground ShareReport View full prediction

Examples

View more examples

Run time and cost

This model costs approximately $0.018 to run on Replicate, or 55 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia A100 (80GB) GPU hardware. Predictions typically complete within 14 seconds.

Readme

You can do the same with pruna by following this minimal example:

import torch
from f_lite import FLitePipeline

# Trick required because it is not a native diffusers model
from diffusers.pipelines.pipeline_loading_utils import LOADABLE_CLASSES, ALL_IMPORTABLE_CLASSES
LOADABLE_CLASSES["f_lite"] = LOADABLE_CLASSES["f_lite.model"] = {"DiT": ["save_pretrained", "from_pretrained"]}
ALL_IMPORTABLE_CLASSES["DiT"] = ["save_pretrained", "from_pretrained"]

pipeline = FLitePipeline.from_pretrained("Freepik/F-Lite", torch_dtype=torch.bfloat16).to("cuda")
pipeline.dit_model = torch.compile(pipeline.dit_model)

from pruna_pro import SmashConfig, smash

# Initialize the SmashConfig
smash_config = SmashConfig()
smash_config["cacher"] = "auto"
# smash_config["auto_cache_mode"] = "taylor"
smash_config["auto_speed_factor"] = 0.8  # Lower is faster, but reduces quality
smash_config["auto_custom_model"] = True

smashed_pipe = smash(
    model=pipeline,
    smash_config=smash_config,
    experimental=True,
)

smashed_pipe.cache_helper.configure(
    pipe=pipeline,
    pipe_call_method="__call__",
    step_argument="num_inference_steps",
    backbone=pipeline.dit_model,
    backbone_call_method="forward",
)

smashed_pipe(
    prompt="A cake with 'pruna' written on it",
    height=1024,
    width=1024,
    num_inference_steps=30,
    guidance_scale=3.0,
    negative_prompt=None,
    generator=torch.Generator(device="cuda").manual_seed(0),
).images[0]