You're looking at a specific version of this model. Jump to the model overview.

cjwbw /unival:00a9af2b

Input schema

The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.

Field Type Default value Description
input_image
string
Input image.
input_audio
string
Input audio.
input_video
string
Input video.
task_type
string (enum)
Image Captioning

Options:

Image Captioning, Video Captioning, Audio Captioning, Visual Grounding, General, General Video

Choose a task.
instruction
string
Provide question for the VQA task, region for Visual Grounding task, and instruction for General tasks. The default instruction for Captioning task is ‘What does the image/video/audio describe?’

Output schema

The shape of the response you’ll get when you run this model with an API.

Schema
{'properties': {'answer': {'title': 'Answer', 'type': 'string'},
                'output': {'format': 'uri',
                           'title': 'Output',
                           'type': 'string'}},
 'type': 'object'}