jigsawstack / vocr

Recognise, describe and retrieve data within an image with great accuracy.

  • Public
  • 10 runs
Iterate in playground

๐Ÿงพ JigsawStack vOCR โ€“ Vision-Powered OCR & Data Extraction

This model wraps the JigsawStack vOCR API

vOCR (Vision OCR) by JigsawStack enables intelligent image and document understanding through deep OCR, layout analysis, and targeted data extraction. This model supports both descriptive prompts and precise structured queries, including multi-page PDF handling.


๐Ÿง  What It Does

You provide an image (or multi-page PDF) and a prompt โ€” either: - A general prompt like "Describe the image in detail" for scene/image summarization - Or an array like ["Invoice Number", "Date"] for targeted information extraction

vOCR uses JigsawStack’s visual OCR engine to: - Recognize text - Understand structure and layout - Extract semantic meaning from visuals


๐Ÿ“ฅ Inputs

Name Type Required Description
prompt string or array โŒ No Prompt for what to extract. Defaults to "Describe the image in detail"
url string โŒ No Public URL to the image or PDF
file_store_key string โŒ No Key to the file stored in JigsawStack File Storage
page_range array of 2 numbers โŒ No For PDFs: [startPage, endPage], max 10 pages per request
api_key string โœ… Yes Your JigsawStack API key

โš ๏ธ Either url or file_store_key must be provided โ€” not both.