๐งพ JigsawStack vOCR โ Vision-Powered OCR & Data Extraction
This model wraps the JigsawStack vOCR API
vOCR (Vision OCR) by JigsawStack enables intelligent image and document understanding through deep OCR, layout analysis, and targeted data extraction. This model supports both descriptive prompts and precise structured queries, including multi-page PDF handling.
๐ง What It Does
You provide an image (or multi-page PDF) and a prompt
โ either:
- A general prompt like "Describe the image in detail"
for scene/image summarization
- Or an array like ["Invoice Number", "Date"]
for targeted information extraction
vOCR uses JigsawStack’s visual OCR engine to: - Recognize text - Understand structure and layout - Extract semantic meaning from visuals
๐ฅ Inputs
Name | Type | Required | Description |
---|---|---|---|
prompt |
string or array | โ No | Prompt for what to extract. Defaults to "Describe the image in detail" |
url |
string | โ No | Public URL to the image or PDF |
file_store_key |
string | โ No | Key to the file stored in JigsawStack File Storage |
page_range |
array of 2 numbers | โ No | For PDFs: [startPage, endPage] , max 10 pages per request |
api_key |
string | โ Yes | Your JigsawStack API key |
โ ๏ธ Either
url
orfile_store_key
must be provided โ not both.