cudanexus / nougat

Nougat: Neural Optical Understanding for Academic Documents

  • Public
  • 224 runs
  • GitHub
  • Paper
  • License

Run time and cost

This model costs approximately $0.081 to run on Replicate, or 12 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia T4 GPU hardware. Predictions typically complete within 7 minutes. The predict time for this model varies significantly based on the inputs.

Readme

Nougat OCR

Introduction

This repository contains the source code for Nougat OCR, a tool for Optical Character Recognition (OCR) using the Nougat model. Follow the instructions below to set up the environment and run the OCR.

Installation

  1. Clone this repository: ```bash

git clone https://github.com/cudanexus/nougat.git ```

  1. Download the model files from Hugging Face using Git LFS:
  2. Make sure you have Git LFS installed (Git LFS Installation )
  3. Run the following commands:
git lfs install
git clone https://huggingface.co/spaces/tomriddle/nougat

2. After the above commands, your folder structure should look like this:

input
Upload nougat.pdf
nougat
output
Upload nougat.pdf
README.md
app.py
requirements.txt

3. Copy the nougat folder (which contains all model files) to the root of this repository. Your updated structure should look like:

input
nougat
--- config.json
--- pytorch_model.bin
--- special_tokens_map.json
--- tokenizer.json
--- tokenizer_config.json
output
app.py
cog.yaml
output.txt
predict.py
requirements.txt

4. Install the required Python packages:

pip install -r requirements.txt

Testing

Ensure that everything is installed correctly by running:

python app.py --pdf_file input/nougat.pdf

If the installation is successful, you should see the OCR output.

Additional Information

For any issues or questions, please refer to the repository or contact the repository owner.