donut-tutorial

Donut 🍩 : Document Understanding Transformer

Introduction Donut 🍩, Document understanding transformer, is a new method of document understanding that utilizes an OCR-free end-to-end Transformer model. Donut does not require off-the-shelf OCR engines/APIs, yet it shows state-of-the-art performances on various visual document understanding tasks, such as visual document classification or information extraction (a.k.a. document parsing). In addition, we present SynthDoG 🐶, Synthetic Document Generator, that helps the model pre-training to be flexible on various languages and domains.

Our academic paper, which describes our method in detail and provides full experimental results and analyses, can be found here:

OCR-free Document Understanding Transformer. Geewook Kim, Teakgyu Hong, Moonbin Yim, JeongYeon Nam, Jinyoung Park, Jinyeong Yim, Wonseok Hwang, Sangdoo Yun, Dongyoon Han, Seunghyun Park. In ECCV 2022.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
__pycache__		__pycache__
data_processing		data_processing
misc		misc
.gitignore		.gitignore
README.md		README.md
export.py		export.py
gradio_app.py		gradio_app.py
image_utils.py		image_utils.py
infer_hf.py		infer_hf.py
train-hf.py		train-hf.py
train-pl.py		train-pl.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

donut-tutorial

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

donut-tutorial

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages