Posts - Page 3 of 3

Challenges of Layout Analysis across Arabic-Script Training Data

  • 3 min read

Layout Analysis is the process of identifying regions (e.g., title, body text, footnotes, etc.) on a page of text before sending it through the OCR engine. Preparing documents to train our OCR models involves several distinct steps, including semantic annotation, fixing segmentation errors, and editing faulty transcriptions. eScriptorium allows users to associate specific labels with regions…

Read More