Projects

Projects

The OpenITI is involved in a number of exciting projects. See a list of our projects (past and present) below:

Automatic Collation for Diversifying Corpora (ACDC)

The Automatic Collation for Diversifying Corpora (ACDC) project, funded by a Level III Digital Humanities Advanced Grant from the National Endowment for the Humanities, aims to significantly improve the accuracy of handwritten text recognition (HTR) for Arabic-script manuscripts. Our team will develop a collation tool to automatically create large amounts of training data from existing digital texts and manuscript images without time-consuming human annotation of individual manuscripts.

Read More

CorpusBuilder

In 2017, OpenITI joined forces with the SHARIAsource project of the Program in Islamic Law at Harvard Law School to develop a robust and user-friendly OCR pipeline called CorpusBuilder. This project was funded by the Program in Islamic Law at Harvard Law School.

Read More

Digital Publications

OpenITI has begun piloting the production of the first digital publications of Persian and Arabic works, taken straight from their original manuscript form into a digital publication without a print intermediary. We are developing two projects in collaboration with Carl Ernst for our digital publication pipeline.

Read More

Textual Lacunae Reconstruction Tool (TLR)

The textual lacunae reconstruction tool (TLR), funded by the National Science Foundation, will leverage new techniques for unsupervised transcription to automatically transcribe vast quantities of handwritten Arabic-script text.

Read More