Textual Lacunae Reconstruction Tool (TLR)

Textual Lacunae Reconstruction Tool (TLR)

The textual lacunae reconstruction tool (TLR), funded by the National Science Foundation, will leverage new techniques for unsupervised transcription to automatically transcribe vast quantities of handwritten Arabic-script text.

OpenITI team members from the University of Maryland, College Park and the University of California, San Diego will lead this project and begin to unlock the Islamicate written tradition, perhaps the largest archive of human cultural production of the premodern world. Through this work, we will train a neural encoder for images of manuscript text lines by learning to reconstruct masked regions (i.e., lacuna) of unlabeled manuscript images. Within the larger world of handwritten text recognition (HTR) research, this approach is completely unexplored. However, in other fields with similar challenges, related approaches have yielded highly promising results.

For more information on this project, please see this press release from the University of Maryland, College Park and our team’s recent publication on the efficacy of lacuna reconstruction.

Funding and Project Duration: $597,844.00 from July 2022 to June 2025 (see subgrant one and subgrant two on the National Science Foundation website for more information).

Primary Project Personnel

Jonathan Parkes Allen

Postdoctoral Research Associate, Roshan Institute for Persian Studies, University of Maryland, College Park

Taylor Berg-Kirkpatrick

Associate Professor of Computer Science, Department of Computer Science and Engineering, University of California, San Diego

Danlu Chen

Doctoral Candidate, Department of Computer Science and Engineering, University of California, San Diego

Matthew Thomas Miller

Assistant Professor of Persian Literature & Digital Humanities, Roshan Institute for Persian Studies, University of Maryland, College Park; Director, Roshan Initiative in Persian Digital Humanities; Affiliate, Maryland Institute for Technology in the Humanities

John Mullan

Faculty Assistant, Roshan Institute for Persian Studies, University of Maryland, College Park; Digital Specialist, Roshan Initiative in Persian Digital Humanities

Nikolai Vogler

Doctoral Candidate, Department of Computer Science and Engineering, University of California, San Diego