Funded through two grants from The Andrew W. Mellon Foundation, Phase One of the Open Islamicate Texts Initiative Arabic-script OCR Catalyst Project (OpenITI AOCP) was the first undertaking of its kind to tackle the technical and organizational barriers that historically have stymied the development of Arabic-script OCR and digital text production for Islamicate Studies.
For full detail on the first phase of OpenITI AOCP, please see this article.
Phase II of OpenITI AOCP brings together a highly interdisciplinary team of experts in Islamicate studies, digital humanities, and computer science to build textual resources for the digital study of the Islamicate world and transform open-source optical character (OCR)/handwritten textual recognition (HTR) technology for all languages. This project is led by investigators from the Roshan Institute for Persian Studies and Maryland Institute for Technology in the Humanities at the University of Maryland; Northeastern University’s (NU) NULab for Texts, Maps, and Networks; University of California, San Diego; and the Centre for Digital Humanities at the Aga Khan University’s Institute for the Study of Muslim Civilisations (London).
OpenITI AOCP Phase II will build on the considerable successes of the Phase I project in piloting corpus production for Persian and Arabic and advancing OCR character accuracy rates (CARs) on the most common typefaces in Persian and Arabic print history. In Phase II, the OpenITI AOCP team will dramatically expand the size of OpenITI’s Persian and Arabic corpus through large-scale OCR work in Persian and Arabic; extend the linguistic capabilities of OpenITI’s OCR tools into Ottoman Turkish and Urdu; transform its open-source optical character recognition (OCR)/handwritten text recognition (HTR) pipeline by incorporating newly developed unsupervised machine learning tools into its workflow (including into its user-friendly interface, eScriptorium); build individual scholarly and institutional Islamicate manuscript HTR workflows; and convene an experts workshop to critically assess the ethical and technological issues for next-generation digital text dissemination.
Primary Project Personnel
Programme Manager, Institute for the Study of Muslim Civilisations, Aga Khan University, London
Jonathan Parkes Allen
Postdoctoral Research Associate, Roshan Institute for Persian Studies, University of Maryland, College Park
Associate Professor of Computer Science, Department of Computer Science and Engineering, University of California, San Diego
Assistant Research Professor, Roshan Institute for Persian Studies, University of Maryland, College Park
Digital Projects Assistant, Hill Museum and Manuscript Library
Matthew Thomas Miller
Assistant Professor of Persian Literature & Digital Humanities, Roshan Institute for Persian Studies, University of Maryland, College Park; Director, Roshan Initiative in Persian Digital Humanities; Affiliate, Maryland Institute for Technology in the Humanities
Postdoctoral Research Associate, Khoury College of Computer Sciences, Northeastern University
Doctoral Candidate, Khoury College of Computer Sciences, Northeastern University
Assistant Director for Finance and Administration, School of Languages, Literatures, and Cultures, University of Maryland, College Park
Sarah Bowen Savant
Digital Lead, KITAB project
Mellon Islamicate Digital Humanities Postdoctoral Associate, Roshan Institute for Persian Studies, University of Maryland, College Park
Chief Librarian, Forman Christian College University
Senior Research Software Developer, Maryland Institute for Technology in the Humanities, University of Maryland, College Park
Doctoral Candidate, Department of Computer Science and Engineering, University of California, San Diego
OpenITI AOCP Phase II Partner Projects
Professor Emeritus & William R. Kenan, Jr. Distinguished Professor of Islamic Studies, Department of Religious Studies, University of North Carolina at Chapel Hill; Principal Investigator, Omar ibn Said Digitization Project
Intisar A. Rabb
Professor of Law, Harvard Law School; Professor of History, Harvard University; Director, Program in Islamic Law
Khedouri A. Zilkha Professor of Jewish Civilization in the Near East, Princeton University; Director, Princeton Geniza Lab