The Open Islamicate Texts Initiative Arabic-script OCR Catalyst Project (OpenITI AOCP) is hosting an Expert Workshop on Digital Publication of Right-to-Left Script Corpora June 29-30, 2023 at the University of Maryland, College Park, MD, USA.
This two-day workshop will bring together experts on the creation and publication of annotated textual corpora. Participants will be invited to share and work collaboratively on approaches and solutions to the digital publication of text corpora in general, but with a focus on challenges pertaining to text in right-to-left scripts.
For the purposes of this workshop, we construe “digital publication” as the distribution of annotated digital texts for a variety of purposes, from corpora as data to digital scholarly editions on the web.
The workshop intends to cover three main topics related to digital publication: curation, annotation, and dissemination. What approaches and instruments are available to assist in the tagging and encoding of digitized texts? How can texts be further structured and annotated towards publication? What instruments are available to disseminate text corpora on the web as collections or digital editions? What infrastructure is needed for their long-term availability to the scholarly community and beyond?
We expect that there will be a mix of demonstrations, broader state-of-the art or theoretical presentations, and hands-on sessions with data from the OpenITI corpus data in a variety of textual formats.
More information will be posted here.
Preliminary program
Day one
The first day of the workshop will focus on project presentations from invited participants. Around three main topics:
- Curation: What approaches and instruments are available to assist in the tagging and encoding of digitized texts?
- Annotation: How can texts be further structured and annotated towards publication? What about annotation after publication?
- Dissemination: What instruments are available to disseminate text corpora on the web as collections or digital editions? What infrastructure is needed for their long-term availability to the scholarly community and beyond?
We are planning for 30 minute presentations including 10 minutes for questions.
9:00 - 9:45 am | Opening remarks, participants introductions, workshop overview |
9:45 - 10:45 am | Project presentations from invited participants |
10:45 - 11:00 am | Break |
11:00 - 12:30 pm | Project presentations from invited participants |
12:30 - 2:00 pm | Lunch |
2:00 - 3:30 pm | Project presentations from invited participants |
3:30 - 3:45 pm | Break |
3:45 - 5:15 | Project presentations from invited participants |
5:15 - 6:30 | Break |
6:30 pm | Dinner |
Day two
The second day of the workshop will focus on structured discussion and strategic planning. Participants will be given a series of topics to discuss throughout the day. We will adopt a “design studio” format that combines divergent and convergent thinking through breakout sessions and group discussion. The goal is to explore the wide set of tools presented during Day 1 to create a shared vision that can accommodate a diversity of approaches.
9:00 - 9:15 am | Day overview and planning |
9:15 - 10:45 am | Design studio 1 (e.g. addressing low bandwidth / low tech access) |
10:45 - 11:00 am | Break |
11:00 - 12:30 pm | Design studio 2 (e.g. sustainability with and without infrastructure ) |
12:30 - 2:00 pm | Lunch |
2:00 - 3:30 pm | Design studio 3 (e.g. strategies for right-to-left encoding) |
3:30 - 3:45 pm | Break |
3:45 - 4:30 pm | Closing discussion and remarks |
4:30 - 6:30 | Break |
6:30 pm | Dinner (optional) |