18 OpenITI mARkdown
OpenITI mARkdown is a lightweight markup language designed to add basic tagging in a quick and efficient way to text in right-to-left languages.
Advantages:
- no conflicts between LTR tags and RTL text
- no pairs of closing and opening tags as in TEI XML: only single tags
- OpenITI mARkdown tags can be converted to TEI XML
- highlighting and folding schemes available for EditPad Pro and Kate editors.
- limited but expandable tag set
Full documentation: alraqmiyyat.github.io/mARkdown/
18.1 OpenITI mARkdown header
Each OpenITI mARkdown file has a header that contains three elements:
- the so-called “magic value” ######OpenITI#, which activates the OpenITI mARkdown highlighting scheme in EditPad Pro. It should be on the first line of the file, followed by a line break
- Zero or more metadata fields, each on one line, and each introduced by the tag
#META#. This header is meant for metadata salvaged from digital text files, and is not necessarily machine-readable. Machine-readable metadata to each text file is provided in a separate file, in the YAML format; this file has the same filename as the file it describes, and the file extension.yml.
- Header splitter #META#Header#End#: this tag, preceded and followed by an empty line, splits the metadata header off from the main body of the text.
18.2 Structural text units
18.2.1 Paragraphs
- #: The start of a paragraph in OpenITI texts is tagged with a single hashtag, followed by a space.
- ~~: because many text editors have trouble displaying very long lines of text, paragraphs in OpenITI texts are split into short lines (max. 72 characters), each preceded by two tildes.
 
Paragraphs and page numbers, highlighted in Kate editor
NB: in the near future, OpenITI mARkdown will abandon these tags and will mark
paragraphs as done in standard markdown: double line breaks as paragraph dividers.
18.2.2 Page numbers: PageV##P###
Page numbers in OpenITI mARkdown are formatted in the PageV##P### format;
V## refers to the volume number, P### to the page number. Both are padded
with zeroes (e.g., PageV01P001).
manuscript transcriptions, add A to the page number for recto, and B for verso
(e.g., PageV01P002B).
18.2.3 Section headers: ### |, ### ||, etc.
Section header tags must always start on a new line and consist of two elements:
- three hashtags, followed by a space
- one or more “pipe” symbols (|), followed by a space. The number of pipes defines the level of a header: one pipe is level 1, two pipes is level 2, etc. The highlighting scheme will give each level a different colour.
If the section has a title, put it on the same line immediately after the final space of the tag. If the title does not have a section, the rest of the line after the space may be left blank.
 
Folded OpenITI mARkdown section headers in Kate editor
18.2.4 Dictionary units: ### $
Lemmata in texts that have a dictionary format (lexicographical dictionaries,
biographical dictionaries, ṭabaqāt works, bibliographical works, etc.)
can be tagged using the ### $ tag:
- ### $: simplified lemma tag: can be used if there is only one type of entry in a dictionary-type work, or as a temporary tag
- descriptive unit tags: the type of lemma can be expressed by putting it between
two dollar signs: e.g.,
- dictionary entries:
- ### $DIC_NIS$[a descriptive name entry]
- ### $DIC_TOP$[a toponym entry]
- ### $DIC_LEX$[a lexical entry]
- ### $DIC_BIB$[a book title]
 
- biographical dictionary entities:
- ### $BIO_MAN$[a biography of a man]
- ### $BIO_WOM$[a biography of a woman]
- ### $BIO_REF$[a cross-reference, for both men and women]
- ### $BIO_NLI$[a list of names]
 
 
- dictionary entries:
- shorthand unit tags: if a book contains multiple types of lemmata (e.g.,
biographies of men and women; book titles and genre categories), multiple dollar
signs can be used to differentiate between each type. E.g.,
- ### $[a biography of a man]
- ### $$[a biography of a woman]
- ### $$$[a cross-reference and/or repetition, for both men and women]
- ### $$$$[a list of names]
 
18.2.5 Paratext sections: ### |EDITOR|, ### |PARATEXT|, ### |APPENDIX|
These tags are used to mark sections that do not really form part of the text; for many kinds of computational analysis, it may be useful to disregard these sections:
- ### |EDITOR|: title pages and front matter of the printed editions, editorial introductions, indices, table of contents, etc.
- ### |PARATEXT|: scribal additions to the text, riwāyas at the start of a text, samāʿa statements, …
- ### |APPENDIX|: appendices to the text that are closely related to the text, e.g., known fragments from lost parts of a text, sections from alternative manuscript transmission chains, etc.
All of these tags are put on their own on a new line of text at the beginning
of the section. The section continues until the next ### tag, or the end of
the text
18.2.6 Verses of poetry: # <hemistich_1> %~% <hemistich_2
NB: make sure each verse starts on its own line.
 
Poetry verses tagged in OpenITI mARkdown, highlighted in EditPad Pro
18.3 Semantic patterns
18.3.1 Years: YY####, YB####, YD####, YA####
Years mentioned in a text can be tagged by placing one of the following tags in front of them (separate them from the text with a space):
- YB####: year of birth
- YD####: year of death
- YA####: age in years
- YY####: any other type of year reference
NB: these tags must be preceded and followed by a space.
 
Years tagged in OpenITI mARkdown, as displayed by EditPad Pro
18.3.2 Named entities (toponyms, personal names, …): @XXX##
This category of tags has a special structure: all tags consist of:
- an “at sign” @
- a three-letter code:
- TOPfor toponyms
- SOCfor biographical characteristics
- PERfor persons
- SRCfor sources (book titles, citation statements like qāla fulān, …)
- QURfor Qur'ān verses
- a digit that refers to the length of an attached prefix: e.g.,
- @TOP01 بغداد
- @TOP11 ببغداد
- @TOP21 وببغداد
- one or more digits that refer to the number of tokens of the entity (The highlighting scheme will highlight this number of tokens after the tag): e.g.,
- @TOP12 بمدينة السلام
- @SRC010 قال محمد بن عبد الملك بن علي بن أحمد السندي
NB: in order to speed up your tagging, you may use variants of these same tags
with a single-letter code: @T##, @S##, @P##, @Q##
 
Named entities highlighting in EditPad Pro
18.4 Further reading
Full documentation: alraqmiyyat.github.io/mARkdown/