Using Metadata to Indicate Missing Signs and Lines

By Miller Prosser, October 2013

Updated January 2017

OCHRE's data model encourages a user to indicate missing signs and lines as metadata. This article will explain what that means and how to implement it in your project.

What do we mean by indicating missing signs as metadata?

Instead of entering a dash, X, or some other place-holder for missing signs, where each of these characters represents a separate epigraphic unit, the user can and should indicate a run of missing signs as a single epigraphic unit.

In the following image, each missing sign is represented by a separate epigraphic unit. This will work, but it is cumbersome. Also, by using this method, you lose some of the advanced functionality that OCHRE can provide when it generates a view of the text and when it queries for broken words.

If in the image above, both letters before the s are broken, they should be represented by a single epigraphic unit. The user would then indicate the number of missing signs as follows:

What settings and options do you have?

In short, many. Navigate to your Project in the navigation pane. Choose Preferences, then Epigraphic Sigla. Here you will find a section called Missing Item(s) Notation.

Here is a brief explanation of each of these settings.

Missing sign(s), epigraphic: this single character displays in the transliteration pane. We recommend using a dash or 'x' here.

Missing line(s), epigraphic: when an epigraphic unit is categorized as a line, and the user indicates a specific number of missing lines, this text will appear in the transliteration pane.

Missing sign(s), discourse, single: this text will display in the phonemic pane when the missing sign is neither followed nor preceded by another missing sign.

Missing sign(s), discourse, series: this text will display when two or more missing signs appear together adjacently. If left blank, the default is an em-dash.

Missing line(s), discourse: this text appears in the phonemic pane when an entire line is missing. The number sign (#) can be used as a variable to tell OCHRE where to display the number of missing lines.

Missing sign(s), indeterminate length: this text indicates that an unknown number of signs is missing. This text appears in both panes.

NOTE: enter a space (or double-space) in this field if you wish to add linear space for each missing sign. Leaving this field blank AND entering zero in the Missing signs box will produce an ellipsis.

Missing line(s), indeterminate length: this text indicates that an unknown number of lines is missing. This text appears in both panes.

Entering metadata

So how do you enter the metadata for your epigraphic units to create a line of text such as in the following example?

In this line, we have one word completely lost in the break, a second word whose final letter is only partially preserved, followed by a break of indeterminate length. In other words, the tablet is completely broken on the right side at the end of line one.

Using the Missing Item(s) Notation as shown in the image above, the epigraphic units appear as follows in the epigraphic hierarchy.

The first epigraphic unit is configured like this:

With the Damage indicator set to Illegible, the signs will be shown within square brackets (or other project-configured damage notation from Epigraphic Sigla) in the transliteration pane. The number of missing signs is set to 3, by which is meant approximately three. The number of signs is previewed in the Missing signs pane, indicating how it will be shown in the text's View.

The second word begins with three missing and illegible signs but ends with one missing and partially legible sign. The number of signs in each case is indicated in the Missing signs field.


The final epigraphic unit in this line is described as having an Unknown # of missing signs. In other words, it is a series of missing signs of indeterminate length. It is also described as Illegible. Because this project uses "..." to indicate missing signs of indeterminate length and square brackets to denote illegibility, this item will display as [...].

Linking epigraphic units to discourse units

Each of the broken epigraphic units above must be linked to discourse units in such a way that OCHRE knows how many words are present. The first word consists of just the first epigraphic unit, [---]. However, this epigraphic unit stands in for three missing letters. As such, it qualifies as a series of missing signs. So, it is displayed in the phonemic view with an em-dash.

The second discourse unit consists of two epigraphic units, the [---] and the -. The sum of these two epigraphic units is four missing letters. Therefore, the phonemic pane displays as a series of missing signs (an em-dash again). Notice how the display in the discourse hierarchy shows the em-dash, while the two epigraphic units display as links where expected.

Combining known and unknown missing content

If a series of epigraphic units, all of which represent an (estimated) unknown number of missing signs, are used together in a single discourse unit, these runs of missing-ness will NOT be joined by the syllable-separator. Only "known" runs of missing signs will be treated as if they stand in for actual signs, and will thus be shown as joined by the syllable-separator (typically a dash).

That is, any Unknown unit of missing content will, by default, not be treated as "signs" and therefore will not be syllable-separated with respect to adjoining content.

Unknown missing content

If an Unknown unit of missing content is to be treated as if it represented signs (only it's not clear which ones or even how many), override the epigraphic unit's Type, setting this to sign. OCHRE will then know to format this item as if it were representing signs, and will therefore use appropriate syllable separation.

Unknown missing context Type'd as "sign"

If a unit of missing content is known to represent a number (and so should be formatted as such), override the epigraphic unit's Type, setting this to number. OCHRE will then use the appropriate number-separator for formatting (the plus sign in this example). Here, the x's must represent numerals, even though it can't be determined which ones.

Unknown missing content Type'd as "number"

Missing lines

When formatting the number of missing lines in a text, it is possible to display in the transliteration pane a text string for each and every missing line or to present a summary statement for the total number of missing lines. Use the Type setting of the epigraphic unit to control this aspect of the display.

If the missing line is given the Type "line" on the epigraphic unit, then the character string specified as the Missing line(s), epigraphic will display for each missing line.

14. text text text




18. text text text

If the missing line is given the Type "region" on the epigraphic unit, then a summary statement will be displayed using the Missing line(s), discourse notation.

14. text text text

3 lines missing

18. text text text

The displayed text is based on the settings provided at the project level. See the image above.

A final note on indicating damage and certainty of missing items

All of the metadata options for damage, emendation, placement, and other options can be applied to missing signs. So a missing line can be uncertain. It can be partially or entirely illegible.