README Oracc Home SEARCH DOCUMENTATION

Creative Commons License

ORACC Home


Introduction


Structure/scope

Signs

Forms

Values

Local Values

Instances

Notes

Internal Notes

Lists

Unicode Fields

SL: Sign Lists

(http://oracc.org/ns/sl/1.0)

Steve Tinney
Version of 2017-08-10

Introduction

Sign list source files are maintained as a simple plain text file with @-commands in the manner of Oracc glossaries. This document gives an simple overview of the SL commands.

Structure/scope

The general structure of the file is a collection of records, one per sign. Each record contains a partly-ordered set of fields, i.e., there are some key elements of the ordering that are important but there is not a single strict order imposed on every field in the record. These ordering restrictions apply to the @form field and also to a group of fields whose scope is dependent on where they occur relative to other fields.

By convention, @form fields come at the end of the @sign block, immediately before the @end sign. This allows there to be a short version and a long version of the @form field. If a @form is followed immediately by another @form or by @end sign then it is considered complete and @end form is silently supplied by the parser. If a @form field is followed by another other sign field, it must be concluded with @end form.

The scoped fields apply either to signs, forms or values, depending on where they are given. If they occur before any values or forms, they belong to the sign. If they occur within a form, but before any values which the form may contain, they belong to the form. If they occur after a value, they belong to the value.

The scoped fields are: @note, @inote, and @inst.

Signs

Each sign is enclosed in an @sign ... @end sign block. After the @sign comes the sign name, which is the standard way of referring to the sign and must be unique. The sign name is constructed using the conventions of GDL, the grapheme description language used by ATF.

Forms

Any sign may contain @form entries which at their simplest give a form variant code and a name for the variant form. The form variant codes and names must be unique within the current sign. The form may contain also contain any of the fields which may be part of a sign, except for nested @form entries.

In a @form field, the variant code is given by tilde followed by one or more lowercase letters (~a ... ~aa etc.).

Values

The most common field of @sign is the value, and we call the values of a top-level @sign global values. Global values must be unique with respect to each other, i.e., each global value may only occur once as a value of a top-level @sign.

Values These are usually simply a reading of the sign, but additional information can be given on the @v line:

Questionable value
A questionable value is indicated by putting question mark after the field name, as in @v? id₅.
Deprecated value
A deprecated value, one which is no longer considered allowable for the sign or form, is indicated by putting a minus sign after the field name, as in @v- gazum₂.
Language restriction
A value which is only valid for one language may be indicated by putting an ATF language code between the field name and the value, as in @v %elx anše@d.
Proof Example
A proof example may be given on the same line, after the value. These are not needed for most values, but where a value is contested and a particular instance is critical to the proof of the value's existence, the instance may be given in square brackets, following the general conventions for instances described elsewhere in this document. E.g., @v nanna₂ [SpTu 2 36 = cams:P348641 o 18, i-nanna₂-ma].

Local Values

Forms can contain @v lines: we call these local values. Local values may occur more than once in the signlist, either as values of top-level @signs or as values of other forms. Within the signlist system, local values are always considered to be qualified by their form. Thus, if you have the following mythical sign:

@sign BA
@v ba
@form ~a ZU
@v ba
@end form
@end sign

the actual values as internalized by the signlist system are:

ba
ba(BA~a)
ba(ZU)

Of these, the first two are guaranteed to be unique as a result of the sign/form name and value restrictions on uniqueness described above. The third is potentially ambiguous (imagining how such an ambiguity could arise is left as an exercise for the reader). When entering data, however, it is convenient for users to be able to use the form name as a qualifier and only look up the variant code when the system warns about ambiguity.

Note that this approach means that local values must always be qualified by their form, even when they are globally unique.

Instances

Although the general idea of GSL is that instances should be harvested from the corpora, it is possible to include instances manually if so desired. The format of an instance consists of the field name, @inst, followed by the instance data. The instance data may contain up to three subfields, but no restrictions are imposed on which of them are present. These subfields are:

Citation
A human-readable citation, usually the name of the text. If further subfields are given the citation must be followed by an equals sign (=).
CDL Label
A CDL label is a standard label documented elsewhere. Briefly, a label consists of an optional project abbreviation, a required text ID (P- or Q-number), and an optional line label. An example is given above in the discussion of proof examples.
Word
A word may come at the end of an instance, in which case it must be preceded by a comma followed by a space. The word should be a word form which occurs in the line, omitting any editorial marks such as flags and bracketing (though it may have determinative brackets).

Notes

One or more notes may be given each in their own @note field. Notes of this kind appear in rendered versions of the sign list such as web pages and PDFs.

Internal Notes

Internal notes are given in @inote fields. These are exactly the same as @note fields, but they do not display when the sign list is rendered.

Lists

The numbers used for signs in other signlists are given using the @list field. This contains at least an ATF-style sign list reference, e.g., MZL503. This may optionally be followed by and ATF version of the name used in that signlist for the sign in question.

Unicode Fields

Several fields are used for bookkeeping purposes with respet to the Unicode cuneiform specification. These are not yet documented.


Questions about this document may be directed to the Oracc Steering Committee (osc at oracc dot org).