1 Scope
ISO 24613-4 describes the serialization of the lexical markup framework (LMF) model defined as an XML model compliant with the Text Encoding Initiative (TEl) Guidelines. This serialization covers the classes of ISO 24613-1 (the LMF core model) as well as classes provided by ISO 24613-2 (the machine readable dictionary, MRO, model) and ISO 24613-3 (the etymological extension).
2 Normative references
The following documents are referred to in the text In such a way that some or all of their content constitutes requirements of ISO 24613-4. For dated references, only the edition cited applies. For undated references, the latest edition of the referenced document (including any amendments) applies.
ISO 24613-1. Language resource management — Lexical markup framework (LMF) — Part 1: Core model ISO 24613-2, Language resource management — Lexical markup framework (LMF) — Part 2: Machine-
readable dictionary (MRD) model
ISO 24613-3, Language resource management — Lexical markup framework(LMF) — Part 3: Etymological extension
terser (see References [4], [s], L9J [UI and 114]). while eliciting specific constraints needed to align with ISO 24613 in general. For instance, precise value lists are given for some attributes such as @type. in addition, this document complies with the cardinalities expressed in ISO 24613-1, ISO 24613-2 and in ISO 24613-3.
Unless explicitly stated, all resulting constructs shall be valid TEl representations, which means that the specification described in ISO 24613-4 corresponds to a pure subset of the TEl Guidelines. They shall thus be well-formed XML documents as specified by the W3C XML recommendation.
This document requires compliance with ISO 24613-1, ISO 24613-2, and ISO 24613-3 when implementing data categories referred to in the respective parts.
Nevertheless, ISO 24613-4 does not elaborate on the metadata aspects from LMF, since the TEl header, i.e. the metadata component attached to any TEl document, is in essence rich enough in that it represents all the aspects related to the creation, the content description, the versioning and publishing of a textual document as a whole.
in all XML examples in ISO 24613-4 and in order to simplify the actual representations, it is assumed, unless otherwise stated, that XML elements belong to the TEl namespace, thus assuming that all examples are within the scope of the following XML namespace declaration:
The elements provided by the TEl Guidelines within <graniGrp> for grammatical description of an associated class are as follows.
— cpos>°) (part olspeech) to Indicate the grammatical category of the lexical Item. This corresponds to the /partOfSpeech/ data category in Iso 24611:2012, Annex A.
NOTE 1 ThIs element Is equivalent to the use of the <gram> element with the appropriate @type attnbute:
<gram type.’partOfSpeech>. as uted in the TEl Lex 0 lnItIativel5i.
— cgen>21) (grammatical gender) to Indicate the grammatical gender (ii relevant) of the lexical item or one of its inflected forms. This corresponds to the /grammaticalGender/ data category in ISO 24611:2012, Annex A.
NOTE 2 This clement is equivalent to the use of the <gram’ element with the appropriate type aitribute:
<gram typegrammatlcalGender>.
— <number>22) (grammatical number) to indicate the grammatical number (If relevant) of the lexical Item or one of its Inflected forms. This corresponds to the /grammaticalNumber/ data category In ISO 24611:2012, Annex A.
NOTE 3 This element is equivalent to the use of the <gram> element with the appropriate @type attribute:
<gram type.’grammalicalNumher>.
— cper>U (person) to indicate the grammatical person (if relevant) of the lexical item or one of its inflected forms. This corresponds to the /person/ data category in ISO 24611:2012. Annex A.
NOTE 4 ThIs element Is equivalent to the use of the <grain> element with the appropriate @type attribute:
<gram type=person’>.
— <tns>24) (tense) to Indicate the grammatical tense (if relevant) of the lexical Item or one of its inflected forms. This corresponds to the /grammaticalTense/ data category In ISO 24611:2O12 Annex A.
NOTE S This element is equivalent to the use of the <gram> element with the appropriate @type attribute:
<gram type.’grammaticaITense.
— <subc>1 (subeategorization) to indicate subcategorization information (e.g. transitive/ intransitive/ditransitive. countable/non-countable. etc.).
— <il’ype>26) (inflectional class) to indicate the inflectional class associated with the lexical item.
For Instance, to Indicate that the part of speech is a verb and that it Is Intransitive, the following construct shall be used, see Examples 1 to 3.
The implicit referencing mechanism can be used in the case of multIwurd expressions when a form is construed to be made at several sub-forms which can be mapped onto other existing lexical entries. The segmentation of the <form> shafl be made by means of the <seg> element, with a @corresp attribute containing a pointer to another entry.
5.12 Data calegory selection
To establish the link to selected data categories, the TEl Guidelines provide two mechanisms.
For elements having a very precise scope, such as <gen>. <number>, <mood>, two attributes can be activated: @dcr:datcat for data category name, and @dcr:valueDatcat for the corresponding value.
For larger scope TEl elements, such as <form>. @type can replace @dcr:datcat (to carry data category names) for grammatical/syntactic information described within entries.
The following example illustrates the decoration of the representation for gender and number with corresponding data categories.
The namespace associated to the prefix “dcr. shall be conformant to the TEl guidelines.
linside eniriesi
<gen dcr:datcat—qender dct:vaiueOatcat..tm5cut1ne’ir.escuIin</gen> <number dcr;datct—number” dcr:va1ueDct—p1ura1”>p1urie1<inumber>
6 Serialization of the MRD model (ISO 24613.2)
6.1 ImplementIng the Form representations For the Form class
When implemented as a subclass of the OrthographicRepresentation class, the FormRepresentation class shall be represented by means of one of the following elements: <orth>, <pron>, <hyph>, <stress>. <syll>. coupled with the appropriate attributes to quality the content, in particular @type and @notatlon.
The main two elements that shall be used far indicating the orthographic and the phonetic representation ala form are the following:
— orth>29) in cases when the transcription is truly orthographic, as detined by the writing convention of the corresponding object Language:
This element can be associated with an @xml:lang attribute which provides the actual language of the corresponding etymon. and which shall be encoded in accordance with IETF BCP 47.
7.2.2 Representing the meaning of an etymon
The meaning of an etymon can be expressed using the <gloss> (see 7.6) or <def> elements depending on the encoding constraints. <gloss> can he used for a basic summary of the meaning of the etymon. or in the case of hi-/multilingual content for translations. <del> should be used for more detailed descriptions or a definition of the etymon. In both cases, if the <gloss> or <del> are in a language other than that of the etymon, the @xml:lang shall be applied to the given element.
The <usg> element with the @type attribute can be used to declare a number of different features about the etymon’s semantic or sociolinguistlc usage. It Is recommended that the values for type be compliant with those of the TEl [cx 0 initiative’1.
7.2.3 Representing the language ol an etymon
The clang>”) element shall be used to encode the explicit descriptive reference to the language associated with an etymon. A )norm attribute can be used to indicate a standardized representation of the corresponding language or language family in compliance with IETF BCP 47
EXAMPLE clang espand=MIItelhochdeutsch norm=’gmh>mhd.c/lang>
72.4 Associating grammatical information to an etymon
The provision ol supplemental grammatical information to an etymon shall he made by means of the cgramGrp> element in the same way as for the <form element (see 5J).
72.5 Dating an etymon
The cdate>45) element shall be used to mark up the period associated with an etymon. together with the following constraints:
— @type;
— temporal attributes (see below).
Optionally, if the precise date is not known, or the dates concern a span of time, the <date> element can be expressed without text (e.g. cdate/>) and the dating information can he specified as attributes. This can be expressed by means of one or more of the following attribute pairs:
— @notBefore, @notAfter.

