MARTIF - Putting Complexity in Perspective

2. MARTIF-Related Criticisms and Controversies

The proposed standard has generated considerable comment and controversy from a number of quarters. Current comments have come from the perspective of generators of terminological databases (TDBs), e.g., the Localization Industry Standards Association (LISA), from the Natural Language Processing (NLP) community, and from people who are already using the standard as a basis for developing a Hypertext Markup Language (HTML) model for distributed terminology networks on the World Wide Web (~ (West and Murray-Rust 1996). There is also serious discussion within the MARTIF work group itself concerning differing fundamental philosophies with respect to interchange, specifically regarding so-called "blind" interchange (see Section 6).

In a recent articles in The ELRA Newsletter (reprinted in this issue of TermNet News, pp. 1-3), which he specifically identifies as "designed to stimulate debate", Robin Bonthrone writes:

Where machine-processable terminology is required, past efforts at achieving a common standard have not been particularly successful. The MARTIF standard (ISO DIS 12200) goes some way towards achieving this goal, but it appears to have become somewhat bogged-down in increasingly intricate detail. There seems little point in spending years developing an ISO standard (a process which in itself is hardly market-oriented) unless it gains widespread acceptance in an industry, and MARTIF will certainly require re-engineering before it reaches this stage. What could happen is that one industry leader will adopt a particular set of protocols and the test will follow suit. Again, time-to-market will be the driver. (Bonthrone 1996)

The authors have hastened to take up the debate because these comments deserve serious attention. They touch on several aspects of the MARTIF discussion and are not Bonthrone's alone. First it is important to note that the standard is already being used successfully in industry in a variety of environments. Nevertheless, existing critical responses can be classified in three basic categories.

  • Some evaluators have observed that the standard is too complicated. Indeed, it is more complicated than its designers originally anticipated that it would be and than most people would like it to be. The question arises in this context, however: for whom is the standard too complicated? Who will be responsible for dealing with these complications? Section 3 below explains why the complexity of MARTIF provides powerful tools for dealing with critical terminological features, and Section 6 proposes solutions for shifting the burden of complexity away from the end-user by hiding it behind user-friendly interfaces.
  • Some critics have been concerned that the standard is not strict enough. They have urged the adoption of a single standard term entry model in order to facilitate so-called "blind" interchange among partners who do-not have to examine foreign data' before importing it into their own systems. Section 4 outlines proposals designed to address these issues.

<!DOCTYPE martif PUBLIC "ISO 12200: 1997//DTD for MARTIF (framework) //EN" [
<!ENTITY % mtf-body.PUBLIC "ISO 12200:1997//DTD for MARTIF (base) //EN" >
<!ENTITY % mtf-ents PUBLIC "ISO 12200:1997//ENTITIES for MARTIF (sets) //EN" >
<titleStmt><title>Example 1: a complete martif document</title><titleStmt>
<descrip type='subjectField’>appearance of materials</deserip>
<ntig lang=en>
<termNote type='partOfSpeech'>n</termNote>
<termNote type='termType'>preferred term</termNote>
<ntig lang=de>
<termNote type='partOfSpeech’>n</termNote>
<termNote type='gen'>f</termNote>
<descrip type='definition'>Maß für Lichtundurchlässigkeit</descrip>
<ref type='sourceIdentifier' target='DIN673 0. 1992-083p.5</ref>
<ntig lang=fr>
<termNote type='partOfSpeech'>n</termNote>
<termNote type=·'gen'>f</termNote>
<bibl id"DIN6730.1992-08'>Papier und Pappe: Begriffe</bibl>

Figure 1: Sample MARTIF Document

  • Others find the standard inadequate because it is not powerful enough, i.e., it does not accommodate word or "lemma"-oriented lexicographical data or mixed systems of terminological and lexicographical data as discussed in Section 5.
