Translation, Theory and Technology Homepage

[CLS Framework]

CLS Framework
Introduction
Section map
Overview
Applications:
·Representation ·Design
·Sharing
ISO 12620 data categories
Downloads
XML information

Backward Contents Forward

MARTIF - Putting Complexity in Perspective

6. Building Black Boxes in Several Shades of Gray

Numerous aspects of complexity have been outlined above—both the natural complexity that arises from the comparison of variant data structures and the complexity that has been designed into the MARTIF architecture in order to come to terms with the rich variety exhibited in these systems. This article began by citing a call for re-engineering the MARTIF standard in order to remove its complexity and thus to enable it to find wide acceptance. In light of the powerful advantages afforded by the present model, however, it hardly seems logical to make the format simple just so people will like it, if in doing so it is rendered incapable of performing the tasks for which it was designed.

Many other computer-assisted tasks are complicated. For instance, the conversion back and forth between WORD™, WordPerfect™, Rich Text Format (RTF), etc., has become relatively simple in recent years, but the conversion programs themselves are very complicated. Even just looking at raw RTF documents in a line editor (rather than importing them into a word-processing system) is not at all pretty. In the same way, the hidden codes in a WordPerfect file can reveal a high degree of complexity, but being able to look at them provides a powerful element of control for the savvy user who knows how to manipulate the codes.

Despite the fact that the conversion programs cited here are complex, everyone uses them all the time with relatively little difficulty. The reason for this is that these programs are packaged inside the "black box" of a user-friendly interface. Retooling the MARTIF standard to make it simpler would also make it more restrictive, which would also make it potentially less acceptable to a wide range of users and less powerful to handle a broad variety of systems. Hence, the logical course of action is to provide users with ways to utilize its capability without having to ponder complexities that only a relatively few number of developers and SGML experts readily understand.

Some MARTIF blackbox solutions will be "blacker" than others. For systems with very elaborate, predefined entry structures and sophisticated sets of data categories, system designers can create true blackbox routines that will allow users to import and export data that will conform to MARTIF levels 1, 2, or 3 just as easily as they convert word-processing files to RTF. Depending on the MARTIF level that these designers decide to support, interactive tools will query the user about the contents of data categories in order to precondition the data so that it will conform to higher MARTIF levels. For instance, this might involve stating the content of permissible instances such as genders or adapting existing subject-field references to standardized classification systems as described in Section 4.

In systems with definable sets of data categories and freely configurable entry structures (such as MultiTerm), users are able to model their own databases to meet their needs within certain technical limitations. Here users are themselves system designers to a certain extent because only they can interpret their data categories. Consequently, actual system designers will only be able to implement a "partially black" box, although it will still be possible to develop tools and standard implementations that will enable users to produce data that will conform to the various MARTIF levels.

Unfortunately, in cases where terminology management is carried out using self-programmed systems or adaptations from off-the-shelf databases, spreadsheets, or word-processing programs, users must develop their own MARTIF interfaces, which of course will not be easy for the "typical user". There is little possibility for a black-box solution for these cases, although simplified guidelines could be produced to facilitate MARTIF implementation in some cases.

Within the framework of the various levels, or even beyond the requirements stated at any given level, individual user groups can agree among themselves to accept additional standard conventions. For instance, they might agree to use a specific subset of the data categories listed in ISO 12620. Models for cohesive user groups (VHG, LISA, etc.) can adopt very similar, but simplified structures while maintaining a back-door link to MARTIF. Such solutions may utilize a simplified tag-naming convention that nevertheless remains parallel to MARTIF in order to facilitate easy conversion. HTML implementations are likely to fit into this pattern, with fundamental variations designed, for instance, for either read-only or interactive use or even the creation of data resources on Intranets or via the World Wide Web.

As well as considering the levels to which MARTIF data may conform, it is also interesting to look at the different ways that importers may use MARTIF files. There are two likely scenarios for importing data. In the one case, users may obtain data in MARTIF format and create "MARTIF-friendly" target database structure in their own terminology management system. They can then utilize this data in the software environment to which they are accustomed. On the other hand, importers may wish to integrate import data into their own existing data collection.

Obviously, the amount of effort involved in the second scenario will be much greater. In such cases it may be necessary either to obtain data that conforms to a higher MARTIF level (i.e., level 2 or 3), or to examine incoming data structures and modify them in order to ensure data modeling compatibility. In addition, it may be necessary to solve other problems related to database consolidation or handle problems associated with so-called "doublettes" (homographs or duplicate term indexes).

In addition to utilizing the data as noted above, future development may produce a' "universal viewer", i.e., a simple user interface that can be used to examine and even condition MARTIF data before or even side-stepping the importation phase. The more conformant data would be to the higher MARTIF levels 2 or 3, the easier it would be to process or view data in this way.

Backward

Contents

Forward