TBX-Glossary is designed to support the interchange of glossary data among several formats: UTX-Simple, GlossML, the TBX family, and OLIF.

Each of the four formats can represent kinds of data that some of the other three cannot [1]. Therefore, not every file in one of these formats can be converted to another format. Moreover, most of the formats require at least one kind of data that another format does not require. To be fully convertible among formats, a file must contain all data that one of the formats may require, and it must not contain any data that one of the formats cannot represent.

Convertible data categories

The convertible data categories are as follows, where src means 'in the source language' and tgt means 'in the target language'.

Glossary-wide

mandatory:

optional:

Per-entry

mandatory:

optional:

Data placed in these categories should be in plain text, with no XML-like markup nor tab characters. For details on how these data categories are represented in each format, see the corresponding file:

Success and failure

If the input file violates these requirements, the converter program will emit a warning. It may then stop the conversion process, or it may proceed with a best-effort attempt, so production of an output file should not be taken as evidence of success: The only such evidence is freedom from warnings.


[1] By "kinds of data" above we mean both data categories and broader, structural qualities of the glossary. The four formats embody different models of what a glossary is, and conversion requires common ground on these modeling concerns just as it requires agreement on required and permitted data categories. Thus the seemingly vague phrase.