Error-Annotation Stage: Annotating Errors with the TRG Annotation Tool
In the Error-Annotation Stage, a professional translator produces error annotations by assigning issues from the metric to textual segments in the bitext.
The TRG Annotation Tool was originally created as a webapp in 2015 as part of a master’s thesis on TQE, using the PHP webapp framework Symfony. It was used in other student theses (such as Marshall Martins, page 17) under the name of “MQM Scorecard.” It was further developed by a software engineer at the German Research Center for Artificial Intelligence (DFKI) as part of the EU-funded QT21 project. Upon the release of PHP7, LTAC Global, along with the Translation Research Group (TRG) at BYU, undertook the major task of rewriting the webapp in a React framework. Nowadays, the TRG Annotation Tool is being used by researchers to annotate bitexts in low-resource languages, and the TRG continues to use it in research on a corpus of ATA translator certification exams.
The TRG Annotation Tool is a self-hosted webapp. Anyone creating a TQE system can create their own instance, add users and projects, and upload custom typologies. The technical details for setting up an instance are outside the scope of this tutorial, but are detailed on the tools page.
Once an instance is set up and a user account is created for the evaluator, they may log in and create a project, uploading the files prepared in the Preliminary Stage as necessary. Note that each account must upload a general error typology to be shared across all projects. This should be whatever error typology was used to create the metric (see Section B.1.2.1 and B.1.2.4). With a project created, the user, who is the evaluator, completes annotation of the aligned, segmented text and exports the data, which is download as a JSON file.
B2.1 Creating a Project
- Navigate to the “Create project” tab.
- Choose a name.
- Upload the bitext, specifications, and metric file as created in Section B1.
Any of these parameters may be changed at any time from the “View projects” tab by selecting the project’s “Edit” button. An account’s uploaded error typology can only be changed if there are no active projects on that account.
B2.2 The Project Editor
Once a project is created and opened, there will be multiple tabs in the project editor.
Scorecard
This is the main annotation interface. The evaluator scrolls through the bitext and annotates errors. The “Filter” pane at the bottom of the interface allows the evaluator to search for strings in the bitext.
Once an error is identified, the evaluator annotates it:
- Select the segment where the error is found. Double-click anywhere in a segment to select it, use the sidebar arrows to navigate through the segments, or enter a segment number to select under the Navigation pane in the lower right corner of the interface. The selected segment is highlighted in red.
- With the segment selected, click the pencil icon to enable or disable highlighting. If it is orange, then highlighting is enabled.
- With highlighting enabled, click and drag (or double-click a word) as usual to highlight text in the source or target column of the selected segment to create an error.
- Select the type and severity level of the error to add. The selection available comes from the uploaded metric. Further information for each error type comes from the uploaded typology, and can be inspected by mousing over the error types in the dropdown menu.
- Add any freetext notes on the error.
- Click “Add New Error” finalize the annotation of this error.
To select an annotation, click the associated error button underneath the associated text segment. The associated error text will be highlighted. In the right-hand side of the interface, any notes associated with the annotation will appear, along with buttons to deselect, edit, or delete the error annotation.
Project Specifications
Here, the evaluator can consult the translation project specifications, as they were formalized in the uploaded STS file.
Reports
Here, the evaluator can see a summary of the error count in the translation, split by type and severity. This is similar to the error count table that will be created in Section B3.1. There is also a button to export the project data as a JSON file.
Training and Help
In-app tutorials are available in this tab.
About
This is the same throughout every project, and gives credit to the contributors and supports of the TRG Annotation Tool, as well as a contact and bug reporting information.
B2.3 Exporting Project Data
Once the evaluator is satisfied that they have found and annotated all relevant errors in the bitext, they can export the error annotation data as a JSON file.
The JSON file can then be converted into a TEI file. This tutorial provides a webapp that converts TRG Annotation Tool JSON exports into TEI.
To ensure that no data was lost on export or conversion, this tutorial also provides a “reconstructor” for data inspection. This is a webapp that, given a TRG Annotation Tool project’s TEI file, shows a pared-down view of that project’s text and error annotations as an HTML document formatted to resemble the TRG Annotation Tool’s annotation interface.