Auxilliary Tools
Trados-to-Translog-II: Adding Gaze and Qualitivity data to the CRITT TPR-DB
Updated: May 23, 2022.
This describes a new tool in the TPR-DB that converts Trados Studio keylogging data (Qualitivity) into Translog-II format and adds the converted data to the CRITT TPR-DB. The tool is also able to synchronize with the output of various eye-trackers.The Trados-TPR-DB interface provides the possibility to record translation behavior in an ecologically realistic translation environment. We are now able to investigate patterns of reading and typing activities in a widely and professionally used CAT tool, and thus to achieve a better understanding of factors that impact professional translation activity.
For using the Trados to Translog conversion tool, you need to have the platform and software listed below. The procedures for adding gaze and Qualitivity data to the TPR-DB are the following six steps.
Tools needed:
- TPR-DB account & management tool.
- Trados Studio & Qualitivity.
- Tobii Eye tracker TX 300 & Tobii Studio 3.3.2- Check instructions from Tobii.
See also a video instruction on YouTube.
Posters from EAMT 2022 (pdf)(To appear)
Manual Re-Alignment of Keystrokes to Tokens
In some cases, keystrokes are not mapped to the correct token. The manual re-alignment procedure below helps rectify this issue:
Download the following repository from GitLab: https://git.rwth-aachen.de/arndt.heilmann/kdfixer
Download the Event.xml-files from your study form the Management Tool: https://critt.as.kent.edu/cgi-bin/yawat/yawat.cgi
Move the Event.xml-files to the folder "EventFiles"
Run "1_CreatePuzzleFileFromEventFile.py" from the Scripts folder
In the Folder "ManualRealignment" you will find so called .pzl files. These are basically tab-separated tables
Open the .pzl-File in a spreadsheet program (or copy-paste the content of a .pzl-file into the spreadsheet program)
Cut-and-paste misaligned keystroke ids to the correct token id
Save your changes under the .pzl-file again (tab-separated)
Run "2_CreateFixedEventFile".
Retrieve updated .Event.xml-Files from "FixedEventFiles" Folder
Upload .Event.xml-files via the CRITT TPRDB management tool and re-create the tables.
The tool was developed as part of the TRICKLET-project
Syntactic Similarity Scores
Bram Vanroy makes available his scripts to measure syntactic similarity based on TPR-DB data. The features can be added to the TPR-DB tables. It is a simple post-processing step which you can add at the end of the process to create the tables. I’ll describe the steps below.
Install the astred library (https://github.com/BramVanroy/astred#installation)
pip install astred[stanza]
This will also automatically download the stanza parser.
Download the script itself and save it somewhere where you can run it (https://raw.githubusercontent.com/BramVanroy/astred/master/examples/add_features_tprdb.py)
Basic usage
Default usage is as simple as python add_features_tprdb.py <input_directory> --src_lang <lang code for src> --tgt_lang <lang code for tgt>
python add_features_tprdb.py directory/with/tprdb/tables --src_lang en --tgt_lang nl
So, as an example, if you have extracted the attached ZIP file in a directory on your desktop, it would be
python add_features_tprdb.py ~/Desktop/ENDU20-tables --src_lang en --tgt_lang nl
For more options, run python add_features_tprdb.py -h
A complete description of the tool can be found in:
Vanroy, B. (2021). Syntactic difficulties in translation. PhD thesis. Ghent University.
InputLog
Inputlog is a tool to observe writing processes unobtrusively. Writing researchers and teachers use keystroke logging to describe and analyze online writing or translation processes. Visit the InputLog website here for more information from the developers.
You can download the current Inputlog version from their website or also an earlier version 7 from here (Password for installation: IL7558)
While Inputlog is designed to work in conjunction with MS Word, it also logs keystrokes outside Word. However, in that case it only logs the keystrokes but cannot know the context in which these keystrokes/mouseclicks are produced. Inputlog can be used in conjunction with Translog-II. While Translog-II can only record keystrokes that are produced inside the Translog-II editor, Inputlog captures all keystrokes that are produced inside and outside Translog-II. Both log files can be merged and synchronized based on the keystrokes that occur in both loggung tools. The result is a more complete picture of keyboard activities insides and outside Translog-II. The procedure would be a follows:
1) start InputLog
2) then start Translog-II
3) run the translation session
4) stop Translog and save the Translog-II *.xml file
5) stop InputLog: an *.idfx file will be automatically produced in the InputLog folder
6) merge the Inputlog *.idfx file into the Translog-II *.xml file
7) upload the merged output to the TPR-DB
The following should be considered:
In order to run Inputlog with Translog-II you have to change the recording settings (as Word is the default writing environment for Inputlog). Be aware that changing this option will limit the use of certain analysis possibilities that InputLog provides (e.g. revision analysis, process graph etc.) Change the recording settings as follows:
Select File in the top menu
Options: Change Plugin Session by ticking off the WordLog option
To merge the Inputlog idfx file into Translog-II *.xml file run the following command:
perl ./InjectIDFX.pl -T <Tanslog-II>.xml -I <InputLog>.idfx -O <Target_fn>.xml
where options -T and -I specify the two log files and -O specifies the merged output file.