Using termbases
Use termbases to improve translations
Termbases are a critical part of modern translation workflows. They allow for consistency of translation across multiple languages.
CaptionHub has simple termbase support. You need to upload your termbase to CaptionHub, and then attach it to a project. Once it's attached to a project, it'll be pre-applied for Machine Translation when translation with Amazon Translate or DeepL, and it'll also be highlighted in the UI for your linguists.
Prepare your termbase for CaptionHub
CaptionHub supports termbases uploaded in CSV format. Your CSV needs to have at least two columns. They need to be labelled with the language codes corresponding to the languages in your term base. For example English (Global) should be EN, and English (United States) should be EN-US or EN_US. Entries with empty cells will be ignored.
Therefore a termbase with English (United Kingdom), Dutch and French would look like this:
EN-UK | NL | FR |
CaptionHub | CaptionHub | CaptionHub |
computer | ordinateur | |
email | e-mail | courriel |
You can continue to add extra columns to the CSV, for multilingual termbases.
Creating a termbase
- Navigate to the Termbases page, available from the drop-down menu at the top right
- Give your termbase a name, and choose the language that your termbase contains
- Click "Create termbase"
- When your termbase is created, you need to click on the "Import termbase" button
- Locate the CSV file you prepared earlier, and click on the "Import" button
Attaching a termbase to a project
- From the Project page, click on the drop down cog
- Choose "Attach termbases"
- Select the termbase(s) you'd like to attach to your project, and click "Done"
Now, when you use Amazon Translate or DeepL, or create blank translations, then the terms in your termbase will be pre-applied.
Guidance for using termbases
- Termbases can be no larger than 10MB, and the maximum entry size is 200 bytes.
- Try to keep your custom terminology minimal. Only include words which you want to control and which are completely unambiguous. Only use words that you know you will never want to use an alternate meaning of, and you want it to only ever be translated in a single way. Ideally, limit the list to proper names, like brand names and product names.
- Custom terminologies are case-sensitive. If you need both capitalised and non-capitalised versions of a word to be included, you must include an entry for each version.
- Do not include different translations for the same source phrase (for example, entry #1-EN: Amazon, FR: Amazon; entry #2-EN: Amazon FR: Amazone).
- Some languages do not change the shape of a word based on sentence context. With these languages, applying a custom terminology is most likely to improve overall translation quality. However, some languages do have extensive word shape changes. We do not recommend applying the feature to those languages, but we do not restrict you from doing so.
Error message | Solution |
The CSV file contains languages we don’t recognised | Please check that each column headers matches a supported language. The format should be CODE_TERRITORY, for example EN-US for English (United States), EN for English (Global), etc. |