Using termbases

📢
Requirements: Enterprise subscription, or Pro Translation bolt-on.

Overview

Termbases are a critical part of modern translation workflows. They allow for consistency of translation across multiple languages.

CaptionHub has simple termbase support. You need to upload your termbase to CaptionHub, and then attach it to a project. Once it's attached to a project, it'll be pre-applied for Machine Translation (currently only via Amazon Translate), and it'll also be highlighted in the UI for your linguists.

Prepare your termbase for CaptionHub

CaptionHub only supports termbases uploaded in CSV format. Your CSV needs to have at least three columns. The first column needs to be labelled "Entry_ID"; and the remaining columns need to be labelled with the ISO codes corresponding to the languages in your term base.The code should be in the format LANGUAGE_CODE for languages with no territory suffix and LANGUAGECODE_TERRITORYCODE for languages with a territory suffix. For example English (Global) should be EN, and English (United States) should be EN_US.

Please note that all cells need to be populated: Amazon Translate doesn't support incomplete termbases.

Therefore a termbase with English (United Kingdom) and French (France) should be formatted  like this:

Notion image
 

You can continue to add extra columns to the CSV, for multilingual termbases.

Creating a termbase

  1. Navigate to the Termbases page, available from the drop-down menu at the top right
  1. Give your termbase a name, and choose the language that your termbase contains
  1. Click "Create termbase"
  1. When your termbase is created, you need to click on the "Import termbase" button
  1. Locate the CSV file you prepared earlier, and click on the "Import" button

Attaching a termbase to a project

  1. From the Project page, click on the drop down cog
  1. Choose "Attach termbases"
  1. Select the termbase(s) you'd like to attach to your project, and click "Done"

Now, when you use Amazon Translate, or create blank translations, then the terms in your termbase will be pre-applied.

Guidance for using termbases

  • Termbases can be no larger than 10MB, and the maximum entry size is 200 bytes.
  • Try to keep your custom terminology minimal. Only include words which you want to control and which are completely unambiguous. Only use words that you know you will never want to use an alternate meaning of, and you want it to only ever be translated in a single way. Ideally, limit the list to proper names, like brand names and product names.
  • Custom terminologies are case-sensitive. If you need both capitalised and non-capitalised versions of a word to be included, you must include an entry for each version.
  • Do not include different translations for the same source phrase (for example, entry #1-EN: Amazon, FR: Amazon; entry #2-EN: Amazon FR: Amazone).
  • Some languages do not change the shape of a word based on sentence context. With these languages, applying a custom terminology is most likely to improve overall translation quality. However, some languages do have extensive word shape changes. We do not recommend applying the feature to those languages, but we do not restrict you from doing so.

Troubleshooting

Error message
Solution
The CSV file contains languages we don’t recognised
Please check that each column headers matches a supported language. The format should be CODE_TERRITORY, for example EN_US for English (United States), EN for English (Global), etc https://app.captionhub.com/supported_languages
The CSV file is missing a translation with entry
Each row in the CSV must contain translations for all languages. Entries cannot be left blank. Please ensure there are no missing cells in your CSV file.
 
Did this answer your question?
😞
😐
🤩