Do’s and Don’ts of Software Localization

Given the continued competitive pressure on software firms to expedite market demands, many developers work under tight deadlines to deliver a functional software. This software is often geared for localization once the source language version is ready for release.

Keeping these pressures in mind, developers should ensure that basic internationalization principles are followed while developing software to ease smooth localization efforts along with meeting market requirements for all the required languages, not just the source language.

Following are the do’s and don’ts that all developers should know and apply in their work to avail faster and cost effective Software Localization sets for various languages:

1. Do externalize messages in Message Catalogs, resource files, and configuration files: Messages are textual objects and consequently are translatable elements. These are catalogs or files which are installed in a locale-specific location or named with a locale-specific suffix. This practice will ease the localization course of action, since localizers can work on these resource bundles without any need to modify the source code. It will also ease the use of a single source code for all the languages, where only the resource bundles will have different language flavors.

2. Don’t internationalize fixed textual objects: Fixed textual objects such as comments, commands, and configuration settings etc. should not be translated. Only externalize the strings which needing translation.

If these objects appear in resource or configuration files, they should be marked with the tag “NOT_FOR_TRANSLATION.”Here are some examples of fixed textual objects not requiring internationalization principles:

o User names, group names, and passwords

o System or great number names

o Names of terminals, printers, and special devices

o Shell variables and ecosystem variable names

o Message queues, semaphores, and shared memory labels

o UNIX commands and command line options

o Some GUI textual elements, such as keyboard mnemonics and keyboard accelerators

3. Do allow for text expansion in messages (especially for GUI items):

By applying the following expansion rules, when the source text is:

o 0 – 10 characters: The expansion required is from 101 – 200%.

o 11 – 20 characters: 81 – 100%

o 21 – 30 characters: 61 – 80%

o 31 – 50 characters: 41 – 60%

o 50 – 70 characters: 31 – 40%

o Over 70 characters: 30%

But keep the string length well below your limit (usually 254 characters) to explain the additional characters needed.

4. Don’t use variables when you can avoid them: Variables raise questions in the translator’s mind as to the gender of the term to substitute, making it difficult to correctly translate the sentences that incorporate it. If variables are to be used, always offer a list of replacements. Also allow for gender and plurals variations in the translation of the sentences that incorporate the variable.

Don’t use composite strings. A composite string is an error message or other text that is dynamically generated from uncompletely sentence segments and presented to the user in complete sentence form. Use complete sentences instead, already if you have to use repeating segments. This will ensure the accuracy of the translation, in spite of of gender, plurality, conjugation, or sentence structure. Also, avoid using the same placeholders when using multiple variables in the same string, since the sentence structure does change in different languages.

5. Do perform pseudo-translation: Pseudo-translation is the time of action of replacing or adding characters to your software strings to detect character encoding issues and hard-coded text remaining in the source files.

6. Don’t use IF Conditions or rely on a sort order in your code to estimate a string value: For example, avoid (IF Gender = “Female” THEN). Always use enumeration or rare IDs.

7. Do use Unicode roles and methods to sustain all scripts: Applications that store and retrieve text data need to accept and characterize the characters for any given language. Use of Unicode encoding solves the problem of unsupported character sets along with characterize of junk characters.

8. Don’t insert hard carriage returns in the middle of sentences. Translation memory tools meaningful off hard returns and assume that the sentence has ended. Inserting hard carriage returns in the middle of a sentence leads to incomplete sentences in the translation database and corrupts the sentence structure in the target language files. Instead, replace hard returns with soft returns or use break tag such as [BR]. Also, the sentence structures in addition as the length of sentence parts get changed in different languages. So, additional breaks may be required in target languages.

9. Do choose your third-party software provider carefully: Insist that your third-party software sustain Unicode and comply with the internationalization practices. If problems are encountered with third-party software, and if you don’t have control over their code to fix the problems, makes the localization responsibilities more difficult.

10. Don’t use text in icons and bitmaps: The translated text may be too long to fit. Also, avoid using signs with cultural connotations and locale-specific idioms.

11. Do use long dates or month abbreviations instead of numbers when identifying dates: As Month vs. day orders in different parts of the world vary (e.g., mm/dd/yy in the US; dd/mm/yy in Europe).

12. Don’t alphabetically sort strings in string tables and resource bundles. Try to offer as much context as you can with the externalized strings. This will help the translator better adapt the translation to that context. If context is non-existent, run-time QA will take much longer to correct the translations.

Leave a Reply