From 66794749fb7e34b077cff2600e457f2b0a282dd5 Mon Sep 17 00:00:00 2001 From: Ned Batchelder Date: Wed, 15 Jan 2014 09:59:14 -0500 Subject: [PATCH 1/4] Minor tweaks to home page. --- docs/en_us/developers/source/index.rst | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/docs/en_us/developers/source/index.rst b/docs/en_us/developers/source/index.rst index bb36a33f2a..361c25175f 100644 --- a/docs/en_us/developers/source/index.rst +++ b/docs/en_us/developers/source/index.rst @@ -8,13 +8,21 @@ Welcome to EdX's Dev documentation! Contents: -.. toctree:: - :maxdepth: 2 +.. this is wildly disorganized, and is basically just a dumping ground for + .rst files at the moment. - overview.rst - common-lib.rst - djangoapps.rst - i18n_translators_guide.rst +.. toctree:: + :maxdepth: 2 + + overview.rst + common-lib.rst + djangoapps.rst + + overview.rst + common-lib.rst + djangoapps.rst + i18n.rst + i18n_translators_guide.rst Indices and tables ================== @@ -22,4 +30,3 @@ Indices and tables * :ref:`genindex` * :ref:`modindex` * :ref:`search` - From 53de11d4b50f2df40bb76291b99a3dd656527e0f Mon Sep 17 00:00:00 2001 From: Ned Batchelder Date: Wed, 15 Jan 2014 09:59:36 -0500 Subject: [PATCH 2/4] Conversion of page from wiki to .rst --- docs/en_us/developers/source/i18n.rst | 342 ++++++++++++++++++++++++++ 1 file changed, 342 insertions(+) create mode 100644 docs/en_us/developers/source/i18n.rst diff --git a/docs/en_us/developers/source/i18n.rst b/docs/en_us/developers/source/i18n.rst new file mode 100644 index 0000000000..8322d56216 --- /dev/null +++ b/docs/en_us/developers/source/i18n.rst @@ -0,0 +1,342 @@ +###################################### +Internationalization coding guidelines +###################################### + + +See also: + +* `Django Internationalization `_ (overview) +* `Django: Internationalizing Python code `_ +* `Django Translation guidelines `_ +* `Django Format localization `_ + + +General Internationalization Rules +********************************** + +In order to localize source files, we need to prepare them so that the +human-readable strings can be extracted by a pre-processing step, and then have +localized strings used at runtime. This requires attention to detail, and +unfortunately limits what you can do with strings in the code. In general: + +1. Always mark complete sentences for translation. If you combine fragments at + runtime, there is no way for the translator to construct a proper sentence + in their language. + +2. Do not join together strings at runtime to create sentences. + +3. Limit the amount of text in strings that is not presented to the user. HTML + markup is better applied after the translation. If you give HTML to the + translators, there's a good chance they will translate your tags or + attributes. + +See the detailed Style Guidelines at the end for details. + + +Editing source files +******************** + +While editing source files (including Python, Javascript, or HTML template +files), use the appropriate conventions. There are a few things to know how to +do: + +1. What has to be at the top of the file (if anything) to prepare it for i18n. + +2. How are strings marked for internationalization? This takes the form of a + function call with the string as an argument. + +3. How are translator comments indicated? These are comments in the file that + will travel with the strings to the translators, giving them context to + produce the best translation. They have a "Translators:" marker. They must + appear on the line preceding the text they describe. + +The code samples below show how to do each of these things. + +Python source code +================== + +.. highlight:: python + +In Python source code (read the django docs for more details):: + + from django.utils.translation import ugettext as _ + + # Translators: This will help the translator + message = _("Welcome!") + +Django template files +===================== + +.. highlight:: django + +In Django template files (`templates/*.html`):: + + {% load i18n %} + + {# Translators: this will help the translator. #} + {% trans "Welcome!" %} + +Mako template files +=================== + +.. highlight:: mako + +In Mako template files (`templates/*.html`), you can use all of the tools +available to python programmers. Just make sure to import the relevant +functions first. Here's a mako template example:: + + <%! from django.utils.translation import ugettext as _ %> + + ## Translators: message to the translator + ${_("Welcome!")} + +Javascript files +================ + +.. highlight:: javascript + +In order to internationalize Javascript, first the html template (base.html) +must load a special Javascript library (and Django must be configured to serve +it):: + + + +Then, in javascript files (`*.js`):: + + // Translators: this will help the translator. + var message = gettext('Welcome!'); + +Coffeescript files +================== + +.. highlight:: coffeescript + +Coffeescript files are compiled to Javascript files, so it works mostly like +Javascript:: + + `// Translators: this will help the translator.` + message = gettext('Hey there!') + # Interpolation has to be done in Javascript, not Coffeescript: + message = gettext("Error getting student progress url for '<%= student_id %>'.") + full_message = _.template(message, {student_id: unique_student_identifier}) + +But because we extract strings from the compiled .js files, there are some +native Coffeescript features that break the extraction from the .js files: + +1. You cannot use Coffeescript string interpolation: This results in string + concatenation in the .js file, so string extraction won't work. + +2. You cannot use Coffeescript comments for translator comments, since they are + not passed through to the Javascript file. + +:: + + # NO NO not like this: + # Translators: this won't get to the translators! + message = gettext("Welcome, #{student_name}!") # This won't work! + + ### + Translators: This will work, but takes three lines :( + ### + message = gettext("Hey there") + +.. highlight:: python + +Other kinds of code +=================== + +We have not yet established guidelines for internationalizing the following. +See remaining work for more details. + +* xblocks (in edx-platform/src/xblock) should not depend on django, so we + should use the python gettext library instead. + +* course content (such as subtitles for videos) + +* documentation (written for Sphinx as .rst files) + +* client-side templates written using Underscore. + + +Building and testing your code +****************************** + +These instructions assume you are a developer writing new code to check in to +github. For other use cases in the translation life cycle (such as translating +the strings, or checking the translations into github, see use cases). + +1. Run the rake i18n:extract command to create human-readable .po files. This + command may take a minute or two to complete: + +:: + + $ cd edx-platform + $ rake i18n:extract + +2. Generate dummy strings: run rake i18n:dummy to create fake translations. See + coverage testing (below) for more details. + + a. By default, these are created in the Esperanto language directory. + + 1. This will blow away any actual Esperanto translation files that may be + there. You can revert to the github head after you complete testing. + + 2. You will need to switch your browser to Esperanto in order to view + the dummy text. + + 3. Django's implementation requires us to use a real language (like + Esperanto..) rather than an invented language (like Esperanto.. + er Martian) for this testing. + + b. Do not check in to github the dummy text (in conf/locale/eo/LC_MESSAGES). + +:: + + $ rake i18n:dummy + +3. Run the rake i18n:generate command to create machine-readable .mo files:: + + $ rake i18n:generate + +4. Django should be ready to go. The next time you run studio or lms with a + non-English browser, the non-English strings (from step 3, above) should be + displayed. (But be sure that your settings for USE_I18N and USE_L10N are + both set to True. USE_I18N is currently set to False by default in + common.py, but is set to True in lms/envs/dev.py and cms/envs/dev.py) + +5. With your browser set to Esperanto, review the pages affected by your code + and verify that you see fake translations. If you see plain English instead, + your code is not being properly translated. Review the steps in editing + source files (above) + +Coverage testing +**************** + +This tool is used during the bootstrap phase, when presumably (1) there is a +lot of EdX source code to be converted, and (2) there are not a lot of +available translations for externalized EdX strings. At the end of the +bootstrap phase, we will eventually deprecate this tool in favor of other +processes. Once most of the EdX source code has been successfully converted, +and there are several full translations available, it will be easier to detect +and correct specific gaps in compliance. + +Use the coverage tool to generate dummy files:: + + $ rake i18n:dummy + +This will create new dummy translations in the Esperanto directory +(edx-platform/conf/local/eo/LC_MESSAGES). + +You can then configure your browser preferences to view Esperanto as your +preferred language. Instead of plain English strings, you should see something +like this: + + Thé Fütüré øf Ønlïné Édüçätïøn Ⱡσяєм ι# + Før änýøné, änýwhéré, änýtïmé Ⱡσяєм # + +This dummy text is distinguished by extra accent characters. If you see plain +English instead (without these accents), it most likely means the string has +not been externalized yet. To fix this: + +* Find the string in the source tree (either in python, javascript, or html + template code). + +* Refer to the above coding guidelines to make sure it has been externalized + properly. + +* Rerun the scripts and confirm that the strings are now properly converted + into dummy text. + +This dummy text is also distinguished by Lorem ipsum text at the end of each +string, and is always terminated with "#". The original English string is +padded by about 30% extra characters, to simulate some language (like German) +which tend to have longer strings than English. If you see problems with your +page layout, such as columns that do not fit, or text that is truncated (the # +character should always be displayed on every string), then you will probably +need to fix the page layouts accordingly to accommodate the longer strings. + + +Style guidelines +**************** + +Don't append strings. Interpolate values instead. +================================================= + +It is harder for translators to provide reasonable translations of small +sentence fragments. If your code appends sentence fragments, even if it seems +to work ok for English, the same concatenation is very unlikely to work +properly for other languages. + +Bad:: + + message = _("The directory has ") + len(directory.files) + _(" files.") + +In this scenario, the translator will have to figure out how to translate these +two separate strings. It is very difficult to translate a fragment like "The +directory has." In some languages the fragments will be in different order. For +example, in Japanese, "files" will come before "has." + +It is much easier for a translator to figure out how to translate the entire +sentence, using the pattern "The directory has %d files." + +Good:: + + message = _("The directory has %d files.") % len(directory.files) + + +Use named interpolation fields +============================== + +Named fields are better, especially if there are multiple fields, or if some +fields will be locally formatted (i.e. number, date, or currency). + +Bad:: + + message = _('Today is %s %d.') % (m, d) + +Good:: + + message = _('Today is %(month)s %(day)s.') % {'month': m, 'day': d} + +Notice that in English, the month comes first, but in Spanish the day comes +first. This is reflected in the +edx-platform/conf/locale/es/LC_MESSAGES/django.po file like this:: + + # fragment from edx-platform/conf/locale/es/LC_MESSAGES/django.po + msgid "Today is %(month)s %(day)s." + msgstr "Hoy es %(day) de %(month)s." + +The resulting output is correct in each language:: + + English output: "Today is November 26." + Spanish output: "Hoy es 26 de Noviembre." + + +Singular vs Plural +================== + +It's tempting to improve a message by selecting singular or plural based on a +count:: + + if count == 1: + msg = _("There is 1 file.") + else: + msg = _("There are %d files.") % count + +This is not the correct way to choose a string, because other languages have +different rules for when to use singluar and when plural, and there may be more +than two choices! + +One option is not to use different text for different counts:: + + msg = _("Number of files: %d") % count + +If you want to choose based on number, you need to use another gettext variant +to do it:: + + from django.utils.translation import ungettext + msg = ungettext("There is %d file", "There are %d files", count) + msg = msg % count + +This will properly use count to find a correct string in the translation file, +and then you can use that string to format in the count. From 8c79f13d17f2e3fb6c13bda94c3db2ef6a06a2b5 Mon Sep 17 00:00:00 2001 From: Ned Batchelder Date: Thu, 16 Jan 2014 17:16:27 -0500 Subject: [PATCH 3/4] Added more coding details to i18n doc. --- docs/en_us/developers/source/i18n.rst | 187 +++++++++++++++++++++++--- 1 file changed, 167 insertions(+), 20 deletions(-) diff --git a/docs/en_us/developers/source/i18n.rst b/docs/en_us/developers/source/i18n.rst index 8322d56216..b19b8e6117 100644 --- a/docs/en_us/developers/source/i18n.rst +++ b/docs/en_us/developers/source/i18n.rst @@ -2,6 +2,10 @@ Internationalization coding guidelines ###################################### +Preparing code to be presented in many languages can be complex and difficult. +The rules here give the best practices for marking English strings in source +so that it can be extracted, translated, and presented to the user in the +language of their choice. See also: @@ -11,7 +15,7 @@ See also: * `Django Format localization `_ -General Internationalization Rules +General internationalization rules ********************************** In order to localize source files, we need to prepare them so that the @@ -30,7 +34,8 @@ unfortunately limits what you can do with strings in the code. In general: translators, there's a good chance they will translate your tags or attributes. -See the detailed Style Guidelines at the end for details. +See the detailed :ref:`Style Guidelines ` at the end for +details. Editing source files @@ -50,20 +55,37 @@ do: produce the best translation. They have a "Translators:" marker. They must appear on the line preceding the text they describe. -The code samples below show how to do each of these things. +The code samples below show how to do each of these things. Note that you have +to take into account not just the programming language involved, but the type +of file: Javascript embedded in an HTML Mako template is treated differently +than Javascript in a pure .js file. Python source code ================== .. highlight:: python -In Python source code (read the django docs for more details):: +In most Python source code (read the Django docs for more details):: from django.utils.translation import ugettext as _ # Translators: This will help the translator message = _("Welcome!") +Some edX code cannot use Django imports. To maintain portability, XBlocks, +XModules, Inputtypes and Responsetypes forbid importing Django. Each of these +has its own way of accessing translations. You'll use lines like these +instead:: + + # for XBlock & XModule: + _ = self.runtime.service(self, "i18n").ugettext + message = _("Welcome!") + + # for InputType and ResponseType: + _ = self.capa_system.i18n.ugettext + message = _("Welcome!") + + Django template files ===================== @@ -101,11 +123,14 @@ it):: -Then, in javascript files (`*.js`):: +Then, in Javascript files (`*.js`):: // Translators: this will help the translator. var message = gettext('Welcome!'); +Note that Javascript embedded in HTML in a Mako template file is handled +differently. There, you use the Mako syntax even within the Javascript. + Coffeescript files ================== @@ -146,10 +171,6 @@ Other kinds of code =================== We have not yet established guidelines for internationalizing the following. -See remaining work for more details. - -* xblocks (in edx-platform/src/xblock) should not depend on django, so we - should use the python gettext library instead. * course content (such as subtitles for videos) @@ -162,8 +183,8 @@ Building and testing your code ****************************** These instructions assume you are a developer writing new code to check in to -github. For other use cases in the translation life cycle (such as translating -the strings, or checking the translations into github, see use cases). +Github. For other use cases in the translation life cycle (such as translating +the strings, or checking the translations into Github, see use cases). 1. Run the rake i18n:extract command to create human-readable .po files. This command may take a minute or two to complete: @@ -179,7 +200,7 @@ the strings, or checking the translations into github, see use cases). a. By default, these are created in the Esperanto language directory. 1. This will blow away any actual Esperanto translation files that may be - there. You can revert to the github head after you complete testing. + there. You can revert to the Github head after you complete testing. 2. You will need to switch your browser to Esperanto in order to view the dummy text. @@ -188,7 +209,7 @@ the strings, or checking the translations into github, see use cases). Esperanto..) rather than an invented language (like Esperanto.. er Martian) for this testing. - b. Do not check in to github the dummy text (in conf/locale/eo/LC_MESSAGES). + b. Do not check the dummy text in to Github (in conf/locale/eo/LC_MESSAGES). :: @@ -209,6 +230,7 @@ the strings, or checking the translations into github, see use cases). your code is not being properly translated. Review the steps in editing source files (above) + Coverage testing **************** @@ -238,7 +260,7 @@ This dummy text is distinguished by extra accent characters. If you see plain English instead (without these accents), it most likely means the string has not been externalized yet. To fix this: -* Find the string in the source tree (either in python, javascript, or html +* Find the string in the source tree (either in Python, Javascript, or HTML template code). * Refer to the above coding guidelines to make sure it has been externalized @@ -256,11 +278,13 @@ character should always be displayed on every string), then you will probably need to fix the page layouts accordingly to accommodate the longer strings. +.. _style_guidelines: + Style guidelines **************** -Don't append strings. Interpolate values instead. -================================================= +Don't append strings, interpolate values +======================================== It is harder for translators to provide reasonable translations of small sentence fragments. If your code appends sentence fragments, even if it seems @@ -288,7 +312,7 @@ Use named interpolation fields ============================== Named fields are better, especially if there are multiple fields, or if some -fields will be locally formatted (i.e. number, date, or currency). +fields will be locally formatted (for example, number, date, or currency). Bad:: @@ -298,13 +322,17 @@ Good:: message = _('Today is %(month)s %(day)s.') % {'month': m, 'day': d} +Better:: + + message = _('Today is {month} {day}.').format(month=m, day=d) + Notice that in English, the month comes first, but in Spanish the day comes first. This is reflected in the edx-platform/conf/locale/es/LC_MESSAGES/django.po file like this:: # fragment from edx-platform/conf/locale/es/LC_MESSAGES/django.po - msgid "Today is %(month)s %(day)s." - msgstr "Hoy es %(day) de %(month)s." + msgid "Today is {month} {day}." + msgstr "Hoy es {day} de {month}." The resulting output is correct in each language:: @@ -312,7 +340,57 @@ The resulting output is correct in each language:: Spanish output: "Hoy es 26 de Noviembre." -Singular vs Plural +Only translate literal strings +============================== + +As programmers, we're used to using functions in flexible ways. But the +translation functions like ``_()`` and ``gettext()`` can't be used like other +functions. At runtime, they are real functions like any other, but they also +serve as markers for the string extraction process. + +For string extraction to work properly, the translation functions must be +called with only literal strings. If you use them with a computed value, +the string extracter won't have a string to extract. + +The difference between the right way and the wrong way can be very subtle: + +:: + + # BAD: This tries to translate the result of .format() + _("Welcome, {name}".format(name=student_name)) + + # GOOD: Translate the literal string, then use it with .format() + _("Welcome, {name}").format(name=student_name)) + +:: + + # BAD: The dedent always makes the same string, but the extractor can't find it. + _(dedent(""" + .. very long message .. + """)) + + # GOOD: Dedent the translated string. + dedent(_(""" + .. very long message .. + """)) + +:: + + # BAD: The string is separated from _(), the extractor won't find it. + if hello: + msg = "Welcome!" + else: + msg = "Goodbye." + message = _(msg) + + # GOOD: Each string is wrapped in _() + if hello: + message = _("Welcome!") + else: + message = _("Goodbye.") + + +Singular vs plural ================== It's tempting to improve a message by selecting singular or plural based on a @@ -340,3 +418,72 @@ to do it:: This will properly use count to find a correct string in the translation file, and then you can use that string to format in the count. + + +Translating too early +===================== + +When the ``_()`` function is called, it will fetch a translated string. It +will use the current user's language to decide which string to fetch. If you +invoke it before we know the user, then it will get the wrong language. + +For example:: + + from django.utils.translation import ugettext as _ + + HELLO = _("Hello") + GOODBYE = _("Goodbye") + + def get_greeting(hello): + if hello: + return HELLO + else: + return GOODBYE + +Here the HELLO and GOODBYE constants are assigned when the module is first +imported, at server startup. There is no current user then, so ugettext will +use the server's default language. When we eventually use those constants to +show a message to the user, they won't be looked up again, and the user will +get the wrong language. + +There are a few ways to deal with this. The first is to avoid calling ``_()`` +until we have the user:: + + def get_greeting(hello): + if hello: + return _("Hello") + else: + return _("Goodbye") + +Another way is to use Django's ugettext_lazy function. Instead of returning +a string, it returns a lazy object that will wait to do the lookup until it is +actually used as a string: + + from django.utils.translation import ugettext_lazy as _ + +This can be tricky because the lazy object doesn't act like a string in all +cases. + +The last way to solve the problem is to mark the string so that it will be +extracted properly, but not actually do the lookup when the constant is +defined:: + + from django.utils.translation import ugettext + + _ = lambda text: text + + HELLO = _("Hello") + GOODBYE = _("Goodbye") + + _ = ugettext + + def get_greeting(hello): + if hello: + return _(HELLO) + else: + return _(GOODBYE) + +Here we define ``_()`` as a pass-through function, so the string will be +found during extraction, but won't be translated too early. Then we redefine +``_()`` to be the real translation lookup function, and use it at runtime to +get the localized string. From b8a8575d174e9194358c420c82ee4e3a50d0c637 Mon Sep 17 00:00:00 2001 From: Ned Batchelder Date: Fri, 17 Jan 2014 11:11:42 -0500 Subject: [PATCH 4/4] Further edits --- docs/en_us/developers/source/i18n.rst | 151 +++++++++++++++----------- 1 file changed, 89 insertions(+), 62 deletions(-) diff --git a/docs/en_us/developers/source/i18n.rst b/docs/en_us/developers/source/i18n.rst index b19b8e6117..2a825c3b09 100644 --- a/docs/en_us/developers/source/i18n.rst +++ b/docs/en_us/developers/source/i18n.rst @@ -27,15 +27,17 @@ unfortunately limits what you can do with strings in the code. In general: runtime, there is no way for the translator to construct a proper sentence in their language. -2. Do not join together strings at runtime to create sentences. +2. Don't join strings together at runtime to create sentences. 3. Limit the amount of text in strings that is not presented to the user. HTML markup is better applied after the translation. If you give HTML to the translators, there's a good chance they will translate your tags or attributes. -See the detailed :ref:`Style Guidelines ` at the end for -details. +4. Use placeholders with descriptive names: ``"Welcome {student_name}"`` is + much better than ``"Welcome {0}"``. + +See the detailed Style Guidelines at the end for details. Editing source files @@ -77,14 +79,19 @@ XModules, Inputtypes and Responsetypes forbid importing Django. Each of these has its own way of accessing translations. You'll use lines like these instead:: - # for XBlock & XModule: + ### for XBlock & XModule: _ = self.runtime.service(self, "i18n").ugettext + # Translators: a greeting to newly-registered students. message = _("Welcome!") # for InputType and ResponseType: _ = self.capa_system.i18n.ugettext + # Translators: a greeting to newly-registered students. message = _("Welcome!") +"Translators" comments will work in these places too, so don't be shy about +providing clarifying comments to the translators. + Django template files ===================== @@ -105,7 +112,7 @@ Mako template files In Mako template files (`templates/*.html`), you can use all of the tools available to python programmers. Just make sure to import the relevant -functions first. Here's a mako template example:: +functions first. Here's a Mako template example:: <%! from django.utils.translation import ugettext as _ %> @@ -172,11 +179,11 @@ Other kinds of code We have not yet established guidelines for internationalizing the following. -* course content (such as subtitles for videos) +* Course content (such as subtitles for videos) -* documentation (written for Sphinx as .rst files) +* Documentation (written for Sphinx as .rst files) -* client-side templates written using Underscore. +* Client-side templates written using Underscore. Building and testing your code @@ -186,32 +193,16 @@ These instructions assume you are a developer writing new code to check in to Github. For other use cases in the translation life cycle (such as translating the strings, or checking the translations into Github, see use cases). -1. Run the rake i18n:extract command to create human-readable .po files. This - command may take a minute or two to complete: +1. Create human-readable .po files with the latest strings. This command may + take a minute or two to complete:: -:: + $ cd edx-platform + $ rake assets + $ rake i18n:extract - $ cd edx-platform - $ rake i18n:extract - -2. Generate dummy strings: run rake i18n:dummy to create fake translations. See - coverage testing (below) for more details. - - a. By default, these are created in the Esperanto language directory. - - 1. This will blow away any actual Esperanto translation files that may be - there. You can revert to the Github head after you complete testing. - - 2. You will need to switch your browser to Esperanto in order to view - the dummy text. - - 3. Django's implementation requires us to use a real language (like - Esperanto..) rather than an invented language (like Esperanto.. - er Martian) for this testing. - - b. Do not check the dummy text in to Github (in conf/locale/eo/LC_MESSAGES). - -:: +2. Generate dummy strings: See coverage testing (below) for more details. This + will create an "Esperanto" translation that is actually over-accented + English. Use this to create fake translations:: $ rake i18n:dummy @@ -219,26 +210,26 @@ the strings, or checking the translations into Github, see use cases). $ rake i18n:generate -4. Django should be ready to go. The next time you run studio or lms with a - non-English browser, the non-English strings (from step 3, above) should be - displayed. (But be sure that your settings for USE_I18N and USE_L10N are - both set to True. USE_I18N is currently set to False by default in - common.py, but is set to True in lms/envs/dev.py and cms/envs/dev.py) +4. Django should be ready to go. The next time you run Studio or LMS with a + browser set to Esperanto, the accented-English strings (from step 3, above) + should be displayed. Be sure that your settings for ``USE_I18N`` and + ``USE_L10N`` are both set to True. ``USE_I18N`` is set to False by default + in common.py, but is set to True in development settings files. 5. With your browser set to Esperanto, review the pages affected by your code and verify that you see fake translations. If you see plain English instead, your code is not being properly translated. Review the steps in editing - source files (above) + source files (above). Coverage testing **************** This tool is used during the bootstrap phase, when presumably (1) there is a -lot of EdX source code to be converted, and (2) there are not a lot of -available translations for externalized EdX strings. At the end of the +lot of edX source code to be converted, and (2) there are not a lot of +available translations for externalized edX strings. At the end of the bootstrap phase, we will eventually deprecate this tool in favor of other -processes. Once most of the EdX source code has been successfully converted, +processes. Once most of the edX source code has been successfully converted, and there are several full translations available, it will be easier to detect and correct specific gaps in compliance. @@ -273,13 +264,12 @@ This dummy text is also distinguished by Lorem ipsum text at the end of each string, and is always terminated with "#". The original English string is padded by about 30% extra characters, to simulate some language (like German) which tend to have longer strings than English. If you see problems with your -page layout, such as columns that do not fit, or text that is truncated (the # -character should always be displayed on every string), then you will probably -need to fix the page layouts accordingly to accommodate the longer strings. +page layout, such as columns that don't fit, or text that is truncated (the +``#`` character should always be displayed on every string), then you will +probably need to fix the page layouts accordingly to accommodate the longer +strings. -.. _style_guidelines: - Style guidelines **************** @@ -288,7 +278,7 @@ Don't append strings, interpolate values It is harder for translators to provide reasonable translations of small sentence fragments. If your code appends sentence fragments, even if it seems -to work ok for English, the same concatenation is very unlikely to work +to work OK for English, the same concatenation is very unlikely to work properly for other languages. Bad:: @@ -301,34 +291,36 @@ directory has." In some languages the fragments will be in different order. For example, in Japanese, "files" will come before "has." It is much easier for a translator to figure out how to translate the entire -sentence, using the pattern "The directory has %d files." +sentence, using the pattern "The directory has {file_count} files." Good:: - message = _("The directory has %d files.") % len(directory.files) + message = _("The directory has {file_count} files.").format(file_count=directory.files) -Use named interpolation fields -============================== +Use named placeholders +====================== -Named fields are better, especially if there are multiple fields, or if some -fields will be locally formatted (for example, number, date, or currency). +Python string formatting provides both positional and named placeholders. Use +named placeholders, never use positional placeholders. Positional placeholders +can't be translated into other languages which may need to re-order them to +make syntactically correct sentences. Even with a single placeholder, a named +placeholder provides more context to the translator. Bad:: message = _('Today is %s %d.') % (m, d) -Good:: +OK:: message = _('Today is %(month)s %(day)s.') % {'month': m, 'day': d} -Better:: +Best:: message = _('Today is {month} {day}.').format(month=m, day=d) Notice that in English, the month comes first, but in Spanish the day comes -first. This is reflected in the -edx-platform/conf/locale/es/LC_MESSAGES/django.po file like this:: +first. This is reflected in the .po file like this:: # fragment from edx-platform/conf/locale/es/LC_MESSAGES/django.po msgid "Today is {month} {day}." @@ -390,6 +382,41 @@ The difference between the right way and the wrong way can be very subtle: message = _("Goodbye.") +Be aware of nested syntax +========================= + +When translating strings in templated files, you have to be careful of nested +syntax. For example, consider this Javascript fragment in a Mako template:: + + + +When rendered for a French speaker, it will produce this:: + + + +which is now invalid Javascript. This can be avoided by using double-quotes +for the Javascript string. The better solution is to use a filtering function +that properly escapes the string for Javascript use:: + + + +which produces:: + + + +Other places that might be problematic are HTML attributes:: + + ${_("I love you.")} + + Singular vs plural ================== @@ -399,22 +426,22 @@ count:: if count == 1: msg = _("There is 1 file.") else: - msg = _("There are %d files.") % count + msg = _("There are {file_count} files.").format(file_count=count) This is not the correct way to choose a string, because other languages have -different rules for when to use singluar and when plural, and there may be more +different rules for when to use singular and when plural, and there may be more than two choices! One option is not to use different text for different counts:: - msg = _("Number of files: %d") % count + msg = _("Number of files: {file_count}").format(file_count=count) If you want to choose based on number, you need to use another gettext variant to do it:: from django.utils.translation import ungettext - msg = ungettext("There is %d file", "There are %d files", count) - msg = msg % count + msg = ungettext("There is {file_count} file", "There are {file_count} files", count) + msg = msg.format(file_count=count) This will properly use count to find a correct string in the translation file, and then you can use that string to format in the count.