Added more coding details to i18n doc.
This commit is contained in:
@@ -2,6 +2,10 @@
|
||||
Internationalization coding guidelines
|
||||
######################################
|
||||
|
||||
Preparing code to be presented in many languages can be complex and difficult.
|
||||
The rules here give the best practices for marking English strings in source
|
||||
so that it can be extracted, translated, and presented to the user in the
|
||||
language of their choice.
|
||||
|
||||
See also:
|
||||
|
||||
@@ -11,7 +15,7 @@ See also:
|
||||
* `Django Format localization <https://docs.djangoproject.com/en/dev/topics/i18n/formatting/>`_
|
||||
|
||||
|
||||
General Internationalization Rules
|
||||
General internationalization rules
|
||||
**********************************
|
||||
|
||||
In order to localize source files, we need to prepare them so that the
|
||||
@@ -30,7 +34,8 @@ unfortunately limits what you can do with strings in the code. In general:
|
||||
translators, there's a good chance they will translate your tags or
|
||||
attributes.
|
||||
|
||||
See the detailed Style Guidelines at the end for details.
|
||||
See the detailed :ref:`Style Guidelines <style_guidelines>` at the end for
|
||||
details.
|
||||
|
||||
|
||||
Editing source files
|
||||
@@ -50,20 +55,37 @@ do:
|
||||
produce the best translation. They have a "Translators:" marker. They must
|
||||
appear on the line preceding the text they describe.
|
||||
|
||||
The code samples below show how to do each of these things.
|
||||
The code samples below show how to do each of these things. Note that you have
|
||||
to take into account not just the programming language involved, but the type
|
||||
of file: Javascript embedded in an HTML Mako template is treated differently
|
||||
than Javascript in a pure .js file.
|
||||
|
||||
Python source code
|
||||
==================
|
||||
|
||||
.. highlight:: python
|
||||
|
||||
In Python source code (read the django docs for more details)::
|
||||
In most Python source code (read the Django docs for more details)::
|
||||
|
||||
from django.utils.translation import ugettext as _
|
||||
|
||||
# Translators: This will help the translator
|
||||
message = _("Welcome!")
|
||||
|
||||
Some edX code cannot use Django imports. To maintain portability, XBlocks,
|
||||
XModules, Inputtypes and Responsetypes forbid importing Django. Each of these
|
||||
has its own way of accessing translations. You'll use lines like these
|
||||
instead::
|
||||
|
||||
# for XBlock & XModule:
|
||||
_ = self.runtime.service(self, "i18n").ugettext
|
||||
message = _("Welcome!")
|
||||
|
||||
# for InputType and ResponseType:
|
||||
_ = self.capa_system.i18n.ugettext
|
||||
message = _("Welcome!")
|
||||
|
||||
|
||||
Django template files
|
||||
=====================
|
||||
|
||||
@@ -101,11 +123,14 @@ it)::
|
||||
|
||||
<script type="text/javascript" src="jsi18n/"></script>
|
||||
|
||||
Then, in javascript files (`*.js`)::
|
||||
Then, in Javascript files (`*.js`)::
|
||||
|
||||
// Translators: this will help the translator.
|
||||
var message = gettext('Welcome!');
|
||||
|
||||
Note that Javascript embedded in HTML in a Mako template file is handled
|
||||
differently. There, you use the Mako syntax even within the Javascript.
|
||||
|
||||
Coffeescript files
|
||||
==================
|
||||
|
||||
@@ -146,10 +171,6 @@ Other kinds of code
|
||||
===================
|
||||
|
||||
We have not yet established guidelines for internationalizing the following.
|
||||
See remaining work for more details.
|
||||
|
||||
* xblocks (in edx-platform/src/xblock) should not depend on django, so we
|
||||
should use the python gettext library instead.
|
||||
|
||||
* course content (such as subtitles for videos)
|
||||
|
||||
@@ -162,8 +183,8 @@ Building and testing your code
|
||||
******************************
|
||||
|
||||
These instructions assume you are a developer writing new code to check in to
|
||||
github. For other use cases in the translation life cycle (such as translating
|
||||
the strings, or checking the translations into github, see use cases).
|
||||
Github. For other use cases in the translation life cycle (such as translating
|
||||
the strings, or checking the translations into Github, see use cases).
|
||||
|
||||
1. Run the rake i18n:extract command to create human-readable .po files. This
|
||||
command may take a minute or two to complete:
|
||||
@@ -179,7 +200,7 @@ the strings, or checking the translations into github, see use cases).
|
||||
a. By default, these are created in the Esperanto language directory.
|
||||
|
||||
1. This will blow away any actual Esperanto translation files that may be
|
||||
there. You can revert to the github head after you complete testing.
|
||||
there. You can revert to the Github head after you complete testing.
|
||||
|
||||
2. You will need to switch your browser to Esperanto in order to view
|
||||
the dummy text.
|
||||
@@ -188,7 +209,7 @@ the strings, or checking the translations into github, see use cases).
|
||||
Esperanto..) rather than an invented language (like Esperanto..
|
||||
er Martian) for this testing.
|
||||
|
||||
b. Do not check in to github the dummy text (in conf/locale/eo/LC_MESSAGES).
|
||||
b. Do not check the dummy text in to Github (in conf/locale/eo/LC_MESSAGES).
|
||||
|
||||
::
|
||||
|
||||
@@ -209,6 +230,7 @@ the strings, or checking the translations into github, see use cases).
|
||||
your code is not being properly translated. Review the steps in editing
|
||||
source files (above)
|
||||
|
||||
|
||||
Coverage testing
|
||||
****************
|
||||
|
||||
@@ -238,7 +260,7 @@ This dummy text is distinguished by extra accent characters. If you see plain
|
||||
English instead (without these accents), it most likely means the string has
|
||||
not been externalized yet. To fix this:
|
||||
|
||||
* Find the string in the source tree (either in python, javascript, or html
|
||||
* Find the string in the source tree (either in Python, Javascript, or HTML
|
||||
template code).
|
||||
|
||||
* Refer to the above coding guidelines to make sure it has been externalized
|
||||
@@ -256,11 +278,13 @@ character should always be displayed on every string), then you will probably
|
||||
need to fix the page layouts accordingly to accommodate the longer strings.
|
||||
|
||||
|
||||
.. _style_guidelines:
|
||||
|
||||
Style guidelines
|
||||
****************
|
||||
|
||||
Don't append strings. Interpolate values instead.
|
||||
=================================================
|
||||
Don't append strings, interpolate values
|
||||
========================================
|
||||
|
||||
It is harder for translators to provide reasonable translations of small
|
||||
sentence fragments. If your code appends sentence fragments, even if it seems
|
||||
@@ -288,7 +312,7 @@ Use named interpolation fields
|
||||
==============================
|
||||
|
||||
Named fields are better, especially if there are multiple fields, or if some
|
||||
fields will be locally formatted (i.e. number, date, or currency).
|
||||
fields will be locally formatted (for example, number, date, or currency).
|
||||
|
||||
Bad::
|
||||
|
||||
@@ -298,13 +322,17 @@ Good::
|
||||
|
||||
message = _('Today is %(month)s %(day)s.') % {'month': m, 'day': d}
|
||||
|
||||
Better::
|
||||
|
||||
message = _('Today is {month} {day}.').format(month=m, day=d)
|
||||
|
||||
Notice that in English, the month comes first, but in Spanish the day comes
|
||||
first. This is reflected in the
|
||||
edx-platform/conf/locale/es/LC_MESSAGES/django.po file like this::
|
||||
|
||||
# fragment from edx-platform/conf/locale/es/LC_MESSAGES/django.po
|
||||
msgid "Today is %(month)s %(day)s."
|
||||
msgstr "Hoy es %(day) de %(month)s."
|
||||
msgid "Today is {month} {day}."
|
||||
msgstr "Hoy es {day} de {month}."
|
||||
|
||||
The resulting output is correct in each language::
|
||||
|
||||
@@ -312,7 +340,57 @@ The resulting output is correct in each language::
|
||||
Spanish output: "Hoy es 26 de Noviembre."
|
||||
|
||||
|
||||
Singular vs Plural
|
||||
Only translate literal strings
|
||||
==============================
|
||||
|
||||
As programmers, we're used to using functions in flexible ways. But the
|
||||
translation functions like ``_()`` and ``gettext()`` can't be used like other
|
||||
functions. At runtime, they are real functions like any other, but they also
|
||||
serve as markers for the string extraction process.
|
||||
|
||||
For string extraction to work properly, the translation functions must be
|
||||
called with only literal strings. If you use them with a computed value,
|
||||
the string extracter won't have a string to extract.
|
||||
|
||||
The difference between the right way and the wrong way can be very subtle:
|
||||
|
||||
::
|
||||
|
||||
# BAD: This tries to translate the result of .format()
|
||||
_("Welcome, {name}".format(name=student_name))
|
||||
|
||||
# GOOD: Translate the literal string, then use it with .format()
|
||||
_("Welcome, {name}").format(name=student_name))
|
||||
|
||||
::
|
||||
|
||||
# BAD: The dedent always makes the same string, but the extractor can't find it.
|
||||
_(dedent("""
|
||||
.. very long message ..
|
||||
"""))
|
||||
|
||||
# GOOD: Dedent the translated string.
|
||||
dedent(_("""
|
||||
.. very long message ..
|
||||
"""))
|
||||
|
||||
::
|
||||
|
||||
# BAD: The string is separated from _(), the extractor won't find it.
|
||||
if hello:
|
||||
msg = "Welcome!"
|
||||
else:
|
||||
msg = "Goodbye."
|
||||
message = _(msg)
|
||||
|
||||
# GOOD: Each string is wrapped in _()
|
||||
if hello:
|
||||
message = _("Welcome!")
|
||||
else:
|
||||
message = _("Goodbye.")
|
||||
|
||||
|
||||
Singular vs plural
|
||||
==================
|
||||
|
||||
It's tempting to improve a message by selecting singular or plural based on a
|
||||
@@ -340,3 +418,72 @@ to do it::
|
||||
|
||||
This will properly use count to find a correct string in the translation file,
|
||||
and then you can use that string to format in the count.
|
||||
|
||||
|
||||
Translating too early
|
||||
=====================
|
||||
|
||||
When the ``_()`` function is called, it will fetch a translated string. It
|
||||
will use the current user's language to decide which string to fetch. If you
|
||||
invoke it before we know the user, then it will get the wrong language.
|
||||
|
||||
For example::
|
||||
|
||||
from django.utils.translation import ugettext as _
|
||||
|
||||
HELLO = _("Hello")
|
||||
GOODBYE = _("Goodbye")
|
||||
|
||||
def get_greeting(hello):
|
||||
if hello:
|
||||
return HELLO
|
||||
else:
|
||||
return GOODBYE
|
||||
|
||||
Here the HELLO and GOODBYE constants are assigned when the module is first
|
||||
imported, at server startup. There is no current user then, so ugettext will
|
||||
use the server's default language. When we eventually use those constants to
|
||||
show a message to the user, they won't be looked up again, and the user will
|
||||
get the wrong language.
|
||||
|
||||
There are a few ways to deal with this. The first is to avoid calling ``_()``
|
||||
until we have the user::
|
||||
|
||||
def get_greeting(hello):
|
||||
if hello:
|
||||
return _("Hello")
|
||||
else:
|
||||
return _("Goodbye")
|
||||
|
||||
Another way is to use Django's ugettext_lazy function. Instead of returning
|
||||
a string, it returns a lazy object that will wait to do the lookup until it is
|
||||
actually used as a string:
|
||||
|
||||
from django.utils.translation import ugettext_lazy as _
|
||||
|
||||
This can be tricky because the lazy object doesn't act like a string in all
|
||||
cases.
|
||||
|
||||
The last way to solve the problem is to mark the string so that it will be
|
||||
extracted properly, but not actually do the lookup when the constant is
|
||||
defined::
|
||||
|
||||
from django.utils.translation import ugettext
|
||||
|
||||
_ = lambda text: text
|
||||
|
||||
HELLO = _("Hello")
|
||||
GOODBYE = _("Goodbye")
|
||||
|
||||
_ = ugettext
|
||||
|
||||
def get_greeting(hello):
|
||||
if hello:
|
||||
return _(HELLO)
|
||||
else:
|
||||
return _(GOODBYE)
|
||||
|
||||
Here we define ``_()`` as a pass-through function, so the string will be
|
||||
found during extraction, but won't be translated too early. Then we redefine
|
||||
``_()`` to be the real translation lookup function, and use it at runtime to
|
||||
get the localized string.
|
||||
|
||||
Reference in New Issue
Block a user