Localization of Web Applications With Babel

Back

07 Oct 2012

As the authors of Babel say: Babel is a collection of tools for internationalizing Python applications. In this article I show how to internationalize Python web application with Babel, Cherrypy, Mako and Distribute. I assume basic knowledge about mentioned frameworks, libraries and Python.

Integeration with CherryPy

Cherrpy has been my usual choice of Python web framework / wsgi server for years. It is easy to use, reliable and versatile - so it became perfect choice for following demonstration.

To integrate Babel with Cherrpy I use slightly modified I18nTool by Thorsten Weimann (original can be found here). Notice file i18ntool.py, which contains mentioned I18nTool.

Let's look on this example of Cherrpy application.

-- babelexample
    |-- __init__.py
    |-- config
    |   `-- cherrypy.conf
    `-- i18ntool.py

__init__.py:

import os
import cherrypy
from i18ntool import ugettext as _
from i18ntool import I18nTool

#Initialization of I18nTool
cherrypy.tools.I18nTool = I18nTool(os.path.abspath( __file__ ))

class Greeting(object):

    @cherrypy.expose
    def index(self):
        html = []
        html.append('<a href="switch_sk">' + _('slovak') + '</a>')
        html.append('<a href="switch_en">' + _('english') + '</a>')
        return '\n'.join(html)

    @cherrypy.expose
    def switch_sk(self):
        cherrypy.tools.I18nTool.set_custom_language('sk')
        return self.index()

    @cherrypy.expose
    def switch_en(self):
        cherrypy.tools.I18nTool.set_custom_language('en')
        return self.index()

if __name__ == '__main__':
    # root controler, url path to mount, path to config file
    cherrypy.quickstart(Greeting(),'/','config/cherrypy.conf')

Notice _("something") cals. Method ugettext search for translation of text supplied as argument in chosen language. If translation does not exist it returns original text.

Application use this cherrypy.conf file to configure Cherrypy and I18Tool.

cherrypy.conf:

[global]
server.socket_host = "127.0.0.1"
server.socket_port = 8003
server.thread_pool = 10

[/]
tools.encode.on = True
tools.encode.encoding = "utf-8"

# I18nTool enabled
tools.I18nTool.on = True

# default language
tools.I18nTool.default = 'en'

# directory where are gettext data stored
tools.I18nTool.mo_dir = "i18n"

# name of catalog assigned for our application
tools.I18nTool.domain = 'babelexample'

Creating Catalog

To get translation working Babel needs catalog of messages for each language. This is what has to be done in order to create it.

First you need to extract messages to be translated. This will extract for you all messages which you used as argument of ugettext into catalog template.

$ pybabel extract -o ./babelexample/i18n/dict.pot ./babelexample

Now is catalog template stored in dict.pot. To create catalog for concrete language, type command like this.

$ pybabel init -i ./babelexample/i18n/dict.pot -D babelexample -d ./babelexample/i18n -l sk

After that, you will find catalog here: ./babelexample/i18n/sk/LC_MESSAGES/babelexample.po. Open babelexample.po in your favorite text editor and write into msgstr entries correct translations for messages. When you are finished, delete line which contain "#, fuzzy" (it should be line 6). I edited my catalog like this.

babelexample.po

# Slovak translations for PROJECT.
# Copyright (C) 2012 ORGANIZATION
# This file is distributed under the same license as the PROJECT project.
# FIRST AUTHOR <EMAIL@ADDRESS>, 2012.
#
msgid ""
msgstr ""
"Project-Id-Version: PROJECT VERSION\n"
"Report-Msgid-Bugs-To: EMAIL@ADDRESS\n"
"POT-Creation-Date: 2012-09-29 21:54+0200\n"
"PO-Revision-Date: 2012-09-29 21:54+0200\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: sk <LL@li.org>\n"
"Plural-Forms: nplurals=3; plural=(n%10==1 && n%100!=11 ? 0 : n%10>=2 && "
"n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2)\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=utf-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Generated-By: Babel 0.9.6\n"

#: babelexample/__init__.py:14
msgid "slovak"
msgstr "slovensky"

#: babelexample/__init__.py:15
msgid "english"
msgstr "anglicky"

Finally run following command to compile catalog.

$ pybabel compile -D babelexample -d ./babelexample/i18n -l sk

Now you may run application and test if everything works.

$ python ./__init__.py

Application should be now available via url http://localhost:8003/.

Integration with Mako

Most webapplications use some template language for HTML rendering. I like to use Mako. Language sytax is similar to Python, has all features you usually need, and it is quite fast. Some argue that its syntax is less elegant than jinja2, but I am OK with that.

So let's rewrite HTML code generated by Greeting.index method into Mako template:

index.html:

<html>

<head>
<title>${_('Babel Example')}</title>
</head>

<body>
<a href="switch_sk">${_('Slovak')}</a> | <a href="switch_en">${_('English')}</a>
<p>${_('Just some text')}</p>
</body>

</html>

In application you need to setup Mako's TemplateLookup class and rewrite method Greeting.index to use Mako to render page.

import os
import cherrypy
from i18ntool import ugettext as _
from i18ntool import I18nTool
from mako.lookup import TemplateLookup

#Initialization of I18nTool
cherrypy.tools.I18nTool = I18nTool(os.path.abspath( __file__ ))

class Greeting(object):

    # configure Mako TemplateLookup to search for templates in dir './templates'
    # use unicode for both input and output encoding and import ugettext method
    # Mako will compile templates and use /tmp/babelexample_templates directory
    # to store them
    lookup = TemplateLookup(directories='./templates',
            module_directory='/tmp/babelexample_templates', collection_size=500,
            disable_unicode=False, input_encoding='utf-8',output_encoding='utf-8',
            default_filters=['decode.utf8'],
            imports=['from i18ntool import ugettext as _'])

    @cherrypy.expose
    def index(self):
        return self.lookup.get_template('index.html').render()

    @cherrypy.expose
    def switch_sk(self):
        cherrypy.tools.I18nTool.set_custom_language('sk')
        return self.index()

    @cherrypy.expose
    def switch_en(self):
        cherrypy.tools.I18nTool.set_custom_language('en')
        return self.index()

if __name__ == '__main__':
    # root controler, url path to mount, path to config file
    cherrypy.quickstart(Greeting(),'/','config/cherrypy.conf')

When you now try to run application you find out that some messages are not translated. Reason is obvious, you need to extract messages from Mako template, update and compile catalog. For that, you will need Mapping file to tell Babel where are template files and that it needs to use special method to extract messages from Mako templates.

babel.cfg:

# Extraction from Mako templates ...
[mako: babelexample/templates/**.html]
input_encoding = utf-8

Now you need to extract messages from Mako template. Comands will be slighty differ from last example. Start with extracting and updating dictionary.

$ pybabel extract -F babel.cfg -o ./babelexample/i18n/dict.pot ./
$ pybabel update -i ./babelexample/i18n/dict.pot -D babelexample -d ./babelexample/i18n -l sk

Last command merged old dictionary with new messages. In this case old messages from init.py were removed and new from index.html template were added. Mine babelexample.pot file after editing looks like this:

babelexample.po

# Slovak translations for PROJECT.
# Copyright (C) 2012 ORGANIZATION
# This file is distributed under the same license as the PROJECT project.
# FIRST AUTHOR <EMAIL@ADDRESS>, 2012.
#
msgid ""
msgstr ""
"Project-Id-Version: PROJECT VERSION\n"
"Report-Msgid-Bugs-To: EMAIL@ADDRESS\n"
"POT-Creation-Date: 2012-10-06 20:25+0200\n"
"PO-Revision-Date: 2012-10-06 19:28+0200\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: sk <LL@li.org>\n"
"Plural-Forms: nplurals=3; plural=(n%10==1 && n%100!=11 ? 0 : n%10>=2 && "
"n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2)\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=utf-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Generated-By: Babel 0.9.6\n"

#: babelexample/templates/index.html:4
msgid "Babel Example"
msgstr "Babel ukážka"

#: babelexample/templates/index.html:8
msgid "slovak"
msgstr "slovensky"

#: babelexample/templates/index.html:8
msgid "english"
msgstr "anglicky"

#: babelexample/templates/index.html:9
msgid "Just some text"
msgstr "Iba nejaký text"

Afrer you compile dictionary by following command translation should work for all messages.

$ pybabel compile -D babelexample -d ./babelexample/i18n -l sk

Integration with Distribute

If you plan or already are distributing your application with Distribute, you will run into small issue and simple setup.py file will not solve it. The problem are .mo dictionary files which needs to be compiled before install command execution as part of build process.

Luckily creators of Babel taught of that, so babel can be integrated with distribute easily. Even though to make setup.py build command compile catalogs automatically is tricky. I work it out this way:

from setuptools import setup
from distutils.command.build import build as build_
from babel.messages.frontend import compile_catalog, extract_messages, update_catalog, init_catalog
from distutils.cmd import Command
from string import strip

# this is implementation of command which complies all catalogs (dictionaries)
class compile_all_catalogs(Command):

    description = 'compile message catalogs for all languages to binary MO files'
    user_options = [
        ('domain=', 'D',
         "domain of PO file (default 'messages')"),
        ('directory=', 'd',
         'path to base directory containing the catalogs'),
        ('locales=', 'l',
         'locale of the catalogs to compile'),
        ('use-fuzzy', 'f',
         'also include fuzzy translations'),
        ('statistics', None,
         'print statistics about translations')
    ]
    boolean_options = ['use-fuzzy', 'statistics']

    def initialize_options(self):
        self.domain = None
        self.directory = None
        self.locales = None
        self.use_fuzzy = False
        self.statistics = False

    def finalize_options(self):
        self.locales = map(strip,self.locales.split(','))

    def run(self):
        for locale in self.locales:
            compiler = compile_catalog(self.distribution)

            compiler.initialize_options()
            compiler.domain = self.domain
            compiler.directory = self.directory
            compiler.locale = locale
            compiler.use_fuzzy = self.use_fuzzy
            compiler.statistics = self.statistics
            compiler.finalize_options()

            compiler.run()

# This is modification of build command, compile_all_catalogs
# is added as last/first command
class build(build_):
     sub_commands = build_.sub_commands[:]
     sub_commands.insert(0,('compile_all_catalogs', None))

setup(
    name = 'babelexample',
    version = "0.1.0",
    packages = [
        'babelexample',
    ],

    # package data, configuration file, templates, catalogs
    package_data = {
        'babelexample': [
            'config/cherrypy.conf',
            'templates/*.html',
            'i18n/sk/LC_MESSAGES/babelexample.mo',
            'i18n/en/LC_MESSAGES/babelexample.mo',
            ],
    },

    # new commands added and build command modified
    cmdclass = {
        'build': build,
        'compile_catalog': compile_catalog,
        'extract_messages': extract_messages,
        'update_catalog': update_catalog,
        'init_catalog': init_catalog,
        'compile_all_catalogs': compile_all_catalogs
    },

    # dependencies of application
    install_requires = [
        'cherrypy>=3.1.2',
        'mako>=0.3.3',
        'babel>=0.9',
        ],
    # required packages for build process
    setup_requires = [
        'babel>=0.9',
    ],

Newly added commands has lots of arguments. It would be tedious to write them every time to do required operation and what is worse modified build command will not work at all. To solve this I added another setup.cfg configuration file, which supply correct arguments for added commands.

setup.cfg

[extract_messages]
keywords = _
mapping_file = babel.cfg
output-file = babelexample/i18n/vnet.pot
input-dirs = .

[init_catalog]
domain = vnet
input-file = babelexample/i18n/vnet.pot
output-dir = babelexample/i18n

[update_catalog]
domain = vnet
input-file = babelexample/i18n/vnet.pot
output-dir = babelexample/i18n

[compile_catalog]
domain = babelexample
directory = babelexample/i18n

[compile_all_catalogs]
locales = sk
domain = babelexample
directory = babelexample/i18n

The tree structure of package before build looks like this:

|-- babel.cfg
|-- babelexample
|   |-- config
|   |   `-- cherrypy.conf
|   |-- i18n
|   |   |-- dict.pot
|   |   `-- sk
|   |       `-- LC_MESSAGES
|   |           `-- babelexample.po
|   |-- i18ntool.py
|   |-- __init__.py
|   `-- templates
|       `-- index.html
|-- setup.cfg
`-- setup.py

Finally when you run:

$ python setup.py build

It now should install Babel if needed and build catalogs before building package. Finished application can be downloaded here.

References

.