NextGenerationLocalizationSystem

Summary

Development and deployment of a new i18n data distribution system.

Rationale

Current i18n distribution system is a compromise for usefulness and technical difficulty of large-scale deployment via apt. It has many shortcomings that can be fixed, some technical but some - more important - ideological.

Ideological limitations

  • Users are bound to one source of translations
  • Universe packages are never updated with new translations *except* at each release, this causes massive delays and useless packages for non-English speaking users as new releases bring more new strings to translate.
  • Users are discouraged to get involved in translation once they understand how long it takes to propagate their changes. They choose other methods and thus kill the spirit of launchpad.

Technical limitations

  • Language packs are both too big - they contain files for applications that are not necessarily installed (thus wasting space), and too small - they only target packages from main (for technical reasons, universe packages were to huge).
  • Language packs were chosen because distributing each translation for each package as another package would be impossible via apt/dpkg. Those systems were not designed to process that kind of information and they would do so inefficiently. Dpkg has huge overhead for one-file packages.

Use cases

Joe is a typical Ubuntu user but he does not speak English at all. He's got his default setup which he does not touch too much but he also got some extra packages from Universe and from Main. He is quite happy with it most of the time but there are many holes in the applications he uses. Joe has a friend, Bart, that is fluent in English and their common language. Bart is a lanuchpad activists and is translating applications to help people like Joe. Joe will not see most of Bart's work for a long time, probably never. In general both would like a faster response time in the case of translations.

gnomepl.org is a group of translators that focus on high-quality translations of gnome applications. They have active contributors and upgrades are available daily. They update upstream translations every upstream release, similar to the translations for Universe. They would like to have closer access to their users. Setting up a l10n repository and distributing translations would fulfill that need.

Ubuntu ships modified packages as a result of branding. If the proposed system could be implemented upstream-wide, Ubuntu could just provide a tiny repository with Ubuntu-specific strings that would complement existing translations.

Scope

This idea touches nearly every single package from universe and some packages in main.

Design

A system similar to apt with respect to front end is proposed.

l10n-get

L10n-get would provide following operations:

  • install translations for a given language
  • remove translations for a given language
  • update status of translation providers
  • upgrade translations for all installed languages
  • display which translations have available upgrades

This application would be driven by configuration file in /etc/l10n/providers In general following information could be stored there:

i18n http://i18n.ubuntu.com dapper          # The default Distro repository
i18n http://gnomepl.org/i18n gnome2.14      # Translations for 2.14 packages (for example)
i18n http://www.suxx.pl/ubuntu/i18n/ dapper # Translations for a few apps my dad is using
                                            # updated daily

l10n-get (server side directory layout)

The directory layout outlined here is not set in stone. If anything is incorrect from technical point of view please say why.

Path

Contained information

/info

List of supported languages; List of supported update modes (full/daily/periodical)

/full/$lang/package-list

List of supported packages

/full/$lang/packages/$package.mo

full .mo file for given language and package

/full/$lang/packages/$package.info

Last-upgrade timestamp, checksum

/diffs/daily/$lang/$year-$month-$day/$sequence-number.mo.diff

A special mo-efficient diff against previous day, or against previous sequence number in the same day. This is an incremental diff, similar to --patch-N in arch based systems. All patches are required for an up-to-date version

/diffs/periodical/$lang/$year-$month-$day/to/$year-$month-$day/data.mo.diff

Periodical diff, useful for rare upgrades. Contains a full diff against one snapshot and another. This might be a bigger file if many things have been modified

/aggregates/$lang/info

Information about available aggregates and their contents

/aggregates/$lang/info.$aggregate

External information if the info file should grow too big

/aggregates/$lang/$year-$month-$day/aggregate-pack-$id/

A package containing multiple .mo files

Implementation

Code

Data preservation and migration

Outstanding issues

BoF agenda and discussion


CategorySpec

NextGenerationLocalizationSystem (last edited 2008-08-06 16:35:17 by localhost)