[mir-coders] mir, i18n, and a proposal
kellan
kellan at protest.net
Sun Oct 14 14:49:28 PDT 2001
(i thought i sent this out on Friday, but go away for 2 days, come
back today, and its in my postponed queue, grrrr)
Apologizes in advance for a long email. The problems with having
too much time to write without a net connection.
** Intro: Where we are
So Marc seems to have my original assignment (expanded file
uploads) well under control, and with style! Nice job. Besides
I haven't really gotten a real handle on the whole producer
thang.
Which set me wondering to what I might be able to add to Mir.
And I decided as the non-german speaker among us, perhaps I
should work on improving Mir's internationalization support.
(hereafter referred to as i18n, an I 18 letters and then an N)
Currently Mir handles i18n by simply having parallel templates
for each supported language. (a separate but equal strategy)
This has some advantages. Its easy to implement, and
straightforward to comprehend. Allows for radical customization
on a per locale basis.(eg. you can have a special photo heavy
design for countries like the US where they don't teach us how to
read ;) File contention is kept to a minimum both during
deployment, and at translation time.
** Problem: An overview
So whats wrong with it? Well being a non-german speaker I've
felt accutely the pains of having templates fall out of sync.
The struggles with merely posting have really slowed down the my
ability to contribute up until this point. This raised a red
flag for me, if we were already having trouble keeping the
templates in sync, imagine once we got 2 dozens installed sites,
a dozen programmers, and numerous unsundry activist cum web
designers hacking on things.
So I started thinking about all the things wrong with the current
model. (btw. I just realized that raising a red flag might have
different conotations on this project, but stick with me, okay :)
** The Current Problems: a selected list
Bits that need to be translated are hard to find, buried either
in markup, or in java code. Makes the translators' (human or
automated) very difficult, and error prone.
No clear path for deploying functional improvements from one
localized set of templates, to another.(for try as we might, we
will always have some blending of presentation and logic)
No reuse. If my english templates look different then your
english templates, I can't gain the benefits of your
translations.
No failover. If I don't have the english template for a section
I'm screwed, I don't get the pieces we know in English, and the
rest in German.
Duplication scales badly. If we roll out a low-bandwidth version
of the sites (as there is talk of doing) then we'll double the
amount of translation work that needs to be done. If we roll out
a VoiceXML version,(purely hypothetical) then we've tripled the
work.
** A Proposal: you hoped there was one coming
So whats the solution? Well I don't know what *the* solution is,
but *a* solution is to use ResourceBundles. (the Java equivalent
of gettext)
I'm going to assume that most of us are acquainted with
ResourceBundles (Mir currently uses them for its config file),
but basically ResourceBundles are an API for handling text look
up and replacement based on Locales, where the most common
implementation is a funnily named flat files of key=value pairs.
(again, just like Mir's config) (If anybody needs/wants a quick
intro to ResourceBundles, I can probably provide one)
I've got some code, mostly borrowed from the Struts project, to
allow some neat loading, cacheing, and per key failover of
ResourceBundles with a clean API.
(org.apache.struts.util.MessageResources)
The are at least 2 possible ways we could implement this code.
A. Load all the properly localed strings into FreeMarkers
modelRoot as SimpleScalars. Then when a html author wants to
include some localized text, they simply use ${hello.world} which
is replaced appropiately.
B. Create a FM TemplateMethodModel that hooks into the
MessageResources. We then register then method with the
modelRoot, and when the html author wants that localized text
they instead call ${message(hello.world)}
option B looks a little more cluttered, but I'm leaning towards
it heavily.
Lazy loading seems like a really good idea for what could
potentially be a lot of information.
It keeps the work our servlet does up front to a minimum.
It allows us to do neat things like ${message("hello.world",
"Mar")} where message("hello.world") actually returns a
java.text.MessageFormat string and we substitute with place
holders. (like ? in SQL statements)
(of course my concenrs about lazy loading, preventing upfront
work might be way off in a non-dynamic environment, something I
haven't entirely wrapped my head around)
** Some related ideas:
I've got a friend at IBM's alphaWorks who wrote a Swing based app
for handling a tree of ResourceBundles. Includes nice
statistics, and visual feedback, and what has, and has not been
translated. I'll see if I can get a copy.
We should probably have a separte ResourceBundle for error
messages, that we as developers stat using.
As we start using the RBs more, they'll grow large and unwieldly,
I'm not sure what the solution is for this. Perhaps we want to
start from the beginning breaking things out functionally.
(FormLables.properties, GifText.properties,
ErrorMessages.properties) Not quite sure how to support this as
transparently as I would like for the html authors.
If, and when, we get web based template editting, we'll need
webbased RB editting as well. An alternative I found in one
system, based on Zope's dtml instead of freemarker, actually had
a form of the message tag "message_add", instead of message, that
took the bundle property, and a default value, and inserted it,
if it wasn't found. That seemed liked a neat approach.(And was
written by a consulting company in Basque country, where I
regularily heard at least 3 languages spoken)
Freemarker 1.6.2 has translations. So we could have something
like:
<translation messages>
Lots of text here
</translation messages>
Which might be good to support in some cases, but I think is a
version 2.0 feature :)
** So how is it coming?
http://riseup.net/~kellan/i18n/
Just some quick code I threw together, but I think it works well.
** What next?
Feedback on: what you think of my analysis, my proposal, the
implementation details, commitment to the migration to using
ResourceBundles, previous expirence with similiar ideas,
counter-proposals, or any other feedback is always welcome.
kellan
--
Dotcoms didn't make the world a better place.
They weren't trying to. Rest in Peace.
kellan at protest.net
More information about the mir-coders
mailing list