gettext handling of LRO and RLO
Yaron Shahrabani
sh.yaron at gmail.com
Tue Dec 28 20:02:06 GMT 2010
Now I get it, this is a really broad issue regarding the gettext parser and
the msgfmt compiler, I guess we can continue the discussion in some place
where gettext devs can see it.
What about parsers like Poedit and Virtaal? they should also have the
ability to read this files regardless of directionality unicode chars, or
they can already do it (didn't check yet).
I tried contacting another Israeli developer who is a gettext developer as
well and I didn't get his answer yet, although we can wait another day or 2
before we will retry contacting him about this issue.
Kind regards,
Yaron Shahrabani
<Hebrew translator>
On Sun, Dec 26, 2010 at 2:41 PM, Chris Scaife <scaife.chris at gmail.com>wrote:
> Entering data in a computer serves more purpose than simple having it
> regurgitated to LOOK the same on the screen to a human reader: We would like
> the computer to actually process it correctly and consistently.
>
> Say a database holds the equivalent of name "Sproket123" but using an R2L
> alphabet.
>
> Then thanks to the Unicode bidi algorithm, on the screen a sequence of
> characters appears to be "Sprocket321". I could also enter
> "<RLO>Sprocket123<PDF>" or <LRO>321tekcorpS<PDF> to get that very same
> appearance... and quite a few other combinations as well.
>
> To HUMAN readers all of these different combinations are indistinguishable
> and exactly the same thing, yet to the COMPUTER program algorithms they are
> all completely different.
>
> Software source code, be it in C or any other programming language. Is
> mostly written by people in a Left to right tradition without ANY need to
> embed Right to Left characters or any incentive to consider them in our
> algorithms: a string is just a sequence of characters. Thus consideration of
> how programming languages should handle directionality overrides is not very
> important.
>
> OTOH it is of paramount importance in translation files such as the ones
> submitted to gettext. IMO you really must not consider these two very
> different file types as the same issue and placing the directionality
> overrides inside the quotes is IMO the worst possible solution.
>
> Anyway as far as I'm concerned I've solved that issue for my own project:
> It now has the capability to handle R2L correctly under full control of the
> translator but transparently to the person doing the programming.
>
> My OWN technique for creating translation files consists of placing LRO or
> RLO at the beginning of each line so that I know exactly what sequences of
> characters I will be generating and then I remove them before submitting to
> msgfmt. Other people can obviously use other tactics... Either way I'm
> finally back onto my bidirectional terminal emulator project :)
>
>
>> > While there are multiple ways to achieve the very same appearance on the
>
> > screen, most programs not written with this in mind will consider text
> with
>
> > different embedded overrides in different places as completely different
>
> > text... thus resulting in malfunction on things like a database lookup or
>
> > even a simple string comparison.
>
> >
>
> I might need to ask you to explain that again, it could be the late hour
>
> though ?
>
>
>
>
> --
> Ubuntu-RTL mailing list
> Ubuntu-RTL at lists.ubuntu.com
> Modify settings or unsubscribe at:
> https://lists.ubuntu.com/mailman/listinfo/ubuntu-rtl
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://lists.ubuntu.com/archives/ubuntu-rtl/attachments/20101228/6d7eb009/attachment.htm
More information about the Ubuntu-RTL
mailing list