Mallard and Ubuntu-Translators and Translation Regressions

Thu Feb 25 17:56:35 UTC 2010

Shaun McCance wrote:
> On Thu, 2010-02-25 at 09:44 -0500, Kyle Nitzsche wrote:
>   
>> Milo Casagrande wrote:
>>     
>>> 2010/2/25 Milo Casagrande <milo at casagrande.name>:
>>>   
>>>       
>>>> But if you start having something like:
>>>> <para>
>>>>  Put some text here with a lot of Docbook tags,
>>>> <guimenu>XYZ</guimenu>, <emphasis>YXZ</emphasis>,
>>>> <application>ZYX</application>.
>>>> </para>
>>>> or lists, or tables, you are going to lose you translations
>>>>     
>>>>         
>>> Hmmm... I think with lists or tables we can be safe (as long as they
>>> are not mixed with other tags), wrong examples.
>>> But I hope you got the point.
>>>
>>>   
>>>       
>> Excellent point. The issue is that docbook tags are (at least sometimes) 
>> embedded (by xml2-po) in the text to be translated, and since Mallard 
>> tags are often different, there is a potentially large number of strings 
>> that will be affected (short of a solution).
>>     
>
> I don't know how the LP translation system works, but with xml2po
> you rarely see block or section elements in the PO files.  Unless
> you nest block content, you should mostly see only inline elements.
>   
LP translates po files straightforwardly, so I think it is orthogonal to 
the solution.
> The hard part of going from DocBook to Mallard is the structural
> elements.  Converting a DocBook inline context to Mallard should
> be pretty straightforward.
>
> So it occurs to me that somebody could write a tool that reads in
> a DocBook-based PO file and converts the msgid and msgstr of 
I could write that easily (and would be happy to do so) if the data and 
rules (db tag > mallard tag) are known and clear. Then there's the 
question of how and where it gets integrated into the transition process.
> only
> those messages which have tags from a certain well-known set.
>
> I just did a quick grep on the Gnome User Guide and found only 46
> elements that appear in the PO files.  Of these, I'd say roughly
> half could be reliably automatically converted.
>   
That list sounds like the start of the data & rules needed for 
auto-conversion.

That still leaves half that would represent manual work for docs folks 
to convert.
> The utility of this depends, of course, on writers doing the most
> obvious conversion of their content.  But even if the converted
> messages don't match, merge tools will mark them as either fuzzy
> or unused, so there's no harm in having them there.
>   
but then they are translation regressions, I think?

Cheers,
Kyle
> --
> Shaun
>
>
>