Multiple "msgid" for an "msgstr" in gettext - gettext

Is it possible to make two or more msgids matching one msgstr?
For example, both ('list.empty') and ('list.null') return "There is no any objects yet."
If I write this way in po file:
msgid "list.empty"
msgid "list.null"
msgstr "There is no any objects yet."
It just errors with "missing 'msgstr'":
However,
msgid "list.empty"
msgstr "There is no any objects yet."
msgid "list.null"
msgstr "There is no any objects yet."
Looks and works fine but stupid, because once I change one msgstr without another, they return different result.
Does anyone have any better hacks?

You are approaching gettext in the wrong way, here is how it works:
msgid is required for each entry
msgctxt is optional and is used to differentiate between two msgid records with same content that may have different translations.
(msgid, msgctxt) is the unique key for the dictionary, if msgctxt is missing you can consider it null.
You should read the gettext documentation before implementing as it's not always straightforward.
In your case, this is how you are supposed to implement it:
msgctxt "list.empty"
msgid "There is no any objects yet."
msgctxt "list.null"
msgid "There is no any objects yet."

Related

Localization via Locale::Maketext::Simple always falls back to default instead of .po entry

In a perl Module I want to use https://metacpan.org/pod/Locale::Maketext::Simple to convert strings to different languages.
My .po files are located unter /opt/x/languages, e.g. /opt/x/languages/en.po.
In my module I'm using the following header:
use Locale::Maketext::Simple (
Path => '/opt/x/languages',
Style => 'maketext'
);
loc_lang('en');
An entry in the .po files looks like this:
msgid "string [_1] to be converted"
msgstr "string [_1] is converted"
and the check via console with msgfmt -c en.po throws no errors so far.
But when I'm converting a string with loc() like loc("string [_1] to be converted", "xy") it gives me the output of "string xy to be converted" instead of "string xy is converted" as I would expect it. This looks to me like the .po files are not loaded correctly.
How can I check what .po files are found during maketext instantiation? Or am I mixing things up and there' a general mistake?
Edit 1:
Thanks for the comments, but it still does not work.
I've checked the files with https://poedit.net/ and created the corresponding .mo files (currently for de and en) with this tool as well. They are located next to the .po files (inside /opt/x/languages).
For completeness, my header looks like this:
# MY OWN LANGUAGE FILE (DE)
# 06-2019 by me
#
msgid ""
msgstr ""
"Project-Id-Version: 1.0.0\n"
"POT-Creation-Date: 2019-06-01 00:00+0100\n"
"PO-Revision-Date: 2019-06-02 00:00+0100\n"
"Last-Translator: thatsme <me#me.de>\n"
"Language-Team: unknown\n"
"Language: de\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
"X-Generator: Poedit 2.2.3\n"
msgid "string [_1] to be converted"
msgstr "string [_1] is converted"
After some more digging and testing I finally found the issue for this behaviour, so here's my solution so far. Hope this may help others, because you rarely find any good documentation on this topic:
Add libraries
I added the library https://metacpan.org/pod/Locale::Maketext::Simple as stated above, but forgot to add https://metacpan.org/pod/Locale::Maketext::Lexicon.
This took me quite long to see, because there were no exceptions or errors thrown, just... nothing.
In the Maketext::Simple documentation it says
If Locale::Maketext::Lexicon is not present, it implements a minimal localization function by simply interpolating [_1] with the first argument, [_2] with the second, etc.
what looks at a first glance that .po files are loaded without Maketext::Lexikon, but it simply replaces placeholders.
Other issues:
I then discovered that all string are translated, except for the ones with placeholders like [_1]. I could not find a reason for this, but I moved to
Style => 'gettext'
and replaced all [_1], [_2]... with %1, %2... - that works like a charm.

How to assign particular context to --keyword for proper_name?

When using the xgettext tool it is possible to automatically add commenting to assist translators with regards to proper names (as documented).
The documentation suggests to add the following to the command line:
--keyword='proper_name:1,"This is a proper name. See the gettext manual, section Names."'
Which results in proper names being extracted to the .pot file like this:
#. This is a proper name. See the gettext manual, section Names.
#: ../Foo.cpp:18
msgid "Bob"
msgstr ""
The problem with this; is that no particular context has been defined for that string. Here is ideally how the proper name would be extracted:
#. This is a proper name. See the gettext manual, section Names.
#: ../Foo.cpp:18
msgctxt "Proper Name"
msgid "Bob"
msgstr ""
I've tried the following but with no success:
# Hoping that 0 would be the function name 'proper_name'.
--keyword='proper_name:0c,1,"This is a proper name. See the gettext manual, section Names."'
# Hoping that -1 would be the function name 'proper_name'.
--keyword='proper_name:-1c,1,"This is a proper name. See the gettext manual, section Names."'
# Hoping that the string would be used as the context.
--keyword='proper_name:"Proper Name"c,1,"This is a proper name. See the gettext manual, section Names."'
# Hoping that the string would be used as the context.
--keyword='proper_name:c"Proper Name",1,"This is a proper name. See the gettext manual, section Names."'
Is there a way to force a particular msgctxt to be used for all strings extracted with a keyword (such as proper_name from the example above)?
If there is no option to achieve this with xgettext as-is then I considered perhaps using the following:
--keyword='proper_name:1,"<PROPERNAME>"'
Resulting with:
#. <PROPERNAME>
#: ../Foo.cpp:18
msgid "Bob"
msgstr ""
The problem then becomes; how to automatically translate all occurrences of this in the resulting .pot file into the following:
#. This is a proper name. See the gettext manual, section Names.
#: ../Foo.cpp:18
msgctxt "Proper Name"
msgid "Bob"
msgstr ""
If you want to extract a message context, it has to be part of the argument list. And the numerical part in "Nc" has to be a positive integer. All your attempts with 0, -1 are fruitless, sorry.
The signature of your function must look like this:
#define PROPER_NAME "Proper Name"
const char *proper_name(const char *ctx, const char *name);
And then call it like this:
proper_name(PROPER_NAME, "Bob");
That repeats PROPER_NAME all over the code, but it's the only way to get it into the message context.
Maybe file a feature request?
There is also a hack that achieves the same without changing your source code. I assume that you're using C and the standard Makefile (but you can do the same in other languages):
Copy the file POTFILES to POTFILES-proper-names and add a line ./proper_names.pot to POTFILES.in.
Then you have to create proper_names.pot:
xgettext --files-from=POTFILES-proper-names \
--keyword='' \
--keyword='proper_names:1:"Your comment ..."' \
--output=proper_names.pox
This will now only contain the entries that were maked with "proper_names()". Now add the context:
msg-add-content proper_names.pox "Proper Name" >proper_names.pot
rm proper_names.pot
Unfortunately, there is no program called "msg-add-content". Grab one of the zillion po-parsers out there, and write one yourself (or take mine at the end of this post).
Now, update your PACKAGE.pot as usual. Since "proper_names.pox" is an input file for the main xgettext run, all your extracted proper names with the context added, are added to your pot file (and their context will be used).
Short of another script for adding a message context to all your entries in a .pot file, use this one:
#! /usr/bin/env perl
use strict;
use Locale::PO;
die "usage: $0 POFILE CONTEXT" unless #ARGV == 2;
my ($input, $context) = #ARGV;
my $entries = Locale::PO->load_file_asarray($input) or die "$input: failure";
foreach my $entry (#$entries) {
$entry->msgctxt($context) unless '""' eq $entry->msgid;
print $entry->dump;
}
You have to install the Perl library "Locale::PO" for it, either with "sudo cpan install Locale::PO" or use the pre-built version that your vendor may have.

INTLTOOL_EXTRACT breaks translatable lines

I don't know why but my source code line and the one in the pot file are not the same, source code:
#include <libintl.h>
#define _(String) gettext(String)
/* more code */
printf (_("Error while saving file in %s:\n\n%s"), ...);
Now, in the pot file, looks like this:
#: ../src/main.c:72
#, c-format
msgid ""
"Error while saving file in %s:\n"
"\n"
"%s"
msgstr ""
Question: Why is the break line and how to avoid it? The expected is:
#: ../src/main.c:72
#, c-format
msgid "Error while saving file in %s:\n\n%s"
msgstr ""
PS: I'm using autotools, so everything is generated with gettextize and intltoolize.
Thanks
The two forms are equivalent. Different pot file writers format their pot files differently, and this is just the way that gettext/intltool does it.
I don't know how to avoid the line breaks, but I wouldn't waste time trying.

Why is msgid_plural necessary in gettext translation files?

I've read the GNU Gettext manual about Translating plural forms and see its example:
#, c-format
msgid "One file removed"
msgid_plural "%d files removed"
msgstr[0] "%d slika je uklonjena"
msgstr[1] "%d datoteke uklonjenih"
msgstr[2] "%d slika uklonjenih"
Why is msgid_plural different from msgid, and doesn't that defeat the purpose of having translations be aware of plural forms?
I'd think that I could do something like this (for English):
#, c-format
msgid "X geese"
msgstr[0] "%d goose"
msgstr[1] "%d geese"
#, c-format
msgid "sentence_about_geese_at_the_lake"
msgstr[0] "There is one goose at the lake."
msgstr[1] "There are %d geese at the lake."
(using just one msgid).
Then in my code, I'd have something like:
<?php echo $this->translate('X geese', $numberA); ?>
<?php echo $this->translate('sentence_about_geese_at_the_lake', $numberB); ?>
If $numberA is 3, it would say "3 geese."
If $numberB is 0, the next line would say "There are 0 geese at the lake."
(because for English, the rule is (n != 1), so plural is used for any number that equals 0 or greater than 1).
It seems redundant for me to be required to specify 2 msgids for the same collection of phrases.
Thanks for your help!
One of the ideas behind gettext is that the msgid is extracted from source files to create POT files, which are used as base for translations stored in PO files, and later compiled to MO files. A msgid is also used if no suitable translation is found.
A msgid is not a "key" that is non-readable by users; it is a real phrase that can be used in the program. So when in your code you request a translation for a plural (pseudocode here):
ngettext("One file removed", "%d files removed", file_count)
...these two strings will be used a) to extract messages from the source code; these messages will serve as guide for translators b) as the default strings when no suitable translation is found for the current locale.
That's why a plural string has two msgid: to show how they are defined in the source program (for translators) and to be used as default (when no translation exists).
In other localization systems, like Android String Resources or Rails YAML files, it works like you imagined -- the equivalent to a msgid is a single "key" even for plurals, but then the real phrase is not used in the source code, and defining translations is a two-steps action even for the original language.

How can I convert a Zend localization/translation array to gettext?

My multi-lingual site already successfully uses the "array" method of Zend translations.
I want to convert from that method to the "gettext" method because I've read that gettext is superior.
I've tried using http://docs.translatehouse.org/projects/translate-toolkit/en/latest/commands/php2po.html but can't get it to work.
I think it's not meant to handle Zend arrays as the input.
My Zend file (which works) looks like this:
<?php
return array(
'choose your favorite stores' => 'Choose your %1$sfavorite stores%2$s',
'P.S. If you ever have question' => 'P.S. If you ever have questions, %1$semail us%2$s any time.',
'You can also find quick answer' => 'You can also find quick answers on our %1$sHelp page%2$s.',
'Earn X cash' => '%1$sEarn 1-30%% cash back%2$s, get money-saving coupons, and find the best price on every purchase at %3$s2,500+ stores%4$s.'
);
(But it's much longer, and I have multiple languages, each in their own PHP file.)
With the snippet you have given the conversion works for me.
$ php2po en.php en.po -t en.php
processing 1 files...
[###########################################] 100%
$ cat en.po
#. extracted from en.php, en.php
msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2012-12-19 10:08+0200\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL#ADDRESS>\n"
"Language-Team: LANGUAGE <LL#li.org>\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
"X-Generator: Translate Toolkit 1.9.1-pre\n"
#: return+array%28-%3E%27choose+your+favorite+stores%27
msgid "Choose your %1$sfavorite stores%2$s"
msgstr "Choose your %1$sfavorite stores%2$s"
#: return+array%28-%3E%27P.S.+If+you+ever+have+question%27
msgid "P.S. If you ever have questions, %1$semail us%2$s any time."
msgstr "P.S. If you ever have questions, %1$semail us%2$s any time."
#: return+array%28-%3E%27You+can+also+find+quick+answer%27
msgid "You can also find quick answers on our %1$sHelp page%2$s."
msgstr "You can also find quick answers on our %1$sHelp page%2$s."
I'm using a Translate Toolkit version from git master, maybe you should try that.