iOS Localization - Updating Localizable.strings with just new strings - iphone

I have searched Google and StackOverflow and still have no clear answer on an easy and automated way of doing this but here is the scenario:
I have an app with 1000 strings localized into en, fr, de, es, it.
I build a new feature that makes 10 distinctly new NSLocalizedString() keys.
I just want those 10 new strings appended onto the ends of the files:
en.lproj/Localizable.strings
fr.lproj/Localizable.strings
es.lproj/Localizable.strings
de.lproj/Localizable.strings
it.lproj/Localizable.strings
genstrings will retrieve all 1010 distinct strings. This is a pain since I'll need to "needle in a haystack" find those 10 strings every time I do an update.
UPDATE 19-SEP-2014 -- XCode 6 - Apple has finally released support for XLIFF export and import of your .strings files
Whats new in XCode 6? Localisation
Linguan (v1.1.3) whilst it is a lovely tool most of the time, it is starting to be a tool in the other sense. It merges the changes but some strings aren't matching correctly when it merges, so everytime it does a Scan Sources it creates 100 new duplicate keys as well as the 10 strings I am after so it is making more work.
FileMerge As suggested below try doing a diff between old and new versions of the genstrings output files. The genstrings output has the strings sorted alphabetically so 10 strings scattered throughout 1000 means that there are 200 differences to review. it keeps matching the /*...*/ and the "..." = "..." and saying that the ... has been updated. It hasn't been updated, just shifted to a new location in the file. More and more it is looking like I am going to have to write a custom tool.
MacHG + FileMerge on a side note, for some strange reason doesn't like doing diffs out of the repository with the working copy of Localizable.strings. Both the left and right panes appear empty.
UPDATE: Turns out variations in some changesets being saved as UTF-16 and some as UTF-8 are screwing with it being able to do a proper diff.
Bash Script + FileMerge I have written the following script to help maintain my english reference file after each time I add new NSLocalizedString entries:
#LOCALISATION UPDATE SCRIPT
#
#This will create a temporary copy of the current 'en' reference file then generate the
#latest reference file using the 'genstrings' tool. Finally forcing FileMerge to launch
#and diff the changes.
#
#Last Updated: 2014-JAN-06
#Author(s): Josh Wilson
clear
#assuming this script is run from $SRCROOT
#Backup Existing 'en' reference
cp "en.lproj/Localizable.strings" "en.lproj/Localizable-src.strings"
#Scan source files for 'NSLocalizableString' macros
genstrings -q -u -o en.lproj Classes/*.{m,mm}
genstrings -q -u -a -o en.lproj Classes/iPad/*.{m,mm}
genstrings -q -u -a -o en.lproj Classes/iPhone/*.{m,mm}
#Force FileMerge to launch and diff the update (NOTE: piping to cat forces GUI to open)
opendiff "en.lproj/Localizable-src.strings" "en.lproj/Localizable.strings" | cat
#Cleanup up temporary file
rm "en.lproj/Localizable-src.strings"
But this only updates the EN file and I am lacking a way of having the other language files updated with the new keys. This one has been good for instances where I don't have an english word as the key and genstrings bombs my
"welcome_message" = "Welcome!" with "welcome_message" = "welcome_message"
POEditor http://poeditor.com/. This is an online tool and subscription based after 1000 strings. Seems to work well but it would be good if there was a non subscription based tool.
Traducto Pro Seems to do an alright job of integrating with XCode and extracting the strings and merging things together. But it is impossible to get anything back out of it until it is fully translated so you are coerced into using their translation services.
Surely this functionality has been implemented before. How does Apple keep their Apps localised?
Script junkies, I call upon thee! iOS development has been going on for some time now and localisation is kind of common, surely there is a mature solution to this by now?
Python Script update_strings.py: Stackoverflow finally recommended a related question and the python script in this answer Best practice using NSLocalizedString looks promising...
Tested it and in its current form (31-MAY-2013) it doesn't handle multiline comments if you have duplicate comments entries (expects single line comments).
Might just need to tweak the regex's a bit.

Checkout BartyCrouch, it perfectly solves your problem. Also it is open source, actively maintained and can be easily installed and integrated within your project.
Install BartyCrouch via Homebrew:
brew install bartycrouch
Alternatively, install it via Mint:
mint install Flinesoft/BartyCrouch
Incrementally update your Localizable.strings files:
$ bartycrouch update
This will do exactly what you were looking for.
In order to keep your Storyboards/XIBs Strings files updated over time I highly recommend adding a build script (instructions on how to add a build script here):
if which bartycrouch > /dev/null; then
bartycrouch update -x
bartycrouch lint -x
else
echo "warning: BartyCrouch not installed, download it from https://github.com/Flinesoft/BartyCrouch"
fi
In addition to incrementally updating your Storyboards/XIBs Strings files this will also make sure your Localizable.strings files stay updated with newly added keys in code using NSLocalizedString and show warnings for duplicate keys or empty values.
Make sure to checkout BartyCrouch on GitHub for additional information.

if you have the genstrings for the previous version, just a "diff" between new and old could do the tricks
EDIT: best use vimdiff to deal with utf-16 files

You can check out this Xcode Plugin I built for OneSky, it aims to improve the localization work flow for iOS/Mac OSX developers.
The string generation feature of the plugin runs genstrings and ibtool --export-strings-file to the selected source/IB files, new files will be added the project and target automatically, new strings will be merged into existing files with comments.
It will only generate/update strings for the base language, but you can make use of other features of the plugin to automate translation export and import with OneSky platform, which is free for crowdsource projects.

You may want to check out my solution here: SwiftyLocalization
With few steps to setup, you will have a very flexible localization in Google Spreadsheet (comment, custom color, highlight, font, multiple sheets, and more).
In short, steps are: Google Spreadsheet --> CSV files --> Localizable.strings
Moreover, it also generates Localizables.swift, a struct that acts like interfaces to a key retrieval & decoding for you (You have to manually specify a way to decode String from key though).
Why is this great?
You no longer need have a key as a plain string all over the places.
Wrong keys are detected at compile time.
Xcode can do autocomplete, so you can do something like this:
// It's defined as computed static var, so it's up-to-date every time you call.
// You can also have your custom retrieval method there.
button.setTitle(Localizables.login.button_title_login, forState: .Normal)
The project uses Google App Script to convert Sheets --> CSV Python script to convert CSV files --> Localizable.strings
You can have a quick look at this example sheet to know what's possible.

Related

Postgresql fulltext search for Czech language (no default language config)

I am trying to setup fulltext search for Czech language. I am little bit confused, because I see some cs_cz.affix and cs_cz.dict files inside tsearch_data folder, but there is no Czech language configuration (it's probably not shipped with Postgres).
So should I create one? Which dics do I have to create/config? Is there some support for Czech language at all?
Should I use all possible dicts? (Synonym Dictionary, Thesaurus Dictionary, Ispell Dictionary, Snowball Dictionary)
I am able to create Czech configuration for ispell dict and it works fine, bud I am not sure if it's enough (just ispell configuration).
Thanks a lot I tried to read https://www.postgresql.org/docs/9.5/static/textsearch.html but I am little bit confused.
I have never tried it, but you should be able to create a Czech Snowball stemmer as long as you are ready to compile PostgreSQL from source.
There is an explanation in src/backend/snowball/README:
The files under src/backend/snowball/libstemmer/ and
src/include/snowball/libstemmer/ are taken directly from their libstemmer_c
distribution, with only some minor adjustments of file inclusions. Note
that most of these files are in fact derived files, not master source.
The master sources are in the Snowball language, and are available along
with the Snowball-to-C compiler from the Snowball project. We choose to
include the derived files in the PostgreSQL distribution because most
installations will not have the Snowball compiler available.
To update the PostgreSQL sources from a new Snowball libstemmer_c
distribution:
Copy the *.c files in libstemmer_c/src_c/ to src/backend/snowball/libstemmer
with replacement of "../runtime/header.h" by "header.h", for example
for f in libstemmer_c/src_c/*.c
do
sed 's|\.\./runtime/header\.h|header.h|' $f >libstemmer/`basename $f`
done
(Alternatively, if you rebuild the stemmer files from the master Snowball
sources, just omit "-r ../runtime" from the Snowball compiler switches.)
Copy the *.c files in libstemmer_c/runtime/ to
src/backend/snowball/libstemmer, and edit them to remove direct inclusions
of system headers such as <stdio.h> – they should only include "header.h".
(This removal avoids portability problems on some platforms where <stdio.h>
is sensitive to largefile compilation options.)
Copy the *.h files in libstemmer_c/src_c/ and libstemmer_c/runtime/
to src/include/snowball/libstemmer. At this writing the header files
do not require any changes.
Check whether any stemmer modules have been added or removed. If so, edit
the OBJS list in Makefile, the list of #include's in dict_snowball.c, and the
stemmer_modules[] table in dict_snowball.c.
The various stopword files in stopwords/ must be downloaded
individually from pages on the snowball.tartarus.org website.
Be careful that these files must be stored in UTF-8 encoding.
Now there is a Czech Snowball stemmer available here, it was contributed to the project. There is no stop word dictionary available, but I am sure you can either find one or create one yourself.
The real work would be to install Snowball and use the Snowball-to-C compiler to create the C and header files to add to the PostgreSQL source.
These files should then remain stable, so it shouldn't be difficult to upgrade to a new PostgreSQL version.
If you are willing to do the work, but don't want to patch PostgreSQL and build it from source every time, you could also consider submitting a patch to PostgreSQL. As long as the stemmer works fine, I don't expect that you will much resistance there (but the patch submission process is still tedious).

OmegaT: How to import already translated files?

My team has been using Notepad for translation purposes so far. Recently, we decided to use one of the CAT tools available on the Internet - OmegaT.
We've got source and manually translated files, and only values were ever touched.
Is it possible to import both to the same project, so that source phrases stay source, and our phrases become their translated counterparts?
Note: I don't know if it matters, but files are formatted as INI (key=value).
What you need is an alignment. It takes source and target files and creates a translation memory.
In your specific case (INI files), you can use OmegaT to do an automatic alignment with a command line:
http://omegat.sourceforge.net/manual-standard/en/chapter.installing.and.running.html#omegat.command.arguments
Sample command line:
java -jar OmegaT.jar "C:\OmegaTProject" --mode=console-align --alignDir="C:\OmegaTProject\align"
For more general purposes, and with a GUI, there's a prototype version of OmegaT with an aligner:
https://omegat.ci.cloudbees.com/job/omegat-prototype/26/
See the OmegaT development mailing list for information about this.
Didier
With currently Beta version of 4.* releases (currently 4.1.5), you can use nice visual aligner - https://www.proz.com/forum/omegat_support/306343-new_interactive_aligner_in_omegat.html

Automating Localizable.strings?

So, in my project I have 10 languages, and 10 Localizable.strings files.
I just created Localizable.strings files, a file for each language. Now they contain "key" = "value" pairs, and both keys and values are in English (default language).
My languages are all translated and stay in Excel files.
The question is, how can I insert all my languages in those files faster than just copying each word manually or writing a script for that?
Maybe there is a existing tool for this already?
Thanks.
I found an easy way to compose localizable.strings files from Excel documents.
In the Excel document, in specific columns I insert " " = " " symbols. It's easy to do for all the words by dragging Excel cell down from the corner, so that it copies stuff from that cell to all the cells you drag it to. (sorry for messy explanation)
Thus the document contains the same symbols and words as localizable.strings does.
Than I just copy everything to the text file, remove tabs, change extension to .strings.
(no comments saved unfortunately).
EDIT:
You can copy the stuff from Excel to Sublime Text, then Find & Replace tabs if any. Copy resulted stuff into proper Xcode .string file.
One application that will really save you a lot of time by automating and streamlining localization procedure is Localization Suite. I do not know if they support importing from excel (to save you time transferring your string pairs) but it's free and seems like a complete solution.
I had an internal script at work for doing that tasks in iOS and Android, and I've just opensourced it as a Gem. You can take a look at it here: http://github.com/mrmans0n/localio
It can open spreadsheets from Google Drive and local Excel files as well, like requested.
You just would have to install the gem
gem install localio
And have a custom DSL file in your project directory, called Locfile, with the info referring to your project and the localization files. An example in your case, where an Excel file is used, could be as simple as:
platform :ios
source :xls, :path => 'YourExcelFileGoesInHere.xls'
output_path 'Resources/Localizables/'
The .xls file should have a certain format, that probably is very similar to what you have right now. You just have to clone the contents of this one and fill it with your translations: https://docs.google.com/spreadsheet/ccc?key=0AmX_w4-5HkOgdFFoZ19iSUlRSERnQTJ4NVZiblo2UXc
Hope this helps.
Here are the steps i followed:
change the extension of .strings to .txt on windows
open excel and go to File > Open
Choose the file to open. This should present an import wizard
Follow the steps and specify the delimiting character as =
You're done

Version control for DOCX and PDF?

I've been playing around with git and hg lately and then suddenly it occurred to me that this kind of thing will be great for documents.
I've a document which I edit in DOCX and export as PDF. I tried using both git and hg to version control it and turns out with hg you end up tracking only binary and diff-ing isn't meaningful. Although with git I can meaningfully diff DOCX (haven't tried on PDF yet) I was wondering if there is a better way to do it than I'm doing it right now. (Ideally, not having to leave Word to diff will be the best solution.)
There are two different concepts here - one is "can the version control system make some intelligent judgements about the contents of files?" - so that it can store just delta information between revisions (and do things like assign responsibility to individual parts of a file).
The other is 'do I have a file comparison tool which is useful for the types of files I have in the version control system'. Version control systems tend to come with file comparison tools which are inferior to dedicated alternatives. But they can pretty much always be linked to better diff programs - either for all file types or specific ones.
So it's common to use, for example, Beyond Compare as a general compare tool, with Word as a dedicated Word document comparer.
Different version control systems differ as to how good people perceive them to be at handling 'binaries', but that's often as much to do with handling huge files and providing exclusive locking as it is to do with file comparison.
http://tortoisehg.bitbucket.io/ includes a plugin called docdiff that integrates Word and Excel diff'ing.
You can use Beyond Compare as external diff tool for hg. Add to/change your user mercurial.ini as:
[extdiff]
cmd.vdiff = c:/path/to/BCompare.exe
Then get Beyond Compare file viewer rule for docx.
Now you should be able to compare two versions of docx in Beyond Compare.
This article outlines the solution for Docx using Pandoc
While this post outlines solution for PDF using pdf2html.
Only for docx, I compiled instructions for multiple places here: https://gist.github.com/nachocab/6429893
# download docx2txt by Sandeep Kumar
wget -O docx2txt.pl http://www.cs.indiana.edu/~kinzler/home/binp/docx2txt
# make a wrapper
echo '#!/bin/bash
docx2txt.pl $1 -' > docx2txt
chmod +x docx2txt
# make sure docx2txt.pl and docx2txt are your current PATH. Here's a guide
http://shapeshed.com/using_custom_shell_scripts_on_osx_or_linux/
mv docx2txt docx2txt.pl ~/bin/
# set .gitattributes (unfortunately I don't this can't be set by default, you have to create it for every project)
echo "*.docx diff=word" > .git/info/attributes
# add the following to ~/.gitconfig
[diff "word"]
binary = true
textconv = docx2txt
# add a new alias
[alias]
wdiff = diff --color-words
# try it
git init
# create my_file.docx, add some content
git add my_file.docx
git commit -m "Initial commit"
# change something in my_file.docx
git wdiff my_file.docx
# awesome!
It works great on OSX
If you happen to use a Mac, I wrote a git merge driver that can use Microsoft Word and tracked changes to merge and show conflicts between any file types Word can read & write.
http://github.com/jasmas/wordMerge
I say 'if you happen to use a Mac' because the driver I wrote uses AppleScript, primarily to accomplish this task.
It'd be nice to add a vbscript version to the project, but at the moment I don't have a Windows environment for testing. Anyone with some basic scripting knowledge should be able to take a look at what I'm doing and duplicate it in vbscript, powershell or whatever on Windows.
I used SVN (yes, in 2020 :-)) with TortoiseSVN on Windows. It has a built-in function to compare DOCX files (it opens Microsoft Word in a mode where your screen is divided into four parts: the file after the changes, before the changes, with changes highlighted and a list of changes). Screenshot below (sorry for the Polish version of MS Word). I also checked TortoiseGIT and it also has this functionality. I've read that TortoiseHG has it as well.

Getting more out of *.diff -files

I wonder if there are tools to show *.diff files used in patching related to debian packaging. What I need from the tool is that it could just read the diff file and show the actual files changed with changed rows, like kdiff or meld would do when comparing directly 2 different files. Or maybe I have totally wrong kind of approach to this, maybe I should ask how can I get more out of diff-files?
Kompare is able to open a .diff, and it shows you the files changed at the top, alist of changes of the selected file, and a side by side diff (for the lines that it is able to extract from the .diff.
However, when I feed it a debdiff, it got confused. The diff did not have === file headers, only --- and +++ headers, and so it included the changes from the /debian/changelog, /debian/copyright, and /debian/rules with in the /debian/control file. Ymmv.
Screenshot: http://imagebin.ca/view/fNWEzx.html
The Debian diff format seems to be a special diff format. As my short google search didn't result in a graphical tool, which could handle these files in the way normal diff tools do, I'm not sure, if such a tool exists. Perhaps you could try to convert these debiff files to normal diff files (I didn't find a tool, which would do that, either).
There is a tool to visualize changes in Linux packages (Deb, RPM, TAR.GZ, etc.) - pkgdiff.
Usage:
pkgdiff -old OLD.deb -new NEW.deb
Sample reports:
http://lvc.github.com/pkgdiff/pkgdiff_reports/libqb/0.4.1_to_0.8.1/changes_report.html
http://lvc.github.com/pkgdiff/pkgdiff_reports/gstreamer/0.10.23-i486-1_to_0.10.32-i486-1/changes_report.html