OmegaT: How to import already translated files? - import

My team has been using Notepad for translation purposes so far. Recently, we decided to use one of the CAT tools available on the Internet - OmegaT.
We've got source and manually translated files, and only values were ever touched.
Is it possible to import both to the same project, so that source phrases stay source, and our phrases become their translated counterparts?
Note: I don't know if it matters, but files are formatted as INI (key=value).

What you need is an alignment. It takes source and target files and creates a translation memory.
In your specific case (INI files), you can use OmegaT to do an automatic alignment with a command line:
http://omegat.sourceforge.net/manual-standard/en/chapter.installing.and.running.html#omegat.command.arguments
Sample command line:
java -jar OmegaT.jar "C:\OmegaTProject" --mode=console-align --alignDir="C:\OmegaTProject\align"
For more general purposes, and with a GUI, there's a prototype version of OmegaT with an aligner:
https://omegat.ci.cloudbees.com/job/omegat-prototype/26/
See the OmegaT development mailing list for information about this.
Didier

With currently Beta version of 4.* releases (currently 4.1.5), you can use nice visual aligner - https://www.proz.com/forum/omegat_support/306343-new_interactive_aligner_in_omegat.html

Related

Step my script not its imports - functionality?

It is actually a credit to the strength of PyDev/ Eclipse that the debugger also steps through the corresponding parts of the imported numpy/pandas, at the places their functionalities are used by my script e.g. df = pandas.dataframe({...
But if I am confident that the imports work OK: Is there a way for the debugger to step only through my own 10 lines of script and not its imports? It would save a lot of inspection time.
(Eclipse for C/C++ on Windows 10 64bit)
Thank you!
There's actually such functionality available in the debugger, but it currently doesn't have an UI (still didn't have time to implement it).
Still, you can set an environment variable to use it.
I.e.: add an environment variable named PYDEVD_FILTERS (you can add it in the interpreter configuration or by editing your launch) and set it to be a list of paths which match the directories you want to ignore separated by ; (fnmatch style) -- those matches will be skipped by the debugger.
See: https://github.com/fabioz/PyDev.Debugger/blob/master/_pydevd_bundle/pydevd_utils.py#L191 as a reference for this (i.e.: pydevd_utils.is_ignored_by_filter).

current scctext replacement for textual representation of vfp binary files

What are people using in vfp 9 for a replacement for the built-in scctext.prg that translates binary files in vfp to a textual representation?
We’ve moving an existing project that’s in vfp 9 sp1 into tfs source control, but we need a way to make sure that the non-textual files are able to get the benefits of comparison that only non-binary text files allow. We plan to check both the textual representation and the binary file into source control (the binary is more for the “just in case” scenario)
According to the document at
http://www.ita-software.com/papers/Borup_Mercurial_Published.pdf
there are at least three options for converting .scx, .frx, .lbx, .prj and other non-prg dbf files in visual foxpro (vfp) to a textual representation. Only some of them allow for converting the textual information back to binary - not sure how often we’d really use that or not.
ALTERNATE SCCTEXT
This one seems older with latest version in 2009 - not sure if it’s still the preferred tool - and it seems to have no way to take the textual representation and convert it back to a binary file.
http://vfpx.codeplex.com/releases/view/12955
TWOFOX
This one seems similar to the foxbin2prg except it creates xml files - seems like only one dev is working on it unlike the others that are open to contributions from others so not sure how current it is and how much it’s being used by other developers - it does have two way conversion like fox2binprg has.
http://www.foxpert.com/downloads.htm
FOXBIN2PRG
This one is fairly recent - but not sure if it’s production ready enough to use for prod coding working - it does have two way conversion
http://vfpx.codeplex.com/releases/view/116407
TRIGGER INVOKE ONE OF THE ABOVE ON CHANGE OF BINARY FILES IN VFP IDE
What are people using to invoke these textual representation options?
I’ve seen this class that was created to run one of the programs listed above for all files in the project. Apparently it does it when the date time of the last generate is older that the date time on the textual version of the file. One detriment I’ve read is that it generates for foundation classes and other things that really are not items that a dev is working on (code that is referenced by but not included in your project).
http://codepaste.net/9yy1gm
Thanks for any advice from those that are using vfp 9 with source control out there!
You should check out the scX library written by Paul McNett which is published on Ed Leafe's web site. I haven't used it in a mission-critical software project yet, but I have tested it out. It seemed to catch all the potential problems I've encountered with other scctext replacements.
The reason I haven't used it in a big project for a couple of reasons.
It is a breaking change for source control history. So, comparing source code in your current SCA or VCA files with the new files generated by scX isn't going to be simple.
It isn't a drop in replacement for scctext. Instead of checking files into and out of source control directly from the IDE, you'll have an intermediary folder.
You'll check your files out of source control into one folder, convert them to FoxPro format, and then edit them in the FoxPro IDE.
Then, you'll save your changes in the FoxPro IDE, convert them to scX format, and then check them into source control.
I'm sure much of #2 can be automated; but combined with #1, making the change to scX wasn't worth it for me.
FoxBin2Prg is Production ready, and AFAIK, it's the only tool that allow Diff and Merge of the generated text (tx2) files, and can regenerate the binaries from them.
The generated files are PRG style, so developers can see them as modifying a PRG (with PROc/ENDPROC structures and such), but they aren't mean to compile. Primary use is for SCM tools, but can be used seperately.
I'm actually using on production code with a 10 member team using concurrent modifications on forms and classes.
Some documentation is available on VFPx in English and Spanish, Internal messages are vailable on both languages and from version v1.19.24 a new translation to German is available too.
More info on VFPx site,
Best regards!

iOS Localization - Updating Localizable.strings with just new strings

I have searched Google and StackOverflow and still have no clear answer on an easy and automated way of doing this but here is the scenario:
I have an app with 1000 strings localized into en, fr, de, es, it.
I build a new feature that makes 10 distinctly new NSLocalizedString() keys.
I just want those 10 new strings appended onto the ends of the files:
en.lproj/Localizable.strings
fr.lproj/Localizable.strings
es.lproj/Localizable.strings
de.lproj/Localizable.strings
it.lproj/Localizable.strings
genstrings will retrieve all 1010 distinct strings. This is a pain since I'll need to "needle in a haystack" find those 10 strings every time I do an update.
UPDATE 19-SEP-2014 -- XCode 6 - Apple has finally released support for XLIFF export and import of your .strings files
Whats new in XCode 6? Localisation
Linguan (v1.1.3) whilst it is a lovely tool most of the time, it is starting to be a tool in the other sense. It merges the changes but some strings aren't matching correctly when it merges, so everytime it does a Scan Sources it creates 100 new duplicate keys as well as the 10 strings I am after so it is making more work.
FileMerge As suggested below try doing a diff between old and new versions of the genstrings output files. The genstrings output has the strings sorted alphabetically so 10 strings scattered throughout 1000 means that there are 200 differences to review. it keeps matching the /*...*/ and the "..." = "..." and saying that the ... has been updated. It hasn't been updated, just shifted to a new location in the file. More and more it is looking like I am going to have to write a custom tool.
MacHG + FileMerge on a side note, for some strange reason doesn't like doing diffs out of the repository with the working copy of Localizable.strings. Both the left and right panes appear empty.
UPDATE: Turns out variations in some changesets being saved as UTF-16 and some as UTF-8 are screwing with it being able to do a proper diff.
Bash Script + FileMerge I have written the following script to help maintain my english reference file after each time I add new NSLocalizedString entries:
#LOCALISATION UPDATE SCRIPT
#
#This will create a temporary copy of the current 'en' reference file then generate the
#latest reference file using the 'genstrings' tool. Finally forcing FileMerge to launch
#and diff the changes.
#
#Last Updated: 2014-JAN-06
#Author(s): Josh Wilson
clear
#assuming this script is run from $SRCROOT
#Backup Existing 'en' reference
cp "en.lproj/Localizable.strings" "en.lproj/Localizable-src.strings"
#Scan source files for 'NSLocalizableString' macros
genstrings -q -u -o en.lproj Classes/*.{m,mm}
genstrings -q -u -a -o en.lproj Classes/iPad/*.{m,mm}
genstrings -q -u -a -o en.lproj Classes/iPhone/*.{m,mm}
#Force FileMerge to launch and diff the update (NOTE: piping to cat forces GUI to open)
opendiff "en.lproj/Localizable-src.strings" "en.lproj/Localizable.strings" | cat
#Cleanup up temporary file
rm "en.lproj/Localizable-src.strings"
But this only updates the EN file and I am lacking a way of having the other language files updated with the new keys. This one has been good for instances where I don't have an english word as the key and genstrings bombs my
"welcome_message" = "Welcome!" with "welcome_message" = "welcome_message"
POEditor http://poeditor.com/. This is an online tool and subscription based after 1000 strings. Seems to work well but it would be good if there was a non subscription based tool.
Traducto Pro Seems to do an alright job of integrating with XCode and extracting the strings and merging things together. But it is impossible to get anything back out of it until it is fully translated so you are coerced into using their translation services.
Surely this functionality has been implemented before. How does Apple keep their Apps localised?
Script junkies, I call upon thee! iOS development has been going on for some time now and localisation is kind of common, surely there is a mature solution to this by now?
Python Script update_strings.py: Stackoverflow finally recommended a related question and the python script in this answer Best practice using NSLocalizedString looks promising...
Tested it and in its current form (31-MAY-2013) it doesn't handle multiline comments if you have duplicate comments entries (expects single line comments).
Might just need to tweak the regex's a bit.
Checkout BartyCrouch, it perfectly solves your problem. Also it is open source, actively maintained and can be easily installed and integrated within your project.
Install BartyCrouch via Homebrew:
brew install bartycrouch
Alternatively, install it via Mint:
mint install Flinesoft/BartyCrouch
Incrementally update your Localizable.strings files:
$ bartycrouch update
This will do exactly what you were looking for.
In order to keep your Storyboards/XIBs Strings files updated over time I highly recommend adding a build script (instructions on how to add a build script here):
if which bartycrouch > /dev/null; then
bartycrouch update -x
bartycrouch lint -x
else
echo "warning: BartyCrouch not installed, download it from https://github.com/Flinesoft/BartyCrouch"
fi
In addition to incrementally updating your Storyboards/XIBs Strings files this will also make sure your Localizable.strings files stay updated with newly added keys in code using NSLocalizedString and show warnings for duplicate keys or empty values.
Make sure to checkout BartyCrouch on GitHub for additional information.
if you have the genstrings for the previous version, just a "diff" between new and old could do the tricks
EDIT: best use vimdiff to deal with utf-16 files
You can check out this Xcode Plugin I built for OneSky, it aims to improve the localization work flow for iOS/Mac OSX developers.
The string generation feature of the plugin runs genstrings and ibtool --export-strings-file to the selected source/IB files, new files will be added the project and target automatically, new strings will be merged into existing files with comments.
It will only generate/update strings for the base language, but you can make use of other features of the plugin to automate translation export and import with OneSky platform, which is free for crowdsource projects.
You may want to check out my solution here: SwiftyLocalization
With few steps to setup, you will have a very flexible localization in Google Spreadsheet (comment, custom color, highlight, font, multiple sheets, and more).
In short, steps are: Google Spreadsheet --> CSV files --> Localizable.strings
Moreover, it also generates Localizables.swift, a struct that acts like interfaces to a key retrieval & decoding for you (You have to manually specify a way to decode String from key though).
Why is this great?
You no longer need have a key as a plain string all over the places.
Wrong keys are detected at compile time.
Xcode can do autocomplete, so you can do something like this:
// It's defined as computed static var, so it's up-to-date every time you call.
// You can also have your custom retrieval method there.
button.setTitle(Localizables.login.button_title_login, forState: .Normal)
The project uses Google App Script to convert Sheets --> CSV Python script to convert CSV files --> Localizable.strings
You can have a quick look at this example sheet to know what's possible.

Validate against an Eclipse formatting profile from command line

I'm looking for a way to verify Java code against an Eclipse code formatting profile from the command line. The goal is to create a Mercurial hook which rejects any commit that doesn't match the profile. Is there a way to do this?
I'm aware of the possibility to call Eclipse's formatter from the command line. What I'm looking for is something which just validates (yes/no). I guess I could use the formatter and then compare the two, but it seems like a clumsy approach.
Background: The reason we want to try this is because we currently get many unnecessary merge conflicts because of formatting differences. We have an environment where multiple IDE:s are used, although only one is officially supported. We want to enforce the official profile, and everyone can continue using the tools they prefer as long as they set it up to format the code correctly.
In brief, follow those steps:
Duplicate the original Java file in a temporary place ;
Format the temporary duplicate using the Eclipse Java code formatter ;
Check whether the files are identical or not.
Tricks to help you out:
To call the Eclipse Java code formatter from command line, see Formatting your code using the Eclipse code formatter.
To know whether files are identical, using the diff utility: diff --text --quiet >/dev/null, the error code will tell you what you're seeking for.

Version control for DOCX and PDF?

I've been playing around with git and hg lately and then suddenly it occurred to me that this kind of thing will be great for documents.
I've a document which I edit in DOCX and export as PDF. I tried using both git and hg to version control it and turns out with hg you end up tracking only binary and diff-ing isn't meaningful. Although with git I can meaningfully diff DOCX (haven't tried on PDF yet) I was wondering if there is a better way to do it than I'm doing it right now. (Ideally, not having to leave Word to diff will be the best solution.)
There are two different concepts here - one is "can the version control system make some intelligent judgements about the contents of files?" - so that it can store just delta information between revisions (and do things like assign responsibility to individual parts of a file).
The other is 'do I have a file comparison tool which is useful for the types of files I have in the version control system'. Version control systems tend to come with file comparison tools which are inferior to dedicated alternatives. But they can pretty much always be linked to better diff programs - either for all file types or specific ones.
So it's common to use, for example, Beyond Compare as a general compare tool, with Word as a dedicated Word document comparer.
Different version control systems differ as to how good people perceive them to be at handling 'binaries', but that's often as much to do with handling huge files and providing exclusive locking as it is to do with file comparison.
http://tortoisehg.bitbucket.io/ includes a plugin called docdiff that integrates Word and Excel diff'ing.
You can use Beyond Compare as external diff tool for hg. Add to/change your user mercurial.ini as:
[extdiff]
cmd.vdiff = c:/path/to/BCompare.exe
Then get Beyond Compare file viewer rule for docx.
Now you should be able to compare two versions of docx in Beyond Compare.
This article outlines the solution for Docx using Pandoc
While this post outlines solution for PDF using pdf2html.
Only for docx, I compiled instructions for multiple places here: https://gist.github.com/nachocab/6429893
# download docx2txt by Sandeep Kumar
wget -O docx2txt.pl http://www.cs.indiana.edu/~kinzler/home/binp/docx2txt
# make a wrapper
echo '#!/bin/bash
docx2txt.pl $1 -' > docx2txt
chmod +x docx2txt
# make sure docx2txt.pl and docx2txt are your current PATH. Here's a guide
http://shapeshed.com/using_custom_shell_scripts_on_osx_or_linux/
mv docx2txt docx2txt.pl ~/bin/
# set .gitattributes (unfortunately I don't this can't be set by default, you have to create it for every project)
echo "*.docx diff=word" > .git/info/attributes
# add the following to ~/.gitconfig
[diff "word"]
binary = true
textconv = docx2txt
# add a new alias
[alias]
wdiff = diff --color-words
# try it
git init
# create my_file.docx, add some content
git add my_file.docx
git commit -m "Initial commit"
# change something in my_file.docx
git wdiff my_file.docx
# awesome!
It works great on OSX
If you happen to use a Mac, I wrote a git merge driver that can use Microsoft Word and tracked changes to merge and show conflicts between any file types Word can read & write.
http://github.com/jasmas/wordMerge
I say 'if you happen to use a Mac' because the driver I wrote uses AppleScript, primarily to accomplish this task.
It'd be nice to add a vbscript version to the project, but at the moment I don't have a Windows environment for testing. Anyone with some basic scripting knowledge should be able to take a look at what I'm doing and duplicate it in vbscript, powershell or whatever on Windows.
I used SVN (yes, in 2020 :-)) with TortoiseSVN on Windows. It has a built-in function to compare DOCX files (it opens Microsoft Word in a mode where your screen is divided into four parts: the file after the changes, before the changes, with changes highlighted and a list of changes). Screenshot below (sorry for the Polish version of MS Word). I also checked TortoiseGIT and it also has this functionality. I've read that TortoiseHG has it as well.