How to change encoding of EMF model instances from ASCII to UTF-8 in Eclipse? - eclipse

I am on working most of the time on Mac. I have changed the "Text file encoding" of my Eclipse instance from "Properties->Workspace" to UTF-8. But still, when I create EMF model instances (be it dynamic or from registered examples), the created *.xmi-files have always "ASCII" as their encoding. When I try to share these project containing the models, instances etc. over SVN, this causes some problems if I work with these later on a Windows machine. It is probably a Windows problem regarding the SVN. The exact error code can a later post here.
Still, from now on, I would like to work with UTF-8 anyways. So the question is, why are my created text files still encoded in ASCII, even though my Eclipse workspace property has been set to UTF-8?
I have also tried to change the property globally within eclipse.in with the following line: -Dfile.encoding=UTF-8
Another thing I was wondering, if the "New text file line delimiter" from "Properties->Workspace" set to default (in my case Unix) could have anything to do with this issue?
Thank you!

If you have an XML resource, the following might help:
XMLResource JavaDoc
-Martin

Related

Persian font encoding in m-files on Matlab [duplicate]

I'd like to use Unicode characters in comments in a MATLAB source file. This seems to work when I write the text; however, if I close the file and reload it, "unusual" characters have been turned into question marks. I guess MATLAB is saving the file as ASCII.
Is there any way to tell MATLAB to use UTF-8 instead?
According to http://www.mathworks.de/matlabcentral/newsreader/view_thread/238995
feature('DefaultCharacterSet', 'UTF8')
will change the encoding to UTF-8. You can put the line above in your startup.m file.
How the MATLAB Process Uses Locale Settings shows how to set the encoding for different platforms. Use
feature('DefaultCharacterSet')
You can read more about this undocumented function here. See also this Matlab Central thread for other options.
Mac OSX only!
As I found solution which worked in my case I want to share it.
Mathworks advises here to use slCharacterEncoding(encoding) in order to change the encoding as desired, but for the OSX this does not solve the issue exactly as the feature('DefaultCharacterSet') in accepted answer does not do it. What helped me to get the UTF-8 encoding set for opening and saving .m files was the following link on MATLAB answers:
https://www.mathworks.com/matlabcentral/answers/12422-macosx-encoding-problem
Matlab seems to ignore any value set in slCharacterEncoding(encoding) or feature('DefaultCharacterSet') but uses region set in System Preferences -> Language & Region. After checking which region is selected in our case then it is possible to define the actual encoding in the hidden configuration file in
$matlabroot/bin/lcdata.xml
This directory can be opened by getting to the Applications and after right click on Matlab by selecting Show Package Contents as on screenshot (here in German)
For example for German default ISO-8859-1 it is possible to adjust it by changing the respective line in the file lcdata.xml:
<locale name="de_DE" encoding="ISO-8859-1" xpg_name="de_DE.ISO8859-1">
to:
<locale name="de_DE" encoding="UTF-8" xpg_name="de_DE.UTF-8">
If the region which is selected is not present in the lcdata.xml file this will not work.
Hope this helps!
The solution provided here worked for me on Windows with R2018a.
In case link doesn't work: the idea is to use file matlabroot/bin/lcdata.xml to configure an alias for encoding name (some explanation can be found in this very file in the comments):
<codeset>
<encoding name="UTF-8">
<encoding_alias name="windows-1252" />
</encoding>
</codeset>
You would use your own value instead of windows-1252, currently used encoding can be obtained by running feature('locale').
Although, if you use Unicode characters in help comments, the help browser does not recognize them, as well as console window output.
For Mac OS users, Jendker's solution really helps!!! Thanks a lot first.
Recap here.
Check the default language in Matlab by typing in the command window getenv('LANG'). Mine returned en_US.ISO8859-1.
In the Application directory find Matlab, show its package contents. Go to bin, open lcdata.xml as an administrator, locate the corresponding xpg_name, in my case en_US.ISO8859-1. Change encoding in the same line to UTF-8. Save it.
Restart Matlab, and it's all done!

Eclipse won't ignore CRLF in team synchronization

First, let me explain what I am doing. I have a CVS repository that I store 5,000 Data Definition Language files in. These 5,000 files are generated from an external data modeling application, they are text and have windows CRLFs. During development, if I need to make a change, I re-generate the 5,000 files and then overwrite the contents of my local CVS workspace in eclipse. The full overwrite/replacement is to make sure that I don't miss any updates to files. After overwriting/replacing the files, I use eclipse to do a team < synchronize with repository. When I do this, the comparison flags every single file as an outgoing change because it looks to not be ignoring CRLFs in its comparison. I have "Ignore white space" checked off and the eclipse documentation states that it should be ignoring CRLFs:
Ignore whitespace option:
Causes the comparison to ignore differences which are whitespace characters
(spaces, tabs, etc.). Also causes differences in line terminators ( LF
versus CRLF) to be ignored.
When I open the files in text compare, it shows no diffs but there is an extra CRLF at top of one of the files. Is this a bug or is there an option I am missing in eclipse? It looks like the problem is that it doesn't ignore CRLFs that are on their own line.
The Eclipse compare dialog doesn't have a bug; you're just confused because you're seeing the output of several, independent problems.
The option "ignore whitespace" only reduces the amount of changes that the compare dialog shows; it has no effect whatsoever on the differences that CVS sees. So as long as the files have the wrong line ending, CVS will complain.
Some version control systems allow you to specify converters to solve this issue, CVS doesn't. So you really need to generate files with the correct line endings.
The "single file with extra CRLF" really has a an extra CRLF. Find out why and fix that to make the difference go away.
When generating files, you should never use PrintStream or PrintWriter. It is tempting but these two have many bugs (like close() doesn't flush(), violating their API contract) plus they use platform dependent line endings which is almost never what you want. Yes, it might work by accident but trust me on this, that's not what you want. You don't want you pay check filed on accident, either, right?
If you don't use PrintStream nor PrintWriter, then avoid the System property line.separator for the same reasons.
I suggest to wrote a helper class which has many of the methods of PrintStream / PrintWriter but none of the bugs. Plus it should allow you to set the line delimiter to whatever you need.
Note: If you use a Writer, make sure you also specify the charset / encoding or the "UTF-8 to bytes" conversion will be as random as the line endings.

Unicode characters in MATLAB source files

I'd like to use Unicode characters in comments in a MATLAB source file. This seems to work when I write the text; however, if I close the file and reload it, "unusual" characters have been turned into question marks. I guess MATLAB is saving the file as ASCII.
Is there any way to tell MATLAB to use UTF-8 instead?
According to http://www.mathworks.de/matlabcentral/newsreader/view_thread/238995
feature('DefaultCharacterSet', 'UTF8')
will change the encoding to UTF-8. You can put the line above in your startup.m file.
How the MATLAB Process Uses Locale Settings shows how to set the encoding for different platforms. Use
feature('DefaultCharacterSet')
You can read more about this undocumented function here. See also this Matlab Central thread for other options.
Mac OSX only!
As I found solution which worked in my case I want to share it.
Mathworks advises here to use slCharacterEncoding(encoding) in order to change the encoding as desired, but for the OSX this does not solve the issue exactly as the feature('DefaultCharacterSet') in accepted answer does not do it. What helped me to get the UTF-8 encoding set for opening and saving .m files was the following link on MATLAB answers:
https://www.mathworks.com/matlabcentral/answers/12422-macosx-encoding-problem
Matlab seems to ignore any value set in slCharacterEncoding(encoding) or feature('DefaultCharacterSet') but uses region set in System Preferences -> Language & Region. After checking which region is selected in our case then it is possible to define the actual encoding in the hidden configuration file in
$matlabroot/bin/lcdata.xml
This directory can be opened by getting to the Applications and after right click on Matlab by selecting Show Package Contents as on screenshot (here in German)
For example for German default ISO-8859-1 it is possible to adjust it by changing the respective line in the file lcdata.xml:
<locale name="de_DE" encoding="ISO-8859-1" xpg_name="de_DE.ISO8859-1">
to:
<locale name="de_DE" encoding="UTF-8" xpg_name="de_DE.UTF-8">
If the region which is selected is not present in the lcdata.xml file this will not work.
Hope this helps!
The solution provided here worked for me on Windows with R2018a.
In case link doesn't work: the idea is to use file matlabroot/bin/lcdata.xml to configure an alias for encoding name (some explanation can be found in this very file in the comments):
<codeset>
<encoding name="UTF-8">
<encoding_alias name="windows-1252" />
</encoding>
</codeset>
You would use your own value instead of windows-1252, currently used encoding can be obtained by running feature('locale').
Although, if you use Unicode characters in help comments, the help browser does not recognize them, as well as console window output.
For Mac OS users, Jendker's solution really helps!!! Thanks a lot first.
Recap here.
Check the default language in Matlab by typing in the command window getenv('LANG'). Mine returned en_US.ISO8859-1.
In the Application directory find Matlab, show its package contents. Go to bin, open lcdata.xml as an administrator, locate the corresponding xpg_name, in my case en_US.ISO8859-1. Change encoding in the same line to UTF-8. Save it.
Restart Matlab, and it's all done!

Localizable.strings woes

My Localizable.strings file has somehow been corrupted and I don't know how to restore it.
If I open it as a Plain Text File it starts with weird characters that I can't copy here.
If I leave the file be the app builds. If I make any changes either the values aren't interpreted properly or I get an error at compile time.
Localizable.strings: Conversion of string failed. The string is empty.
Command /Developer/Library/Xcode/Plug-ins/CoreBuildTasks.xcplugin/Contents/Resources/copystrings failed with exit code 1
I suspect this is an encoding problem but I don't know how it happened (maybe SVN is to blame?) nor how to solve it. Any tips will be much more appreciated.
I have issues with the same file that sound very similar to your own. What happens for me is that Xcode doesn't know the correct file formating. I often get this when rearranging the project and I remove and re-add this file to the Xcode project. When I re-add the file, its encoding gets set to something like Western Roman which can't seem to render anything other than ASCII.
Here's what I do to fix the problem:
In Xcode select the Localizable.stings file in the Groups&Files panel.
Do a Get Info on that file.
On the info panel select the General tab.
In that tab go to the File Encoding and change its value.
The last step is where the trick lies as you now have to guess the right encoding. I find that for most European languages that "Unicode (UTF-8)" works. And for Asian languages I find that "Unicode (UTF-16/32)" are the ones to try.
I just had that error because I forgot a semicolon. Took me a while to figure it out. Seems like a really ambiguous compiler error but the fix was simple.
Make sure in File-Get Info, that UTF-16 is selected. If it's set to none or UTF-8 as encoding then you need to change it. If your characters have spaces between them then you choose to "re-interpret" the file as UTF-16. If there are weird characters in the file, then you need to remove them.
Execpt the UTF-8 problem, sometimes you still have to check the content in case if there are some syntax problems.
Use the following Regular Expression to verify your text line by line, if there's any line not matched, there must be a problem.
"(.+?)"="(.+?)";
You can use the plutil command line tool. Without options or with the -lint option, it checks the syntax of the file given as argument. It will tell you more precisely where the error is.
This happens to me when there is a missing quote or something not right with the file. MOst commonly, since my language files are done by another team member, he tends to forget a quote or something. Usually XCode shows an error on that line, sometimes it does'nt and just throws "Corrupted data" error.
Double check if all your strings are properly closed in quotes
Open the file in Xcode.
Right click it in Project Navigator.
Select Open as -> ASCII Property List

What is the default VB6 charset?

we have an application written in Java which reads some text generated by a VB6 application.
The problem is: this VB6 application generate this output using some special characters, like ç,ã,á which we don't know in what charset.
So the question is: is there a default charset used by VB6? Which is it?
how do you transfer the data from one to the other? via file? if yes then it uses the machine default encoding i don't know the java code to get it, but in c# its Encoding.Default...
Well,
here is what we discovered: We don't know if that was because our VB6 application was executed on the command line,but it was actually using the MS-DOS environment default charset, which in our case was the windows-1252.
So, all we had to do was to change our Java code to something like this:
InputStreamReader inputReader = new InputStreamReader(input, "windows-1252");
and it just worked fine!
Maybe it's even not because of the MS-DOS environment, but because we are getting this data from a Microsoft Access database. Personally, I think that this is the most probably solution for our problem.