I run some matlab function and after upgrading to a new computer, MATLAB 2017b is not able to do string comparison containing umlaut because all the umlaut are dispalyed as ?. See below:
strcmp(factor_struct.conditions(i,k),'gr?sser als')
Chaning ? to ö does not work as matlab seems not able to display properly this character.
Is there a set-up to change in order to be able to read that type of charachters?
I managed to solve the problem by changing indeed the OS language format.
Related
I have a dataset but its variable labels are in unicode like this:
I tried typing:
unicode
However, this simply displays the following:
How can I correctly display the unicode?
Or, at least, is there any method I can see the labels using another program?
Assuming your data are in a file data.dta in your working directory (where Stata should point):
clear
unicode encoding set euc-kr
unicode translate data.dta
Type help encodings from Stata's command prompt for details regarding the different formats.
I’m am trying to create a pdf file from matlab figure using cmyk colors, but facing a problem with umlauts and also some other special characters. Is there any other way to handle this than Latex? The following example demonstrates the issue.
plot(rand(199,1))
title_string = ['Some text:äö' char(228) ':2005' char(150) '2008:end text'];
title(title_string);
print(gcf,'-dpdf','cmykfile.pdf','-r600','-cmyk');
print(gcf,'-dpdf','rgbfile.pdf','-r600');
As you can see from the pdf-files the RGB-version handles umlauts, but not en-dash, and CMYK skips them all.
PDF is generated in Matlab using Ghostscript, but I have not found how to configure character encoding for GS.
I am using Windows and Matlab R2014.
I'm not completely sure this is the solution you was looking for.
Anyway, if you create an eps first and then convert it to pdf the output file doesn't have any issue with the special characters in the title, provided that you don't build your title string using char.
plot(rand(199,1))
title_string = 'Some text:äöä:2005—2008æ:end text';
title(title_string);
print(gcf,'-depsc','cmykfile.eps','-r600','-cmyk');
!ps2pdf cmykfile.eps cmykfile.pdf
The code above works if you have the ps2pdf utility in your system path. You already have ps2pdf on your computer if you have MiKTeX installed, but you might need to update your system path. Basically ps2pdf should be a shortcut to gs, therefore also if you have only gs and not MiKTeX installed, you should be able to achieve the same result.
EDIT
On my machine (Windows 7, MATLAB R2014b), also this code works well, without the need to use ps2pdf:
plot(rand(199,1))
title_string = 'Some text:äöä:2005—2008æ:end text';
title(title_string);
print(gcf,'-dpdf','cmykfile.pdf','-r600','-cmyk');
It seems that the issue happens when you build the title string using char.
i am using wxMac 2.8 in non-unicode build. I try to read a file with mutated vowels "ü" to a wxtextctrl. When i do, the data gets interpreted as current encoding, but it is a multibyte string. I narrowed the problem down to this:
text_ctrl->Clear();
text_ctrl->SetValue("üüüäääööößßß");
This is the result:
üüüäääööößßß
Note that the character count has doubled - printing the string in gdb displays "\303\274" and similar per original char. Typing "ü" or similar into the textctrl is no problem. I tried various wxMBConv methods but the result is always the same. Is there a way to solve this?
Best regards,
If you use anything but 7 bit ASCII, you must use Unicode build of wxWidgets. Just do yourself a favour and switch to it. If you have too much existing code that was written for "ANSI" build of wxWidgets 2.8 and earlier and doesn't compile with Unicode build, use wxWidgets 2.9 instead where it will compile -- and work as intended.
It sounds like your text editor (for program source code) is in a different encoding from the running program.
Suppose for example that your text entry control and the rest of your program are (correctly) using UTF-8. Now if your text editor is using some other encoding, then a string that looks fine on screen will actually contain garbage bytes.
Assuming you are in a position to help create a pure-UTF8 world, then you should:
1) Encode UTF-8 directly into the string literals using escapes, e.g. "\303" or "\xc3". That's annoying to do, but it means you just don't have to worry about you text editor (or the editor settings of other developers).
2) Then check that the program is using UTF-8 everywhere.
The MATLAB Engine is a C interface to MATLAB. It provides a function engEvalString() which takes some MATLAB code as a C string (char *), evaluates it, then returns MATLAB's output as a C string again.
I need to be able to pass unicode data to MATLAB through engEvalString() and to retrieve the output as unicode. How can I do this? I don't care about the particular encoding (UTF-8, UTF-16, etc.), any will do. I can adapt my program.
More details:
To give a concrete example, if I send the following sting, encoded as, say, UTF-8,
s='Paul Erdős'
I would like to get back the following output, encoded again as UTF-8:
s =
Paul Erdős
I hoped to achieve this by sending feature('DefaultCharacterSet', 'UTF-8') (reference) before doing anything else, and this worked fine when working with MATLAB R2012b on OS X. It also works fine with R2013a on Ubuntu Linux. It does not work on R2013a on OS X though. Instead of the character ő in the output of engEvalString(), I get character code 26, which is supposed to mean "I don't know how to represent this". However, if I retrieve the contents of the variable s by other means, I see that MATLAB does correctly store the character ő in the string. This means that it's only the output that didn't work, but MATLAB did interpret the UTF-8 input correctly. If I test this on Windows with R2013a, neither input, nor output works correctly. (Note that the Windows and the Mac/Linux implementations of the MATLAB Engine are different.)
The question is: how can I get unicode input/output working on all platforms (Win/Mac/Linux) with engEvalString()? I need this to work in R2013a, and preferably also in R2012b.
If people are willing to experiment, I can provide some test C code. I'm not posting that yet because it's a lot of work to distill a usable small example from my code that makes it possible to experiment with different encodings.
UPDATE:
I learned about feature('locale') which returns some locale-related data. On Linux, where everything works correctly, all encodings it returns are UTF-8. But not on OS X / Windows. Is there any way I could set the various encodings returned by feature('locale')?
UPDATE 2:
Here's a small test case: download. The zip file contains a MATLAB Engine C program, which reads a file, passes it to engEvalString(), then writes the output to another file. There's a sample file included with the following contents:
feature('DefaultCharacterSet', 'UTF-8')
feature('DefaultCharacterSet')
s='中'
The (last part of the) output I expect is
>>
s =
中
This is what I get with R2012b on OS X. However, R2013 on OS X gives me character code 26 instead of the character 中. Outputs produces by R2012b and R2013a are included in the zip file.
How can I get the expected output with R2013a on all three platforms (Windows, OS X, Linux)?
I strongly urge you to use engPutVariable, engGetVariable, and Matlab's eval instead. What you're trying to do with engEvalString will not work with many unicode strings due to embedded NULL (\0) characters, among other problems. Due to how the Windows COM interface works, the Matlab engine can't really support unicode in interpreted strings. I can't speculate about how the engine works on other platforms.
Your other question had an answer about using mxCreateString_UTF16. Wasn't that sufficient?
I've stuck with the following problem:
I have a script which is retrieving title form the Firefox window:
tell application "Firefox"
if the (count of windows) is not 0 then
set window_name to name of front window
end if
end tell
It works well as long as the title contains only English characters but when title contains some non-ASCII characters(Cyrillic in my case) it produces some utf-8 garbage. I've analyzed this garbage a bit and it seems that my Cyrillic character is converted to the Utf-8 without any concerning about codepage i.e instead of using Cyrillic codepage for conversion it uses non codepages at all and I have utf-8 text with characters different from those in the window title.
My question is: How can I retrieved the window title in utf-8 directly without any conversion?
I can achieve this goal by using AXAPI but I want to achieve this by AppleScript because AXAPI needs some option turned on in the system.
UPD:
It works fine in the AppleScript Editor. But I'm compiling it through the C++ code via OSACompile->OSAExecute->OSADisplay
I don't know the guts of the AppleScript Editor so maybe it has some inside information about how to encode the characters
I've found the answer when wrote update. Sometimes it is good to ask a question for better it understanding :)
So for the future searchers: If you want to use unicode result of the script execution you should provide typeUnicodeText to the OSADisplay then you will have result in the UTF-16LE in the result AEDesc