How do you add a macron to a character in an NSString? via Unicode? - iphone

Objective-C iOS Programming:
I need to display a number like 8.33333 just as 8.3, with the three having a macron (repeating number symbol, a bar line) above it. I have done some searching and have not found a solution to this. I have found the encoding for C/C++/Java source code being "\u0304" and for Unicode being "U+0304". Is there a way that I can create an NSString from a Unicode character? And how would a create a Unicode character with a macron?
Thanks.

For combining characters such as U+0304, the string should contain the original letter followed by the combining character. For instance,
NSString *str = #"ca\u0304t";
is a representation of cāt.

Related

Adding the combining overline unicode character

I'm writing a program that converts an integer to a Roman numeral.
Roman numerals over 3999 are overlined, so IV overlined is 4000, CM overlined is 900'000, etc. These lines can stack.
So as to not limit my program, stopping it at just 3999 isn't good enough.
The question is, how do I add the "combining overline" unicode character to my string to achieve this?
My program is written in Rust, but I suspect the solution is similar across most languages that support unicode strings.
Just add the combining mark after each character.
Here's a Python example. What you see depends on support for combining marks in your console/IDE/browser.
with open('test.txt','w',encoding='utf-8-sig') as f:
print('I\u0305V\u0305',file=f)
Output (image and text)
(image) I̅V̅ (text)
In testing, U+0305 COMBINING OVERLINE could stack up to two, but Chrome drew incorrectly for three. There is also U+033F COMBINING DOUBLE OVERLINE.
You can just use them in string constants, either with the Unicode escape sequence (here shown for Rust) or directly (as they can be easily represented in UTF-8 source code files):
println!("I\u{0305}V\u{0305} - I̅V̅");
Note however, that each letter with overline requires two Unicode codepoints. So they do not fit into a single char. You need to use a string.
The combining overline character itself fits into a single character:
let combining_overline = '\u{0305}';
To apply it, insert it after the base character that needs the overline.

NSString and no UTF-8 symbols

For some of you (I'm sure) this question is quite simple to answer, but I have some difficulties in understanding how to solve the problem.
I have a .txt file containing a table like this:
" 236
? 26
x00EE 16
As you probably understood the left column lists symbols and the right one lists some code of the, I defined in my app.
And... you probably understood that, within symbols, there are some "strange". The 0x00EE should be the "å" (a with a ring above).
Unfortunately I can't control the left column, i.e. it comes from another software. Making some experiments I found that:
NSLog( #"\x00ee" );
for example produces a waring telling the hte code does not belong to the UTF-8 range.
So I was wandering how to convert the NSString #"\x00ee" (that I read from file, so is a string composed of 6 chars) to the unique unicode letter "å" (a with a ring above).
Can anyone help me?
Thanks...
You need to find out what character set encoding was used. 0xEE is unicode for î. In Unicode, å is E5. This is encoded in UTF-8 as the sequence 0xC3 0xA5. The following does the trick for me:
NSLog(#"\xc3\xa5");
If your input string contains only ASCII characters then you can use the fact that
NSNonLossyASCIIStringEncoding decodes \uNNNN to the corresponding Unicode character:
NSString *s = #"\\x00ee"; // from your text file
NSString *s1 = [s stringByReplacingOccurrencesOfString:#"\\x" withString:#"\\u"];
NSData *d = [s1 dataUsingEncoding:NSASCIIStringEncoding];
NSString *s2 = [[NSString alloc] initWithData:d encoding:NSNonLossyASCIIStringEncoding];
NSLog (#"%#", s2);
Output: î, which is U+00EE (LATIN SMALL LETTER I WITH CIRCUMFLEX).
(Remark: å is U+00E5, not U+00EE).

NSString UTF8String mangling unicode characters

When I run [NSString UTF8String] on certain unicode characters the resulting const char* representation is mangled both in NSLog and on the device/simulator display. The NSString itself displays fine but I need to convert the NSString to a cStr to use it in CGContextShowTextAtPoint.
It's very easy to reproduce (see code below) but I've searched for similar questions without any luck. Must be something basic I'm missing.
const char *cStr = [#"章" UTF8String];
NSLog(#"%s", cStr);
Thanks!
CGContextShowTextAtPoint is only for ASCII chars.
Check this SO question for answers.
When using the string format specifier (aka %s) you cannot be guaranteed that the characters of a c string will print correctly if they are not ASCII. Using a complex character as you've defined can be expressed in UTF-8 using escape characters to indicate the character set from which the character can be found. However the %s uses the system encoding to interpret the characters in the character string you provide to the formatting ( in this case, in NSLog ). See Apple's documentation:
https://developer.apple.com/library/mac/documentation/cocoa/Conceptual/Strings/Articles/formatSpecifiers.html
%s
Null-terminated array of 8-bit unsigned characters. %s interprets its input in the system encoding rather than, for example, UTF-8.
Going onto you CGContextShowTextAtPoint not working, that API supports only the macRoman character set, which is not the entire Unicode character set.
Youll need to look into another API for showing Unicode characters. Probably Core Text is where you'll want to start.
I've never noticed this issue before, but some quick experimentation shows that using printf instead of NSLog will cause the correct Unicode character to show up.
Try:
printf("%s", cStr);
This gives me the desired output ("章") both in the Xcode console and in Terminal. As nob1984 stated in his answer, the interpretation of the character data is up to the callee.

How to get stroke count of Chinese character?

How to get stroke count of Chinese character?
Example>
一 => 1
十 => 2
日 => 4
Short answer: You can't without a hardcoded map of characters to stroke counts. And then, you'll have to assume the user is using a particular Chinese variant (e.g. traditional.)
Unicode (the basic character set used by NSString) doesn't distinguish between traditional, simplified, Japanese-specific, Korean-specific, etc. hanzi. Unicode does not encode stroke information directly. Rather, it distinguishes between characters (not their graphical representations) and a character may have different stroke counts depending on language and font used. So while the character 十 may universally have two strokes, other characters will vary.
The example Wikipedia gives is the character for "grass", U+8279, which has four strokes in traditional Chinese, but 3 in every other variant.
You can use "ssc install cnstroke" STATA command for the said purpose.
Thanks, math.
First, call
NSInteger section = [[UILocalizedIndexedCollation currentCollation] sectionForObject:yourObject collationStringSelector:#selector(objectsProperty)];
then check index of section in following array
[UILocalizedIndexedCollation currentCollation].sectionTitles
Remember to add
Localized resources can be mixed = YES
in info.plist

Displaying accents and other UTF-8 characters in a UILabel

I have a little app which lists the names of certain people from around the world, and some of those names use characters that are not normal ASCII characters, like DÌaz, or ThÈrËse for example.
The strings show up in Xcode just fine, but when I put them in a UILabel, they behave unexpectedly.
My question is: Is there a way to set up a UILabel to to take the exact string in Xcode, and display it properly, even if it is a UTF-8 character (or any other character encoding for that matter)?
UIKit fully supports unicode, your problem is most likely the encoding of the source file. You can set that in the inspector (Xcode 4: ⌘⌥1) under "Text Settings". Make sure it is UTF-8 as well.
Alternative: Use unicode escapes like #"\u2605" (should display ★).
Try to encode the String:
NSString *s = [NSString stringWithCString:value encoding:NSASCIIStringEncoding];