How to extract unicode value of an NSString

How to extract unicode value of an NSString - unicode

I'm trying to extract the unicode value of the first character of an NSString for range checking purposes. How can this be accomplished?

NSString.characterAtIndex(index) is the method you're looking for
http://developer.apple.com/library/mac/documentation/Cocoa/Reference/Foundation/Classes/NSString_Class/Reference/NSString.html#//apple_ref/occ/instm/NSString/characterAtIndex:

Related

NSString UTF8String mangling unicode characters

When I run [NSString UTF8String] on certain unicode characters the resulting const char* representation is mangled both in NSLog and on the device/simulator display. The NSString itself displays fine but I need to convert the NSString to a cStr to use it in CGContextShowTextAtPoint.
It's very easy to reproduce (see code below) but I've searched for similar questions without any luck. Must be something basic I'm missing.
const char *cStr = [#"章" UTF8String];
NSLog(#"%s", cStr);
Thanks!

CGContextShowTextAtPoint is only for ASCII chars.
Check this SO question for answers.

When using the string format specifier (aka %s) you cannot be guaranteed that the characters of a c string will print correctly if they are not ASCII. Using a complex character as you've defined can be expressed in UTF-8 using escape characters to indicate the character set from which the character can be found. However the %s uses the system encoding to interpret the characters in the character string you provide to the formatting ( in this case, in NSLog ). See Apple's documentation:
https://developer.apple.com/library/mac/documentation/cocoa/Conceptual/Strings/Articles/formatSpecifiers.html
%s
Null-terminated array of 8-bit unsigned characters. %s interprets its input in the system encoding rather than, for example, UTF-8.
Going onto you CGContextShowTextAtPoint not working, that API supports only the macRoman character set, which is not the entire Unicode character set.
Youll need to look into another API for showing Unicode characters. Probably Core Text is where you'll want to start.

I've never noticed this issue before, but some quick experimentation shows that using printf instead of NSLog will cause the correct Unicode character to show up.
Try:
printf("%s", cStr);
This gives me the desired output ("章") both in the Xcode console and in Terminal. As nob1984 stated in his answer, the interpretation of the character data is up to the callee.

How to avoid UTF8 characters inside my NSDictionary?

i'm saving a NSString inside an NSArray and that NSArray inside an NSDictionary. While doing this, a process inside my NSDictionary notifies me if my string is like Hi I'm XYZ. Then in the place of single quote the appropriate UTF character is getting stored.
So how to avoid this or how can I get my actual text along with special characters from NSArray or from my NSDictionary?
Any help is thankful.

NSString internally uses Unicode characters. So it easily can handle all sorts of characters from different languages.
You cannot choose the internal encodig of NSString. It's always Unicode. If you have an encoding problem, then you have either created the NSString instance incorrectly or you have output the instance the wrong way.
And there's no such thing as an UTF character.
Please better describe your problem and show the relevant source code.

How do you add a macron to a character in an NSString? via Unicode?

Objective-C iOS Programming:
I need to display a number like 8.33333 just as 8.3, with the three having a macron (repeating number symbol, a bar line) above it. I have done some searching and have not found a solution to this. I have found the encoding for C/C++/Java source code being "\u0304" and for Unicode being "U+0304". Is there a way that I can create an NSString from a Unicode character? And how would a create a Unicode character with a macron?
Thanks.

For combining characters such as U+0304, the string should contain the original letter followed by the combining character. For instance,
NSString *str = #"ca\u0304t";
is a representation of cāt.

NSString stringWithCharacters Unicode Problem

This has got to be simple -- surely this method is supposed to work -- but I'm having some kind to two-byte-to-one-byte problem, I think.
The purpose of the code is to generate a string of 0 characters of a certain length (10 minus the number of digits that will be tacked onto the end). It looks like this:
const unichar zero = 0x0030;
NSString *zeroBuffer = [NSString stringWithCharacters:&zero length:(10 - [[NSString stringWithFormat:#"%i", photoID] length])];
Alternate second line (casting the thing at address &zero):
NSString *zeroBuffer = [NSString stringWithCharacters:(unichar *)&zero length:(10 - [[NSString stringWithFormat:#"%i", photoID] length])];
0x0030 is the address of the numeral 0 in the Basic Latin portion of the unicode table.
If photoID is 123 I'd want zeroBuffer to be #"0000000". What it actually ends up as is a zero and then some crazy unicode characters along the lines of (not sure how this will show) this:
0䪨 燱ܾ뿿﹔
I'm assuming that I've got data crossing character boundaries or something. I've temporarily rewritten it as a dumb substring thing, but this seems like it would be more efficient.
What am I doing wrong?

stringWithCharacters:length: expects the first argument to be the address of a buffer containing each of the characters to be inserted in the string in sequence. It's reading your character zero for the first character, then advancing to the following memory address and reading whatever data is there for the next character, and so on. This is not the right method for doing what you're trying to do.
Alas, there isn't a built-in repeat-this-string method. See the answers here for suggestions.
Alternatively, you can avoid the issue completely and just do this:
[NSString stringWithFormat:#"%010i", photoID];
That causes the number formatter to output a decimal number padded with ten zeroes.

how to convert german charater in to utf string in pdf parsing in iphone?

I have implementing pdf parsing in which i have parsed pdf and fetch the all text but it disply junks characters so i want to convert in to utf string.How it possible please help me for this question.

First, you need to find out which encoding is currently used for the text. I guess it's ISO-8859-1, aka Latin-1 or it's variant ISO-8859-15, aka Latin-15.
As soon as know that it's a piece of cake. You haven't said in which container you got the text, e.g. whether it's stored in a C string or NSData.
Let's assume you got a C string. In that case you would do:
myString = [[NSString alloc] initWithBytes:myCString
length:strlen(myCString)
encoding:NSISOLatin1StringEncoding];
If you got a NSData, you would use the initWithData:encoding: initializer instead. That's all you need to do, as according to Apple's documentation, "A string object presents itself as an array of Unicode characters". If you need a UTF8-encoded C string, you can then query it via:
myUTF8CString = [myString UTF8String];
There's also dataUsingEncoding: to get a NSData object instead of a C string.
Have a look at the NSString class reference and the NSStringEncoding constants.