Non-English Characters in JSON - iphone

I'm using a JSON file which contains non-English characters.Hence when I'm fetching values from this file, it is showing some Chinese like characters in the simulator.In the console, I'm getting values like
\U2021\U00c6\U00e1\U2021\U00c6\U00a9\U2021\U00d8\U00e7\U2021\U00c6\U00b1\U2021\U00d8
\U00e0\U2021\U00c6\U00d8\U2021\U00c6\U00d6\U2021\U00c6\U2264\U2021\U00c6\U2122\U2021
\U00d8\U00e7\U2021\U00c6\U2122\U2021\U00c6\U00b1\U2021\U00d8\U00e0\U2021\U00c6\U00ef
\U2021\U00d8\U00e7 \U2021\U00c6\U00ef\U2021\U00d8\U00c7\U2021\U00c6\U00fc...
Any idea?

Try to print in such way:
NSString *currentString = [[[NSString alloc] initWithData:characterBuffer encoding:NSUTF8StringEncoding] autorelease];
NSLog(#"Converted string: %#", currentString);
where characterBuffer is buffer where you've collected received data, replace NSUTF8StringEncoding with appropriate encoding, used at your server.

Related

Latin characters display ? objective-c

I am making a call to get a JSON response like this:
NSData *urlData=[NSURLConnection sendSynchronousRequest:serviceRequest returningResponse:&httpResponse error:nil ];
NSString *returnString=[[NSString alloc]initWithData:urlData encoding:NSUTF8StringEncoding];
However, when I print the string using NSLog:
Emiratos �rabes Unidos
When I convert it to NSData like this:
NSData *jsonData = [returnString dataUsingEncoding:NSUTF8StringEncoding];
NSArray * response = [NSJSONSerialization JSONObjectWithData:jsonData options:0 error:nil];
It turns it to be (when I retrieve the value from the array):
Emiratos \Ufffdrabes Unidos
And when I put it in a label it displays it like this:
Emiratos �rabes Unidos
I would like to display in a label like this:
Emiratos Árabes Unidos
How can I do it?
The problem seems to be this line:
NSString *returnString =
[[NSString alloc] initWithData:urlData
encoding:NSUTF8StringEncoding];
You are assuming that the data is a string encoded as UTF8. But apparently it isn't. Therefore you're seeing the "replacement character" (codepoint U+FFFD) at this point.
You'll need to find out what encoding is actually being used. You can probably just experiment with other encodings. Alternatively, use NSLog to look at the data; an NSData object is logged as a sequence of hex bytes, so by looking at the bytes in that position, and by looking up various encodings on the Internet, you may be able to deduce what encoding is being used here.
(But if you use NSLog and you actually see FFFD at this point, then you've had it; the server itself is supplying the bad data and there's nothing you can do about it, as the good data is lost before you can get at it.)

How to encode some text correctly

I have some text formatting issues that I need to solve. I have some strange characters displaying from the NSString below
the original string:
NSString *descriptionStringPreFormatted = [item objectForKey:#"title"];
the formatted string:
NSString *descriptionLabelStringUTF8 = [descriptionStringPreFormatted stringByReplacingPercentEscapesUsingEncoding:NSUTF8StringEncoding];
NSLog(#"descriptionStringPreFormatted is %#", descriptionStringPreFormatted);
NSLog(#"descriptionLabelStringUTF8 is %#", descriptionLabelStringUTF8);
here's the output which is the same whether I use the UTF8 encoding or not.
the output:
2013-01-05 16:44:51.807 descriptionStringPreFormatted is £144.99...
2013-01-05 16:44:51.810 descriptionLabelStringUTF8 is £144.99...
I think you are receiving dictionary "item" from web services. So try to decode that response string from webservice with NSUTF8StringEncoding.
NSString *str=[[NSString alloc] initWithData:responseData encoding:NSUTF8StringEncoding];
here "responseData" is raw data coming from web services.

NSString stringWithContentsOfFile:fileName non-english letters

I have a file with many lines separated by "\n". One of the lines is:
Christian Grundekjøn
I can't read the file unless I delete the line. I use the following code to read line by line:
for (NSString *line in [[NSString stringWithContentsOfFile:fileName encoding:NSUTF8StringEncoding error:NULL] componentsSeparatedByString:#"\n"])
If I don't delete the line, the code wouldn't even go into the for loop at all. Nothing was read. How to handle the non-English letters?
If you are generating the text file from within iOS then you need to make sure you are encoding it with NSUTF8StringEncoding. But given the problem you are reporting, I suspect that you may be pulling in data from another source and that source hasn't encoded the text as UTF8. If this is the case, you may be able to fix the problem outside your app but converting the source file to UTF8.
If you don't know what encoding is used, e.g. because the user has supplied the file, iOS can try to guess it for you. A pattern that I have used successfully is to first try to get the string using UTF8 encoding, for example using the same approach you use. Assuming you write a method, to which you pass a filename, to get the string something like the following:
- (NSString*) stringFromFile: (NSString*) filePath;
{
NSError* error = nil;
NSString* stringFromFile = [NSString stringWithContentsOfFile: fileName
encoding: NSUTF8StringEncoding
error: &error];
if (stringFromFile) return stringFromFile; // success
NSLog(#"String is not UTF8 encoded. Error: %#", [error localizedDescription]);
NSStringEncoding encoding = 0;
NSError* usedEncodingError = nil;
NSString* stringFromFile = [NSString stringWithContentsOfFile: path
usedEncoding: &encoding
error: &usedEncodingError];
if (stringFromFile)
{
NSLog(#"Retrieved string using an alternative encoding. Encoding was: %d", encoding);
return stringFromFile;
}
// either handle error or attempt further explicit unencodings here
return nil;
}
In many cases, usedEncoding works very well. But there are edge cases where trying to figure out an encoding can be very tricky. It all depends on the source file.
I had problem with Japanese characters. My solution was when saving file to doc directory
NSString *fileData = [NSString stringWithFormat:#"%#", noteContent];
BOOL isWriteToFile = [fileData writeToFile:notePath atomically:YES encoding:NSUTF8StringEncoding error:nil];
When reading file content
[[NSString alloc] initWithContentsOfFile:fullNotePath usedEncoding:nil error:nil];
In the file, store your data in unicode format or you can also store special character in unicode format.

How to find why NSMutableData is invalid

I access a RESTFUL url and get back results. The results are in JSON. I turn the response into a string via:
- (void)connectionDidFinishLoading:(NSURLConnection *)connection {
NSString *json = [[NSString alloc] initWithBytes:[self.receivedData mutableBytes] length:[self.receivedData length] encoding:NSUTF8StringEncoding];
The json variable has a value of 0x0. When I mouse over it, I see <Invalid CFStringRef>. How can I debug this to tell why it is invalid? I render the JSON given back through the browser in A JSON parser. That checks out fine.
Results are given back by entering an ID in the URL. Other IDs return results without issue. The result set is fairly large.
First I would use initWithData:encoding: to setup the NSString. Small difference, but that method is there for a reason.
Then, I would do a hexdump of self.receivedData to see what is actually in there. If that data is not properly UTF8 encoded then the initWithData:encoding: will fail.
(Google for NSData hex dump to find other people's utility functions to do this)
I have found that sometimes web services are sloppy with their encoding. So I usually implement a fallback like this:
NSString* html = [[NSString alloc] initWithData: data encoding: NSUTF8StringEncoding];
if (html == nil) {
html = [[NSString alloc] initWithData: data encoding: NSISOLatin1StringEncoding];
if (html == nil) {
html = [[NSString alloc] initWithData: data encoding: NSMacOSRomanStringEncoding];
}
}
It is kind of sad that this is required but many web services are not written or configured properly.
Use NSLog to look at the bytes.

NSArray from URL encoding problem

So I have the following code:
NSURL *baseURL = [NSURL URLWithString:#"http://www.baseurltoanxmlpage.com"];
NSURL *url = [NSURL URLWithString: #"page.php" relativeToURL:baseURL];
NSArray *array = [NSArray arrayWithContentsOfURL:url];
If the XML page is as follows:
<array><dict><key>City</key><string>Montreal</string></dict></array>
The array returns fine. However, if the XML file is as follows:
<array><dict><key>City</key><string>Montréal</string></dict></array>
The array returns null. I guess this has something to do with the special char "é".
How would I deal with these characters? The XML page is generated with PHP. utf8_encode() function makes the array return but then I don't know how to deal with the encoded "é" character.
Here's the working solution:
NSString *stringArray = [NSString stringWithContentsOfURL:url encoding:NSUTF8StringEncoding error:nil];
NSArray *array = [stringArray propertyList];
NSLog(stringArray);
NSLog(#"%#", array);
NSLog([[[array objectAtIndex:0] valueForKey:#"City"] stringByReplacingPercentEscapesUsingEncoding:NSUTF8StringEncoding]);
The first log prints out the "é" fine.
In the second log, it's encoded and is printed as "\U00e9".
In the 3rd log, it's decoded and printed as "é" (which is what I was looking for).
As you noted, you need to return a UTF8- or UTF16-encoded XML document. Then make NSString objects using the stringByReplacingPercentEscapesUsingEncoding: method, using the relevant encoding.
NSString has a method to return data in encoding, data read from the URL:
+(id)stringWithContentsOfURL:encoding:error:
Look under String Encodings section for possible encodings to find the one suitable.
Looking at the NSString documentation for :propertyList, we see:
Parses the receiver as a text representation of a property list, returning an NSString, NSData, NSArray, or NSDictionary object, according to the topmost element.
A property list is an Apple-specific document that stores representations of NSString, NSDictionary, NSArray, and other core types in XML format. These .plist files are usually used for storing preferences or application settings.
This XML-formatted property list document is encoded in UTF-8, by default. When you turn your NSString into a property list element, encoding the "é" character replaces it with the UTF-8 Unicode character "\U00e9".