I have a file with many lines separated by "\n". One of the lines is:
Christian Grundekjøn
I can't read the file unless I delete the line. I use the following code to read line by line:
for (NSString *line in [[NSString stringWithContentsOfFile:fileName encoding:NSUTF8StringEncoding error:NULL] componentsSeparatedByString:#"\n"])
If I don't delete the line, the code wouldn't even go into the for loop at all. Nothing was read. How to handle the non-English letters?
If you are generating the text file from within iOS then you need to make sure you are encoding it with NSUTF8StringEncoding. But given the problem you are reporting, I suspect that you may be pulling in data from another source and that source hasn't encoded the text as UTF8. If this is the case, you may be able to fix the problem outside your app but converting the source file to UTF8.
If you don't know what encoding is used, e.g. because the user has supplied the file, iOS can try to guess it for you. A pattern that I have used successfully is to first try to get the string using UTF8 encoding, for example using the same approach you use. Assuming you write a method, to which you pass a filename, to get the string something like the following:
- (NSString*) stringFromFile: (NSString*) filePath;
{
NSError* error = nil;
NSString* stringFromFile = [NSString stringWithContentsOfFile: fileName
encoding: NSUTF8StringEncoding
error: &error];
if (stringFromFile) return stringFromFile; // success
NSLog(#"String is not UTF8 encoded. Error: %#", [error localizedDescription]);
NSStringEncoding encoding = 0;
NSError* usedEncodingError = nil;
NSString* stringFromFile = [NSString stringWithContentsOfFile: path
usedEncoding: &encoding
error: &usedEncodingError];
if (stringFromFile)
{
NSLog(#"Retrieved string using an alternative encoding. Encoding was: %d", encoding);
return stringFromFile;
}
// either handle error or attempt further explicit unencodings here
return nil;
}
In many cases, usedEncoding works very well. But there are edge cases where trying to figure out an encoding can be very tricky. It all depends on the source file.
I had problem with Japanese characters. My solution was when saving file to doc directory
NSString *fileData = [NSString stringWithFormat:#"%#", noteContent];
BOOL isWriteToFile = [fileData writeToFile:notePath atomically:YES encoding:NSUTF8StringEncoding error:nil];
When reading file content
[[NSString alloc] initWithContentsOfFile:fullNotePath usedEncoding:nil error:nil];
In the file, store your data in unicode format or you can also store special character in unicode format.
Related
I am making a call to get a JSON response like this:
NSData *urlData=[NSURLConnection sendSynchronousRequest:serviceRequest returningResponse:&httpResponse error:nil ];
NSString *returnString=[[NSString alloc]initWithData:urlData encoding:NSUTF8StringEncoding];
However, when I print the string using NSLog:
Emiratos �rabes Unidos
When I convert it to NSData like this:
NSData *jsonData = [returnString dataUsingEncoding:NSUTF8StringEncoding];
NSArray * response = [NSJSONSerialization JSONObjectWithData:jsonData options:0 error:nil];
It turns it to be (when I retrieve the value from the array):
Emiratos \Ufffdrabes Unidos
And when I put it in a label it displays it like this:
Emiratos �rabes Unidos
I would like to display in a label like this:
Emiratos Árabes Unidos
How can I do it?
The problem seems to be this line:
NSString *returnString =
[[NSString alloc] initWithData:urlData
encoding:NSUTF8StringEncoding];
You are assuming that the data is a string encoded as UTF8. But apparently it isn't. Therefore you're seeing the "replacement character" (codepoint U+FFFD) at this point.
You'll need to find out what encoding is actually being used. You can probably just experiment with other encodings. Alternatively, use NSLog to look at the data; an NSData object is logged as a sequence of hex bytes, so by looking at the bytes in that position, and by looking up various encodings on the Internet, you may be able to deduce what encoding is being used here.
(But if you use NSLog and you actually see FFFD at this point, then you've had it; the server itself is supplying the bad data and there's nothing you can do about it, as the good data is lost before you can get at it.)
I am converting NSData to NSString which I got as response of a url using the following method.
NSString *result = [[NSString alloc] initWithData:_Data encoding:NSUTF8StringEncoding];
It works fine and I am using this for a long time but today I faced an issue while loading the data (paging) at one page my result gives null string.
So I searched SO and found a method from this link NSData to NSString converstion problem!
[NSString stringWithCString:[theData bytes] length:[theData length]];
and this works fine.
My queries,
The method was deprecated in iOS 2.0. If I use this will I be facing any issue in future?
I think this is the text that made the method fail What is this and is there any way that I can encode this using NSUTF8StringEncoding?
What is the the alternative encoding that I can use for encoding all the type of characters like in the above pic?
In order to obtain the type of the content which is sent by the server, you need to inspect the Content-Type header of the response.
The content type's value specifies a "MIMI type", e.g.:
Content-Type: text/plain
A Content-Type's value may additionally specify a character encoding, e.g.:
Content-Type: text/plain; charset=utf-8
Each MIME type should define a "default" charset, which is to be used when there is no charset parameter specified.
For text/* media types the default charset is US-ASCII.
(see RFC 6657, §3).
The following code snippet demonstrates how to safely encode the body of a response:
- (NSString*) bodyString {
CFStringEncoding cfEncoding = NSASCIIStringEncoding;
NSString* textEncodingName = self.response.textEncodingName;
if (textEncodingName) {
cfEncoding = CFStringConvertIANACharSetNameToEncoding( (__bridge CFStringRef)(textEncodingName) );
}
if (cfEncoding != kCFStringEncodingInvalidId) {
NSStringEncoding encoding = CFStringConvertEncodingToNSStringEncoding(cfEncoding);
return [[NSString alloc] initWithData:self.body encoding:encoding];
}
else {
return [self.body description];
}
}
Note:
body is a property returning a NSData object representing the response data.
response is a property returning the NSHTTPURLResponse object.
If
NSString *result = [[NSString alloc] initWithData:_Data encoding:NSUTF8StringEncoding];
returns nil then _Data does not contain a valid string in UTF-8 encoding.
You said that
[NSString stringWithCString:[theData bytes] length:[theData length]];
works fine in your case. This method
interprets the data bytes in the "default C string encoding", but it is unspecified which
encoding that is (and therefore this method is deprecated and should not be used).
I think the default C string encoding is still "Mac Roman". In that case
NSString *result = [[NSString alloc] initWithData:_Data encoding:NSMacOSRomanStringEncoding];
would be the correct solution. But in any case, you should find out which encoding
the web service uses for the response, and specify that in the initWithData:encoding:
method.
Try this
NSString *theString = [NSString stringWithFormat:#"To be continued%C", ellipsis];
NSData *asciiData = [theString dataUsingEncoding:NSASCIIStringEncoding allowLossyConversion:YES];
NSString *asciiString = [[NSString alloc] initWithData:asciiData encoding:NSASCIIStringEncoding];
NSLog(#"Original: %# (length %d)", theString, [theString length]);
NSLog(#"Converted: %# (length %d)", asciiString, [asciiString length]);
It is due to the uncorrect string encoding.
You can try:
save the NSData to the disk with dataPath
use the NSString class method to create the string:
+ (id)stringWithContentsOfURL:(NSURL *)url usedEncoding:(NSStringEncoding *)enc error:(NSError **)error
Notice here:
enc
Upon return, if url is read successfully, contains the encoding used to interpret the data.
So if the method successes, you can get the correct string and all is done by the iOS.
I'm using a JSON file which contains non-English characters.Hence when I'm fetching values from this file, it is showing some Chinese like characters in the simulator.In the console, I'm getting values like
\U2021\U00c6\U00e1\U2021\U00c6\U00a9\U2021\U00d8\U00e7\U2021\U00c6\U00b1\U2021\U00d8
\U00e0\U2021\U00c6\U00d8\U2021\U00c6\U00d6\U2021\U00c6\U2264\U2021\U00c6\U2122\U2021
\U00d8\U00e7\U2021\U00c6\U2122\U2021\U00c6\U00b1\U2021\U00d8\U00e0\U2021\U00c6\U00ef
\U2021\U00d8\U00e7 \U2021\U00c6\U00ef\U2021\U00d8\U00c7\U2021\U00c6\U00fc...
Any idea?
Try to print in such way:
NSString *currentString = [[[NSString alloc] initWithData:characterBuffer encoding:NSUTF8StringEncoding] autorelease];
NSLog(#"Converted string: %#", currentString);
where characterBuffer is buffer where you've collected received data, replace NSUTF8StringEncoding with appropriate encoding, used at your server.
I am attempting to create a .xml file and set the contents of the file to equal a predetermined string.
I have built the XML and am currently storing it in an NSString.
I want to put the contents of this string into a file with the extension of .xml and send an email with the file as an attachment.
I am able to email PDFs and assumed creating the file with an extension of .xml would be the easy bit, but alas I cannot do it.
If anyone could offer a helping hand I would be much appreciative.
Try something like this:
NSString *path = ...;
NSString *string = ...;
NSError *error;
BOOL ok = [string writeToFile:path atomically:YES encoding:NSUnicodeStringEncoding error:&error];
if (!ok) {
// an error occurred
NSLog(#"Error writing file at %#\n%#",path, [error localizedFailureReason]);
// implementation continues ..
Writing to Files and URLs
So I have the following code:
NSURL *baseURL = [NSURL URLWithString:#"http://www.baseurltoanxmlpage.com"];
NSURL *url = [NSURL URLWithString: #"page.php" relativeToURL:baseURL];
NSArray *array = [NSArray arrayWithContentsOfURL:url];
If the XML page is as follows:
<array><dict><key>City</key><string>Montreal</string></dict></array>
The array returns fine. However, if the XML file is as follows:
<array><dict><key>City</key><string>Montréal</string></dict></array>
The array returns null. I guess this has something to do with the special char "é".
How would I deal with these characters? The XML page is generated with PHP. utf8_encode() function makes the array return but then I don't know how to deal with the encoded "é" character.
Here's the working solution:
NSString *stringArray = [NSString stringWithContentsOfURL:url encoding:NSUTF8StringEncoding error:nil];
NSArray *array = [stringArray propertyList];
NSLog(stringArray);
NSLog(#"%#", array);
NSLog([[[array objectAtIndex:0] valueForKey:#"City"] stringByReplacingPercentEscapesUsingEncoding:NSUTF8StringEncoding]);
The first log prints out the "é" fine.
In the second log, it's encoded and is printed as "\U00e9".
In the 3rd log, it's decoded and printed as "é" (which is what I was looking for).
As you noted, you need to return a UTF8- or UTF16-encoded XML document. Then make NSString objects using the stringByReplacingPercentEscapesUsingEncoding: method, using the relevant encoding.
NSString has a method to return data in encoding, data read from the URL:
+(id)stringWithContentsOfURL:encoding:error:
Look under String Encodings section for possible encodings to find the one suitable.
Looking at the NSString documentation for :propertyList, we see:
Parses the receiver as a text representation of a property list, returning an NSString, NSData, NSArray, or NSDictionary object, according to the topmost element.
A property list is an Apple-specific document that stores representations of NSString, NSDictionary, NSArray, and other core types in XML format. These .plist files are usually used for storing preferences or application settings.
This XML-formatted property list document is encoded in UTF-8, by default. When you turn your NSString into a property list element, encoding the "é" character replaces it with the UTF-8 Unicode character "\U00e9".