TouchJSON - Why I get lots of \U**** in the results of parsing? - iphone

I am using TouchJSON to parse a JSON data.
In the results, the strings in the array are all in \U** format.
Yes, they are meant to be other languages than English.
Why TouchJSON can't just replace them with real string via UTF8?
How should I deal with the results if I want to store them as NSStrings and use them in UILabel?
Thanks

The leading \u should not be a part of the string, but rather an identifier. If you load an NSString from parsed NSDictionary, it should've ignored the leading \u.
If it did not, you can always use [theString substringFromIndex: 2] to remove any leading identifiers.
Otherwise, take a look at SBJson, an alternative library for Objective-C JSON parsing.

Related

NSString to NSData encoding considerations

I understand why when going from NSData to NSString you need to specify encoding.
However I'm finding it frustrating how the reverse (NSString to NSData) needs to have an encoding specified.
In this related question the answers suggested using
NSUTF8StringEncoding or defaultCStringEncoding, with the latter not being fully explained.
So I just wanted to ask IF the following is correct when converting NSString to NSData:
In cases where you want to be 100% sure the binary representation of the NSString object is UTF8 then use NSUTF8StringEncoding (or whatever encoding is needed)
In cases where the encoding of the NSString object is known/expected to already be of a certain type and no conversion is required then it's safe (perhaps internally faster) to use defaultCStringEncoding (from what I have read objective-c uses UTF-16 internally, not sure if LE or BE but I'd assume LE because the platform is LE)
TIA
The encoding needs to be specified for converting NSString to NSData for the same reason it needs to be specified going from NSData to NSString.
An NSData object is a wrapper for a string of absolutely raw bytes. If the NSString doesn't specify some encoding, it doesn't know what to write, because at the level of ones and zeroes, a UTF-16 encoding looks different from a UTF-8 encoding of the same letter, and of course, if you write UTF-16 as big-endian and read it as little-endian you will get gibberish.
In other words, don't think of it as converting or escaping a string; it's generating a byte buffer, and the encoding tells it which ones and zeroes to write when the next character is "a" and which ones to write when it means "妈".
As for your question...here's my two cents.
1) If you are converting an NSString to an NSData so that your same program can convert it back later, and no other software will need to deal with that NSData until after you've read it back into an NSString, then none of this matters. All that matters is that your string-to-data encoding and your data-to-string encoding match.
2) If you are dealing only with ASCII characters, you can probably get away with a lot, just because many kinds of encoding use the same representation for characters under 128. But this breaks easily, even with little things like smart quotes.
3) Despite the name, defaultCStringEncoding is not something you should use as a default. It's designed for special circumstances where you need to deal with system strings and don't otherwise know how the system deals with its internal strings. It refers to the way strings are handled in the default C implementation, NOT in the NSString internals, so there's not necessarily a performance benefit.
4) If you write a string with an unknown string encoding, and you try to read it back with a different string encoding, your code will fail; in many cases, you will just end up with an empty string.
Bottom line is: who will be trying to interpret your NSData objects? If it's your own app, pick an encoding that makes sense for you (I use UTF8 for everything) and use it for both conversions. Otherwise, figure out what your ecosystem needs to read or write and make that your standard.

How to avoid UTF8 characters inside my NSDictionary?

i'm saving a NSString inside an NSArray and that NSArray inside an NSDictionary. While doing this, a process inside my NSDictionary notifies me if my string is like Hi I'm XYZ. Then in the place of single quote the appropriate UTF character is getting stored.
So how to avoid this or how can I get my actual text along with special characters from NSArray or from my NSDictionary?
Any help is thankful.
NSString internally uses Unicode characters. So it easily can handle all sorts of characters from different languages.
You cannot choose the internal encodig of NSString. It's always Unicode. If you have an encoding problem, then you have either created the NSString instance incorrectly or you have output the instance the wrong way.
And there's no such thing as an UTF character.
Please better describe your problem and show the relevant source code.

Stig JSON library parse error: How do you accommodate new lines in JSON?

I have some xml that is coming back from a web service. I in turn use xslt to turn that xml into json (I am turning someone else's xml service into a json-based service). My service, which is now outputting JSON, is consumed by my iphone app using the de facto iphone json framework, SBJSON.
The problem is, using the [string JSONValue] method chokes, and I can see that it's due to line breaks. Lo and behold, even the FAQ tells me the problem but I don't know how to fix it.
The parser fails to parse string X
Are you sure it's legal JSON? This framework is really strict, so won't accept stuff that (apparently) several validators accepts. In particular, literal TAB, NEWLINE or CARRIAGE RETURN (and all other control characters) characters in string tokens are disallowed, but can be very difficult to spot. (These characters are allowed between tokens, of course.)
If you get something like the below (the number may vary) then one of your strings has disallowed Unicode control characters in it.
NSLocalizedDescription = "Unescaped control character '0x9'";
I have tried using a line such as: NSString *myString = [myString stringByReplacingOccurrencesOfString:#"\n" withString:#"\\n"];
But that doesn't work. My xml service is not coming back as CDATA. The xml does have a line break in it as far as I can tell (how would I confirm this). I just want to faithfully transmit the line break into JSON.
I have actually spent an entire day on this, so it's time to ask. I have no pride anymore.
Thanks alot
Escaping a new line character should work. So following line should ideally work. Just check if your input also contains '\r' character.
NSString *myString = [myString stringByReplacingOccurrencesOfString:#"\n" withString:#"\\n"];
You can check which control character is present in the string using any editor which supports displaying all characters (non-displayable characters as well). e.g. using Notepad++ you can view all characters contained in a string.
It sounds like your XSLT is not working, in that it is not producing legal JSON. This is unsurprising, as producing correctly formatted JSON strings is not entirely trivial. I'm wondering if it would be simpler to just use the standard XML library to parse the XML into data structures that your app can consume.
I don't have a solution for you, but I usually use CJSONSerializer and CJSONDeserializer from the TouchJSON project and it is pretty reliable, I have never had a problem with line breaks before. Just a thought.
http://code.google.com/p/touchcode/source/browse/TouchJSON/Source/JSON/CJSONDeserializer.m?r=6294fcb084a8f174e243a68ccfb7e2c519def219
http://code.google.com/p/touchcode/source/browse/TouchJSON/Source/JSON/CJSONSerializer.m?r=3f52118ae2ff60cc34e31dd36d92610c9dd6c306

how to write special character in objective-C NSString

when I try to write this JSON:
{"author":"mehdi","email":"email#hotmail.fr","message":"Hello"}
like this in Objective-C:
NSString *myJson=#"{"author":"mehdi","email":"email#hotmail.fr","message":"Hello"}";
it doesn't work. Can someone help me?
You need to escape quote characters with a backslash:
NSString *myJson = #"{\"author\":\"mehdi\",\"email\":\"email#hotmail.fr\",\"message\":\"Hello\"}";
Otherwise the compiler will think that your string literal ends right after the first {.
The backslashes will not be present as characters in the resulting NSString. They are merely there as hints for the compiler and are removed from the actual string during compilation.
Newbie note: JSON strings that you read directly from a file via Objective C of course do not need any escaping! (JSON itself may need such, but that's about it. No need for additional escaping on the ObjC-side of it.)

NSStrings, C strings, pathnames and encodings in iPhone

I am using libxml2 in my iPhone app. I have an NSString that holds the pathname to an XML file. The pathname may include non-ASCII characters. I want to get a C string representation of the NSString for to pass to xmlReadFile(). It appears that cStringUsingEncoding gives me the representation I seek. I am not clear on which encoding to use.
I wonder if there is a "default" encoding in iPhone OS that I can use here and ensure that I can roundtrip non-ASCII pathnames.
Use NSString's fileSystemRepresentation. If the string contains characters that are not representable in the file system's encoding then this method will raise an exception.
To convert back, use NSFileManager's stringWithFileSystemRepresentation:length: