I have an array which contains description of a route on map. I got this array by parsing JSON. My arrays contains string in this format:
"<b>Sri Krishna Nagar Rd</b> \U306b\U5411\U304b\U3063\U3066<b>\U5317\U6771</b>\U306b\U9032\U3080",
"\U53f3\U6298\U3057\U3066\U305d\U306e\U307e\U307e <b>Sri Krishna Nagar Rd</b> \U3092\U9032\U3080",
"\U5927\U304d\U304f\U5de6\U65b9\U5411\U306b\U66f2\U304c\U308a\U305d\U306e\U307e\U307e <b>Bailey Rd/<wbr/>NH 30</b> \U3092\U9032\U3080<div class=\"\">\U305d\U306e\U307e\U307e NH 30 \U3092\U9032\U3080</div><div class=\"google_note\">\n<b landmarkid=\"0x39ed57bfe47253b7:0x779c8bf48892f269\" class=\"dir-landmark\">Petrol Bunk</b>\U3092\U901a\U904e\U3059\U308b<div class=\"dirseg-sub\">\Uff083.9 km \U5148\U3001\U53f3\U624b\Uff09</div>\n</div>",
Now I want to get name of places from this array like Sri Krishna Nagar Rd , NH 30 Petrol Bunk. First two should give Sri Krishna Nagar Rd and last on should give NH 30 Petrol
Bunk
How can I get result like this.Any help would be appreciated. Thanx In Advance.
Again, suppose I have string in this format..."\U5de6\U6298\U3059\U308b" which don't have ny place name.How will i handle this scenarios.
You can get like below:
NSString *strName=[yourArray objectAtIndex:index];
NSString *yourPlaceString=[[strName componentsSeparatedByString:#"<b>"] objectAtIndex:1];
yourPlaceString=[[yourPlaceString componentsSeparatedByString:#"</b>"] objectAtIndex:0];
you can get all places like this.
First of all, you should check if you don't have any other cleaner API available for the service you query this data. If the service returns such garbage in its JSON response, that shouldn't be your responsability to clean up that mess: the service should return some text that is more usable if it is a real clean API.
Next, if you really don't have any other choice and really need to clean this text, you have two options:
If the text is XHTML (I mean real XHTML, conforming to the XML standard) you may use an NSXMLParser to filter out any tags and only keep the text from your string. This may be a bit too much for this anyway so I don't really recommand it.
You can use regular expressions. If you are developping for iOS4.0+ you can use the NSRegularExpressionclass for this purpose. The tricky part is to get the right regex (can help you with that if needed)
You can use the NSScanner class (which is available in iOS since 2.0 IIRC) to scan characters in you string and parse it. This is probably easier to understand and the way to go if you are not a regex expert, so I recommand this approach
For example if you choose the NSScannersolution, you can scan your string for characters in the alphanumeric character set, to scan letters and digits and accumulate it (you may also add ponctuation characters to your NSCharacterSetyou are using if needed). You will have the NSScanner to stop when it encounter characters such as the unicode characters \Uxxxx or like < and >. When you encounter < you can then ask the NSScanner to ignore the characters up to the next >, then start to scan the alphanumeric characters again and accumulating... and so on until the end of the string.
Finally, if you really find a pattern in the response string you are receiving, like if your place names is always between the first <b> and </b> pair (but you have to be sure of that), you can handle it other ways, like:
splitting your string using the <b> text as the separator (e.g. componentsSeparatedByString)
or asking the rangeOfString for the string <b> and then for string </b> and once you have their position, only extract substringWithRange from your original string to extract only the place name (using rangeOfString will be faster that componentsSeparatedByString because it will stop on the first occurrence found)
It looks like an encoding problem - can you change the encoding of the source or target to a different format. I had similar issues with German ö ä ü characters when UTF-8 was turned off....
Related
I have an XML which I am using to parse news. News have a description. I'm using NSString to show that description in UILabel.
But, the description comes like this:
Bad news for Windows’ the researches show that Windows’ for years....
And it is being showed with those unwanted characters in UILabel. The numbers are changing in every string. They are not the sames.
I want to remove the characters that begins with &# and the numbers that follow. How can I do that? Which string encoding format should I use?
Thanks a lot.
EDIT: I don't have just one string. If I remove ’ from this one, there might be ᶺ in another one. It won't be removed.
I can remove &# characters and numbers too. But when I do that, In a string like that "In 1980, Jobs told us to৬ do something" the output will be "In , Jobs told us to do something" 1980 will be gone too, but I don't want that. That's a problem either.
These are ASCAII symbols so you need to use utf-8 string.check this link
so use this line of code
NSString *resultString = [NSString stringWithUTF8String:myAsciiString];
I have some xml that is coming back from a web service. I in turn use xslt to turn that xml into json (I am turning someone else's xml service into a json-based service). My service, which is now outputting JSON, is consumed by my iphone app using the de facto iphone json framework, SBJSON.
The problem is, using the [string JSONValue] method chokes, and I can see that it's due to line breaks. Lo and behold, even the FAQ tells me the problem but I don't know how to fix it.
The parser fails to parse string X
Are you sure it's legal JSON? This framework is really strict, so won't accept stuff that (apparently) several validators accepts. In particular, literal TAB, NEWLINE or CARRIAGE RETURN (and all other control characters) characters in string tokens are disallowed, but can be very difficult to spot. (These characters are allowed between tokens, of course.)
If you get something like the below (the number may vary) then one of your strings has disallowed Unicode control characters in it.
NSLocalizedDescription = "Unescaped control character '0x9'";
I have tried using a line such as: NSString *myString = [myString stringByReplacingOccurrencesOfString:#"\n" withString:#"\\n"];
But that doesn't work. My xml service is not coming back as CDATA. The xml does have a line break in it as far as I can tell (how would I confirm this). I just want to faithfully transmit the line break into JSON.
I have actually spent an entire day on this, so it's time to ask. I have no pride anymore.
Thanks alot
Escaping a new line character should work. So following line should ideally work. Just check if your input also contains '\r' character.
NSString *myString = [myString stringByReplacingOccurrencesOfString:#"\n" withString:#"\\n"];
You can check which control character is present in the string using any editor which supports displaying all characters (non-displayable characters as well). e.g. using Notepad++ you can view all characters contained in a string.
It sounds like your XSLT is not working, in that it is not producing legal JSON. This is unsurprising, as producing correctly formatted JSON strings is not entirely trivial. I'm wondering if it would be simpler to just use the standard XML library to parse the XML into data structures that your app can consume.
I don't have a solution for you, but I usually use CJSONSerializer and CJSONDeserializer from the TouchJSON project and it is pretty reliable, I have never had a problem with line breaks before. Just a thought.
http://code.google.com/p/touchcode/source/browse/TouchJSON/Source/JSON/CJSONDeserializer.m?r=6294fcb084a8f174e243a68ccfb7e2c519def219
http://code.google.com/p/touchcode/source/browse/TouchJSON/Source/JSON/CJSONSerializer.m?r=3f52118ae2ff60cc34e31dd36d92610c9dd6c306
I am getting problem while parsing xml files that contains some special characters like single quote,double quote (', "")etc.I am using NSXMLParser's parser:foundCharacters:method to collect characters in my code.
<synctext type = "word" >They raced to the park Arthur pointed to a sign "Whats that say" he asked Zoo said DW Easy as pie</synctext>
When i parse and save the text from above tag of my xml file,the resultant string is appearing,in GDB, as
"\n\t\tThey raced to the park Arthur pointed to a sign \"Whats that say\" he asked Zoo said DW Easy as pie";
Observe there are 2 issues:
1)Unwanted characters at the beginning of the string.
2)The double quotes around Whats that say.
Can any one please help me how to get rid of these unwanted characters and how to read special characters properly.
NSString*string =[string stringByTrimmingCharactersInSet:[NSCharacterSet characterSetWithCharactersInString:#" \n\t"]];
The parser is apparently returning exactly what's in the string. That is, the XML was coded with the starting tag on one line, a newline, two tabs, and the start of the string. And quotes in the string are obviously there in the original (and it's not clear in at least this example why you'd want to delete them).
But if you want these characters gone then you need to post-process the string. You can use Rams' statement to eliminate the newline and tabs, and stringByReplacingOccurrencesOfString:WithString: to zap the quotes.
(Note that some XML parsers can be instructed to return strings like this with the leading/trailing stuff stripped, but I'm not sure about this one. The quotes will always be there, though.)
"artistName":"Travie McCoy", "collectionName":"Billionaire (feat. Bruno Mars) - Single", "trackName":"Billionaire (feat. Bruno Mars)",
i wish to get the artist name so Travie McCoy from within that code using regex, please not i am using regexkitlite for the iphone sdk if this changes things.
Thanks
"?artistName"?\s*:\s*"([^"]*)("|$) should do the trick. It even handles some variations in the string:
White space before and after the :
artistName with and without the quotes
missing " at the end of the artist name if it is the last thing on the line
But there will be many more variations in the input you might encounter that this regex will not match.
Also you don’t want to use a regex for matching this for performance reasons. Right now you might only be interested in the artistName field. But some time later you will want information from the other fields. If you just change the field name in the regex you’ll have to match the whole string again. Much better to use a parser and transform the whole string into a dictionary where you can access the different fields easily. Parsing the whole string shouldn’t take much longer than matching the last key/value pair using a regex.
This looks like some kind of JSON, there are lots of good and complete parsers available. It isn’t hard to write one yourself though. You could write a simple recursive descent parser in a couple of hours. I think this is something every programmer should have done at least once.
\"?artistName\"?\s*:\s*\"([^\"]*)(\"|$)
Thats for objective c
In my current implementation of a UISearchBarController I'm using [NSString compare:] inside the filterContentForSearchText:scope: delegate method to return relevant objects based on their name property to the results UITableView as you start typing.
So far this works great in English and Korean, but what I'd like to be able to do is search within NSString's defined character clusters. This is only applicable for a handfull of languages, of which Korean is one.
In English, compare: returns new results after every letter you enter, but in Korean the results are generated once you complete a recognized grapheme cluster. I would like to be able to search through my Korean objects name property via the individual elements that make up a syllable.
Can anyone shed any light on how to approach this? I'm sure it has something to do with searching through UTF16 characters manually, or by utilising a lower level class.
Cheers!
Here is a specific example that's just not working:
`NSString *string1 = #"이";
`NSString *string2 = #"ㅣ";
NSRange resultRange = [[string1 decomposedStringWithCanonicalMapping] rangeOfString: [string2 decomposedStringWithCanonicalMapping] options:(NSLiteralSearch)];
The result is always NSNotFound, with or without decomposedStringWithCanonicalMapping.
Any ideas?
I'm no expert, but I think you're very unlikely to find a clean solution for what you want. There doesn't seem to be any relationship between a Korean character's Unicode value and the graphemes that it's made up of.
e.g. "이" is \uc774 and "ㅣ" is \u3163. From the perspective of the NSString, they're just two different characters with no specific relationship to each other.
I suspect that you will have to find or create an explicit mapping between characters and their graphemes, and then write your own search function that consults this mapping.
This very long page on Unicode Korean can help you, if it comes to that. It has a table of all the characters which suggests some structured relation between the way characters are numbered and their components.
If you use compare:options with NSLiteralString, it should compare character by character, that is, the Unicode code points, regardless of the grapheme. The default behavior of compare: is to use no options. You could use - decomposedStringWithCanonicalMapping to get the Unicode bytes of the input string, but I'm not sure how that would interact with compare:.