I have an XML which I am using to parse news. News have a description. I'm using NSString to show that description in UILabel.
But, the description comes like this:
Bad news for Windows’ the researches show that Windows’ for years....
And it is being showed with those unwanted characters in UILabel. The numbers are changing in every string. They are not the sames.
I want to remove the characters that begins with &# and the numbers that follow. How can I do that? Which string encoding format should I use?
Thanks a lot.
EDIT: I don't have just one string. If I remove ’ from this one, there might be ᶺ in another one. It won't be removed.
I can remove &# characters and numbers too. But when I do that, In a string like that "In 1980, Jobs told us to৬ do something" the output will be "In , Jobs told us to do something" 1980 will be gone too, but I don't want that. That's a problem either.
These are ASCAII symbols so you need to use utf-8 string.check this link
so use this line of code
NSString *resultString = [NSString stringWithUTF8String:myAsciiString];
Related
I am getting RSS feeds from different sources. There are dozen different types of RSS output I am receiving. I am using an XML parser to parse/get the <item>, <title>, <link>, <description> tags.
After getting the description value of each item, I am using regular expression to parse the description field to get image link(if any) and clear text. The following regular expression works for Yahoo/CNN feeds.
#"<p><a.+?><img src=\"(.+?)\".+?<\\/a>(.+?)<\\/p>" ;
But still there are some unwanted characters leaving behind in the description(the second match in above regex).
Here I am looking for some suggestions, on, how to put in different regex to evaluate the RSS description and get "clear text" & "image links". Again, putting lot of regular expressions and comparing each ones success resulting in performance loss.
To summarize, there are two problems, I am seeing here.
Construct different regex, apply each one against description field, check the success and take output.(applying 4 or 5 regex, performance loss will be there) In this step I am trying to separate description & image link.
The description got above, is still not a clear text, needs to remove lot of other extra characters and tags. I need a regular expression here, to remove all of those extra unnecessary things. Somebody who have already done this, can help me in this regard.
you can put all the unwanted characters in a set and clear the string you want from it .. check this function
- (NSString *) stripTags:(NSString *)str{
NSString *clearString;
NSCharacterSet *doNotWant;
doNotWant = [NSCharacterSet characterSetWithCharactersInString:#"-=+[]{}:/?.><;,!##$%^&*\n()\r'"];
clearString = [[str componentsSeparatedByCharactersInSet: doNotWant] componentsJoinedByString: #""];
return clearString;}
I hope this will be helpful.
I have an array which contains description of a route on map. I got this array by parsing JSON. My arrays contains string in this format:
"<b>Sri Krishna Nagar Rd</b> \U306b\U5411\U304b\U3063\U3066<b>\U5317\U6771</b>\U306b\U9032\U3080",
"\U53f3\U6298\U3057\U3066\U305d\U306e\U307e\U307e <b>Sri Krishna Nagar Rd</b> \U3092\U9032\U3080",
"\U5927\U304d\U304f\U5de6\U65b9\U5411\U306b\U66f2\U304c\U308a\U305d\U306e\U307e\U307e <b>Bailey Rd/<wbr/>NH 30</b> \U3092\U9032\U3080<div class=\"\">\U305d\U306e\U307e\U307e NH 30 \U3092\U9032\U3080</div><div class=\"google_note\">\n<b landmarkid=\"0x39ed57bfe47253b7:0x779c8bf48892f269\" class=\"dir-landmark\">Petrol Bunk</b>\U3092\U901a\U904e\U3059\U308b<div class=\"dirseg-sub\">\Uff083.9 km \U5148\U3001\U53f3\U624b\Uff09</div>\n</div>",
Now I want to get name of places from this array like Sri Krishna Nagar Rd , NH 30 Petrol Bunk. First two should give Sri Krishna Nagar Rd and last on should give NH 30 Petrol
Bunk
How can I get result like this.Any help would be appreciated. Thanx In Advance.
Again, suppose I have string in this format..."\U5de6\U6298\U3059\U308b" which don't have ny place name.How will i handle this scenarios.
You can get like below:
NSString *strName=[yourArray objectAtIndex:index];
NSString *yourPlaceString=[[strName componentsSeparatedByString:#"<b>"] objectAtIndex:1];
yourPlaceString=[[yourPlaceString componentsSeparatedByString:#"</b>"] objectAtIndex:0];
you can get all places like this.
First of all, you should check if you don't have any other cleaner API available for the service you query this data. If the service returns such garbage in its JSON response, that shouldn't be your responsability to clean up that mess: the service should return some text that is more usable if it is a real clean API.
Next, if you really don't have any other choice and really need to clean this text, you have two options:
If the text is XHTML (I mean real XHTML, conforming to the XML standard) you may use an NSXMLParser to filter out any tags and only keep the text from your string. This may be a bit too much for this anyway so I don't really recommand it.
You can use regular expressions. If you are developping for iOS4.0+ you can use the NSRegularExpressionclass for this purpose. The tricky part is to get the right regex (can help you with that if needed)
You can use the NSScanner class (which is available in iOS since 2.0 IIRC) to scan characters in you string and parse it. This is probably easier to understand and the way to go if you are not a regex expert, so I recommand this approach
For example if you choose the NSScannersolution, you can scan your string for characters in the alphanumeric character set, to scan letters and digits and accumulate it (you may also add ponctuation characters to your NSCharacterSetyou are using if needed). You will have the NSScanner to stop when it encounter characters such as the unicode characters \Uxxxx or like < and >. When you encounter < you can then ask the NSScanner to ignore the characters up to the next >, then start to scan the alphanumeric characters again and accumulating... and so on until the end of the string.
Finally, if you really find a pattern in the response string you are receiving, like if your place names is always between the first <b> and </b> pair (but you have to be sure of that), you can handle it other ways, like:
splitting your string using the <b> text as the separator (e.g. componentsSeparatedByString)
or asking the rangeOfString for the string <b> and then for string </b> and once you have their position, only extract substringWithRange from your original string to extract only the place name (using rangeOfString will be faster that componentsSeparatedByString because it will stop on the first occurrence found)
It looks like an encoding problem - can you change the encoding of the source or target to a different format. I had similar issues with German ö ä ü characters when UTF-8 was turned off....
I want to display super script number in a simple text view can any one help me for that?
UITextView can't handle rich text so if you want to have superscripted numbers you have to build up a string using the unicode characters for superscripted numbers, e.g.
NSString *super0 = #"\u2070";
Gives you a superscripted zero. You can find the rest of the numerals here on wikipedia. You'll have to build up the string yourself from the individual digits but that will be a nice programming exercise.
Under the Edit menu in Xcode there is a Special Characters option.
int no=1;
NSString *str=#"xyz";
*yourtextView*.text=[NSString stringWithFormat:#"%#%d",str,no];
I am getting problem while parsing xml files that contains some special characters like single quote,double quote (', "")etc.I am using NSXMLParser's parser:foundCharacters:method to collect characters in my code.
<synctext type = "word" >They raced to the park Arthur pointed to a sign "Whats that say" he asked Zoo said DW Easy as pie</synctext>
When i parse and save the text from above tag of my xml file,the resultant string is appearing,in GDB, as
"\n\t\tThey raced to the park Arthur pointed to a sign \"Whats that say\" he asked Zoo said DW Easy as pie";
Observe there are 2 issues:
1)Unwanted characters at the beginning of the string.
2)The double quotes around Whats that say.
Can any one please help me how to get rid of these unwanted characters and how to read special characters properly.
NSString*string =[string stringByTrimmingCharactersInSet:[NSCharacterSet characterSetWithCharactersInString:#" \n\t"]];
The parser is apparently returning exactly what's in the string. That is, the XML was coded with the starting tag on one line, a newline, two tabs, and the start of the string. And quotes in the string are obviously there in the original (and it's not clear in at least this example why you'd want to delete them).
But if you want these characters gone then you need to post-process the string. You can use Rams' statement to eliminate the newline and tabs, and stringByReplacingOccurrencesOfString:WithString: to zap the quotes.
(Note that some XML parsers can be instructed to return strings like this with the leading/trailing stuff stripped, but I'm not sure about this one. The quotes will always be there, though.)
In my current implementation of a UISearchBarController I'm using [NSString compare:] inside the filterContentForSearchText:scope: delegate method to return relevant objects based on their name property to the results UITableView as you start typing.
So far this works great in English and Korean, but what I'd like to be able to do is search within NSString's defined character clusters. This is only applicable for a handfull of languages, of which Korean is one.
In English, compare: returns new results after every letter you enter, but in Korean the results are generated once you complete a recognized grapheme cluster. I would like to be able to search through my Korean objects name property via the individual elements that make up a syllable.
Can anyone shed any light on how to approach this? I'm sure it has something to do with searching through UTF16 characters manually, or by utilising a lower level class.
Cheers!
Here is a specific example that's just not working:
`NSString *string1 = #"이";
`NSString *string2 = #"ㅣ";
NSRange resultRange = [[string1 decomposedStringWithCanonicalMapping] rangeOfString: [string2 decomposedStringWithCanonicalMapping] options:(NSLiteralSearch)];
The result is always NSNotFound, with or without decomposedStringWithCanonicalMapping.
Any ideas?
I'm no expert, but I think you're very unlikely to find a clean solution for what you want. There doesn't seem to be any relationship between a Korean character's Unicode value and the graphemes that it's made up of.
e.g. "이" is \uc774 and "ㅣ" is \u3163. From the perspective of the NSString, they're just two different characters with no specific relationship to each other.
I suspect that you will have to find or create an explicit mapping between characters and their graphemes, and then write your own search function that consults this mapping.
This very long page on Unicode Korean can help you, if it comes to that. It has a table of all the characters which suggests some structured relation between the way characters are numbered and their components.
If you use compare:options with NSLiteralString, it should compare character by character, that is, the Unicode code points, regardless of the grapheme. The default behavior of compare: is to use no options. You could use - decomposedStringWithCanonicalMapping to get the Unicode bytes of the input string, but I'm not sure how that would interact with compare:.