hi i am trying to loop through XML document using NSXMLParser and have trouble with description tag.
some news websites have strange characters(HTML tags,<,>,a etc) in the tag and thus parsing is not as expected. could anyone provide some help?
thanks
You'll need to convert entity references to the characters that they represent. Any HTML tags would either need to be stripped, or fed into a UIWebView.
For skipping the html tags you need to do this:
- (NSString *)flattenHTML:(NSString *)html {
NSScanner *theScanner;
NSString *text = nil;
theScanner = [NSScanner scannerWithString:html];
while ([theScanner isAtEnd] == NO) {
[theScanner scanUpToString:#"<" intoString:NULL] ;
[theScanner scanUpToString:#">" intoString:&text] ;
html = [html stringByReplacingOccurrencesOfString:[NSString stringWithFormat:#"%#>", text] withString:#""];
}
//
html = [html stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
return html;
}
Then you can simply replace other unwanted characters by string manipulation.
Hope this helps.
Thanks,
Madhup
Related
I have a list of RSS feeds and I need to display the detail of each feed in iPhone. I got all RSS feeds from the server which I'm displaying in tableview. Now on selecting a row I need to display the discription of RSS Feed which is coming from Server in HTML content like:-
<img border=\"0\" hspace=\"10\" align=\"left\" style=\"margin-top:3px;margin-right:5px;\" src=\"http://timesofindia.indiatimes.com/photo/4881843.cms\" />Cadila Pharmaceutical will seek the govt's nod in two days for initiating clinical trials for a vaccine against swine flu.<img width='1' height='1' src='http://timesofindia.feedsportal.com/c/33039/f/533968/s/1f181b11/mf.gif' border='0'/><div class='mf-viral'><table border='0'><tr><td valign='middle'><img src=\"http://res3.feedsportal.com/images/emailthis2.gif\" border=\"0\" /></td><td valign='middle'><img src=\"http://res3.feedsportal.com/images/bookmark.gif\" border=\"0\" /></td></tr></table></div><br/><br/><img src=\"http://da.feedsportal.com/r/133515347892/u/0/f/533968/c/33039/s/1f181b11/a2.img\" border=\"0\"/><img width=\"1\" height=\"1\" src=\"http://pi.feedsportal.com/r/133515347892/u/0/f/533968/c/33039/s/1f181b11/a2t.img\" border=\"0\"/>
How do I display this HTML Content in our iPhone UI, as this will contain text,hyperlink and images.
Is it proper way to use UIWebview in this case, as UIWebView is heavy weight.
Please read this blog : http://www.raywenderlich.com/2636/how-to-make-a-simple-rss-reader-iphone-app-tutorial,
It is useful for you.
you can do something like below.
first of all
descLbl.text=[self flattenHTML:descLbl.text];//descLbl.text is a text in which you are getting that HTML description...
now in flattenHTML method write like below...
- (NSString *)flattenHTML:(NSString *)html {
NSScanner *thescanner;
NSString *text = nil;
thescanner = [NSScanner scannerWithString:html];
while ([thescanner isAtEnd] == NO) {
[thescanner scanUpToString:#"<" intoString:NULL];
// find end of tag
[thescanner scanUpToString:#">" intoString:&text];
html = [html stringByReplacingOccurrencesOfString:[NSString stringWithFormat:#"\n"] withString:#" "];
// replace the found tag with a space
//(you can filter multi-spaces out later if you wish)
html = [html stringByReplacingOccurrencesOfString:[NSString stringWithFormat:#"%#>", text] withString:#" "];
} // while //
return html;
}
let me know it is working or not..
Happy coding!!!!
If you are planning to show HTML entities as it is, i would suggest to go with UIWebView. You can pass this HTML as a string in method loadHTMLString:
If you want the user to view the content as a web page, then loading the url in UIWebView is the best and easiest way.
Like this:
UIWebView *webView = [[UIWebView alloc] init];
NSURL *pageurl = [NSURL URLWithString:http://www.google.com];
NSURLRequest *request = [NSURLRequest requestWithURL:pageurl];
[webView loadRequest:request];
You can simply use RSSKit library.
Ok I have an exsiting app that I am currently working on an update for. What I am trying to do is when the client updates their website, the app will pull the text from the certain page and display the text in an UITextView? I am trying this approach which works fine except it includes the text of the NavBar? So how do I get the text only and no NavBar?
textView.text = [webView stringByEvaluatingJavaScriptFromString:#"document.documentElement.innerText"];
Well you have two choices from the point i see it at. If you know how long the text in the nav bar is and it is the same character length just use:
NSString *webString = [webView stringByEvaluatingJavaScriptFromString:#"document.documentElement.innerText"];
int length = amount of characters to remove from beginning of string;
webString = [webString substringFromIndex:length];
If you dont know the amount you want to remove you can use the NSScanner which is a bit more complicated but is more flexible.
NSString *webString = [webView stringByEvaluatingJavaScriptFromString:#"document.documentElement.innerText"];
NSScanner *stringScanner = [NSScanner scannerWithString:webString];
NSString *content = [[NSString alloc] init];
while ([stringScanner isAtEnd] == NO) {
[stringScanner scanUpToString:#"Start of the text you want" intoString:null];
[stringScanner scanUpToString:#"End of the text you want" intoString:&content];
}
Hope This Helps :D
Here is the code I am trying to use
NSString *webString = [webView stringByEvaluatingJavaScriptFromString:#"document.documentElement.innerText"];
NSScanner *stringScanner = [NSScanner scannerWithString:webString];
NSString *content = [[NSString alloc] init];
while ([stringScanner isAtEnd] == NO) {
[stringScanner scanUpToString:#"Andalee" intoString:NULL];
[stringScanner scanUpToString:#"Eastern Sun Dance Company Rehearsal Mondays 7:00pm # Cal Arts Academy" intoString:&content];
textView.text = webString; }
Maybe I am approaching it wrong.
Here is the webpage that I am trying to pull from http://andalee.com/andalee/CLASSES.html
Been searching the net for an example of how to convert HTML string markup into Plain text.
I get my information from a feed which contains HTML, I then display this information in a Text View. does the UITextView have a property to convert HTML or do I have to do it in code. I tried:
NSString *str = [NSString stringWithCString:self.fullText encoding:NSUTF8StringEndcoding];
but doesn't seem to work. Anyone got any ideas?
You can do it by parsing the html by using NSScanner class
- (NSString *)flattenHTML:(NSString *)html {
NSScanner *theScanner;
NSString *text = nil;
theScanner = [NSScanner scannerWithString:html];
while ([theScanner isAtEnd] == NO) {
[theScanner scanUpToString:#"<" intoString:NULL] ;
[theScanner scanUpToString:#">" intoString:&text] ;
html = [html stringByReplacingOccurrencesOfString:[NSString stringWithFormat:#"%#>", text] withString:#""];
}
//
html = [html stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
return html;
}
Hope this helps.
If you are using UIWebView then it will be easier to parse HTML to text:
fullArticle = [webView stringByEvaluatingJavaScriptFromString:#"document.body.getElementsByTagName('article')[0].innerText;"]; // extract the contents by tag
fullArticle = [webView stringByEvaluatingJavaScriptFromString:#"document.body.innerText"]; // extract text inside body part of HTML
you can't do it directly i guess.. however you can use NSXML Parser and parse the HTML and retrieve exactly what you want...
If you need to present the text in read-only fashion, why not use UIWebView?
I'm trying to replace all multiple whitespace in some text with a single space. This should be a very simple task, however for some reason it's returning a different result than expected. I've read the docs on the NSScanner and it seems like it's not working properly!
NSScanner *scanner = [[NSScanner alloc] initWithString:#"This is a test of NSScanner !"];
NSMutableString *result = [[NSMutableString alloc] init];
NSString *temp;
NSCharacterSet *whitespace = [NSCharacterSet whitespaceCharacterSet];
while (![scanner isAtEnd]) {
// Scan upto and stop before any whitespace
[scanner scanUpToCharactersFromSet:whitespace intoString:&temp];
// Add all non whotespace characters to string
[result appendString:temp];
// Scan past all whitespace and replace with a single space
if ([scanner scanCharactersFromSet:whitespace intoString:NULL]) {
[result appendString:#" "];
}
}
But for some reason the result is #"ThisisatestofNSScanner!" instead of #"This is a test of NSScanner !".
If you read through the comments and what each line should achieve it seems simple enough!? scanUpToCharactersFromSet should stop the scanner just as it encounters whitespace. scanCharactersFromSet should then progress the scanner past the whitespace up to the non-whitespace characters. And then the loop continues to the end.
What am I missing or not understanding?
Ah, I figured it out! By default the NSScanner skips whitespace!
Turns out you just have to set charactersToBeSkipped to nil:
[scanner setCharactersToBeSkipped:nil];
I am working on an iPhone OS application that sends an xml request to a webservice. In order to send the request, the xml is added to an NSString. When doing this I have experienced some trouble with quotation marks " and backslashes \ in the xml file, which have required escaping. Is there a complete list of characters that need to be escaped?
Also, is there an accepted way of doing this escaping (ie replacing \ with \\ and " with \") or is it a case of creating a method myself?
Thanks
NSString *escapedString = [unescapedString stringByReplacingOccurrencesOfString:#"\\" withString:#"\\\\"];
escapedString = [escapedString stringByReplacingOccurrencesOfString:#"\"" withString:#"\\\""];
Doesn't fully answer your question, but seems like it might help with the second part...
You can use a NSScanner that will scan for characters from a character set and if found, it will add the escaping \\ to a new string and copy the next substring from the found special character till the next.
NSString *sourceString = /* Some input String*/;
NSMutableString *destString = [#"" mutableCopy];
NSCharacterSet *escapeCharsSet = [NSCharacterSet characterSetWithCharactersInString:#" ()\\"];
NSScanner *scanner = [NSScanner scannerWithString:sourceString];
while (![scanner isAtEnd]) {
NSString *tempString;
[scanner scanUpToCharactersFromSet:escapeCharsSet intoString:&tempString];
if([scanner isAtEnd]){
[destString appendString:tempString];
}
else {
[destString appendFormat:#"%#\\%#", tempString, [sourceString substringWithRange:NSMakeRange([scanner scanLocation], 1)]];
[scanner setScanLocation:[scanner scanLocation]+1];
}
}