How to convert NSString HTML markup to plain text NSString? - iphone

Been searching the net for an example of how to convert HTML string markup into Plain text.
I get my information from a feed which contains HTML, I then display this information in a Text View. does the UITextView have a property to convert HTML or do I have to do it in code. I tried:
NSString *str = [NSString stringWithCString:self.fullText encoding:NSUTF8StringEndcoding];
but doesn't seem to work. Anyone got any ideas?

You can do it by parsing the html by using NSScanner class
- (NSString *)flattenHTML:(NSString *)html {
NSScanner *theScanner;
NSString *text = nil;
theScanner = [NSScanner scannerWithString:html];
while ([theScanner isAtEnd] == NO) {
[theScanner scanUpToString:#"<" intoString:NULL] ;
[theScanner scanUpToString:#">" intoString:&text] ;
html = [html stringByReplacingOccurrencesOfString:[NSString stringWithFormat:#"%#>", text] withString:#""];
}
//
html = [html stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
return html;
}
Hope this helps.

If you are using UIWebView then it will be easier to parse HTML to text:
fullArticle = [webView stringByEvaluatingJavaScriptFromString:#"document.body.getElementsByTagName('article')[0].innerText;"]; // extract the contents by tag
fullArticle = [webView stringByEvaluatingJavaScriptFromString:#"document.body.innerText"]; // extract text inside body part of HTML

you can't do it directly i guess.. however you can use NSXML Parser and parse the HTML and retrieve exactly what you want...

If you need to present the text in read-only fashion, why not use UIWebView?

Related

Display RSS Feed in iPhone`

I have a list of RSS feeds and I need to display the detail of each feed in iPhone. I got all RSS feeds from the server which I'm displaying in tableview. Now on selecting a row I need to display the discription of RSS Feed which is coming from Server in HTML content like:-
<img border=\"0\" hspace=\"10\" align=\"left\" style=\"margin-top:3px;margin-right:5px;\" src=\"http://timesofindia.indiatimes.com/photo/4881843.cms\" />Cadila Pharmaceutical will seek the govt's nod in two days for initiating clinical trials for a vaccine against swine flu.<img width='1' height='1' src='http://timesofindia.feedsportal.com/c/33039/f/533968/s/1f181b11/mf.gif' border='0'/><div class='mf-viral'><table border='0'><tr><td valign='middle'><img src=\"http://res3.feedsportal.com/images/emailthis2.gif\" border=\"0\" /></td><td valign='middle'><img src=\"http://res3.feedsportal.com/images/bookmark.gif\" border=\"0\" /></td></tr></table></div><br/><br/><img src=\"http://da.feedsportal.com/r/133515347892/u/0/f/533968/c/33039/s/1f181b11/a2.img\" border=\"0\"/><img width=\"1\" height=\"1\" src=\"http://pi.feedsportal.com/r/133515347892/u/0/f/533968/c/33039/s/1f181b11/a2t.img\" border=\"0\"/>
How do I display this HTML Content in our iPhone UI, as this will contain text,hyperlink and images.
Is it proper way to use UIWebview in this case, as UIWebView is heavy weight.
Please read this blog : http://www.raywenderlich.com/2636/how-to-make-a-simple-rss-reader-iphone-app-tutorial,
It is useful for you.
you can do something like below.
first of all
descLbl.text=[self flattenHTML:descLbl.text];//descLbl.text is a text in which you are getting that HTML description...
now in flattenHTML method write like below...
- (NSString *)flattenHTML:(NSString *)html {
NSScanner *thescanner;
NSString *text = nil;
thescanner = [NSScanner scannerWithString:html];
while ([thescanner isAtEnd] == NO) {
[thescanner scanUpToString:#"<" intoString:NULL];
// find end of tag
[thescanner scanUpToString:#">" intoString:&text];
html = [html stringByReplacingOccurrencesOfString:[NSString stringWithFormat:#"\n"] withString:#" "];
// replace the found tag with a space
//(you can filter multi-spaces out later if you wish)
html = [html stringByReplacingOccurrencesOfString:[NSString stringWithFormat:#"%#>", text] withString:#" "];
} // while //
return html;
}
let me know it is working or not..
Happy coding!!!!
If you are planning to show HTML entities as it is, i would suggest to go with UIWebView. You can pass this HTML as a string in method loadHTMLString:
If you want the user to view the content as a web page, then loading the url in UIWebView is the best and easiest way.
Like this:
UIWebView *webView = [[UIWebView alloc] init];
NSURL *pageurl = [NSURL URLWithString:http://www.google.com];
NSURLRequest *request = [NSURLRequest requestWithURL:pageurl];
[webView loadRequest:request];
You can simply use RSSKit library.

Trying to get text from website and display it in ios?

Ok I have an exsiting app that I am currently working on an update for. What I am trying to do is when the client updates their website, the app will pull the text from the certain page and display the text in an UITextView? I am trying this approach which works fine except it includes the text of the NavBar? So how do I get the text only and no NavBar?
textView.text = [webView stringByEvaluatingJavaScriptFromString:#"document.documentElement.innerText"];
Well you have two choices from the point i see it at. If you know how long the text in the nav bar is and it is the same character length just use:
NSString *webString = [webView stringByEvaluatingJavaScriptFromString:#"document.documentElement.innerText"];
int length = amount of characters to remove from beginning of string;
webString = [webString substringFromIndex:length];
If you dont know the amount you want to remove you can use the NSScanner which is a bit more complicated but is more flexible.
NSString *webString = [webView stringByEvaluatingJavaScriptFromString:#"document.documentElement.innerText"];
NSScanner *stringScanner = [NSScanner scannerWithString:webString];
NSString *content = [[NSString alloc] init];
while ([stringScanner isAtEnd] == NO) {
[stringScanner scanUpToString:#"Start of the text you want" intoString:null];
[stringScanner scanUpToString:#"End of the text you want" intoString:&content];
}
Hope This Helps :D
Here is the code I am trying to use
NSString *webString = [webView stringByEvaluatingJavaScriptFromString:#"document.documentElement.innerText"];
NSScanner *stringScanner = [NSScanner scannerWithString:webString];
NSString *content = [[NSString alloc] init];
while ([stringScanner isAtEnd] == NO) {
[stringScanner scanUpToString:#"Andalee" intoString:NULL];
[stringScanner scanUpToString:#"Eastern Sun Dance Company Rehearsal Mondays 7:00pm # Cal Arts Academy" intoString:&content];
textView.text = webString; }
Maybe I am approaching it wrong.
Here is the webpage that I am trying to pull from http://andalee.com/andalee/CLASSES.html

"Inject" Objective-C data into a UIWebView that loads a local HTML file?

I am trying to load a UIWebView with local HTML/CSS that is build to look like a nutrition label. The problem is, the data for the food lies inside of my iPhone app. Do I have to put all of my HTML into one enormous NSString object and concatenate my data into it, or is there a way to load the HTML from a local .html file, but somehow "inject" the data that is stored within Objective-C into it?
If the data to be injected is "safe", you could construct your "enormous NSString object" as a format string, sprinkled with %# markers, and use stringWithFormat: to perform the injection in a single move. This is how I construct the pages in the TidBITS News app, using pieces that all come from RSS. It's really quite painless.
You can load basic html using NSData's method dataWithContentsOfFile and then use javascript to modify html in the way you need.
Code would look something like this (using this example):
NSString *path = [[NSBundle mainBundle] pathForResource:#"food" ofType:#"html"];
NSData *data = [NSData dataWithContentsOfFile:path];
if (data) {
[webView loadData:data MIMEType:#"text/html" textEncodingName:#"UTF-8"];
}
[webView stringByEvaluatingJavaScriptFromString:#"var script = document.createElement('script');"
"script.type = 'text/javascript';"
"script.text = \"function myFunction() { "
"var field = document.getElementById('field_3');"
"field.value='Calling function - OK';"
"}\";"
"document.getElementsByTagName('head')[0].appendChild(script);"];
[webView stringByEvaluatingJavaScriptFromString:#"myFunction();"];
I would do a hybrid of both- have an HTML file in the app that you load, then replace certain strings in that before giving it to the UIWebView. So for example, you could have a file like this
<html>
<head>
<title><!--foodName--></title>
</head>
<body>
<h1><!--foodName--></h1>
<p>Calories / 100g: <!--foodCalories--></p>
</body>
</html>
You'd load that into Cocoa, then replace your special placeholder comments with the actual values you want.
NSDictionary *substitutions = [NSDictionary dictionaryWithObjectsAndKeys:
#"Carrots", #"foodName",
[NSNumber numberWithInt:20], #"foodCalories",
// add more as needed
nil];
NSMutableString *html = [NSMutableString stringWithContentsOfFile:[[NSBundle mainBundle] pathForResource:#"foodCard" ofType:#"html"]
encoding:NSUTF8StringEncoding
error:nil];
for(NSString *substitutionKey in substitutions)
{
NSString *substitution = [[substitution objectForKey:substitutionKey] description];
NSString *searchTerm = [NSString stringWithFormat:#"<!--%#-->", substitutionKey];
[html replaceOccurrencesOfString:searchTerm withString:substitution options:0 range:NSMakeRange(0, [html length])];
}
[webView loadHTMLString:html baseURL:[[NSBundle mainBundle] resourceURL]];
Since iOS 2 you can use - (NSString *)stringByEvaluatingJavaScriptFromString:(NSString *)script within a UIWebView subclass to execute JS scripts in your webview. This is the best way to inject data from the "Objective-C part" of your application.
Cf: https://developer.apple.com/library/ios/documentation/UIKit/Reference/UIWebView_Class/#//apple_ref/occ/instm/UIWebView/stringByEvaluatingJavaScriptFromString:

How is stringWithFormat used here?

NSString *html="html page to parse";
NSString *text="some html text";
html = [html stringByReplacingOccurrencesOfString:
[NSString stringWithFormat:#"%#>", text] withString:#""];
My question is what will #"%#>" will do in stringwithFormat.
thanks
%# tells NSString you will be including an object in your string, so it will try to parse it as a string. According to Apple, %#:
"Objective-C object, printed as the string returned by descriptionWithLocale: if available, or description otherwise. Also works with CFTypeRef objects, returning the result of the CFCopyDescription function."
The first # symbol simply denotes a NSString.
Apple documentation
The code
html = [html stringByReplacingOccurrencesOfString:
[NSString stringWithFormat:#"%#>", text] withString:#""];
will replace occurence of some html text> in html page to parse with empty string.
So the result will be html page to parse only.
Using stringWithFormat You can easily perform many operation such as converting an int/float value to string,etc.,
int age=18;
NSSring *myage=[NSString stringWithFormat:#"My age is %d", age];
Here the value of myage is My age is 18.

NSXMLParser RSS feed strange characters issue

hi i am trying to loop through XML document using NSXMLParser and have trouble with description tag.
some news websites have strange characters(HTML tags,<,>,a etc) in the tag and thus parsing is not as expected. could anyone provide some help?
thanks
You'll need to convert entity references to the characters that they represent. Any HTML tags would either need to be stripped, or fed into a UIWebView.
For skipping the html tags you need to do this:
- (NSString *)flattenHTML:(NSString *)html {
NSScanner *theScanner;
NSString *text = nil;
theScanner = [NSScanner scannerWithString:html];
while ([theScanner isAtEnd] == NO) {
[theScanner scanUpToString:#"<" intoString:NULL] ;
[theScanner scanUpToString:#">" intoString:&text] ;
html = [html stringByReplacingOccurrencesOfString:[NSString stringWithFormat:#"%#>", text] withString:#""];
}
//
html = [html stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
return html;
}
Then you can simply replace other unwanted characters by string manipulation.
Hope this helps.
Thanks,
Madhup