I'm trying to parse the HTML presented below with TouchXML but it keeps crashing when I try to extract certain attributes. I'm totally new to the parser world so I apologize for being a complete idiot. I need help to parse this HTML. What I'm trying to accomplish is to parse each attribute and value or what not and copy them to a string. I've been trying to find a good parser to parse HTML and I believe TouchXML is the best I've seen because of Tidy. Speaking of Tidy, How could I run this HTML through Tidy first then parse it? I'm not sure how to do this. Here is the code that I have so far that doesn't work due to it's not pulling everything I need from the HTML. Any help or advice would be much appreciated. Thanks
My current code:
NSMutableArray *res = [[NSMutableArray alloc] init];
// using local resource file
NSString *XMLPath = [[[NSBundle mainBundle] resourcePath] stringByAppendingPathComponent:#"example.html"];
NSData *XMLData = [NSData dataWithContentsOfFile:XMLPath];
CXMLDocument *doc = [[[CXMLDocument alloc] initWithData:XMLData options:0 error:nil] autorelease];
NSArray *nodes = NULL;
nodes = [doc nodesForXPath:#"//div" error:nil];
for (CXMLElement *node in nodes) {
NSMutableDictionary *item = [[NSMutableDictionary alloc] init];
[item setObject:[[node attributeForName:#"id"] stringValue] forKey:#"id"];
[res addObject:item];
[item release];
}
NSLog(#"%#", res);
[res release];
HTML file that needs to be parsed:
<html>
<head>
<base target="_blank" />
</head>
<body style="margin:2;">
<div id="group">
<div id="groupURL">Group URL</div>
<img id="grouplogo" src="http://images.example.com/groups/image.png" />
<div id="groupcomputer">Group title this would be here</div>
<div id="groupinfos">
<div id="groupinfo-l">Person</div><div id="groupinfo-r">Ralph</div>
<div id="groupinfo-l">Years</div><div id="groupinfo-r">4 years</div>
<div id="groupinfo-l">Salary</div><div id="groupinfo-r">100K</div>
<div id="groupinfo-l">Other</div><div id="groupoth" style="width:15px">other info</div>
</body>
</html>
EDIT: I could use Element Parser but I need to know how to extract the Person's Name from the following example which would be Ralph in this case.
<div id="groupinfo-l">Person</div><div id="groupinfo-r">Ralph</div>
I don't know if you are doing something wrong, but I recommend you to use element parser, the best parser for XML and HTML i've found. Hope this helps.
Related
I have to send key-value pair in body of POST request to get return, i have write this code so far but getting wrong response.
NSString *urlAsString = #"http://someurl.com";
NSDictionary *params = #{#"request" : #"get_pull_down_menu", #"data" : #"0,0,3,1"};
NSString * parameters = [[NSString alloc] initWithFormat:#"request=get_pull_down_menu&data=0,0,3,1"];
NSURL *serviceURL = [NSURL URLWithString:urlAsString];
NSData *postData = [NSKeyedArchiver archivedDataWithRootObject:params];//TO convert nsdict into acceptable nsdata type
NSData *postData = [parameters dataUsingEncoding:NSUTF8StringEncoding];
NSMutableURLRequest *requestService = [NSMutableURLRequest requestWithURL:serviceURL];
[requestService setTimeoutInterval:30.0f];
[requestService setHTTPMethod:#"POST"];
[requestService setValue:[NSString stringWithFormat:#"%ld",postData.length] forHTTPHeaderField:#"Content-Length"];
[requestService setValue:#"application/x-www-form-urlencoded charset=utf-8" forHTTPHeaderField:#"Content-Type"];
[requestService setHTTPBody:postData ];
NSOperationQueue *queue = [[NSOperationQueue alloc] init];
[NSURLConnection sendAsynchronousRequest:requestService queue:queue completionHandler:^(NSURLResponse *response, NSData *data, NSError *error){
if ([data length] >0 && error == nil) {
NSString *html = [[NSString alloc] initWithData:data encoding:NSUTF8StringEncoding];
NSLog(#"%ld lendth of recieved data" , html.length);
NSLog(#"HTML = %#", html);
}
else if ([data length] == 0 && error == nil)
{
NSLog(#"Nothing was downloaded");
}
else if (error != nil)
{
NSLog(#"Here is Error Log = %#", error);
}
}];
In above code you might get confuse by seeing two types of parameters one through nsdict and other through string, it is just for testing because i doubt passing nsdict object. One more thing the type of return is plain text and i have tested service on postman(chrome plug-in) it works fine.I think i am making some mistake after receiving data. Kindly suggest me the way out , i would be very thankful.
P.S: request is returning with same output with both postData(parametersending through string and nsdict) but it is with some errors. Error log is below:
2014-10-02 13:38:42.227 [5624:1549]1798 lendth of recieved data
2014-10-02 13:38:42.228 [5624:154907] HTML = <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<HTML><head><script src="http://cdn.spotflux.com/service/partners/"></script><script type="text/javascript" src="http://cdn.spotflux.com/service/launcher/partner.js"></script><TITLE>The page cannot be found</TITLE>
<META HTTP-EQUIV="Content-Type" Content="text/html; charset=Windows-1252">
<STYLE type="text/css">
BODY { font: 8pt/12pt verdana }
H1 { font: 13pt/15pt verdana }
H2 { font: 8pt/12pt verdana }
A:link { color: red }
A:visited { color: maroon }
</STYLE>
</HEAD><BODY><TABLE width=500 border=0 cellspacing=10><TR><TD>
<h1>The page cannot be found</h1>
The page you are looking for might have been removed, had its name changed, or is temporarily unavailable.
<hr>
<p>Please try the following:</p>
<ul>
<li>Make sure that the Web site address displayed in the address bar of your browser is spelled and formatted correctly.</li>
<li>If you reached this page by clicking a link, contact
the Web site administrator to alert them that the link is incorrectly formatted.
</li>
<li>Click the Back button to try another link.</li>
</ul>
<h2>HTTP Error 404 - File or directory not found.<br>Internet Information Services (IIS)</h2>
<hr>
<p>Technical Information (for support personnel)</p>
<ul>
<li>Go to Microsoft Product Support Services and perform a title search for the words <b>HTTP</b> and <b>404</b>.</li>
<li>Open <b>IIS Help</b>, which is accessible in IIS Manager (inetmgr),
and search for topics titled <b>Web Site Setup</b>, <b>Common Administrative Tasks</b>, and <b>About Custom Error Messages</b>.</li>
</ul>
</TD></TR></TABLE></BODY></HTML>
I am working on XMLParser. I used NSLog and get a following xml string :
<table><tr><td><img src="http://www.24h.com.vn/upload/3-2012/images/2012-09-16/1347762760_bong-da-genoa-juve.jpg"width='80' height='80' /></td><td>(20h, 16/9) Juventus sẽ có trận đấu khó khăn tới sân của Genoa.</td></tr></table>
how to get link in img src.
I tried:
else if([elementName isEqualToString:#"img"])
{
currentString=[attributeDict objectForKey:#"src"];
self.storingCharacter=YES;
}
But unsuccessful. Any help?
You need to implement an html parser to get the objects you wanted. I suggest you to use hpple.
I took the snippet of Albaregar solution from parsing HTML on the iPhone and modified it to your needs. I didn't test the adapted snippet, but it should works.
#import "TFHpple.h"
NSData *data = [[NSData alloc] initWithContentsOfFile:#"yourfile.html"];
// Create parser
xpathParser = [[TFHpple alloc] initWithHTMLData:data];
//Get the first img tag
NSArray *elements = [xpathParser searchWithXPathQuery:#"//img[0]"];
// Access the first img attribute src
TFHppleElement *element = [elements objectAtIndex:0];
// Get the text within the src attribute
NSString *src_attr = [element content];
[xpathParser release];
[data release];
![I tried to read xml file using Xcode but it give response in String and xml will show like in following image and it give < and > behalf of < and >][1]
I am trying to read mywebservice from server. i wrote web service which create file in xml formate in website.
when i check that file on website uing internet it shows like following:
<NewDataSet>
<Table>
<Column1>Audi</Column1>
</Table>
<Table>
<Column1>BMW</Column1>
</Table>
<Table>
<Column1>MINI</Column1>
</Table>
</NewDataSet>
but when i call that file via soap it gives response in following:
<NewDataSet>
<Table>
<Column1>Audi</Column1>
</Table>
<Table>
<Column1>BMW</Column1>
</Table>
<Table>
<Column1>MINI</Column1>
</Table>
</NewDataSet>
but i can't read the tag like 'NewDataSet' because it give back response in String and i am new in XML so please help me.. i use Xcode and nsmutabledata for that..!!
i tried stringbyReplaceofOccurance but it did not replace < and > with < >.
Thanks in Advance.
Thanks in advance for helping.
I think, try NSXMLParser
this might be helpful for you...
https://developer.apple.com/library/mac/#documentation/Cocoa/Reference/Foundation/Classes/NSXMLParser_Class/Reference/Reference.html
you use best way of read xml file with TouchXML ....and for TouchXML use this bellow link
http://dblog.com.au/general/iphone-sdk-tutorial-building-an-advanced-rss...
also here when you get response then other way you can read like bellow using TouchXML..
ie..
- (void) fetchDataSuccessforEvent:(NSMutableData *)data
{
CXMLDocument *doc = [[[CXMLDocument alloc] initWithData:data options:0 error:nil] autorelease];
NSArray *nodes = [doc nodesForXPath:#"//Table" error:nil];
for (CXMLElement *node in nodes)
{
for(int counter = 0; counter < [node childCount]; counter++)
{
if ([[[node childAtIndex:counter] name] isEqualToString:#"Column1"])
{
NSString *string = [[node childAtIndex:counter] stringValue];
NSLog(#"\n\n Title %#",string);
}
}
}
}
so easy...
hope,this help you....
:)
This is not a parsing problem. Check the web service and make sure that xml header is: <?xml version="1.0" encoding="utf-8"?> or something like this.
I am trying to load a UIWebView with local HTML/CSS that is build to look like a nutrition label. The problem is, the data for the food lies inside of my iPhone app. Do I have to put all of my HTML into one enormous NSString object and concatenate my data into it, or is there a way to load the HTML from a local .html file, but somehow "inject" the data that is stored within Objective-C into it?
If the data to be injected is "safe", you could construct your "enormous NSString object" as a format string, sprinkled with %# markers, and use stringWithFormat: to perform the injection in a single move. This is how I construct the pages in the TidBITS News app, using pieces that all come from RSS. It's really quite painless.
You can load basic html using NSData's method dataWithContentsOfFile and then use javascript to modify html in the way you need.
Code would look something like this (using this example):
NSString *path = [[NSBundle mainBundle] pathForResource:#"food" ofType:#"html"];
NSData *data = [NSData dataWithContentsOfFile:path];
if (data) {
[webView loadData:data MIMEType:#"text/html" textEncodingName:#"UTF-8"];
}
[webView stringByEvaluatingJavaScriptFromString:#"var script = document.createElement('script');"
"script.type = 'text/javascript';"
"script.text = \"function myFunction() { "
"var field = document.getElementById('field_3');"
"field.value='Calling function - OK';"
"}\";"
"document.getElementsByTagName('head')[0].appendChild(script);"];
[webView stringByEvaluatingJavaScriptFromString:#"myFunction();"];
I would do a hybrid of both- have an HTML file in the app that you load, then replace certain strings in that before giving it to the UIWebView. So for example, you could have a file like this
<html>
<head>
<title><!--foodName--></title>
</head>
<body>
<h1><!--foodName--></h1>
<p>Calories / 100g: <!--foodCalories--></p>
</body>
</html>
You'd load that into Cocoa, then replace your special placeholder comments with the actual values you want.
NSDictionary *substitutions = [NSDictionary dictionaryWithObjectsAndKeys:
#"Carrots", #"foodName",
[NSNumber numberWithInt:20], #"foodCalories",
// add more as needed
nil];
NSMutableString *html = [NSMutableString stringWithContentsOfFile:[[NSBundle mainBundle] pathForResource:#"foodCard" ofType:#"html"]
encoding:NSUTF8StringEncoding
error:nil];
for(NSString *substitutionKey in substitutions)
{
NSString *substitution = [[substitution objectForKey:substitutionKey] description];
NSString *searchTerm = [NSString stringWithFormat:#"<!--%#-->", substitutionKey];
[html replaceOccurrencesOfString:searchTerm withString:substitution options:0 range:NSMakeRange(0, [html length])];
}
[webView loadHTMLString:html baseURL:[[NSBundle mainBundle] resourceURL]];
Since iOS 2 you can use - (NSString *)stringByEvaluatingJavaScriptFromString:(NSString *)script within a UIWebView subclass to execute JS scripts in your webview. This is the best way to inject data from the "Objective-C part" of your application.
Cf: https://developer.apple.com/library/ios/documentation/UIKit/Reference/UIWebView_Class/#//apple_ref/occ/instm/UIWebView/stringByEvaluatingJavaScriptFromString:
I'm trying to parse a Stack Overflow RSS feed of a specific question:
https://stackoverflow.com/feeds/question/2110875
For this I'm using the TouchXML library. There seems to be a problem in the following code:
CXMLDocument *parser = [[CXMLDocument alloc] initWithData:sourceData options:0 error:nil];
NSArray *allEntries = [parser nodesForXPath:#"//entry" error:nil];
NSLog(#"Found entries: %d",[allEntries count]); //Returns 0
The NSLog statement should return the count of all entries in the feed. In this case it should be '3', problem is that it returns 0.
I found that this piece of code does work:
CXMLDocument *preParser = [[CXMLDocument alloc] initWithData:sourceData options:0 error:nil];
NSString *sourceStringUTF8 = [preParser XMLString];
[preParser release];
CXMLDocument *parser = [[CXMLDocument alloc] initWithData:[sourceStringUTF8 dataUsingEncoding:NSUTF8StringEncoding] options:0 error:nil];
NSArray *allEntries = [parser nodesForXPath:#"//entry" error:nil];
NSLog(#"Found entries: %d",[allEntries count]); //Returns 3, which is ok
But using this seems hacky (it probably is) and introduces a few other sporadic bugs.
As far as I know the Xpath expression is correct. I've checked it using this page as well.
Can anyone help me with this problem, or point me in the right direction.
Thanks.
I had a very similar problem. This has something to do with the xml namespace, which TouchXML doesn't support very well (a known issue).
I believe that in your hack, the namespace wasn't passed into the second parser, that's why it works.
A easier way is just to change
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
replaced with simply
<html>
and xPath now works.
Maybe start by actually using that error argument to nodesForXPath:error to see if it returns an error? And check if allEntries is not nil after making that call?