Im using libxml2 on the iPhone with the nice Method: PerformXMLXPathQuery from Cocoa with Love. The problem is how to find out witch xml got sent without first parsing the whole document... I tried to use the #"/" query to retrieve the first element as written on the introduction of Cocoa with Love but unfortunately, the PerformXMLXPathQuery crashes cause of this query!
When I use the #"/*" command the whole tree gets parsed, which is very inefficient in terms of time and memory consumption..
Any Ideas how this works?
Thanks
Markus
I'm not sure how tight your performance requirements are, but using a SAX parser, such as NSXMLParser, would enable you to quit parsing after you processed the element you're looking for. See the
- (void)abortParsing
method on the parser, and
– parser:didStartElement:namespaceURI:qualifiedName:attributes:
– parser:didEndElement:namespaceURI:qualifiedName:
on the NSXMLParserDelegate protocol.
Related
I am trying to parse some not-complicated RSS html content in iphone.
So I don't need a heavy HTML parser.
I have searched here and found these two:
https://github.com/topfunky/hpple
https://github.com/zootreeves/Objective-C-HMTL-Parser
Both are simple to use. But I guess they have their problems for my purpose.
For TFHpple, it is good, but for every element, it does not have the complete HTML <> with itself. for example, element doesn't have this complete tag string. I need this complete tag string, because I need to remove it from the whole HTML string. I would be more convenient for me if element has that.
For zootreeves HTML-Parser, it is also simple and good. And it has the complete tag string with every element. I am very happy. However, it seems to be a big memory-comsumer. I monitored it. If I try to parse a big number of HTML fragments (say, 1000), the memory it will cost and stays occupied is like 40MB. It is not applicable for ios devices. zootreeves is using pure C codes and linked-list to organise the tree structures of the HTML, I guess. and it uses pure malloc and free for memory. I don't know whether that will affect ios memory.
So, anyone can recommend a state-of-art better and fast and simple HTML parser for iOs for me?
Thanks
I'd use libxml2. It's not just for xml; it has an HTML parser too. It's fast and low-memory and is available in iOS. The only drawback is that it's a C-based API, but for all that it's not terribly difficult to work with.
Update
In response to the first comment below: It's been awhile, so I'm not sure, but I don't think so. What you get is a data structure with lots of information about the document structure, and each tag has a list of attribute/value pairs. Nowhere is the original html string stored (I presume that this is considered redundant and is not done to save memory).
However, it doesn't seem like you actually need it for what you want to do. It seems to me that you are using information from the parser to modify the original string, stripping out HTML tags. What you want to do instead is to rebuild the document using information from the parse tree, and when you do this, leave out the tags you want omitted.
I have to read this link's xml
http://www.ecb.int/stats/eurofxref/eurofxref-daily.xml
and parse it to stock in an array with association currency/rate
example:
USD = 1.2948
I know I sould use NSParser but I don't know how create a loop to set array's fields.
Thank you everybody
If you mean NSXMLParser, that is a SAX-style parser, which means you travel the XML tree unidirectionally from beginning to end of file. Every time the parser encounters something significant, it calls one of its delegate methods. In these methods, you read the values and fill your array by adding values one at a time.
It appears awkward at first and the code can become pretty verbose, with lots of conditionals. But SAX parsing is fast and has a small memory footprint.
I strongly recommend studying the examples in Apple's documentation on Event-Driven XML Programming, starting here.
This link will be useful.It is very simple and easy to implement quickly
how to parse xml using libxml2 in iphone
Cheers
// result array is get after use NSXMLPARSER
for (int i=6;i<[resultArray count];i++)
{
[currencyDict setvalue:[[resultArray Objectatindex:i] valueforkey:#"rate"]
forkey:[[resultArray Objectatindex:i] valueforkey:#"currency"]];
}
use this method and the you get currencydict in this format USD = 1.2948 , THB = 39.504 etc
if not understand then post comment to ask question
Is anybody famaliar with how to use TTXMLParser. I can't find any documentation or sample code on it.
Is it SAX or DOM?
Does it support Xpath?
Can I extract CDATA from elements?
I have an application that already uses several Three20 modules it would be a shame to have to use another parser.
The main documentation I've found for TTXMLParser is in the header file. The comment there gives an overview of what TTXMLParser does.
TTXMLParser shouldn't really be thought of as an XML parser in the way you are thinking of it -- in this sense, questions such as "is it SAX or DOM" and "does it support XPath" aren't directly applicable. Instead, think of TTXMLParser as a convenience class to take XML and turn it into a tree of Objective-C objects. For example, this XML node:
<myNode attr1="value1" attr2="value2" />
would be turned into an Objective-C NSDictionary node which mapped the key "attr1" to the value "value1" and the key "attr2" to the key "value2".
TTXMLParser internally uses NSXMLParser (which is basically SAX) to build up its tree, but you, as the user of TTXMLParser, don't have to do any SAX-like stuff.
So, no, you will not end up with an XML document on which you can perform XPath queries. Instead, you will end up with an Objective-C tree of objects. If that's what you want, great; if you want a traditional XML parser with XPath, I'm currently working on a project that uses both Three20 and TouchXML. TouchXML supports XPath.
I agree it's hard to find sample code for TTXMLParser. Three20's TTTwitter sample used to use TTXMLParser (well actually, TTURLXMLResponse, which in turn uses TTURLParser), but at some point it was changed to use TTURLJSONResponse instead, which is a shame, because this was their only XML sample.
You can still see the old XML-based sample code here. Specifically, look at the -[requestDidFinishLoad:] function near the bottom of the file, for an example of some code that takes a TTURLXMLResponse, queries its rootObject member, and then walks down the resulting tree of objects.
I'm trying to get the distance between two places (by way of roads), and I get that an http request will return a XML with the following data
http://code.google.com/apis/maps/documentation/directions/#XML
but I don't get how to use NSXMLParser to get the total distance. I would assume to use the
parser:didStartElement:namespaceURI:qualifiedName:attributes
method but not sure how this works. The element I would want I guess is "directions" but there's a couple elements like that. How do I specify which one I want? Does it have to do with the "attributes" parameter?
NSXMLParser is a SAX parser, and that means you must provide callback methods for various parse events while the document is processed. The Google documentation you link to makes me think these documents are not terribly large, and so I would personally not use a SAX parser.
A DOM parser, such as NSXMLDocument is much easier to use, but unfortunately Apple did not include NSXMLDocument in the iOS SDK. There are several alternatives, including TouchXML.
If you only care about the total distance, and not the distance of each leg or step, then a simple XPath query will get all of the distances for you: //leg/step/distance/value. Loop through them and sum them up.
You use didStartElement to find element you need, than you should use foundCharacters method to gather content of tag (it can be called more than ones), so only when didEndElement method is called you can use content of tag for whatever you want.
in a class i'm writing i'll most likely have to use NSXMLParser twice to parse two different xml's, and i'm wondering which approach should i use?
- release the parser after it finished parsing url and reinitialize when need to parse the second url?
- use different class as delegate for parsing other url?
- or something else?
thanks
peter
In my own personal experience, I've commonly had to parse several different REST xml responses and for each of them I inherit a base class and create one class per request/response/parse. IMHO although this isn't clean code, I honestly find it impossible to write clean code when dealing with a SAX-style parser.
My advice would be separate calls and perhaps separate classes if you don't want a bunch of if-else's in your code. Now if the XML is very similar, it could be a different story...
I have written a class which implements the parser methods and you just have to pass it a string (your url). It returns with an array of elements. It may be of use to you.
You can download it here: http://www.kieranmcgrady.me/helper-classes-for-parsing-xml-files
In the past I've often made classes to parse each response type that I expected to see, you can reuse an NSXMLParser, but I really haven't seen a need to.
Depending on your requirements you may want to just read the responses into nested NSDictionaries, then deal with accessing the elements that you need directly from the dictionaries.