NSXMLParser processing complex units of information - nsxmlparser

I'm processing a response from the server using NSXMLParser successfuly.
Something like this
<data>
<company id="">
<name>XXX</name>
<latitude></latitude>
<longitude></longitude>
</company>
<company id="">
<name>XXX</name>
<latitude></latitude>
<longitude></longitude>
</company>
</data>
I've been using the next methods
didStartElement:namespaceURI: ... to detect when the new company need to be parsed, then I allocate a new instance. And also, detect when an attribute starts
foundCharacters: process the content of every attribute
didEndElement: ... the company has been parsed completely and could be added to the internal list. And also, detect when an attribute has been processed, then set the value processed on the foundCharacters: method
Now, I also need to get the complete XML for one company, and store it in a local cache, anybody knows if there is any way using NSXMLParser to get all the content just for one company? Or maybe without using NSXMLParser. don't know.
<company id="">
<name>XXX</name>
<latitude></latitude>
<longitude></longitude>
</company>
Thank you,

Finally I decided to re-create the XML usign the SAX methods
parser:didStartElement:
// Adding the initial TAG of the xml
accountXML = [[NSString alloc] initWithFormat:#"<%#", elementName];
for (NSString *key in [attributeDict allKeys]){
accountXML = [accountXML stringByAppendingFormat:#" %#=\"%#\"", key
, [attributeDict valueForKey:key]];
}
accountXML = [accountXML stringByAppendingString:#">\n"];
and
parser:didEndElement:
// Add the xml to the account and release it
accountXML = [accountXML stringByAppendingFormat:#"</%#>\n", elementName];
[account setCompleteXML:accountXML];

Related

CDATA Block Parsing

i was searched for this and am getting brain fire.
i am gettig
<description><![CDATA[<img src='http://behance.vo.llnwd.net/profiles22/700504/projects/2335700.jpg' style='float:left; margin-right:15px;' /><br /> NIL]]></description>
i dont know parse the Particular Link (http://behance.vo.llnwd.net/profiles22/700504/projects.jpg).
even Though i have tried to use
- (void)parser:(NSXMLParser *)parser foundCDATA:(NSData *)CDATABlock
{
if([sElementName isEqualToString:#"description"])
{
NSMutableString *someString= [[NSMutableString alloc] initWithData:CDATABlock encoding:NSUTF8StringEncoding];
NSLog(#"%#",str);
}
}
it is get printed like
<img src='http://behance.vo.llnwd.net/profiles22/700504/projects/2335700.jpg' style='float:left; margin-right:15px;' /><br /> NIL
help me to get the particular link. Any links or answer may help..,
Thanks in Advance.,
The CDATA function is exactly for this purpose - if you have some XML that you want to embed into another XML as text (as opposed to as nested XML that modifies the structure itself). So, after obtaining this particular string, the <img> tag, you can use another XML parser to obtain the value of the src attribute.

Parsing XML Inner XML tags Objective-C

I would like to create an NSDictionary or (NSArray) full of NSDictionary objects for each station in the following XML:
<stations lastUpdate="1328986911319" version="2.0">
<station>
<id>1</id>
<name>River Street , Clerkenwell</name>
<terminalName>001023</terminalName>
<lat>51.52916347</lat>
<long>-0.109970527</long>
<installed>true</installed>
<locked>false</locked>
<installDate>1278947280000</installDate>
<removalDate/>
<temporary>false</temporary>
<nbBikes>12</nbBikes>
<nbEmptyDocks>7</nbEmptyDocks>
<nbDocks>19</nbDocks>
</station>
... more...
<station>
<id>260</id>
<name>Broadwick Street, Soho</name>
<terminalName>003489</terminalName>
<lat>51.5136846</lat>
<long>-0.135580879</long>
<installed>true</installed>
<locked>false</locked>
<installDate>1279711020000</installDate>
<removalDate/>
<temporary>false</temporary>
<nbBikes>12</nbBikes>
<nbEmptyDocks>4</nbEmptyDocks>
<nbDocks>18</nbDocks>
</station>
...
</stations>
What's the best way to achieve this? Right now I have an NSDictionary with one object and one key - "stations", but I want the NSDictionary (or NSArray) of NSDictionarys.
I'm using an XML parser by Troy Brant - http://troybrant.net/blog/2010/09/simple-xml-to-nsdictionary-converter/
I'm guessing it's going to involve some looping of some sort but I'm not really sure how to approach this problem. Any help or pointers in the right direction would be much appreciated.
Thanks.
I also use the XMLReader it is very easy to understand
I have looked at your xml, and I am assuming you wanted to use the array of station tags.
Here is my solution:
NSDictionary *dictXML= [XMLReader dictionaryForXMLString:testXMLString error:&parseError];
NSArray *arrStation = [[dictXML objectForKey:#"stations"] objectForKey:#"station"];//this would return the array of station dictionaries
Now that you have the array of station tags you can do what you want for example displaying all id:
for(int i=0;i<[arrStation count];i++){
NSDictionary *aStation = [arrStation objectAtIndex:i];
NSLog(#"id = %#",[aStation objectForKey:#"id"]);
}
also you can write less code using the fast enumeration loop:
for(NSDictionary *aStation in arrStation){
NSLog(#"id = %#",[aStation objectForKey:#"id"]);
}
hope that helps :)
The best way would be to create subclasses of NSObject that map to each tag. That would be way more cleaner than using Dictionaries and Arrays.

Get objects from a NSDictionary

I get from an URL this result :
NSString *result = [[NSString alloc] initWithData:responseData encoding:NSUTF8StringEncoding];
it looks like this :
[{"modele":"Audi TT Coup\u00e9 2.0 TFSI","modele_annee":null,"annee":"2007","cilindre":"4 cyl","boite":"BVM","transmision":"Traction","carburant":"ES"},
{"modele":"Audi TT Coup\u00e9 2.0 TFSI","modele_annee":null,"annee":"2007","cilindre":"4 cyl","boite":"BVM","transmision":"Traction","carburant":"ES"}]
So it contains 2 dictionaries. I need to take the objects from all the keys from this result. How can I do this?
I tried this : NSDictionary vehiculesPossedeDictionary=(NSDictionary *)result;
and then this : [vehiculesPossedeDictinary objectForKey:#"modele"]; but this is not working.
Please help me... Thanks in advance
What you have is a JSON string which describes an "array" containing two "objects". This needs to be converted to Objective-C objects using a JSON parser, and when converted will be an NSArray containing two NSDictionarys.
You aren't going to be able to get your dictionary directly from a string of JSON. You are going to have to going to have to run it through a JSON parser first.
At this point, there is not one build into the iOS SDK, so you will have to download a third-party tool and include it in your project.
There are a number of different JSON parser, include TouchJSON, YAJL, etc. that you can find and compare. Personally, I am using JSONKit.
#MatthewGillingham suggests JSONKit. I imagine it does fine, but I've always used its competitor json-framework. No real reason, I just found it first and learned it first. I do think its interface is somewhat simpler, but plenty of people do fine with JSONKit too.
Using json-framework:
require JSON.h
...and then
NSString *myJsonString = #"[{'whatever': 'this contains'}, {'whatever', 'more content'}]";
NSArray *data = [myJsonString JSONValue];
foreach (NSDictionary *item in data) {
NSString *val = [item objectForKey:#"whatever"];
//val will contain "this contains" on the first time through
//this loop, then "more content" the second time.
}
If you have array of dictionary just assign objects in array to dictionary like
NSDictionary *dictionary = [array objectAtIndes:0];
and then use this dictionary to get values.

iPhone SDK: XML mystery, after adding child nodeforXPath returns nothing (found a hacky solution)

I have a big mystery here,
I have a Gdataxml document property:
GDataXMLDocument *doc;
I'm adding a new element to doc, interestingly, this method below looks perfect for other elements but not for the element I just added:
GDataXMLElement *newValueDefElement = [GDataXMLNode elementWithName:#"valuedefinition"];
[variableElement addChild:newValueDefElement];
and now when I query:
NSString *path = [NSString stringWithFormat:#"//inferenceresponse/state/variable[pageId=%d]/valuedefinition",pageID];
NSArray *valueElement = [self.doc nodesForXPath:path error:nil];
Now array comes with zero objects! new added element NOT found! but I can see it in debug as xml string, how on earth it can not find something which I can see it is there on the log? it is a cache problem or a namespace problem or a bug in GDataXML? again..Problem is adding a new child and it is somehow not updated in the doc, but I can get the other elements under same root when use the same Xpath query standard
in NSlog I can see that the new element is added to doc.
NSData *xmlData2 = self.doc.XMLData;
NSString *s= [[[NSString alloc] initWithBytes:[xmlData2 bytes] length:[xmlData2 length] encoding:NSUTF8StringEncoding] autorelease];
NSLog(s);
Also How can self.doc.XMLData give something different than [self.doc nodesForXPath]? so it fools me to thing my doc is ok but maybe I corrupted the doc or a wrong namespace while adding removing some elements in a previous method?
my xml starts like this:
<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
<SOAP-ENV:Header/>
<SOAP-ENV:Body>
<inferenceresponse xmlns="">
<state goalreached="false">
..
..
Update
I just found a (hacky) solution; when I convert "doc" to NSData with "doc.XMLData" and then again convert back to doc, then it works! but this should not be real solution, that's lame to do that conversion back and forth to get a correct document object. What is the problem here? I guess it can not fix the namespaces for new child.
Your problem is here:
<inferenceresponse xmlns="">
The empty namespace attribute is obviously confusing the libxml XPath evaluation. If you step through GDataXMLNode's nodesForXPath:namespaces:error:, xmlXPathEval indeed returns an empty nodes set.
If you have control over the XML generation, I've got correct XPath results removing the empty attribute.
<inferenceresponse>
If modifying the server response is too hard, you can edit GDataXMLNode.m:
Find the method fixQualifiedNamesForNode:graftingToTreeNode: in GDataXMLNode implementation and replace the line
if (foundNS != NULL) {
// we found a namespace, so fix the ns pointer and the local name
with
if (foundNS != NULL && foundNS->href != NULL && strlen((char *)foundNS->href) != 0) {
// we found a namespace, so fix the ns pointer and the local name

xml Tableview nsxmlparsing [duplicate]

I think I read every single web page relating to this problem but I still cannot find a solution to it, so here I am.
I have an HTML web page which is not under my control and I need to parse it from my iPhone application. Here is a sample of the web page I'm talking about:
<HTML>
<HEAD>
<META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
</HEAD>
<BODY>
<LI class="bye bye" rel="hello 1">
<H5 class="onlytext">
<A name="morning_part">morning</A>
</H5>
<DIV class="mydiv">
<SPAN class="myclass">something about you</SPAN>
<SPAN class="anotherclass">
Bye Bye è un saluto
</SPAN>
</DIV>
</LI>
</BODY>
</HTML>
I'm using NSXMLParser and it is going well till it find the è html entity. It calls foundCharacters: for "Bye Bye" and then it calls resolveExternalEntityName:systemID:: with an entityName of "egrave".
In this method i'm just returning the character "è" trasformed in an NSData, the foundCharacters is called again adding the string "è" to the previous one "Bye Bye " and then the parser raise the NSXMLParserUndeclaredEntityError error.
I have no DTD and I cannot change the html file I'm parsing. Do you have any ideas on this problem?
Update (12/03/2010). After the suggestion of Griffo I ended up with something like this:
data = [self replaceHtmlEntities:data];
NSXMLParser *parser = [[NSXMLParser alloc] initWithData:data];
[parser setDelegate:self];
[parser parse];
where replaceHtmlEntities:(NSData *) is something like this:
- (NSData *)replaceHtmlEntities:(NSData *)data {
NSString *htmlCode = [[NSString alloc] initWithData:data encoding:NSISOLatin1StringEncoding];
NSMutableString *temp = [NSMutableString stringWithString:htmlCode];
[temp replaceOccurrencesOfString:#"&" withString:#"&" options:NSLiteralSearch range:NSMakeRange(0, [temp length])];
[temp replaceOccurrencesOfString:#" " withString:#" " options:NSLiteralSearch range:NSMakeRange(0, [temp length])];
...
[temp replaceOccurrencesOfString:#"À" withString:#"À" options:NSLiteralSearch range:NSMakeRange(0, [temp length])];
NSData *finalData = [temp dataUsingEncoding:NSISOLatin1StringEncoding];
return finalData;
}
But I am still looking the best way to solve this problem. I will try TouchXml in the next days but I still think that there should be a way to do this using NSXMLParser API, so if you know how, feel free to write it here.
After exploring several alternatives, it appears that NSXMLParser will not support entities other than the standard entities <, >, &apos;, " and &
The code below fails resulting in an NSXMLParserUndeclaredEntityError.
// Create a dictionary to hold the entities and NSString equivalents
// A complete list of entities and unicode values is described in the HTML DTD
// which is available for download http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent
NSDictionary *entityMap = [NSDictionary dictionaryWithObjectsAndKeys:
[NSString stringWithFormat:#"%C", 0x00E8], #"egrave",
[NSString stringWithFormat:#"%C", 0x00E0], #"agrave",
...
,nil];
NSXMLParser *parser = [[NSXMLParser alloc] initWithData:data];
[parser setDelegate:self];
[parser setShouldResolveExternalEntities:YES];
[parser parse];
// NSXMLParser delegate method
- (NSData *)parser:(NSXMLParser *)parser resolveExternalEntityName:(NSString *)entityName systemID:(NSString *)systemID {
return [[entityMap objectForKey:entityName] dataUsingEncoding: NSUTF8StringEncoding];
}
Attempts to declare the entities by prepending the HTML document with ENTITY declarations will pass, however the expanded entities are not passed back to parser:foundCharacters and the è and à characters are dropped.
<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"
[
<!ENTITY agrave "à">
<!ENTITY egrave "è">
]>
In another experiment, I created a completely valid xml document with an internal DTD
<?xml version="1.0" standalone="yes" ?>
<!DOCTYPE author [
<!ELEMENT author (#PCDATA)>
<!ENTITY js "Jo Smith">
]>
<author>< &js; ></author>
I implemented the parser:foundInternalEntityDeclarationWithName:value:; delegate method and it is clear that the parser is getting the entity data, however the parser:foundCharacters is only called for the pre-defined entities.
2010-03-20 12:53:59.871 xmlParsing[1012:207] Parser Did Start Document
2010-03-20 12:53:59.873 xmlParsing[1012:207] Parser foundElementDeclarationWithName: author model:
2010-03-20 12:53:59.873 xmlParsing[1012:207] Parser foundInternalEntityDeclarationWithName: js value: Jo Smith
2010-03-20 12:53:59.874 xmlParsing[1012:207] didStartElement: author type: (null)
2010-03-20 12:53:59.875 xmlParsing[1012:207] parser foundCharacters Before:
2010-03-20 12:53:59.875 xmlParsing[1012:207] parser foundCharacters After: <
2010-03-20 12:53:59.876 xmlParsing[1012:207] parser foundCharacters Before: <
2010-03-20 12:53:59.876 xmlParsing[1012:207] parser foundCharacters After: <
2010-03-20 12:53:59.877 xmlParsing[1012:207] parser foundCharacters Before: <
2010-03-20 12:53:59.878 xmlParsing[1012:207] parser foundCharacters After: <
2010-03-20 12:53:59.879 xmlParsing[1012:207] parser foundCharacters Before: <
2010-03-20 12:53:59.879 xmlParsing[1012:207] parser foundCharacters After: < >
2010-03-20 12:53:59.880 xmlParsing[1012:207] didEndElement: author with content: < >
2010-03-20 12:53:59.880 xmlParsing[1012:207] Parser Did End Document
I found a link to a tutorial on Using the SAX Interface of LibXML. The xmlSAXHandler that is used by NSXMLParser allows for a getEntity callback to be defined. After calling getEntity, the expansion of the entity is passed to the characters callback.
NSXMLParser is missing functionality here. What should happen is that the NSXMLParser or its delegate store the entity definitions and provide them to the xmlSAXHandler getEntity callback. This is clearly not happening. I will file a bug report.
In the meantime, the earlier answer of performing a string replacement is perfectly acceptable if your documents are small. Check out the SAX tutorial mentioned above along with the XMLPerformance sample app from Apple to see if implementing the libxml parser on your own is worthwhile.
This has been fun.
A possibly less hacky solution is replace the DTD with a local modified one with all external entity declaration replaced with local one.
This is how I do it:
First, find and replace the document DTD declaration with a local file. For example, replace this:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html><body><a href='a.html'>hi!</a><br><p>Hello</p></body></html>
with this:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "file://localhost/Users/siuying/Library/Application%20Support/iPhone%20Simulator/6.1/Applications/17065C0F-6754-4AD0-A1EA-9373F6476F8F/App.app/xhtml1-transitional.dtd">
<html><body><a href='a.html'>hi!</a><br><p>Hello</p></body></html>
```
Download the DTD from the W3C URL and add it to your app bundle. You can find the path of the file with following code:
NSBundle* bundle = [NSBundle bundleForClass:[self class]];
NSString* path = [[bundle URLForResource:#"xhtml1-transitional" withExtension:#"dtd"] absoluteString];
Open the DTD file, find any external entity reference:
<!ENTITY % HTMLlat1 PUBLIC
"-//W3C//ENTITIES Latin 1 for XHTML//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent">
%HTMLlat1;
replace it with the content of the entity file ( http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent in the above case)
After replacing all external reference, NSXMLParser should properly handle the entities without the need to download every remote DTD/external entities each time it parse a XML file.
You could do a string replace within the data before you parse it with NSXMLParser. NSXMLParser is UTF-8 only as far as I know.
I think your going to run into another problem with this example as it isn't vaild XML which is what the NSXMLParser is looking for.
The exact problem in the above is that the tags META, LI, HTML and BODY aren't closed so the parser looks all the way though the rest of the document looking for its closing tag.
The only way around this that I know of if you don't have access to change the HTML is to mirror it with the closing tags inserted.
I would try using a different parser, like libxml2 - in theory I think that one should be able to handle poor HTML.
Since I've just started doing iOS development I've been searching for the same thing and found a related mailing list entry: http://www.mail-archive.com/cocoa-dev#lists.apple.com/msg17706.html
- (NSData *)parser:(NSXMLParser *)parser resolveExternalEntityName: (NSString *)entityName systemID:(NSString *)systemID {
NSAttributedString *entityString = [[[NSAttributedString alloc] initWithHTML:[[NSString stringWithFormat:#"&%#;", entityName] dataUsingEncoding:NSUTF8StringEncoding] documentAttributes:NULL] autorelease];
NSLog(#"resolved entity name: %#", [entityString string]);
return [[entityString string] dataUsingEncoding:NSUTF8StringEncoding];
}
This is fairly similar to your original solution and also causes a parser error NSXMLParserErrorDomain error 26; but it does continue parsing after that. The problem is, of course, that it's harder to tell real errors apart ;-)