I am using NSXMLParser to parse the html file on the server side.(using iphone sdk 3.0)
and my parser stop parsing after it encounter any error and call the delegate message
(void)parser:(NSXMLParser *)parser parseErrorOccurred:(NSError *)parseError
My Ques:How can I parse the file after it encounter the error.Is there any way to do so.
Thanks
You can't. Parsing is stopped when the error is encountered. It would be hard to know what the rest of an erroneous XML document meant, anyway, as the meaning of anything at one location in the document depends on everything that came before it (in this case, including an error).
You're looking for a different kind of parser. An "at-all-costs" parser would be able to do what you want. If you are getting XML from lots of different sources this is ideal.
If you have a few sources you can workaround their problems. For instance, if the only issue you get is that they told you it was UTF-8 when it turns out it is ISO-8859-1 you could run through it once, find out it fails due to a character issue, convert the XML from ISO-8859-1 to UTF-8 and try again. Since you know where the error was you can try to make some sort of fix. It's quite expensive to go this route though.
Related
I'm having problems displaying emojis in a UILabel.
in some cases, it even causes a crash when lay-outing the characters in the label.
these characters are returning from server as unicode, and are parsed with AFNetworking framework.
this is an example of how it is returned from the server (console logs):
\U05d4\U05d9\U05d9
i have tried different approaches, like lowercasing this to "\u05d4" or playing with the encoding of the string returning.
nothing seems to work.
i did managed to show a couple of emojis properly (which makes me think it maybe a server related issue?) - does the server needs to support sets of unicode characters so it can return it in the appropriate encoding? i'd be happy if someone could clarify this point for me. (btw, server is written in RubyOnRails i believe.)
should i parse the data with a different parser (SBJSON)? although switching the networking framework at this point would be impossible due to time and resources available..
what other options do i have?
Thanks
i think you should be able to just paste an emoji character in the code directly as a text.
i noticed strange problem in sdk 3.0.
When i parse XML everything works fine in any sdk 2.x but sdk 3.0 doesn't it.
I didn't find any difference in NSXMLParser but any 2.x sdk works fine and 3.0 doesn't.
If anybody met such problem and tell me how to u solve it?
->
rssParser is NSXmlParser object.
In sdk 3.0
i call this method.[rssParser parse];
then the first method my parser called is this
(void)parser:(NSXMLParser *)parser parseErrorOccurred:(NSError *)parseError
and after that it does nothing.
when i select sdk 2.2.1
then also it calls this method
(void)parser:(NSXMLParser *)parser parseErrorOccurred:(NSError *)parseError
but parser doesn't stop parsing it continue with calling other delegates of NSXmlParser.
Parse error is same in both
Error 65,Description: (null), Line: 1, Column: 60
This is the first line
!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" with < > on both end
Well, you are trying to parse a HTML file with NSXMLParser. An NSXMLParser needs a valid XML file and HTML files aren't valid XML files, they could be, but almost all are not. The doctype for instance isn't valid XML. Your using the wrong "tool" for the job.
The reason why it doesn't work on 3.x, but does on 2.x is not known to me, looks like there has been a change in behaviour after all.
I'd suggest using libxml2 to parse the HTML file and not NSXMLParser. Libxml2 can be used to parse "real world" HTML.
You might want to look at this StackOverflow topic:
parsing HTML on the iPhone
Actually the problem is that in 3.0 NSXmlparser is more restrict and it will not parse any file in 3.0 which contain any error.Answer given by Yannick Compernol is correct.to sove the problem I used libxml2.You can see this link to get the code for the parsing
http://cocoawithlove.com/2008/10/using-libxml2-for-parsing-and-xpath.html
It sounds like there's an error in your xml that the 2.x parser could get past but the 3.0 parser is more strict and stops.
Can you post the smallest xml you can make that causes this error?
Sam
My xml file contains an error
after removing , the problem is solved.
i dont know why it was working fine for NSXMLParser in OS 2.2.1.May be that ignore error in xml file.
When pasrsing XML using NSXMLParser, I encountered this problem when the parser received some characters that it couldn't take such as: the auto-correct "..." or "--" in MSWord.
My app reads XML which is exported out of my database from a PHP file. I wonder if I should handle this on the server side or on the iPhone SDK and How?
any help would be appreciated.
It sounds like an encoding issue.
Are you sure that the XML file is being served as the same encoding as in its header?
I am using the SIMPLE RSS reading example found at http://theappleblog.com/2008/08/04/tutorial-build-a-simple-rss-reader-for-iphone/
It uses parseXML to load the RSS feeds.
Here is the problem I am having. For the following RSS feed example, I am having trouble getting it to load the feed. Comes up with an error that it cannot connect. However on my Mac RSS Reader it works fine, so I know the link is good.
Any ideas on why it cannot load this particular feed but it can load others fine?
http://www.okstate.com/rss.dbml?db_oem_id=200&media=news
Thanks.
I've just released an open source RSS/Atom Parser for iPhone and hopefully it might be of some use.
I'd love to hear your thoughts on it too!
In my experience, HTML markup causes an RSS parser to fail in most cases. I've experienced a problem like this with a lot of parser classes I've come across (in search of the ultimate one, which I didn't find)
My guess is that entities such as
's
are responsible for your crash. That was usually the case with my crashes. This also lead to my decision to create a 'proxy server' to pre-parse the XML before sending it to the iPhone (which gives me the advantage of caching, scaling, and some other stuff). I do believe there are solid solutions out there, but is always difficult writing a parser for so many RSS implementations.
P.S: W3C validates this feed as 'valid', so it really is 'our' problem..
Your problem could lie with:
Unicode characters (i.e. I see some o's with two dots above them in the feed)
The code you have doesn't respect CDATA sections correctly
To find out which is the case, save the feed file to your local disk and load it via your code to make sure the error happens.
Do a binary search on the file to find out if a particular RSS entry is causing the problem (i.e. remove all but the first rss entry and see if the problem exists. If it does, then the problem is there, if it doesn't put half the rss entries back in the file and repeat)
I've been experiencing a similar issue. I haven't yet pinned down the answer, but I've noticed that RSS 2 tends to parse more successfully than the rest.
There are many RSS feeds that contain invalid XML, usually because they were hacked together on the server side using HTML templates by somebody who didn't understand XML. I've seen improperly escaped (or non-escaped) HTML post contents, missing close tags, badly nested tags, and so on.
If you want to be able to parse arbitrary feeds, you have to clean up bad XML. The usual way is to use the "htmlTidy" library, which is included in the OS. This can clean up XML as well as HTML.
This example you're following uses NSXMLParser -- I have no idea why. It's a lower-level API and it doesn't support tidying. I would suggest using NSXMLDocument instead. There's a flag in that API that will tell it to use tidy when parsing the XML. This API also returns you the XML as a handy tree of elements that's easy to work with.
I'm writing server-side programs in PHP for an iPhone app. And I have no iPhone. :P
The iPhone app requests XML files from the site whenever a user runs the iPhone app. You may visit http://www.appvee.com/iphone/ads or http://www.appvee.com/iphone/latest for the XML files.
And a message box will show up with the following error messages:
"Web Site Error
Conversion of data failed. The file is not UTF-8, or in the encoding specified in XML header if XML.
"
Maybe I must add header("Content-type: text/xml"); at the beginning of the PHP files? I didn't add this line and it worked well before.
Any help is greatly appreciated.
I agree with ceejayoz, looks like this is a special characters issue.
I would suggest using the htmlentities method to encode the data in the xml file.
It might be the unescaped special character (looks like it's supposed to be a curly apostrophe) in the 'latest' XML. (in the line that goes "Find out information about what[THIS IS THE CHARACTER]s around you and how...")
Does adding an XML content type header resolve the issue? You ask it if's necessary but give no indication if it helps or not.