Remove HTML entities - iphone

I am developing an application for iPhone.I need to remove html entities like ["<p>"] in a parsed xml response.Is there any direct way to remove all such entities.??

How is the data formatted? Are the HTML entities esacaped in the original XML, something like this:
<xml><content type="html"><p>A paragraph.</p></content></xml>
In this case you could just strip the tags with a regular expression.
Otherwise, I would suggest following the DTD of the XML file, and stripping all other tags under the assumption that they don't constitute part of the XML markup.

Related

How do you use in GWT UiBinder XML? Can you escape it?

In my mark-up I want to add a space ( ) between elements without always having to use CSS to do so. If I put in my markup, GWT throws errors. Is there a way around it?
For example:
<g:Label>One </g:Label><g:Label>Two</g:Label>
Should show:
One Two
And not:
OneTwo
As documented here, you just have to add this to the top of your XML file and it will work!
<!DOCTYPE ui:UiBinder SYSTEM "http://dl.google.com/gwt/DTD/xhtml.ent">
Note that the GWT compiler won't actually visit this URL to fetch the file, because a copy of it is baked into the compiler. However, your IDE may fetch it.
Rather than use a Label, which to me shouldn't allow character entities at all, I use an HTML widget. In order to set the content, though, I find I have to do it as the HTML attribute, not the body content (note that the uppercase HTML is important here, since the set method is setHTML, not setHtml)
<g:HTML HTML="One&nbsp;" />

Purpose for Word Open XML and content controls binding

For word report generation, I am looking at binding XML to content controls to see if it is any easier than to use Word Interop and hardcode index reference to content controls to assign values to them.
However, I don't really understand how to do it.
My work flow is entering information in Excel and then generate an XML file to have content controls populated by XML, however, what I read is the other way round: Word Control Control Toolkit and descriptions where the XML is populated by user entering information in Word, and then programmer to unzip docx file to retrieve the XML file.
How can I populate content controls with XML?
There are samples on generating Word documents from Word templates, XML and data bound content controls # http://worddocgenerator.codeplex.com/
Set up the mapped content controls in the 'template' docx using the content control toolkit or similar. Do this using a sample XML file containing your Excel data.
Now you have that template document, at run time you can inject your XML file into it (ie replace the custom xml part it contains, with your instance data), in C# or Java or whatever.
When the user opens the document in Word 2007/2010, the information in the custom XML part will automatically be copied into the bound controls, and visible to the user.
Note that content control data binding doesn't easily support repeating data (eg populating table rows) in Word 2007/2010, though there are ways to do it.

Need to find the tags under a tag in an XML using jQuery

I have this xml as part of the responseXml of an Ajax call:
<banner-ad>
<title><span style="color:#ffff00;"><strong>Title</strong></span></title>
</banner-ad>
When I used this jQuery(responseXml).find("title").text(); the result is "Title".
I also tried jQuery(responseXml).find("title:first-child") but the result is [object Object].
I want to get the result:
<span style="color:#ffff00;"><strong>Title</strong></span>
Please let me know how to do this in jQuery.
Thanks in advance for any help.
Regards,
Racs
Your problem is that you cannot simply append nodes from one document (the XML response) to another (your HTML page). The issue is two-fold:
You can use jQuery to append nodes from the XML document to the HTML page. This works; the nodes appear in the HTML DOM, but they stay XML nodes and therefore the browser ignores the style attribute, for example. Consequently the text will not be yellow (#ffff00).
As far as I can see, jQuery offers no built-in way to get the XML string (i.e. a serialized node) from an XML node. jQuery can handle XML documents quite well, but there is no equivalent to what .html() does in HTML documents.
So to make this work we need to extract the XML string from the XML document. Some browsers support the .xml property on XML nodes (namely, IE), the others come with an XMLSerializer object:
// find the proper XML node
var $title = $(doc).find("title");
// either use .xml or, when unavailable, an XMLSerializer
var html = $title[0].xml || (new XMLSerializer()).serializeToString($title[0]);
// result:
// '<title><span style="color:#ffff00;"><strong>Title</strong></span></title>'
Then we have to feed this HTML string to jQuery so new, real HTML elements can be created from it:
$("#target").append(html);
There is a fiddle to show this in action: http://jsfiddle.net/Tomalak/QWHj8/. This example also gets rid of the superfluous <title> element.
Anyway. If you have a chance to influence the XML itself, it would make sense to change it:
<banner-ad>
<title><span style="color:#ffff00;"><strong>Title</strong></span></title>
</banner-ad>
Just XML-encode the payload of <title> and you can do this in jQuery:
$("#target").append( $(doc).find("title").text() );
This would probably work:
$(responseXml).find("title").html();

Displaying the contents of the xml page

I am new to iphone development.I want to parse an xml page .The source code contains some htmls tags.This html tag is displayed in my simulator.I want to filter the tags and display only the content.The sorce code of xml is like
<description>
<![CDATA[<br /><p class="author"><span class="by">By: </span>By Sydney Ember</p><br><p>In the week since an earthquake devastated Haiti ...</p>]]>
</description>
I want "in the week since an ..." to be displayed and not the html tags.Please help me out.Thanks
As said before in other answers, the data in your xml is inside a CDATA block - this means that when you get the contents of the tag, the XML parser won't be able to get rid of the 'By:' bit for you - as far as it's concerned, it's all just text.
However,if you're going to display it inside as HTML inside a UIWebView (instead of a UILabel etc), you can add a style sheet to the start of the string that makes the 'By:' hidden. Something like
NSString *cssString = #"<style type='text/css'>span.by { display:none; }</style>"
NSString *html = [NSString stringWithFormat:#"<html><head>%#</head><body>%#</body></html>", cssString, descriptionString];
[webView loadHTMLString:html baseURL:nil];
where descriptionString is the contents of the <description> tag in your xml.
However this approach is a little heavy handed, I would try very hard to get some cleaner xml from your server!
As for actually parsing the xml, try the NSXMLParser object.
The contents inside a CDATA block are considered as text (xml specific chars like <, &, > etc will be ignored and treated as plain chars). If the text canvas you're using to display the text accepts html, read the text node of description tag and assign it to the innerHTML equivalent of the canvas.
I see that all the tags are HTML. In addition, there is a CDATA that defines that its content should be considered as text and not XML. As for the XML parsing - there are few XML parsers available for iPhone:
TouchXML
XPathQuery
I prefer the latter.
I'm not sure how the parsers will treat the CDATA.
Maybe you will have to parse twice - first time for getting the CDATA contents and second time for parsing the content...

Ignore CDATA while xml parsing

I am new to iphone development.I want to ignore CDATA tag while parsing because it consider the HTML tag following it as text.Since i want to display the content alone ,i want my parser to ignore CDATA tag.My source code is
[CDATA[<br /><p class="author"><span class="by">By: </span>By Sydney Ember</p><br><p>In the week since an </p>]].
Is there any way to ignore CDATA tag?
Is there any way to parse my source twice so it displays only the content?
Please give me some sample code.Please help me out.Thanks.
If you treat the CDATA content as XML instead of CDATA then your parser will throw an error (since your HTML is a weird mix of XHTML and HTML and is not well formed).
If you want to get the HTML, then parse the XML, extract the text content of the node, then parse that text as HTML.
There is no way to ignore the CDATA tag - it's part of the xml spec and parsers should honour it.
If you don't like the idea of this answer to your earlier question, you could get the contents of the CDATA section and parse it as XML again. However, this is highly not recommended! You don't know that the contents of the CDATA are going to be valid xml (they're probably not).
If you can 100% guarentee that the CDATA section contains the form you have above, you could probably use some string manipulation to get the data out (i.e. string replace '<span class="by">By: </span>' with '') but again, this will almost certainly break if the CDATA contents change.
Where is the xml coming from? It's a better idea to talk to owner of the service and get them to send you instead of description something like
<description>
<author>By Sydney Ember</autho>
<text>In the week since an </text>
</description>
S