Can I copy a fragment of the input XML using NSXMLParserDelegate protocol? - iphone

Using the NSXMLParserDelegate protocol for parsing XML is fine, however I have the need to copy verbatim a chunk of XML in an answer. What I would like to do is store everything between the beginning/end XML tags verbatim as an NSString object so I can replay this fragment in a future query.
Is this possible or the only solution is parsing the tree manually, converting to a temporal object, then back to XML string in the future query?
One thing to note is that I'm not parsing incrementally the input, rather I'm creating the NSXMLParser object with the complete xml data, then calling parse on it. So maybe there's a way to correlate the position of didStartElement/didEndElement inside the original xml data so I can extract the subrange?

Both didStartElement and didEndElement are being passed an NSXMLParser which tracks the progress of the parsing through the lineNumber and columnNumber properties. Unfortunately there's no direct way to transform those line/column info to a buffer offset, but then as well you have to interpret the NSData with a specific encoding.
A solution is to transform the NSData into a buffer of unichar elements with the NSString::getCharacters:range: method. Then the unichar buffer can be iterated scanning for newline elements until a match of line/col is found against the values stored by the NSXMLParser object. Doing this for the start/end tags gets you the unichar range of characters of XML contained inside them.
Now this range can be transformed to an NSString and that be reused in future queries. The advantage of this is that the XML inside doesn't require to be parsed since it is copied directly and is expected to be well formed.

Related

Identifying and formatting XML String to readable format in XMLParser

I am working in Swift and I have a set of Data that I can encode as a String like this:
<CONTAINER><Creator type="NSNull"/><Category type="NSNull"/><UMID type="NSArray"><CHILD>d1980b265cbd415c90f5d5f04efcb5df</CHILD><CHILD>7e0252c137c249fc92bd0f844effe27f</CHILD></UMID><Channels type="NSNumber">1</Channels></CONTAINER>
I am looking for a way to format this string as XML with indents so I can use XMLParser to properly read through it, which it currently does not. I imagine NSNull is when the object is empty, I just haven't seen this format so I don't know what to search for. Could it be closer to a Dictionary object? If so I'd be happy to format it as that as well.
I've also tried to create a XMLDocument from the data, but it doesn't fix the format.
EDIT:
I wanted to add a bit more information to help clarify what I am trying to do. This string above is derived from an encrypted piece of metadata from a file. In my code I identify the chunk of data that is encrypted, then decrypt it, and then convert that data to a string. It's worth noting that the string ends up having null characters in between each valid character, but I strip those out and end up with this string.
Copying this string into an XML Validator confirms it is valid XML. What is confusing to me is it's format, in which it has Object types such as NSNull and NSNumber. My post was originally hoping to identify this type of format. It seems like more than just XML.
In response to some of the comments, I have used XML Parser delegate with other XML strings and have a basic understanding of how it works. I should have originally mentioned that and instead said that XML Parser does not recognize any of these elements or strings within them.
UPDATE:
The issue ended up being the null characters in between each valid character. Stripping those out and then running it through XML Parser worked great. Thanks all.

Sending a JSON NSData via URL

I am attempting to send a JSON representation of an object in an email link. The recipient will open the link and my app will respond via a url scheme. It must extract the JSON from the url and re-build the object.
I am serializing my object by building an NSDictionary and using:
return [NSJSONSerialization dataWithJSONObject:dictionary options:NSJSONWritingPrettyPrinted error:&error];
I'm not sure what comes next. Somehow I need to convert this NSData into a string so that I can prefix my url scheme and use it in a link.
On the receiving end, I then need to remove the prefix (which I can do) and turn the string back into an NSData.
What is the correct method for doing this? And how do I make sure that the contents of my data do not interfere with the JSON string encoding (e.g. if my object contains text including special characters)?
You need to do an additional encoding step, since there are characters in encoded JSON that also have significance when they are part of a URL. What we actually want to do is URL-encode the data so none of the characters in the resulting string conflict with what applications expect a URL to look like.
The first step is transforming our data into an NSString (this is basically just a memcpy since NSStrings are encoded in UTF-8 by default):
NSString *jsonString = [[NSString alloc] initWithData:data encoding:NSUTF8StringEncoding];
Now, there's a function you might be tempted to use called -stringByAddingPercentEscapesUsingEncoding, but it doesn't do a thorough enough job of escaping all the relevant characters, so we need to build our own.
I could repeat the code here, but since it's been done many times already, just view this blog that shows how you can add a category to NSString to do proper encoding, after which you can append it and send it on its way. Writing the analogous decoding function with CFURLCreateStringByReplacingPercentEscapesUsingEncoding is an exercise for the reader of which many examples can be found floating around.
Make sure your payloads are quite small (on the order of a couple of kB), by the way, since there is probably an upper bound on how long URLs, even those used locally and with a custom scheme, can be.

how to know the PDF file name using a CGPDFDictionaryRef object reference

I have an object of type CGPDFDictionaryRef returned somehow from a method that is considered as a part of a static library (so I do not have access to its code to modify it), however, I want to know the name of the PDF file that this dictionary object holds ? how can I query it to get the name of the file?
There are two functions that take a CGPDFDocumentRef and return a CGPDFDictionaryRef. They are CGPDFDocumentGetInfo and CGPDFDocumentGetCatalog. Neither function returns a dictionary that contains the name of the original file. Neither does the array returned by CGPDFDocumentGetID.
This makes sense, because you can create a CGPDFDocumentRef without a file, from data you get over a socket or by drawing into a CGPDFContext with Quartz 2D.
If you want the name of the file, you'll have to get it some other way.
The CGPDF* functions are a function-based mechanism to get to the series of arrays, dictionaries, integers, string and name elements in PDF documents. PDF documents themselves are really just composed of these "basic" elements. If you'd like some light reading check out the ~1500 page PDF specification sometime. As rob mayoff stated, you are basically pointing to memory once you have a CGPDFDocumentRef.
That being said, there is no value that is guaranteed within a PDF structure that will give you the filename. Download Voyeur and dig around your PDF to look around and prove me wrong (I could be).
Here's the sample of the true contents of a PDF:

Objective-C XML manipulate as NSString is safe?

Is converting an XML to NSString from NSData, then manipulate string and convert it back to NSData is safe?
I heard it is always safer to work with XML Parsing libraries. Can anyone explain why? and at which points I should be careful if I will use that method? is that a possible encoding problem?
The risk is that you'll do this:
Convert an XML to a string
Manipulate the string, and accidentally break the XML.
Convert back, and end up with an invalid XML.
If you work with an XML parsing library, as you can manipulate the elements in a DOM, you won't have the chance of breaking the XML structure and ending up with an invalid XML.
Other than that, if you are careful with the operations you do on the NSString, you'll probably be fine.
Conversion between NSData and NSString is safe as long as you do the conversion with the original encoding.
// example: to NSData and back assuming the original message uses UTF-8
NSData *data = [NSData dataWithBytes:[message UTF8String] length:[message lengthOfBytesUsingEncoding:NSUTF8StringEncoding]];
NSString *string = [[NSString alloc] initWithBytes:[data bytes] length:[data length] encoding:NSUTF8StringEncoding];
Parsing XML as string will onnly work with naive documents. If your XML structure doesn't change, if the elements you search for are unique, if the contents are only characters, if there are no CDATA sections in the middle, if there are no namespaces, you are safe. Otherwise your code will get easily confused trying to digest the XML. It's going to be more solid if both the creator and the client of the document abide by the rules set by the XML standard.
If behind all this you are worrying about complexity, it's easy to operate on XML using XPath. If you worry about speed, maybe you could switch to a faster format like JSON if you are in control of XML generation.

iPhone: NSXMLParser's foundCharacters method called multiple time for single tag

i am able to parse the XML file. but i am facing strange error. My XML file format is like this
<contact>
<contactServiceTag>3B92E6A7-B599-EAE9-1CB7B5592CB8695C</contactServiceTag>
<contactDeletedBoolean>0</contactDeletedBoolean>
<contactLinkStatus>Stored</contactLinkStatus>
<contactName>P4</contactName>
−
<contactEmailAddresses>
<workEmail>updatedp4#isol.co.in</workEmail>
<personalEmail/>
<otherEmail/>
</contactEmailAddresses>
<contactLastUpdated>{ts '2010-01-22 10:05:42'}</contactLastUpdated>
<contactPhotoExists>False</contactPhotoExists>
</contact>
during the parsing, when parser parse the element contactLastUpdated , then foundCharacters method called multiple time and it return the value {ts on first run, \' on second run, 2010-01-22 10:05:42 on third run,\' on fourth run and finally } on last run. so i get only last value (}) when i called didEndElement method.
please suggest how can i resolve this type of error
In your implementation of the <NSXMLParserDelegate> callbacks like parser:foundCharacters:, you should be storing the found characters in instance variables, possibly concatenating a string together, so that when parser:didEndElement:namespaceURI:qualifiedName: is invoked, you have the full element value/body available to your object through its instance variable state.
You might also read up on the difference between SAX and DOM parsers. NSXMLParser is a SAX parser which is less convenient to use, but performs better than DOM parsers.
Create a string when entering an element, append to it when foundCharacters is called and then check its length/value on didEndElement?
Both Jon's and Mobs' answers are correct, that is the way to do it. In order to understand better how it works, I suggest that you take a look at Apple's Seismic XML sample project. It uses the NSXMLParser in a very clear way and also shows how to handle the situation you are in.