Identifying and formatting XML String to readable format in XMLParser - swift

I am working in Swift and I have a set of Data that I can encode as a String like this:
<CONTAINER><Creator type="NSNull"/><Category type="NSNull"/><UMID type="NSArray"><CHILD>d1980b265cbd415c90f5d5f04efcb5df</CHILD><CHILD>7e0252c137c249fc92bd0f844effe27f</CHILD></UMID><Channels type="NSNumber">1</Channels></CONTAINER>
I am looking for a way to format this string as XML with indents so I can use XMLParser to properly read through it, which it currently does not. I imagine NSNull is when the object is empty, I just haven't seen this format so I don't know what to search for. Could it be closer to a Dictionary object? If so I'd be happy to format it as that as well.
I've also tried to create a XMLDocument from the data, but it doesn't fix the format.
EDIT:
I wanted to add a bit more information to help clarify what I am trying to do. This string above is derived from an encrypted piece of metadata from a file. In my code I identify the chunk of data that is encrypted, then decrypt it, and then convert that data to a string. It's worth noting that the string ends up having null characters in between each valid character, but I strip those out and end up with this string.
Copying this string into an XML Validator confirms it is valid XML. What is confusing to me is it's format, in which it has Object types such as NSNull and NSNumber. My post was originally hoping to identify this type of format. It seems like more than just XML.
In response to some of the comments, I have used XML Parser delegate with other XML strings and have a basic understanding of how it works. I should have originally mentioned that and instead said that XML Parser does not recognize any of these elements or strings within them.
UPDATE:
The issue ended up being the null characters in between each valid character. Stripping those out and then running it through XML Parser worked great. Thanks all.

Related

Decoding/parsing CSV and CSV-like files in Swift

I'll have to write a very customised CSV-like parser/decoder. I have looked for open source ones on Github, but not found any that fits my needs. I can solve this, but my question is if it would be a total violation of the key/value decoding, to implement this as a TopLevelDecoder in Swift.
I have keys, but not exactly key/value pairs. In CSV files, there is rather a key for each column of data,
There are a number of problem with the files I need to parse:
Commas are not only for separation of fields, but there are also commas within some fields. Example:
//If I convert to an array
Struct Family {
let name: String?
let parents: [String?]
let siblings: [String?]
}
In this example, both parents' names are within the same field, and needs to be converted into an array, and also the siblings field.
"Name", "Parents","Siblings"
"Danny", "Margaret, John","Mike, Jim, Jane"
In the case of the parents, I could have split that into two fields in a struct like
Struct Family {
let name: String?
let mother: String?
let father: String?
}
but with the Siblings field that doesn't work, since there can be all from zero to many siblings. Therefore I will have to use an array.
There are cases when I will split into two fields though.
All the files I need to parse are not strictly CSV. All of the files have tabular data (comma-or tab-separated), but some of the files have a few rows of comments (sometimes containing metadata) that I need to consider. Those files have a .txt extension, instead of .csv.
## File generated 2020-05-02
"Name", "Parents","Siblings"
"Danny", "Margaret, John","Mike, Jim, Jane"
Therefore I need to peek at the first line(s) to determine if there are such comments, and after that has been parsed I can continue to treat the rest of the file as CSV.
I plan to make it look like any Decoder, from the applications point of view, but internally in my decoder i can handle things like they were a key/value pair, because there is just one set of keys, and that is the first line in the file, if there are no comments in the beginning. I still want to use CodingKeys though.
What are your thoughts? Should I implement in as a decoder (actually TopLevelDecoder in Swift), or would that be an abuse of the idea of key/value decoding? The alternative is to implement this as a parser, but I have to handle several types of files (JSON, GraphQL, CSV and CSV-like files), and I think my application code would be a lot simpler if I could use Decoders for all the types of files.
For JSON there's no problem, since there is already a HSON decoder in Swift. For GraphQL it's not a problem either, because I can write a decoder with an unkeyed container. The problem files are those CSV and CSV-like files.
Some of them have everything in double-quotes, but for the "keys" in the CSV header and for the values. Some only have double-quotes for the keys, but not for the values. Some have comma-separated fields, and some tab-separated. Some have commas within fields, that needs special handling. Some have comments in the beginning of the file, that needs to be skipped, before parsing the rest of the file as CSV.
Some files have two fields in the first column. I have no influence whatsoever of the format of these files, so I just have to deal with it.
If you wonder what files they are, I can tell you that they are files of raw DNA, files with DNA matches, files with common DNA segments with people I have matching DNA with. It's quite a few slightly different files, from several DNA testing companies. I wish they all had used JSON in a standard format, where all keys also were standard for all the companies. But they all have different CSV headers, and other differences.
I also have to decode Gedcom files, which sort of also has key/value coded pairs, but that format too doesn't conform to a pure key/value coding in the files.
ALso: I have searched for others with similar problems, but not exactly the same, so I didn't want to hijack their threads.
See this thread Advice for going from CSV > JSON > Swift objects
That was more of a question of how to convert from CSV to JSON and then to internal data structs in Swift. I know I can write a parser to solve this, but I think it would be more elegant to handle all these files with decoders, but I want your thoughts about it.
I was also think of making a new protocol
protocol ColumnCodingKey: CodingKey {
)
I haven't decided yet what to have in the protocol, if anything.
It might work by just having it empty like in the example, and then let my decoder conform to it, then it maybe wouldn't be a very big violation of the key/value decoding.
Thanks in advance!
CSV files could be parsed using regular expression. To get you started this might save some time. It's hard to know what you really need because it looks like there are many different scenarios, it might grow to even more situations?
Regex expression to parse one line in a CSV file might look something like this
(?:(?:"(?:[^"]|"")*"|(?<=,)[^,]*(?=,))|^[^,]+|^(?=,)|[^,]+$|(?<=,)$)
Here is a detailed description on how it works with a javascript sample
Build a CSV parser

How would I convert a valid Javascript JSON into a dictionary in Swift 2?

I am trying to create a dictionary from this JSON in an variable of type [String: AnyObject].
I am using Alamofire to make the request. However, the responseJSON response handler doesn't work since it is not a 'valid' JSON object in Swift. How can I go about tackling this?
Your text is not valid JSON (you can check this here), as it's missing quotation marks around attribute strings. While it might be a JavaScript object, that's not synonymous with valid JSON. NSJSONSerialization (which is surely what's backing that function) will correctly reject the input.
You should fix your JSON - preferably at the source. You could do it by post-processing with string editing functions in Swift, but this is a bad idea.

Converting a String to a Splittable in GWT

I'm maintaining a site written in GWT (2.5.0) that is used internally by our development team, and I've been experimenting with using AutoBeans for client side json parsing. I have a few objects with json that is not well defined — a developer can dump whatever json string he wants in there — so I'm using a Splittable property. In order to support editing this arbitrary json I'd like to convert a String into a Splittable, but I haven't found a straight-forward way of accomplishing this. Do I need to implement this interface myself or resort to something hacky like wrapping the json in another json object I can then decode into a throw-away AutoBean just to get a Splittable of the original json?
StringQuoter is the utility class which we do much of our manual Splittable work with.
Just user StringQuoter.create("some string"); to produce a Splittable whose payload is
"some string"
Once you have that splittable, you can assign it to a key in another splittable with the following method:
Splittable.assign(Splittable parent, String propertyName);
However, if you are trying to convert some arbitrary string which contains a JSON structure into a splittable, use StringQuoter.split(..) to create it. The resulting splittable can be queried as normal (i.e. what keys exist/don't exist, etc).

Can I copy a fragment of the input XML using NSXMLParserDelegate protocol?

Using the NSXMLParserDelegate protocol for parsing XML is fine, however I have the need to copy verbatim a chunk of XML in an answer. What I would like to do is store everything between the beginning/end XML tags verbatim as an NSString object so I can replay this fragment in a future query.
Is this possible or the only solution is parsing the tree manually, converting to a temporal object, then back to XML string in the future query?
One thing to note is that I'm not parsing incrementally the input, rather I'm creating the NSXMLParser object with the complete xml data, then calling parse on it. So maybe there's a way to correlate the position of didStartElement/didEndElement inside the original xml data so I can extract the subrange?
Both didStartElement and didEndElement are being passed an NSXMLParser which tracks the progress of the parsing through the lineNumber and columnNumber properties. Unfortunately there's no direct way to transform those line/column info to a buffer offset, but then as well you have to interpret the NSData with a specific encoding.
A solution is to transform the NSData into a buffer of unichar elements with the NSString::getCharacters:range: method. Then the unichar buffer can be iterated scanning for newline elements until a match of line/col is found against the values stored by the NSXMLParser object. Doing this for the start/end tags gets you the unichar range of characters of XML contained inside them.
Now this range can be transformed to an NSString and that be reused in future queries. The advantage of this is that the XML inside doesn't require to be parsed since it is copied directly and is expected to be well formed.

Where can I find an Objective C code which parses any XML file without knowing before any tag or attribute?

I would like to find a sample of code written in Objective C for iPhone which can parse any XML file, even if we don't know tags or attributes. Does anyone has something like that?
Probably you can convert the XML into a NSDictionary which then can be used at your ease.
I have not used this code, but maybe you can try this to convert your xml into dictionary
http://troybrant.net/blog/2010/09/simple-xml-to-nsdictionary-converter/