i want to parse large xml file in iPhone and i have tried touch xml , gData and other xml parsers including SAX parsers .. my app crashes after parsing xml files or during parsing xmls because it keeps 40 MB of data in memory
What would be the best way to parse large xml files ? i want to parse the data and insert it into core data.
Thanks Much
From this article (under "Which to Choose" section):
If you want to read extremely large XML documents, performance is the
critical issue here. You’ll want to consider libxml2 SAX, TBXML, or
libxml DOM for this, depending on what your exact situation is.
If you are parsing properly you shouldn't be holding 40MB of parsed data in memory. Be sure to batch insert the data into Core Data as you are parsing to avoid causing a memory warning and a crash of your app.
Related
I'm building an application which has a "record" feature which records user interaction over time. As time progresses, I fill an array in memory with "state" objects representing the current state of the user input. A typical recording will result in about 5k of these objects.
I then archive this data using NSKeyedArchiver archiveRootObject: toFile:. This works fine, however the file size is very large (3.5 megs or so). My question is this:
Is there any inherent file-size overhead involved in archiving files? Would I be able to save this data using much less disk space if I were to use SQLite, or even roll my own file format? Or is the only way to reduce the disk size of the data going to be to reduce the bit depth of the numbers I'm storing?
If your concern is performance, Core Data gives you more granularity. You can lazy load and save by parts during app execution vs loading/saving the whole 3.5Mb object graph.
If your concern is file size, this is the binary plist format, and this is the SQLite file format. But more important than the overhead, is how complex is the translation between your object graph and the Core Data model.
You may also be interested in this comparison of speed and performance for several file formats: https://github.com/eishay/jvm-serializers/wiki/ Not sure if everything there has an C, C++ or objective-C implementation.
3.5 MB isn't a very large file. However, if your app has to load or save a 3.5 MB file all the time, then using Core Data is a lot smarter as this allows you to save only the data that has changed and retrieve only the parts that you're interested in -- not the whole thing every time.
If storage is the main concern, there would be little difference b/w sqlite and core data.
I had to store UIViewControllers with state in an app, where I ended up not saving the serialized objects but saving only the most specific properties and creating a class which read that data and re-allocated those objects.
The property map was then stored in a csv [admittedly very difficult to manage, but small like anything] and then compressed.
I need a suggest how to operate with large amount of data on iPhone. Let say I have xml file with ~120k text records. I need to perform search on this data. The solution i have tried is to use Core Data to store information in sorted order in caches. And then use binary search which works fast. But the problem is to build this caches. On first launch application takes about 15-25 seconds to build this caches. Maybe I need to use different approach to search the data?
Thanks in advance.
If you're using an XML file with the requirement that you can't cache, then you're not going to succeed unless you somehow carefully format your XML file to have useful data traversal properties -- but then you may as well use a binary file that's more useful unless you have some very esoteric requirements.
Really what you want is one of the typical indexing algorithms (on disk hash, B-tree, etc) from the get-go.
However...
If you have to read in and parse your XML text file, then you can skirt using a typical big and slow generic XML parser and write a fast hackish version since most of the data records you'll need to recognize are probably formatted the same way over and over. Nothing special, just find where the relevant data fields start, grab the data until it ends, move on to the next data field.
Honestly, 120k of text isn't very much-- sounds like whatever XML parser you're using is just slow. (I use this trick all the time for autogenerated XML data that just represents things like tables or simple data records -- my own parser is faster than any generic XML parser.)
This is probably the solution you actually want since you sound fairly attached to the XML file format. It won't be as error-proof as a generic XML parser if you're not careful, however it will eat that 120KB file up like nobody's business. And it's entry level CS work -- read in a file with certain specific formatting and grab the data values from it. Regexps are your friend if you have access to them.
Try storing and doing your searches in the cloud. (using a database stored on a server somewhere)
Unless you specifically need ALL of the information on the device..
You may know here is a page comparing the XML parsers for Iphone,
http://www.raywenderlich.com/553/how-to-chose-the-best-xml-parser-for-your-iphone-project
The comparing words for these parsers are ;
"this one is for small XML files" "for large XML files" "for relatively small XML files"
Well what the heck on earth does that mean? for instance speed is important for me and my XML is expected to be around 300KB so is that small? big or relatively small? and what does large XML files mean in Iphone? 1 MB? 50 MB? 100MB? or even 500KB?
I know there is not a strict distinction exist, but at least I need to have a rough idea what those adjectives means, I need to parse this XML in around 1-2 seconds in IPhone.
how should I choose to use one parser over another by looking at my file size and my speed requirements?
Thanks
We have an app that downloads a configuration file from a server that is frequently in excess of 1 MB. We use GDataXml to parse it, and it's relatively fast. 1MB of XML is kind of large for an XML file, but then again I'm sure large companies like WalMart, Tyson, etc. have apps that use massive XML files (possibly 50 MB). That really is a massive amount of text data though, and JSON may be a better alternative in terms of character use. Additionally, you can read the data straight from the file and shove it in an NSDictionary that you can then query. If you have control of the file output, consider JSON.
is there any easy way of store XML data into core data?
Currently, my app just pulls the values from the XML file directly, however, this isn't efficient for XML files which holds over 100 entries, thus storing the data in Core Data would be the best option. XML file is called/downloaded/parsed ever time the app opens.
With the Core Data, the XML data would be downloaded ever 3600 seconds or so, and refresh the current data in the core data, to reduce the loading time when opening the app.
Any ideas on how I can do this?
Having reviewed the developer documentation, it doesn't look very tasty.
I take that you mean you have to down load an xml file, parse it and then save the data encoded in the file? You have several options for saving such data.
If the data is relatively simple and static e.g. a repeating list of items, then you might just want to use a NSArray, NSSet or NSDictionary (or some nested combination) and then just write the resulting collection to disk as a plist using the collection classes writeToFile: methods. Then when the data is needed you just use one of the initWithFile: methods. The disadvantage of this system is that you have to read the entire file back into memory to use it. This system doesn't scale for very large data sets.
If the data is complex e.g. a bunch of separate but highly interrelated chunks of data, and moderately large, then Core Data would be better.
Of course, you always have the option of writing the downloaded file straight to disk as a string if you want.
How do the memory utilization and performance for XML or SQLite compare on the iPhone?
The initial data set for our application is 500 records with no more than 750 characters each.
How well would XML compare with SQLite for accessing say record 397 without going through the first 396? I know SQLite3 would have a better methods for that, but how is the memory utilization?
When dealing with XML, you'll probably need to read the entire file into memory to parse it, as well as write out the entire file when you want to save. With SQLite and Core Data, you can query the database to extract only certain records, and can write only the records that have been changed or added. Additionally, Core Data makes it easy to do batched fetching.
These limited reads and writes can make your application much faster if it is using SQLite or Core Data for its data store, particularly if you take advantage of Core Data's batched fetching. As Graham says, specific numbers on performance can only be obtained by testing under your specific circumstances, but in general XML is significantly slower for all but the smallest data sets. Memory usage can also be much greater, due to the need to load and parse records you do not need at that instant.
To find out how the memory usage for your application fares, you need to measure your application :). The Instruments tool will help you.