My app temporarily stores a lot of sensitive data and I want to overwrite the data in memory again.
I found a post earlier suggesting this:
char* block = malloc(200);
NSString* string = [[NSString alloc] initWithBytesNoCopy:length:encoding:freeWhenDone];
//use string
memset(block, 0, 200);// overwrite block with 0
[string release];
free(block);
but this does not work for me. Because I collect the data in many different ways. For example:
mySensibleString = [anotherString substringWithRange:NSMakeRange(5,15)];
or even get it through a HTTPS connection:
- (void)connection:(NSURLConnection *)connection didReceiveData:(NSData *)data
{
// Append the new data to receivedData.
// receivedData is an instance variable declared elsewhere.
[receivedData appendData:data];
}
So I am wondering whether there is any way to locate the sensitive data stored within the object in memory and find out its length and overwrite that memory (no matter how complicated)?
There's far too many unknowns to make such an endeavour worth it. You don't know the route data has taken before it got to you, and where copies have been left in memory but not scrubbed, etc.
If you followed any of the news about iPhone security in the last year, you'll see it's a got a pretty bad rep at the minute -- pretty flawed encryption, bad things happening like keyboard data possibly being retained for a long long time, etc. I wouldn't bother if I was you!
Obviously, what data is written to disk is another matter and worth considering.
Bottom line: really, REALLY sensitive important data probably just shouldn't go anywhere near an iPhone (and maybe some other smart devices to boot).
What if you were to encrypt any data that is stored on the heap? So anytime you are storing something to a variable or a data structure pass it through a cipher. And then anytime you want to use it just decrypt it.
Saves you the trouble of trying to wipe anything. Even if they can get to the data, if they don't know your encryption phrase/key/method then they can't read it.
Related
I'm finishing up my app by running it through Instruments as well as stressing it with large amounts of data. The Instruments tests go fine, but the stress test is where I'm having issues. Without getting into too much detail, I'm giving my app increasing amounts of Core Data events with which it needs to extrapolate data, make graphs, and present locations on a MKMapView instance. I started small and increased to 56000 events, which it handled fine wihtout any leaks or memory warnings (and I was quite proud of it for handling it all).
My app implements the Dropbox API to allow for uploading and downloading templates and data for sync purposes. Files uploaded from my app are converted from Core Data to an NSDictionary, then to NSData. I create a temporary folder for the data, then upload that file to Dropbox, which works fine.....normally. If I try to upload my data file with 56000 events, then it crashes. I've logged it and watched as the data is converted. It reaches the last event with no issues, but when it's supposed to start uploading to Dropbox, the app crashes and I cannot for the life of me figure out why. I see memory warnings pop up on my log. Typically, it will go Level=1, Level=2, Level=1, Level=2, then crash, which confuses me as it never reaches Level=3.
The majority of the information I've found is in my edit at the botton. Below is some relevant code:
- (void)uploadSurveys:(NSDictionary *)dict {
NSArray *templateArray = [dict objectForKey:#"templates"];
NSArray *dataArray = [dict objectForKey:#"data"];
NSString *filename;
NSLog(#"upload called");
if ([templateArray count] || [dataArray count]) {
if ([templateArray count]) {
// irrelevent code;
}
if ([dataArray count]) {
SurveyData *survey;
for (int i = 0; i < [dataArray count]; i++) {
BOOL matchExists = NO;
// ...... code to make sure no file exists in dropbox folder and creates new version if necessary;
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
NSData *data = [self convertSurvey:survey];
dispatch_async(dispatch_get_main_queue(), ^{
[self uploadData:data withFilename:filename];
NSLog(#"converted and uploading");
});
});
}
}
}
[self convertSurvey:survey] simply converts my Core Data object to NSData.
- (void)uploadData:(NSData *)data withFilename:(NSString *)filename {
NSFileManager *manager = [NSFileManager defaultManager];
NSString *pathComponent = [NSString stringWithFormat:#"tempData.%#", filename];
NSString *path = [NSTemporaryDirectory() stringByAppendingPathComponent:pathComponent];
if ([manager createFileAtPath:path contents:data attributes:nil]) {
[self.restClient uploadFile:filename toPath:[NSString stringWithFormat:#"/%#", currentSearch] fromPath:path];
NSLog(#"uploading data");
}
}
Any help would be much appreicated and I thoroughly thank you in advance. I'm just trying to figure out if I'm either taking the wrong approach for large files or if it's simply not allowed. If I have to split the files, that is fine, but I'd prefer to know what is going on that prevents my app from performing this action before I try to make a workaround. Thank you again.
UPDATE: As this issue is now the only hinderance to the release of my application, I'm adding a bounty to this question to hopefully get a solution or workaround. It will be up for a week, after which given time I am most likely going to just split up the files as they upload to ensure that this apparent size limit is not reached. This approach is not ideal, which is why a better solution is very welcomed, but is my backup plan if this fails to bring in something more convenient.
EDIT: It appears that NSTemporaryDirectory plays no part in this at all. Here is the new situation. As you can see in the code above, NSData *data = [self convertSurvey:survey]; is called in a secondary thread (which isn't the issue). I have been logging the objects created and knew that they had reached the last one, but never thought to check and see if the NSData file was returned. Turns out, it isn't. In short, I convert all my Core Data objects into arrays and place them into a dictionary (only for the relevant survey/data to be converted). This does indeed work and the dictionary is created. Then I create an NSData file using NSData *data = [NSKeyedArchiver archivedDataWithRootObject:d]; where d is my dictionary. Directly after that, I call return data; to set the value for NSData *data = [self convertSurvey:survey];. This being the case, it appears the NSData or NSKeyedArchiver are at fault here. According to the Apple documentation:
Using 32-bit Cocoa, the size of the data is subject to a theoretical 2GB limit (in practice, because memory will be used by other objects this limit will be smaller); using 64-bit Cocoa, the size of the data is subject to a theoretical limit of about 8EB (in practice, the limit should not be a factor).
I have checked the file sizes in small increments to see where the failure occurs. I have successfully gotten 48.2MB of data through, but not 51.5MB, which leads me to believe that the issue occurs around 50MB, well below the theoretical limit for NSData (unless there is a discrepancy between iOS and OS X in that respect).
Hopefully this new information will help to solve this problem
The 2 GB limit for NSData is completely theoretical on iOS, even the iPhone 4 only has 512 MB of RAM and iOS (unlike Mac OS X) cannot swap, so if your physical RAM is full, you crash (or your app is terminated before that).
The 50 MB NSData object alone is already very large and it's not the only object you have in memory – given that you convert the data from Core Data to a dictionary representation and then to NSData, you probably consume at least twice as much memory (likely more). The system and other apps also need RAM, so you're probably reaching a limit.
Try running your app in Instruments to see how much memory you actually consume.
To reduce your peak memory usage, you have a couple of options that largely depend on your data model:
As Jason Foreman suggested in his answer, try to avoid having your whole file in memory at once. Using NSFileHandle, you can write chunks of data to a file without needing to have the whole data in memory at once. Of course, this requires that you prepare your data accordingly, so that it can be split into chunks. A higher-level approach might be to serialize your data into an XML format that you could write out as a stream. If your data format is very simple, something like CSV might also work.
Don't use NSData for uploading to Dropbox. Write your data to a file instead (see above) and point the Dropbox SDK to that file. The Dropbox SDK makes it pretty easy to do so (DBRestClient has an uploadFile:toPath:fromPath: method).
If your data model makes it difficult to take a streaming approach, try to segment the data into more manageable parts. You could then use your old method of serializing dictionaries, just with multiple files.
Be careful with Core Data's memory usage. Try to re-fault objects using refreshObject:mergeChanges: if possible to break cyclic references within your data (see the Core Data Programming Guide for details).
Avoid using autorelease pools while you're in a long-running loop or create a separate NSAutoreleasePool that gets drained in each iteration of your loop.
A way to work around this type of memory pressure is to build your APIs using streams, both for writing your converted data to a file on disk and also for uploading the data to a web service.
During conversion you can use an NSOutputStream to write chunks of data to the file to avoid keeping an large chunk of data in memory at one time. Then, NSMutableURLRequest can accept an NSStream for the body instead of an NSData, so you should create an NSInputStream to read from your file back from disk and upload it.
Using streams in this way will ensure you never have 50+ MB of data loaded and should avoid the memory warnings you are seeing.
I need to read several dozen files and do some trivial processing with their contents. Each file individually won't cause problems, but having all the data loaded at once will quickly exhaust my memory.
I started with:
for (NSString *filename in filenames)
do_something([NSData dataWithContentsOfFile:filename]);
Then of course, I remembered that Objective-C on the iPhone is not really garbage collected, and those would all stick around until the end of the frame anyway. Okay:
for (NSString *filename in filenames) {
NSData *d = [[NSData alloc] initWithContentsOfFile:filename];
do_something(d);
[d release];
}
This nominally only uses as much memory as the largest file, but that's only assuming the allocator is playing friendly at the moment - it could also thrash and fragment everything.
Is there some way I can make an NSMutableData, and keep reusing that Data's buffer, growing it as necessary? I need it as an NSData for other third-party APIs. The best idea I have at the moment is mallocing/reallocing a char* buffer as I go, reading using e.g. stdio, and constructing NSDatas with freeWhenDone:NO backed by that; that way I only thrash/retain a small amount per file.
What you are doing is the second example is fine. Even if you reused an NSMutableData object for its capacity another NSData object would need to be created with the file contents. If you are running into memory issues consider modifying do_something() to work with NSInputStreams.
You could use -[NSData initWithContentsOfMappedFile:] with your second example to keep the memory usage as low as possible.
From the documentation:
A mapped file uses virtual memory techniques to avoid copying pages of the file into memory until they are actually needed.
I've been looking through the apple documentation for the NSdata class, and I didn't really find it too enlightening. I know how to use the class but I don't really understand the gravity of the advantages that it may or may not provide. I know its a simple question but perhaps it would be good to have such information as a reference.
Advantages over what? Certainly, it's useful to represent an arbitrary block of data as an object just as it's useful to represent a string, a number, or a value as an object. Memory management becomes simpler and is consistent with memory management for all other objects, and there are a number of useful methods defined.
Say you want to read a binary file into memory. We won't worry about the reasons why -- there are as many reasons as there are data file formats. You'll have to:
Check the size of the file
Allocate a block of memory of the proper size
Open the file
Read the contents into memory
Close the file
Remember to free the memory when you're done with it (a condition that can sometimes be tricky to detect)
(Optional) Worry about whether the block of memory has been modified
With NSData, you can just create a new instance from a path or URL and not have to think about the rest.
After testing my app with Instruments I realized that the current CSV parser I use has a huge memory footprint. Does anybody have a recommendation for one with a low memory footprint?
You probably should do this row-by-row, rather than reading the whole file, parsing it, and returning an array with all the rows in it. In any case, the code you linked to produces zillions of temporary objects in a loop, which means it'll have very high memory overhead.
A quick fix would be to create an NSAutoreleasePool at the lop of the loop, and drain it at the bottom:
while ( ![scanner isAtEnd] ) {
NSAutoreleasePool *innerPool = [[NSAutoreleasePool alloc] init];
... bunch of code...
[innerPool drain];
}
This will wipe out the temporary objects, so your memory usage will be the size of the data, plus an object for each string in the file (roughly 8 bytes * rows * columns)
There are some other CSV parsers to try:
http://michael.stapelberg.de/cCSVParse
http://cocoawithlove.com/2009/11/writing-parser-using-nsscanner-csv.html (my own blog)
You could experiment to see if either is lower memory overhead.
Neither of these supports "event based" parsing. In event based parsing, you never load the whole source file into memory, just enough of the file to read the current row (you can also do this in-progress on a download). You must handle each row as it is read and make certain all data from the source is freed between rows.
This would be the theoretical lowest overhead solution. If you really needed low overhead, you should adapt an existing solution to do that (I don't have any advice on how this would be done).
It's not a CSV parser, but my open source Cocoa ParseKit framework has a powerfull/convenient/configurable string tokenizer which might be handy for CSV or other types of parsing/tokenizing.
The framework:
http://parsekit.com
Some usage documentation:
http://parsekit.com/tokenization.html
The PKTokenizer class:
http://github.com/itod/parsekit/blob/master/include/ParseKit/PKTokenizer.h
http://github.com/itod/parsekit/blob/master/src/PKTokenizer.m
How would I go about using bluetooth to transfer a core data entity with it's corresponding relationships? I have three core data entities with inverse relationships set up and it all works fine, but I need to transfer these to another iPhone based on the context that it is not in the corresponding table in the core data entity set on the other iPhone. I know how to transfer simple things such as strings and integers over bluetooth, but this is on a whole new level, and I only started programming for iPhone around 4 month ago. Thanks for all your help you experts!
EDIT:
Thanks, but for some reason I keep getting this error! What should I do?
2010-02-12 21:24:14.907 PitScout[92918:207] Failed to call designated initializer on NSManagedObject class 'Team'
2010-02-12 21:24:14.907 PitScout[92918:207] *** -[Team setTeamNumber:]: unrecognized selector sent to instance 0x112b630
2010-02-12 21:24:14.908 PitScout[92918:207] *** Terminating app due to uncaught exception 'NSInvalidArgumentException', reason: '*** -[Team setTeamNumber:]: unrecognized selector sent to instance 0x112b630'
Thanks.
You will need to serialize your objects in some way to transfer and then re-insert into a context on the other side. I suggest looking into the NSCoding protocol and examples which will allow you to use NSKeyedArchiver and NSKeyedUnarchiver to serialize your objects to NSData for transfer (or base64 encoded to an NSString if necessary).
First make sure your model object implements NSCoding:
#interface MyObject : NSManagedObject <NSCoding>
And then implement the following methods in your model object to handle the encoding and decoding of the objects:
-(id)initWithCoder:(NSCoder*)coder
{
if (self = [self init])
{
self.myProperty = [coder decodeObjectForKey:#"myProperty"];
}
return self;
}
-(void)encodeWithCoder:(NSCoder*)coder
{
[coder encodeObject:self.message forKey:#"myProperty"];
}
Use NSKeyedArchiver to serialize your object to NSData:
NSData *data = [NSKeyedArchiver archivedDataWithRootObject:myObject];
Use NSKeyedUnarchiver to deserialize:
MyObject *myObject = (MyObject *)[NSKeyedUnarchiver unarchiveObjectWithData:myData];
If a string is required then you'll have to base64 encode and decode the NSData, see this post for details on that: How do I do base64 encoding on iphone-sdk?
Trying to serialize NSManagedObject instances is going to fail because they are tied directly to the NSManagedObjectContext that they come from.
You will need to translate them into another data structure and then transmit them. Both JSON and XML work very well for this and since you can use KVC to get the data out of an NSManagedObject and into a NSDictionary which can then easily be translated into the intermediate format.
Once you have them in the intermediate format and sent over the wire then you can easily reconstruct them into the destination NSManagedObjectContext without issue.
It may be over kill for this but a method that has yet to fail me is SLIP, RFC 1055 the 1988 version. For years i have used it to map blocks of data into a 7 or 8 bit ASCII stream for transmission over every media I have encountered. Then used the inverse or some modification of it to convert the stream back to their needed configuration on the other end. Examples of the code in C are in the RFC. I always used Phil Karn's suggestion to use the same character for both the start and end of packet.
That way only one routine is needed to deal with the stream. It gobble up characters until the SOP/EOP is encountered. This was chosen to deal with noise that can accumulate on the input of radio links as they sit idle awaiting data. Phil address that in other writings.
I usually use \x0D or \x0A which ever the system the debugging tools run on uses for as a carriage return and use the ever popular back slash '\' as the escape character. Now and then it is handy to use another control code or use differ values for the control characters to reduce the packet size. Use of the system as allows a terminal program with the code for SLIP added and a few modifications to function as a monitor and as tool to enter packets into the stream by hand.
I have always found I had enough options if the first character in the packet indicated the options on the other end. Of course some form of error checking and either/or error recovery and ability to re-transmit a MUNGED packet must be provided. For small packets of data sent over highly reliable links a simple checksum might do or in the case transmissions using three mineralized volcanos as antenna sites that a bit farther apart than one would like a highly redubpndantr Fowarad Error Correction algorithim is right at home.
SLIP is versatile enough to take data from a 16 bit Motorola 68HC11 and reconstruct it on a 32 bit Intel system if the programmer reverses the endedness and takes care of the offset between 16 & 32 bit data.
Gordon
Gordon Couger
Stillwater, OK