Encrypting SQLite Database file in iPhone OS - iphone

Any SQLite database on the iPhone is simply a file bundled with the application. It is relatively simple for anyone to extract this file and query it.
What are your suggestions for encrypting either the file or the data stored within the database.
Edit: The App is a game that will be played against other users. Information about a users relative strengths and weaknesses will be stored in the DB. I don't want a user to be able to jail-break the phone up their reputation/power etc then win the tournament/league etc (NB: Trying to be vague as the idea is under NDA).
I don't need military encryption, I just don't want to store things in plain text.
Edit 2: A little more clarification, my main goals are
Make it non-trivial to hack sensitive data
Have a simple way to discover if data has been altered (some kind of checksum)

You cannot trust the client, period. If your standalone app can decrypt it, so will they. Either put the data on a server or don't bother, as the number of people who actually crack it to enhance stats will be minuscule, and they should probably be rewarded for the effort anyway!
Put a string in the database saying "please don't cheat".

There are at least two easier approaches here (both complimentary) that avoid encrypting values or in-memory databases:
#1 - ipa crack detection
Avoid the technical (and legal) hassle of encrypting the database and/or the contents and just determine if the app is pirated and disable the network/scoring/ranking aspects of the game. See the following for more details:
http://thwart-ipa-cracks.blogspot.com/2008/11/detection.html
#2 - data integrity verification
Alternatively store a HMAC/salted hash of the important columns in each row when saving your data (and in your initial sqlite db). When loading each row, verify the data against the HMAC/hash and if verification fails act accordingly.
Neither approach will force you to fill out the encryption export forms required by Apple/US government.
Score submission
Don't forget you'll need to do something similar for the actual score submissions to protect against values coming from something other than your app. You can see an implementation of this in the cocos2d-iphone and cocoslive frameworks at http://code.google.com/p/cocos2d-iphone/ and http://code.google.com/p/cocoslive/
Response to comments
There is no solution here that will 100% prevent data tampering. If that is a requirement, the client needs to be view only and all state and logic must be calculated on a trusted server. Depending on the application, extra anti-cheat mechanisms will be required on the client.
There are a number of books on developing massively-multiplayer games that discuss these issues.
Having a hash with a known secret in the code is likely a reasonable approach (at least, when considering the type of applications that generally exist on the App Store).

Like Kendall said, including the key on the device is basically asking to get cracked. However, there are folks who have their reasons for obfuscating data with a key on-device. If you're determined to do it, you might consider using SQLCipher for your implementation. It's a build of SQLite that provides transparent, page-level encryption of the entire DB. There's a tutorial over on Mobile Orchard for using it in iPhone apps.

How likely do you think it is that your normal user will be doing this? I assume you're going through the app store, which means that everything is signed/encrypted before getting on to the user's device. They would have to jailbreak their device to get access to your database.
What sort of data are you storing such that it needs encryption? If it contains passwords that the user entered, then you don't really need to encrypt them; the user will not need to find out their own password. If it's generic BLOB data that you only want the user to access through the application, it could be as simple as storing an encrypted blob using the security API.
If it's the whole database you want secured, then you'd still want to use the security api, but on the whole file instead, and decrypt the file as necessary before opening it. The issue here is that if the application closes without cleanup, you're left with a decrypted file.
You may want to take a look at memory-resident databases, or temporary databases which you can create either using a template db or a hard-coded schema in the program (take a look at the documentation for sqlite3_open). The data could be decrypted, inserted into the temporary database, then delete the decrypted database. Do it in the opposite direction when closing the connection.
Edit:
You can cook up your own encryption scheme I'm sure with just a very simple security system by XOR-ing the data with a value stored in the app, and store a hash somewhere else to make sure it doesn't change, or something.

SQLCipher:
Based on my experience SQLCipher is the best option to encrypt the data base.
Once the key("PRAGMA key") is set SQLCipher will automatically encrypt all data in the database! Note that if you don't set a key then SQLCipher will operate identically to a standard SQLite database.
The call to sqlite3_key or "PRAGMA key" should occur as the first operation after opening the database. In most cases SQLCipher uses PBKDF2, a salted and iterated key derivation function, to obtain the encryption key. Alternately, an application can tell SQLCipher to use a specific binary key in blob notation (note that SQLCipher requires exactly 256 bits of key material), i.e.
Reference:
http://sqlcipher.net/ios-tutorial
I hope someone would save time on exploring about this

Ignoring the philosophical and export issues, I'd suggest that you'd be better off encrypting the data in the table directly.
You need to obfuscate the decryption key(s) in your code. Typically, this means breaking them into pieces and encoding the strings in hex and using functions to assemble the pieces of the key together.
For the algorithm, I'd use a trusted implementation of AES for whatever language you're using.
Maybe this one for C#:
http://msdn.microsoft.com/en-us/magazine/cc164055.aspx
Finally, you need to be aware of the limitations of the approach. Namely, the decryption key is a weak link, it will be available in memory at run-time in clear text. (At a minimum) It has to be so that you can use it. The implementation of your encryption scheme is another weakness--any flaws there are flaws in your code too. As several other people have pointed out your client-server communications are suspect too.
You should remember that your executable can be examined in a hex editor where cleartext strings will leap out of the random junk that is your compiled code. And that many languages (like C# for example) can be reverse-compiled and all that will be missing are the comments.
All that said, encrypting your data will raise the bar for cheating a bit. How much depends on how careful you are; but even so a determined adversary will still break your encryption and cheat. Furthermore, they will probably write a tool to make it easy if your game is popular; leaving you with an arms-race scenario at that point.
Regarding a checksum value, you can compute a checksum based on the sum of the values in a row assuming that you have enough numeric values in your database to do so. Or, for an bunch of boolean values you can store them in a varbinary field and use the bitwise exclusive operator ^ to compare them--you should end up with 0s.
For example,
for numeric columns,
2|3|5|7| with a checksum column | 17 |
for booleans,
0|1|0|1| with a checksum column | 0101 |
If you do this, you can even add a summary row at the end that sums your checksums. Although this can be problematic if you are constantly adding new records. You can also convert strings to their ANSI/UNICODE components and sum these too.
Then when you want to check the checksum simple do a select like so:
Select *
FROM OrigTable
right outer join
(select pk, (col1 + col2 + col3) as OnTheFlyChecksum, PreComputedChecksum from OrigTable) OT on OrigTable.pk = OT.pk
where OT.OnTheFlyChecksum = OT.PreComputedChecksum

It appears to be simplest to sync all tournament results to all iPhones in the tournament. You can do it during every game: before a game, if the databases of two phones contradict each other, the warning is shown.
If the User A falsifies the result if his game with User B, this result will propagate until B eventually sees it with the warning that A's data don't match with his phone. He then can go and beat up explain to A that his behavior isn't right, just the way it is in real life if somebody cheats.
When you compute the final tournament results, show the warning, name names, and throw out all games with contradictory results. This takes away the incentive to cheat.
As said before, encryption won't solve the problem since you can't trust the client. Even if your average person can't use disassembler, all it takes is one motivated person and whatever encryption you have will be broken.

Yet, if on windows platform, you also can select SQLiteEncrypt to satisfy your needs.SQLiteEncrypt extends sqlite encryption support, but you can treat it as original sqlite3 c library.

Related

How do I store Argon2 passwords in my database?

I'm trying to store user passwords in my DB using Argon2 algorithm.
This is what I obtain by using it:
$echo -n "password" | argon2 "smallsalt" -id -t 4 -m 18 -p 4
Type: Argon2id
Iterations: 4
Memory: 262144 KiB
Parallelism: 4
Hash: cb4447d91dd62b085a555e13ebcc6f04f4c666388606b2c401ddf803055f54ac
Encoded: $argon2id$v=19$m=262144,t=4,p=4$c21hbGxzYWx0$y0RH2R3WKwhaVV4T68xvBPTGZjiGBrLEAd34AwVfVKw
1.486 seconds
Verification ok
In this case, what should I store in the DB?
The "encoded" value as shown above?
The "hash" value as shown above?
Neither, but another solution?
Please, could you help me? I'm a newbie with this and I'm a little bit lost.
I'm a bit late to the party, but I disagree with the previous answers.
You should store the field: Encoded
The $argon2id$.... value.
(At least if you are using normal Argon2 libraries having the verify() function.
It does not look like the man-page for argon2 command does this, however.
Only if you are stuck with the command line, you should consider storing each field individually.)
The $argon2id$ encoded hash
The argon2 encoded hash follows the same as its older cousin bcrypt's syntax.
The encoded hash includes all you ever need to verify the hash when the user logs in.
It is most likely more future proof. When a newer and better argon2 comes along: You can upgrade your one column hashed passwords. Just like you could detect bcrypt's $2a$-hashes, and re-hash them as $argon2id$-hashes, next time the user logs in. (If you were moving from bcrypt to agron2.)
TL;DR
Store the $-encoded value encoded_hash in your database.
Use argon2.verify(password, encoded_hash) to verify that the password is correct.
Don't bother about all the values inside the hash. Let the library do that for you. :)
Neither. Save following as a single value:
algorithm ID (e.g. argon2id)
salt
number of iterations (4)
memory usage factor (18)
parallelism (4)
The output of the field "encoded" is misleading because you cannot use it as is for password check (i.e. for hash generation), e.g. m=262144 where as for password check you need the original factor m=18.
Are you going to launch an OS process each time you check password? I would discourage you from doing this. I'd suggest you use a library (C++, Java, ...). They produce a string that contains all these data concatenated and separated with "$".
I'd put the type, iterations, memory, parallelism, hash, salt and corresponding user id into separate columns and leave the encoded bit out, because it's just all the attributes joined together. If they're in separate columns then you can reference the attributes more easily than having to split and index the encoded string.
The other option is to just store the encoded string in 1 column, but as I said its more tedious to look at certain attributes, as you'd have to split the encoded string and then index it.
I had the same question and read this post while gathering some information. Now after some days and thoughts about all this, I'll personally take a different route than the accepted answer and therefore slightly disagree with it. I thought I would share my perspective so that it might help others as well.
I suppose it will depend on everyone's context. I don't think there is a one size fits all answer here. I'm sure there are situations where it is perfectly valid and even better/simpler to store the encoded string ($argon2...).
However, I would argue that depending on the context, storing the encoded string doesn't seem to be the right approach.
First of all, it makes the hashing method very obvious. It is probably not that important but for some reasons it makes me a bit more comfortable not having it ^^. But, more importantly, it means that implementation details are stored in your persistence layer (db or else). At the time of writing, argon2id is the recommended hashing mechanism by OWASP but these things can change (eventually do change...). Some day, it might be considered unsecure, or another function will be considered more secure.
As a result, I would suggest this more function "agnostic" starting point:
The hash (for argon2 -> the hex string)
The salt
The last_modified date
A string with hashing parameters (for argon2, you could put the parameters here in the form of your choosing)
The last_modified allows to know if the hash needs updating or not and the parameters allows to support the verification and update of "old" hashes.
Of course this means that you have to work a bit more in the code and can't simply use every libraries shortcuts straight away. But, I would say that this increased complexity offer more flexibility in other circumstances (like moving away from a given hashing function). As always there are no free lunch.
That's why I suppose it depends on your context and why personally I wouldn't go with the accepted answer in my situation.
PS: I'm no cryptography expert nor some devsecop guru. So feel free to contradict, enrich, agree or disagree. I just like to keep my implementation details contained ;)

iphone game loading/saving/configuring level

I would like to set up the configuration of each level of my iphone game through a plist or some kind of flat file. One drawback of doing this is that user can potentially open up the app and change the flat file. I am now thinking of hard coding it as an instance of say Config class. Is that a good idea? What is the conventional approach for saving/loading/configuring levels?
The approach I've often used is two-fold: First, write the data out as data, rather than text, simply to make it a bit less obvious what it is. If you're using a plist, you can serialize it as an NSData element. Secondly, create a hash (SHA-1, etc) of the data, salted and/or concatenated with some value internal to your program, and store the hash either along side the data or somewhere else. Then when the data is read back in, you can validate that it hasn't been tampered with.
You could obfuscate the data in various ways, or actually encrypt it before storing it. Encryption has export ramifications however.

How practical would it be to repeatedly encrypt a given file?

I'm currently experimenting with both public-key and personal file encryption. The programs I use have 2048 bit RSA and 256 bit AES level encryption respectively. As a newbie to this stuff (I've only been a cypherpunk for about a month now - and am a little new to information systems) I'm not familiar with RSA algorithms, but that's not relevant here.
I know that unless some secret lab or NSA program happens to have a quantum computer, it is currently impossible to brute force hack the level of security these programs provide, but I was wondering how much more secure it would be to encrypt a file over and over again.
In a nutshell, what I would like to know is this:
When I encrypt a file using 256-bit AES, and then encrypt the already encrypted file once more (using 256 again), do I now have the equivalent of 512-bit AES security? This is pretty much a question of whether or not the the number of possible keys a brute force method would potentially have to test would be 2 x 2 to the 256th power or 2 to the 256th power squared. Being pessimistic, I think it is the former but I was wondering if 512-AES really is achievable by simply encrypting with 256-AES twice?
Once a file is encrypted several times so that you must keep using different keys or keep putting in passwords at each level of encryption, would someone** even recognize if they have gotten through the first level of encryption? I was thinking that perhaps - if one were to encrypt a file several times requiring several different passwords - a cracker would not have any way of knowing if they have even broken through the first level of encryption since all they would have would still be an encrypted file.
Here's an example:
Decrypted file
DKE$jptid UiWe
oxfialehv u%uk
Pretend for a moment that the last sequence is what a cracker had to work with - to brute-force their way back to the original file, the result they would have to get (prior to cracking through the next level of encryption) would still appear to be a totally useless file (the second line) once they break through the first level of encryption. Does this mean that anyone attempting to use brute-force would have no way of getting back to the original file since they presumably would still see nothing but encrypted files?
These are basically two questions that deal with the same thing: the effect of encrypting the same file over and over again. I have searched the web to find out what effect repeated encryption has on making a file secure, but aside from reading an anecdote somewhere that the answer to the first question is no, I have found nothing that pertains to the second spin on the same topic. I am especially curious about that last question.
**Assuming hypothetically that they somehow brute-forced their way through weak passwords - since this appears to be a technological possibility with 256-AES right now if you know how to make secure ones...
In general, if you encrypt a file with k-bit AES then again with k-bit AES, you only get (k+1) bits of security, rather than 2k bits of security, with a man-in-the-middle attack. The same holds for most types of encryption, like DES. (Note that triple-DES is not simply three rounds of encryption for this reason.)
Further, encrypting a file with method A and then with method B need not be even as strong as encrypting with method B alone! (This would rarely be the case unless method A is seriously flawed, though.) In contrast, you are guaranteed to be at least as strong as method A. (Anyone remembering the name of this theorem is encouraged to leave a comment; I've forgotten.)
Usually you're much better off simply choosing a single method as strong as possible.
For your second question: Yes, with most methods, an attacker would know that the first layer had been compromised.
More an opinion here...
First, when computer are strong enough to do a brute-force attack on AES-256 for example, it will be also for iterations of the same... doubling or tripling the time or effort is insignificant at that level.
Next, such considerations can be void depending on the application you are trying to use this encryption in... The "secrets" you will need to carry become bigger (number of iterations and all the different keys you will need, if in fact they are different), the time to do the encryption and the decryption will also need to increase.
My hunch is that iterating the encryption does not help much. Either the algorithm is strong enough to sustain a brute-force attach or it is not. The rest is all in the protection of the keys.
More practically, do you think your house is more protected if you have three identical or similar locks on your front door ? (and that includes number of keys for you to carry around, don't loose those keys, make sure windows and back door are secured also...)
Question 1:
The size of the solution space is going to be the same for two passes of the 256-bit key as the 512-bit key, since 2^(256+256) = 2^512
The actual running time of each decrypt() may increase non-linearly as the key-size grows (it would depend on the algorithm), in this case I think brute forcing the 256+256 would run faster than the 2^512, but would still be infeasible.
Question 2:
There are probably ways to identify certain ciphertext. I wouldn't be surprised if many algorithms leave some signature or artifacts that could be used for identification.

iPhone app -- are plists the way to handle default values and other languages?

I wrote my first program almost fifty years ago (yes, coding is still a blast, managing big projects with many programmers was not), but my Von Neumann thinking gets in the way.
I want to (a) load default values and (b) account for multiple languages more elegantly (?) than 60-plus iterations of NSLocalizedString. Can I park all of this data into what amounts to a record with fields like this: (key value stuff), (tweak-able user prompt / screen name / whatever), (tasteful default), (user-supplied value)? NSUserDefault has worked well so far; Core Data looks like overkill (?), and sql lite, well, where's Oracle when you need it?
It is certainly possible to store this information in plists and make them localizable; right-click the plist in the Groups & Files window -> get info then select 'Add Localization' at the bottom left.
Enter the country code you would like to support and xCode will go ahead and create a language specific version of the resource.
Your code doesn't need to know about any of this since your app will know it supports a language (when you make the file localized) so any plist key value requests that exist already will be mapped to the appropriate value (depending on the current language).
Same applies for your xibs etc.
Personally, I use NSLocalizedString for strings generated inside the code and plists for resources, since it is easier to get the strings I need translated to the translators this way (Can't assume they can edit a plist).
Hope this helps

Hashes vs Numeric id's

When creating a web application that some how displays the display of a unique identifier for a recurring entity (videos on YouTube, or book section on a site like mine), would it be better to use a uniform length identifier like a hash or the unique key of the item in the database (1, 2, 3, etc).
Besides revealing a little, what I think is immaterial, information about the internals of your app, why would using a hash be better than just using the unique id?
In short: Which is better to use as a publicly displayed unique identifier - a hash value, or a unique key from the database?
Edit: I'm opening up this question again because Dmitriy brought up the good point of not tying down the naming to db specific property. Will this sort of tie down prevent me from optimizing/normalizing the database in the future?
The platform uses php/python with ISAM /w MySQL.
Unless you're trying to hide the state of your internal object ID counter, hashes are needlessly slow (to generate and to compare), needlessly long, needlessly ugly, and needlessly capable of colliding. GUIDs are also long and ugly, making them just as unsuitable for human consumption as hashes are.
For inventory-like things, just use a sequential (or sharded) counter instead. If you migrate to a different database, you will just have to initialize the new counter to a value at least as large as your largest existing record ID. Pretty much every database server gives you a way to do this.
If you are trying to hide the state of your counter, perhaps because you're counting users and don't want competitors to know how many you have, I suggest avoiding the display of your internal IDs. If you insist on displaying them and don't want the drawbacks of a hash, you might consider using a maximal-period linear feedback shift register to generate IDs.
I typically use hashes if I don't want the user to be able to guess the next ID in the series. But for your book sections, I'd stick with numerical id's.
Using hashes is preferable in case you need to rebuild your database for some reason, for example, and the ordering changes. The ordinal numbers will move around -- but the hashes will stay the same.
Not relying on the order you put things into a box, but on properties of the things, just seems.. safer.
But watch out for collisions, obviously.
With hashes you
Are free to merge the database with a similar one (or a backup), if necessary
Are not doing something that could help some guessing attacks even a bit
Are not disclosing more private information about the user than necessary, e.g. if somebody sees a user number 2 in your current database log in, they're getting information that he is an oldie.
(Provided that you use a long hash or a GUID,) greatly helping youself in case you're bought by YouTube and they decide to integrate your databases.
Helping yourself in case there appears a search engine that indexes by GUID.
Please let us know if the last 6 months brought you some clarity on this question...
Hashes aren't guaranteed to be unique, nor, I believe, consistent.
will your users have to remember/use the value? or are you looking at it from a security POV?
From a security perspective, it shouldn't matter - since you shouldn't just be relying on people not guessing a different but valid ID of something they shouldn't see in order to keep them out.
Yeah, I don't think you're looking for a hash - you're more likely looking for a Guid.If you're on the .Net platform, try System.Guid.
However, the most important reason not to use a Guid is for performance. Doing database joins and lookups on (long) strings is very suboptimal. Numbers are fast. So, unless you really need it, don't do it.
Hashes have the advantage that you can check if they are valid or not BEFORE performing any check to your database whether they exist or not. This can help you to fend off attacks with random hashes as you don't need to burden your database with fake lookups.
Therefor, if your hash has some kind of well-defined format with for example a checksum at the end, you can check if it's correct without needing to go to the database.