How safe is information contained within iPhone app compiled code? - iphone

I was discussing this with some friends and we began to wonder about this. Could someone gain access to URLs or other values that are contained in the actual objective-c code after they purchase your app?
Our initial feeling was no, but I wondered if anyone out there had definitive knowledge one way or the other?
I do know that .plist files are readily available.
Examples could be things like:
-URL values kept in a string
-API key and secret values

Yes, strings and information are easily extractable from compiled applications using the strings tool (see here), and it's actually even pretty easy to extract class information using class-dump-x (check here).
Just some food for thought.
Edit: one easy, albeit insecure, way of keeping your secret information hidden is obfuscating it, or cutting it up into small pieces.
The following code:
NSString *string = #"Hello, World!";
will produce "Hello, World!" using the strings tool.
Writing your code like this:
NSString *string = #"H";
string = [stringByAppendingString:#"el"];
string = [stringByAppendingString:#"lo"];
...
will show the characters typed, but not necessarily in order.
Again: easy to do, but not very secure.

When you purchase an app it is saved on your hard disk as "FooBar.ipa"; that file is actually in Zip format. You can unzip it and inspect the contents, including searching for strings in the executable. Try it! Constant values in your code are not compressed, encrypted, or scrambled in any way.

I know this has already been answered, but I want to give my own suggestion too.
Again, please remember that all obfuscation techniques are never 100% safe, and thus are not the best, but often they are "good enough" (depending on what you want to obfuscate). This means that a determined cracker will be able to read your strings anyways, but these techniques may stop the "casual cracker".
My other suggestion is to "crypt" the strings with a simple XOR. This is incredibly fast, and does not require any authorization if you are selling the app through the App Store (it does not fall into the categories of algorithms that require authorization for exporting them).
There are many snippets around for doing a XOR in Cocoa, see for example: http://iphonedevsdk.com/forum/iphone-sdk-development/11352-doing-an-xor-on-a-string.html
The key you use could be any string, be it a meaningless sequence of characters/bytes or something meaningful to confuse readers (e.g. use name of methods, such as "stringWithContentsOfFile:usedEncoding:error:").

Related

What is multicodec and how it is related to multihash?

I don't have any background with this subject.
To try to understand them better, I read:
Multihash
CIDv1: Multicodec prefix
From what I understand, the multihash is the algorithm used to hash (one way) the value. so it means, we can't go back (we can't decode the hash to the value).
Questions
I don't understand, in simple words, what is multicodec and if it's related to decoding the hash to a value (which makes no sense).
what is the motivation to multicodec prefix?
The multicodec is related to decoding the value the hash points to, if that makes it easier to understand. Don't worry, no magic hash decoding is happening ;). Remember we're making CIDs, and we can use CIDs to lookup content. However then we have the question of "how do we decode this data we just retrieved?", the multicodec solves that problem for us. Reading From Data to Data Structures might help clear up some confusion.
The multicodec prefix allows IPFS to evolve to support new and different encodings for the data that's actually put into IPFS. This refers to IPLD, and you can actually find the answer you're looking for under Links (with information about the codecs under Codecs):
For links we use a CID. A CID is an extension of multihash, in fact a multihash is part of a CID. We simply add a codec to a multihash that tells us what format the data is in (JSON, CBOR, Bitcoin, Ethereum, etc). This way, we can actually link between data in different formats and any link to data anyone ever gives us can be decoded so that it can become more than just a series of bytes.
CID is a standard that anyone can implement, even people that have no other interest in IPLD beyond the need for hash links to different data types can use it.

Should I prefer hashes or hashrefs in Perl?

I'm still learning perl.
For me it feels more "natural" to references to hashes rather than directly access them, because it is easier to pass references to a sub, (one variable can be passed instead of a list). Generally I prefer this approach to that where one directly accesses %hashes".
The question is, where (in what situations) is better to use plain %hashes, so
$hash{key} = $value;
instead of
$href->{key} = $value
Is here any speed, or any other "thing" what prefers to use %hashes and not $hashrefs? Or it is matter only of pure personal taste and TIMTOWTDI? Some examples, when is better to use %hash?
I think this kind of question is very legitimate: Programming languages such as Perl or C++ have come a long way and accumulated a lot of historical baggage, but people typically learn them from ahistorical, synchronous exposés. Hence they keep wondering why TIMTOWDI and WTF all these choices and what is better and what should be preferred?
So, before version 5, Perl didn't have references. It only had value types. References are an add-on to Perl 4, enabling lots more stuff to be written. Value types had to be retained, of course, to keep backward compatibility; and also, for simplicity's sake, because frequently you don't need the indirection that references are.
To answer your question:
Don't waste time thinking about the speed of Perl hash lists. They're fast. They're memory access. Accessing a database or the filesystem or the net, that is where your program will typically spend time.
In theory, a dereference operation should take a tiny bit of time, so tiny it shouldn't matter.
If you're curious, then benchmark. Don't draw too many conclusions from differences you might see. Things could look different on another release.
So there is no speed reason to favour references over value types or vice versa.
Is there any other reason? I'd say it's a question of style and taste. Personally, I prefer the syntax without -> accessors.
If you can use a plain hashes, to describe your data, you use a plain hash. However, when your data structure gets a bit more complex, you will need to use references.
Imagine a program where I'm storing information about inventory items, and how many I have in stock. A simple hash works quite well:
$item{XP232} = 324;
$item{BV348} = 145;
$item{ZZ310} = 485;
If all you're doing is creating quick programs that can read a file and store simple information for a report, there's no need to use references at all.
However, when things get more complex, you need references. For example, my program isn't just tracking my stock, I'm tracking all aspects of my inventory. Inventory items also have names, the company that creates them, etc. In this case, I'll want to have my hashes not pointing to a single data point (the number of items I have in stock), but a reference to a hash:
$item{XP232}->{DESCRIPTION} = "Blue Widget";
$item{XP232}->{IN_STOCK} = 324;
$item{XP232}->{MANUFACTURER} = "The Great American Widget Company";
$item{BV348}->{DESCRIPTION} = "A Small Purple Whatzit";
$item{BV348}->{IN_STOCK} = 145;
$item{BV348}->{MANUFACTURER} = "Acme Whatzit Company";
You can do all sorts of wacky things to do something like this (like have separate hashes for each field or put all fields in a single value separated by colons), but it's simply easier to use references to store these more complex structures.
For me the main reason to use $hashrefs to %hashes is the ability to give them meaningful names (a related idea would be name the references to an anonymous hash) which can help you separate data structures from program logic and make things easier to read and maintain.
If you end up with multiple levels of references (refs to refs?!) you start to loose this clean and readable advantage, though. As well, for short programs or modules, or at earlier stages of development where you are retesting things as you go, directly accessing the %hash can make things easier for simple debugging (print statements and the like) and avoiding accidental "action at a distance" issues so you can focus on "iterating" through your design, using references where appropriate.
In general though I think this is a great question because TIMTOWDI and TIMTOCWDI where C = "correct". Thanks for asking it and thanks for the answers.

How to shift bytes of an NSString?

I have a NSString like #"123456". I want to convert this string into byte array and then I want to shift some bytes using some arithmetic operations. Then I want to apply SHA256Hash on that and finally want to encrypt a string using the final result. I have tried many approaches but still got no success. I am very confused in this.If someone wants to look at code i'll post the code.
Edit:
My actual goal is to encrypt an string using AES256 encryption algorithm. And I want to generate my own key and I want to pass my own IV.
I assume you're trying achieve some kind of security. On the other hand it does not look like you're very familiar with the tools and methods you're using. This is a bad start.
Security is a very difficult thing to do—even for experienced developers. Maybe there's a way to reuse some existing implementation for your security needs.
My advice would be not to reinvent things, especially when they are as hard and as crucial as security.

POST/GET bindings in Racket

Is there a built-in way to get at POST/GET parameters in Racket? extract-binding and friends do what I want, but there's a dire note attached about potential security risks related to file uploads which concludes
Therefore, we recommend against their
use, but they are provided for
compatibility with old code.
The best I can figure is (and forgive me in advance)
(bytes->string/utf-8 (binding:form-value (bindings-assq (string->bytes/utf-8 "[field_name_here]") (request-bindings/raw req))))
but that seems unnecessarily complicated (and it seems like it would suffer from some of the same bugs documented in the Bindings section).
Is there a more-or-less standard, non-buggy way to get the value of a POST/GET-variable, given a field name and request? Or better yet, a way of getting back a collection of the POST/GET values as a list/hash/a-list? Barring either of those, is there a function that would do the same, but only for POST variables, ignoring GETs?
extract-binding is bad because it is case-insensitive, is very messy for inputs that return multiple times, doesn't have a way of dealing with file uploads, and automatically assumes everything is UTF-8, which isn't necessarily true. If you can accept those problems, feel free to use it.
The snippet you wrote works when the data is UTF-8 and when there is only one field return. You can define it is a function and avoid writing it many times.
In general, I recommend using formlets to deal with forms and their values.
Now your questions...
"Is there a more-or-less standard, non-buggy way to get the value of a POST/GET-variable, given a field name and request?"
What you have is the standard thing, although you wrongly assume that there is only one value. When there are multiple, you'll want to filter the bindings on the field name. Similarly, you don't need to turn the value into a string, you can leave it as bytes just fine.
"Or better yet, a way of getting back a collection of the POST/GET values as a list/hash/a-list?"
That's what request-bindings/raw does. It is a list of binding? objects. It doesn't make sense to turn it into a hash due to multiple value returns.
"Barring either of those, is there a function that would do the same, but only for POST variables, ignoring GETs?"
The Web server hides the difference between POSTs and GETs from you. You can inspect uri and raw post data to recover them, but you'd have to parse them yourself. I don't recommend it.
Jay

Basic Profanity Filter in Objective C for iPhone

How have you like minded individuals tackled the basic challenge of filtering profanity, obviously one can't possibly tackle every scenario but it would be nice to have one at the most basic level as a first line of defense.
In Obj-c I've got
NSString *tokens = [text componentsSeparatedByString:#" "];
And then I loop through each token to see if any of the keywords (I've got about 400 in a list) are found within each token.
Realising False positives are also a problem, if the word is a perfect match, its flagged as profanity otherwise if more than 3 words with profanity are found without being perfect matches it is also flagged as profanity.
Later on I will use a webservice that tackles the problem more precisely, but I really just need something basic. So if you wrote the word penis it would go yup naughty naughty, bad word written.
Obscenity Filters: Bad Idea, or Incredibly Intercoursing Bad Idea?
Jeff has an interesting article to consider before embarking on such a piece of code:
http://www.codinghorror.com/blog/2008/10/obscenity-filters-bad-idea-or-incredibly-intercoursing-bad-idea.html
I just have a suggestion for tokenizing the string. Your ways works well if the words are all separated by strings but that is rarely the case in most usage scenarios as you would normally have to deal with newlines, punctuation, etc. Try this if you are interested:
NSMutableCharacterSet *separators = [NSMutableCharacterSet punctuationCharacterSet];
[separators formUnionWithCharacterSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
NSArray *words = [bigString componentsSeparatedByCharactersInSet:separators];
Source: http://www.tech-recipes.com/rx/3418/cocoa-explode-break-nsstring-into-individual-words/
Well, searching in that manner is certainly not the most efficient way to search for profanity... a more efficient approach would be to construct a finite state automaton to detect the words, and run the text once through that FSA. You don't really need to split strings to find profanity, and all that splitting adds extra allocation and copying overhead that you don't need. Also, there may be common patterns in some of the blacklisted words, which you are not exploiting by searching each word individually.
That said, I think 400 words is quite a lot. Who, exactly, is your audience? What if a user has a medical question? Should such questions actually be disallowed? I can only think of a handful of words that would be considered profane in any context, so you might want to rethink the filtering.
A couple of things:
FSA won't necessarily work depending on how intelligent you want the filter to be
Regex are generally extremely slow depending on how many you want to run
400 words is somewhat low, depending on your needs and langauges
There are a number of extremely tricky cases to be careful of when filtering, particularly embedding of words such as "ASSume"
My company, Inversoft, builds a commercial filtering solution and it is quite intelligent. It doesn't use regex or FSA, but has a custom built fast-linear processing technology that makes it extremely fast and accurate (4,000+ messages per second). It also has over 600 English words in a number of categories including Slang, Racial Slurs, Drug, Gang, Religious, etc.
If you are looking for an intelligent context-aware solution with support, you should check out Clean Speak from Inversoft. Hooking it up to Obj-C should be simple using the XML WebService.