Is it possible to safely validate offline license keys clientside? - copy-protection

Is it possible to validate license keys on a client application in such a way that it becomes very difficult to crack?
Consider the following simple example:
var status = secure_function_that_checks_license();
if (status == "REGISTERED")
print("Welcome, user");
else
print("Access denied");
No matter how elaborate your function is, in the end you will always have to branch based on the result it gives.
This thread explains a bit more about generating and verifying keys but doesn't explain how to avoid the branching problem.
Is the only way to do this in a secure way to use some sort of online activation scheme?

First if all remember that there is no prevention when it comes to cracking there is just stalling, if you code is worth cracking it will be cracked
Now In the obfuscation process there is a practice named inlining, it simply replaces your function call with the actual function body, this way your code will be harder to crack since there are much more code to modify

Related

Is input validation necessary?

This is a very naive question about input validation in general.
I learned about input validation techniques such as parse and validatestring. In fact, MATLAB built-in functions are full of those validations and parsers. So, I naturally thought this is the professional way of code development. With these techniques, you can be sure of data format of input variables. Otherwise your codes will reject the inputs and return an error.
However, some people argue that if there is a problem in input variable, codes will cause errors and stop. You'll notice the problem anyway, and then what's the point of those complicated validations? Given that codes for validation itself take some efforts and time, often with quite complicated flow controls, I had to admit this opinion has its point. With massive input validations, readability of codes may be compromised.
I would like hear about opinions from advanced users on this issue.
Here is my experience, I hope it matches best practice.
First of all, let me mention that I typically work in situations where I have full control, and won't build my own UI as #tom mentioned. In general, if there is at any point a large probability that your program gets junk inputs it will be worth checking for them.
Some tradeoffs that I typically make to decide whether I should check my inputs:
Development time vs debug time
If erronious inputs are hard to debug (for example because they don't cause errors but just undesirable outcomes) the balance will typically be in favor of checking, otherwise not.
If you are not sure where you will end up (re)using the code, it may help to enforce any assumptions that are required on the input.
Development time vs runtime experience
If your code takes an hour to run, and will break in the end when an invalid input value occurs, you would want to check of this at the beginning of the code
If the code runs into an error whilst opening a file, the user may not understand immediately, if you mention that no valid filename is specified this may be easier to deal with.
The really (really) short story:
Break your design down into user interface, business logic and data - (see MVC pattern)
In your UI layer, do "common sense" validation, e.g. if the input is a $ cost value then it should be >= 0, be able to be parsed into a decimal etc.
In your business logic layer, validate the value, e.g. the $ cost value might not be allowed to be greater than the profit margin (etc.)
In your data layer, validate the data operation, e.g. that insert operation succeeded
The extra really short story: YES! Validate all inputs.
For extra reading credits see: this!

Do we use hash functions for anything 'critical' under the assumption that collisions will never occur?

I'm currently learning the basics of cryptography and started to wonder. I understand that if an attacker wanted to 'pretend to be you' they could theoretically find a collision for your password or whatever it may be that identifies you, then authenticate themselves with that hash value.
Are there any other less obvious uses for hash functions perhaps aside from information security where in the almost impossible off chance that a collision occurs something rather strange would happen? Or in fact are there any real world examples of when this has happened?
I wonder because from what I understand if we use a strong enough hash function we pretty much assume that a collision will certainly not happen... but what if it did? Do we ever use hash functions for anything 'critical'?
edit: This is purely a speculative question.
The amount of try would be so huge (as the associated time to process) that the login by an unknown user is unlikely probable.
In order to prevent that king of attack, you can put some security like interval between 3 false tries. That done, the time needed to process the entire attack with a result would be too long for the attacker.
See http://en.wikipedia.org/wiki/Brute-force_attack.
The hashing method can also be used to create CheckSum, see http://en.wikipedia.org/wiki/Checksum.
Hash functions are also used to digitally sign documents. Assume you have got a document like a PDF and you want your boss to sign it. You would send it to him, he would sign it and you could send it over to someone else in his name.
Or you could prepare a special document that abuses a hash collision. You do not necessarily need a full hash collision. MD5 for example applies block by block of plain text to the hash. If you can find a collision for this single block you won.
Lets say "asdasd" causes a collision with "qweqwe".
You can create a PDF like so:
Headline
if("asdasd" == "asdasd")
Good text...
else
Evil text...
Your boss will see "Good text...". After you have his digital signature for this document you replace one "asdasd" by "qweqwe".
Headline
if("asdasd" == "qweqwe")
Good text...
else
Evil text...
Now you can send the evil PDF with a valid signature.
It is not as easy as I described but you get the idea I think.

Form-related problems

I am new to Lift and I am thinking whether I should investigate it more closely and start using it as my main platform for the web development. However I have few "fears" which I would be happy to be dispelled first.
Security
Assume that I have the following snippet that generates a form. There are several fields and the user is allowed to edit just some of them.
def form(in : NodeSeq): NodeSeq = {
val data = Data.get(...)
<lift:children>
Element 1: { textIf(data.el1, data.el1(_), isEditable("el1")) }<br />
Element 2: { textIf(data.el2, data.el2(_), isEditable("el2")) }<br />
Element 3: { textIf(data.el3, data.el3(_), isEditable("el3")) }<br />
{ button("Save", () => data.save) }
</lift:children>
}
def textIf(label: String, handler: String => Any, editable: Boolean): NodeSeq =
if (editable) text(label, handler) else Text(label)
Am I right that there is no vulnerability that would allow a user to change a value of some field even though the isEditable method assigned to that field evaluates to false?
Performance
What is the best approach to form processing in Lift? I really like the way of defining anonymous functions as handlers for every field - however how does it scale? I guess that for every handler a function is added to the session with its closure and it stays there until the form is posted back. Doesn't it introduce some potential performance issue when it comes to a service under high loads (let's say 200 requests per second)? And when do these handlers get freed (if the form isn't resubmitted and the user either closes the browser or navigate to another page)?
Thank you!
With regards to security, you are correct. When an input is created, a handler function is generated and stored server-side using a GUID identifier. The function is session specific, and closed over by your code - so it is not accessible by other users and would be hard to replay. In the case of your example, since no input is ever displayed - no function is ever generated, and therefore it would not be possible to change the value if isEditable is false.
As for performance, on a single machine, Lift performs incredibly well. It does however require session-aware load balancing to scale horizontally, since the handler functions do not easily serialize across machines. One thing to remember is that Lift is incredibly flexible, and you can also create stateless form processing if you need to (albeit, it will not be as secure). I have never seen too much of a memory hit with the applications we have created and deployed. I don't have too many hard stats available, but in this thread, David Pollak mentioned that demo.liftweb.net at the time had 214 open sessions consuming about 100MB of ram (500K/session).
Also, here is a link to the Lift book's chapter on Scalability, which also has some more info on security.
The closure and all the stuff is surely cleaned at sessionShutdown. Earlier -- I don't know. Anyway, it's not really a theoretical question -- it highly depends on how users use web forms in practice. So, for a broader answer, I'd ask the question on the main channel of liftweb -- https://groups.google.com/forum/#!forum/liftweb
Also, you can use a "statical" form if you want to. But AFAIK there are no problems with memory and everybody is using the main approach to forms.
If you don't create the handler xml/html -- the user won't be able to change the data, that's for sure. In your code, if I understood it correctly (I'm not sure), you don't create "text(label,handler)" when it's not needed, so everything's secure.

GWT RPC server side safe deserialization to check field size

Suppose I send objects of the following type from GWT client to server through RPC. The objects get stored to a database.
public class MyData2Server implements Serializable
{
private String myDataStr;
public String getMyDataStr() { return myDataStr; }
public void setMyDataStr(String newVal) { myDataStr = newVal; }
}
On the client side, I constrain the field myDataStr to be say 20 character max.
I have been reading on web-application security. If I learned something it is client data should not be trusted. Server should then check the data. So I feel like I ought to check on the server that my field is indeed not longer than 20 characters otherwise I would abort the request since I know it must be an attack attempt (assuming no bug on the client side of course).
So my questions are:
How important is it to actually check on the server side my field is not longer than 20 characters? I mean what are the chances/risks of an attack and how bad could the consequences be? From what I have read, it looks like it could go as far as bringing the server down through overflow and denial of service, but not being a security expert, I could be mis-interpreting.
Assuming I would not be wasting my time doing the field-size check on the server, how should one accomplish it? I seem to recall reading (sorry I no longer have the reference) that a naive check like
if (myData2ServerObject.getMyDataStr().length() > 20) throw new MyException();
is not the right way. Instead one would need to define (or override?) the method readObject(), something like in here. If so, again how should one do it within the context of an RPC call?
Thank you in advance.
How important is it to actually check on the server side my field is not longer than 20 characters?
It's 100% important, except maybe if you can trust the end-user 100% (e. g. some internal apps).
I mean what are the chances
Generally: Increasing. The exact proability can only be answered for your concrete scenario individually (i. e. no one here will be able to tell you, though I would also be interested in general statistics). What I can say is, that tampering is trivially easy. It can be done in the JavaScript code (e. g. using Chrome's built-in dev tools debugger) or by editing the clearly visible HTTP request data.
/risks of an attack and how bad could the consequences be?
The risks can vary. The most direct risk can be evaluated by thinking: "What could you store and do, if you can set any field of any GWT-serializable object to any value?" This is not only about exceeding the size, but maybe tampering with the user ID etc.
From what I have read, it looks like it could go as far as bringing the server down through overflow and denial of service, but not being a security expert, I could be mis-interpreting.
This is yet another level to deal with, and cannot be addressed with server side validation within the GWT RPC method implementation.
Instead one would need to define (or override?) the method readObject(), something like in here.
I don't think that's a good approach. It tries to accomplish two things, but can do neither of them very well. There are two kinds of checks on the server side that must be done:
On a low level, when the bytes come in (before they are converted by RemoteServiceServlet to a Java Object). This needs to be dealt with on every server, not only with GWT, and would need to be answered in a separate question (the answer could simply be a server setting for the maximum request size).
On a logical level, after you have the data in the Java Object. For this, I would recommend a validation/authorization layer. One of the awesome features of GWT is, that you can use JSR 303 validation both on the server and client side now. It doesn't cover every aspect (you would still have to test for user permissions), but it can cover your "#Size(max = 20)" use case.

Which is better in PHP: suppress warnings with '#' or run extra checks with isset()?

For example, if I implement some simple object caching, which method is faster?
1. return isset($cache[$cls]) ? $cache[$cls] : $cache[$cls] = new $cls;
2. return #$cache[$cls] ?: $cache[$cls] = new $cls;
I read somewhere # takes significant time to execute (and I wonder why), especially when warnings/notices are actually being issued and suppressed. isset() on the other hand means an extra hash lookup. So which is better and why?
I do want to keep E_NOTICE on globally, both on dev and production servers.
I wouldn't worry about which method is FASTER. That is a micro-optimization. I would worry more about which is more readable code and better coding practice.
I would certainly prefer your first option over the second, as your intent is much clearer. Also, best to keep away edge condition problems by always explicitly testing variables to make sure you are getting what you are expecting to get. For example, what if the class stored in $cache[$cls] is not of type $cls?
Personally, if I typically would not expect the index on $cache to be unset, then I would also put error handling in there rather than using ternary operations. If I could reasonably expect that that index would be unset on a regular basis, then I would make class $cls behave as a singleton and have your code be something like
return $cls::get_instance();
The isset() approach is better. It is code that explicitly states the index may be undefined. Suppressing the error is sloppy coding.
According to this article 10 Performance Tips to Speed Up PHP, warnings take additional execution time and also claims the # operator is "expensive."
Cleaning up warnings and errors beforehand can also keep you from
using # error suppression, which is expensive.
Additionally, the # will not suppress the errors with respect to custom error handlers:
http://www.php.net/manual/en/language.operators.errorcontrol.php
If you have set a custom error handler function with
set_error_handler() then it will still get called, but this custom
error handler can (and should) call error_reporting() which will
return 0 when the call that triggered the error was preceded by an #.
If the track_errors feature is enabled, any error message generated by
the expression will be saved in the variable $php_errormsg. This
variable will be overwritten on each error, so check early if you want
to use it.
# temporarily changes the error_reporting state, that's why it is said to take time.
If you expect a certain value, the first thing to do to validate it, is to check that it is defined. If you have notices, it's probably because you're missing something. Using isset() is, in my opinion, a good practice.
I ran timing tests for both cases, using hash keys of various lengths, also using various hit/miss ratios for the hash table, plus with and without E_NOTICE.
The results were: with error_reporting(E_ALL) the isset() variant was faster than the # by some 20-30%. Platform used: command line PHP 5.4.7 on OS X 10.8.
However, with error_reporting(E_ALL & ~E_NOTICE) the difference was within 1-2% for short hash keys, and up 10% for longer ones (16 chars).
Note that the first variant executes 2 hash table lookups, whereas the variant with # does only one lookup.
Thus, # is inferior in all scenarios and I wonder if there are any plans to optimize it.
I think you have your priorities a little mixed up here.
First of all, if you want to get a real world test of which is faster - load test them. As stated though suppressing will probably be slower.
The problem here is if you have performance issues with regular code, you should be upgrading your hardware, or optimize the grand logic of your code rather than preventing proper execution and error checking.
Suppressing errors to steal the tiniest fraction of a speed gain won't do you any favours in the long run. Especially if you think that this error may keep happening time and time again, and cause your app to run more slowly than if the error was caught and fixed.