may the compiler optimize based on assert(...) expressions/contracts? - compiler-optimization

http://dlang.org/expression.html#AssertExpression
Regarding assert(0): "The optimization and code generation phases of compilation may assume that it is unreachable code."
The same documentation claims assert(0) is a 'special case', but there are several reasons that follow.
Can the D compiler optimize based on general assert-ions made in contracts and elsewhere?
(as if I needed another reason to enjoy the in{} and out{} constructs, but it certainly would make me feel a little more giddy to know that writing them could make things go fwoosh-ier)

In theory, yes, in practice, I don't think it does, especially since the asserts are killed before even getting to the optimizer on dmd -release. I'm not sure about gdc and ldc, but I think they share this portion of the code.
The spec's special case reference btw is that assert(0) is still present, in some form, with the -release compile flag. It is translated into an illegal instruction there (asm {hlt;} - non-kernel programs on x86 aren't allowed to use that so it will segfault upon hitting it), whereas all other asserts are simply left out of the code entirely in -release mode.

GDC certainly does optimise based on asserts. The if conditions make for much better code, even causing unnecessary code to disappear. However, unfortunately at the moment the way it is implemented is that the entire assert can disappear in release build mode so then the compiler never sees the beneficial if-condition info and actually generates worse code in release than in debug mode! Ironic. I have to admit that I've only looked at this effect with if conditions in asserts in the body, I haven't checked what effect in and out blocks have. The in- and out- etc contract blocks can be turned off based on a command line switch iirc, so they are not even compiled, I think this possibly means the compiler doesn't even look at them. So this is another thing that might possibly affect code generation, I haven't looked at it. But there is a feature here that I would very much like to see, that the if condition truth values in the assert conditions (checking that there is no side-effect code in the expression for the assert cond) can always be injected into the compiler as an assumption, just as if there had been an if statement even in release mode. It would involve pretending you had just seen an if ( xxx ) but with the actual code generation for the test suppressed in release mode, and with subsequent code feeling the beneficial effects of say known truth values, value-range limits and so on.

Related

Why is 'throws' not type safe in Swift?

The biggest misunderstanding for me in Swift is the throws keyword. Consider the following piece of code:
func myUsefulFunction() throws
We cannot really understand what kind of error it will throw. The only thing we know is that it might throw some error. The only way to understand what the error might be is by looking at the documentation or checking the error at runtime.
But isn't this against Swift's nature? Swift has powerful generics and a type system to make the code expressive, yet it feels as if throws is exactly opposite because you cannot get anything about the error from looking at the function signature.
Why is that so? Or have I missed something important and mistook the concept?
I was an early proponent of typed errors in Swift. This is how the Swift team convinced me I was wrong.
Strongly typed errors are fragile in ways that can lead to poor API evolution. If the API promises to throw only one of precisely 3 errors, then when a fourth error condition arises in a later release, I have a choice: I bury it somehow in the existing 3, or I force every caller to rewrite their error handling code to deal with it. Since it wasn't in the original 3, it probably isn't a very common condition, and this puts strong pressure on APIs not to expand their list of errors, particularly once a framework has extensive use over a long time (think: Foundation).
Of course with open enums, we can avoid that, but an open enum achieves none of the goals of a strongly typed error. It is basically an untyped error again because you still need a "default."
You might still say "at least I know where the error comes from with an open enum," but this tends to make things worse. Say I have a logging system and it tries to write and gets an IO error. What should it return? Swift doesn't have algebraic data types (I can't say () -> IOError | LoggingError), so I'd probably have to wrap IOError into LoggingError.IO(IOError) (which forces every layer to explicitly rewrap; you can't have rethrows very often). Even if it did have ADTs, do you really want IOError | MemoryError | LoggingError | UnexpectedError | ...? Once you have a few layers, I wind up with layer upon layer of wrapping of some underlying "root cause" that have to be painfully unwrapped to deal with.
And how are you going to deal with it? In the overwhelming majority of cases, what do catch blocks look like?
} catch {
logError(error)
return
}
It is extremely uncommon for Cocoa programs (i.e. "apps") to dig deeply into the exact root cause of the error and perform different operations based on each precise case. There might be one or two cases that have a recovery, and the rest are things you couldn't do anything about anyway. (This is a common issue in Java with checked exception that aren't just Exception; it's not like no one has gone down this path before. I like Yegor Bugayenko's arguments for checked exceptions in Java which basically argues as his preferred Java practice exactly the Swift solution.)
This is not to say that there aren't cases where strongly typed errors would be extremely useful. But there are two answers to this: first, you're free to implement strongly typed errors on your own with an enum and get pretty good compiler enforcement. Not perfect (you still need a default catch outside the switch statement, but not inside), but pretty good if you follow some conventions on your own.
Second, if this use case turns out to be important (and it might), it is not difficult to add strongly typed errors later for those cases without breaking the common cases that want fairly generic error handling. They would just add syntax:
func something() throws MyError { }
And callers would have to treat that as a strong type.
Last of all, for strongly typed errors to be of much use, Foundation would need to throw them since it is the largest producer of errors in the system. (How often do you really create an NSError from scratch compared to deal with one generated by Foundation?) That would be a massive overhaul of Foundation and very hard to keep compatible with existing code and ObjC. So typed errors would need to be absolutely fantastic at solving very common Cocoa problems to be worth considering as the default behavior. It couldn't be just a little nicer (let alone have the problems described above).
So none of this is to say that untyped errors are the 100% perfect solution to error handling in all cases. But these arguments convinced me that it was the right way to go in Swift today.
The choice is a deliberate design decision.
They did not want the situation where you don't need to declare exception throwing as in Objective-C, C++ and C# because that makes callers have to either assume all functions throw exceptions and include boilerplate to handle exceptions that might not happen, or to just ignore the possibility of exceptions. Neither of these are ideal and the second makes exceptions unusable except for the case when you want to terminate the program because you can't guarantee that every function in the call stack has correctly deallocated resources when the stack is unwound.
The other extreme is the idea you have advocated and that each type of exception thrown can be declared. Unfortunately, people seem to object to the consequence of this which is that you have large numbers of catch blocks so you can handle each type of exception. So, for instance, in Java, they will throw Exception reducing the situation to the same as we have in Swift or worse, they use unchecked exceptions so you can ignore the problem altogether. The GSON library is an example of the latter approach.
We chose to use unchecked exceptions to indicate a parsing failure. This is primarily done because usually the client can not recover from bad input, and hence forcing them to catch a checked exception results in sloppy code in the catch() block.
https://github.com/google/gson/blob/master/GsonDesignDocument.md
That is an egregiously bad decision. "Hi, you can't be trusted to do your own error handling, so your application should crash instead".
Personally, I think Swift gets the balance about right. You have to handle errors, but you don't have to write reams of catch statements to do it. If they went any further, people would find ways to subvert the mechanism.
The full rationale for the design decision is at https://github.com/apple/swift/blob/master/docs/ErrorHandlingRationale.rst
EDIT
There seems to be some people having problems with some of the things I have said. So here is an explanation.
There are two broad categories of reasons why a program might throw an exception.
unexpected conditions in the environment external to the program such as an IO error on a file or malformed data. These are errors that the application can usually handle, for example by reporting the error to the user and allowing them to choose a different course of action.
Errors in programming such as null pointer or array bound errors. The proper way to fix these is for the programmer to make a code change.
The second type of error should not, in general be caught, because they indicate a false assumption about the environment that could mean the program's data is corrupt. There my be no way to continue safely, so you have to abort.
The first type of error usually can be recovered, but in order to recover safely, every stack frame has to be unwound correctly which means that the function corresponding to each stack frame must be aware that the functions it calls may throw an exception and take steps to ensure that everything gets cleaned up consistently if an exception is thrown, with, for example, a finally block or equivalent. If the compiler doesn't provide support for telling the programmer they have forgotten to plan for exceptions, the programmer won't always plan for exceptions and will write code that leaks resources or leaves data in an inconsistent state.
The reason why the gson attitude is so appalling is because they are saying you can't recover from a parse error (actually, worse, they are telling you that you lack the skills to recover from a parse error). That is a ridiculous thing to assert, people attempt to parse invalid JSON files all the time. Is it a good thing that my program crashes if somebody selects an XML file by mistake? No isn't. It should report the problem and ask them to select a different file.
And the gson thing was, by the way, just an example of why using unchecked exceptions for errors you can recover from is bad. If I do want to recover from somebody selecting an XML file, I need to catch Java runtime exceptions, but which ones? Well I could look in the Gson docs to find out, assuming they are correct and up to date. If they had gone with checked exceptions, the API would tell me which exceptions to expect and the compiler would tell me if I don't handle them.

Which is better in PHP: suppress warnings with '#' or run extra checks with isset()?

For example, if I implement some simple object caching, which method is faster?
1. return isset($cache[$cls]) ? $cache[$cls] : $cache[$cls] = new $cls;
2. return #$cache[$cls] ?: $cache[$cls] = new $cls;
I read somewhere # takes significant time to execute (and I wonder why), especially when warnings/notices are actually being issued and suppressed. isset() on the other hand means an extra hash lookup. So which is better and why?
I do want to keep E_NOTICE on globally, both on dev and production servers.
I wouldn't worry about which method is FASTER. That is a micro-optimization. I would worry more about which is more readable code and better coding practice.
I would certainly prefer your first option over the second, as your intent is much clearer. Also, best to keep away edge condition problems by always explicitly testing variables to make sure you are getting what you are expecting to get. For example, what if the class stored in $cache[$cls] is not of type $cls?
Personally, if I typically would not expect the index on $cache to be unset, then I would also put error handling in there rather than using ternary operations. If I could reasonably expect that that index would be unset on a regular basis, then I would make class $cls behave as a singleton and have your code be something like
return $cls::get_instance();
The isset() approach is better. It is code that explicitly states the index may be undefined. Suppressing the error is sloppy coding.
According to this article 10 Performance Tips to Speed Up PHP, warnings take additional execution time and also claims the # operator is "expensive."
Cleaning up warnings and errors beforehand can also keep you from
using # error suppression, which is expensive.
Additionally, the # will not suppress the errors with respect to custom error handlers:
http://www.php.net/manual/en/language.operators.errorcontrol.php
If you have set a custom error handler function with
set_error_handler() then it will still get called, but this custom
error handler can (and should) call error_reporting() which will
return 0 when the call that triggered the error was preceded by an #.
If the track_errors feature is enabled, any error message generated by
the expression will be saved in the variable $php_errormsg. This
variable will be overwritten on each error, so check early if you want
to use it.
# temporarily changes the error_reporting state, that's why it is said to take time.
If you expect a certain value, the first thing to do to validate it, is to check that it is defined. If you have notices, it's probably because you're missing something. Using isset() is, in my opinion, a good practice.
I ran timing tests for both cases, using hash keys of various lengths, also using various hit/miss ratios for the hash table, plus with and without E_NOTICE.
The results were: with error_reporting(E_ALL) the isset() variant was faster than the # by some 20-30%. Platform used: command line PHP 5.4.7 on OS X 10.8.
However, with error_reporting(E_ALL & ~E_NOTICE) the difference was within 1-2% for short hash keys, and up 10% for longer ones (16 chars).
Note that the first variant executes 2 hash table lookups, whereas the variant with # does only one lookup.
Thus, # is inferior in all scenarios and I wonder if there are any plans to optimize it.
I think you have your priorities a little mixed up here.
First of all, if you want to get a real world test of which is faster - load test them. As stated though suppressing will probably be slower.
The problem here is if you have performance issues with regular code, you should be upgrading your hardware, or optimize the grand logic of your code rather than preventing proper execution and error checking.
Suppressing errors to steal the tiniest fraction of a speed gain won't do you any favours in the long run. Especially if you think that this error may keep happening time and time again, and cause your app to run more slowly than if the error was caught and fixed.

Who is affected when bypassing Perl safe signals?

Do the risks caused by bypassing Perl safe signals for example like shown in the second timeout example in the DBI documentation concern only the code that uses such bypassing?
The code in that example works hard to localize the change to just that section of code, or any code called from it.
There is not 100% guarantee that no code will be effected outside the code that bypasses safe signals, because signals are no longer safe. In the example the call being timed out is a DBI->connect. For most DBD's this will be implemented mostly in C, unless the C code can handle being aborted and tried again you might find that some data structures internal to the DBD, or the libraries it uses, are left in a inconstant state.
The chances of the example code going wrong is probably incredibly tiny. My personal anecdote on the issues is that I had used the traditional Perl signal handling for years before safe signals were introduced and for a long time I had never had a problem. I hadn't even been very cautious about what I did in my signal handlers. Then we managed to hit a data set that actually did trigger memory corruptions in about 1 out of ever 100 runs. Just modifying the signal handlers to use better practices, similar to those in the example, eliminated our issues.
What does that even mean? By using unsafe signals, you can corrupt Perl's internals and Perl variables. It can also cause problem if a non-reentrant C library call is interrupted.
This can lead to SEGFAULTs and other problems, and those may only manifest themselves outside the block where the timeout is in effect.

Is Either the equivalent to checked exceptions?

Beginning in Scala and reading about Either I naturally comparing new concepts to something I know (in this case from Java). Are there any differences from the concept of checked exceptions and Either?
In both cases
the possibility of failure is explicitly annotated in the method (throws or returning Either)
the programmer can handle the error case directly when it occurs or move it up (returning again an Either)
there is a way to inform the caller about the reason of the error
I suppose one uses for-comprehensions on Either to write code as there would be no error similar to checked exceptions.
I wonder if I am the only beginner who has problems to see the difference.
Thanks
Either can be used for more than just exceptions. For example, if you were to have a user either type input for you or specify a file containing that input, you could represent that as Either[String, File].
Either is very often used for exception handling. The main difference between Either and checked exceptions is that control flow with Either is always explicit. The compiler really won't let you forget that you are dealing with an Either; it won't collect Eithers from multiple places without you being aware of it, everything that is returned must be an Either, etc.. Because of this, you use Either not when maybe something extraordinary will go wrong, but as a normal part of controlling program execution. Also, Either does not capture a stack trace, making it much more efficient than a typical exception.
One other difference is that exceptions can be used for control flow. Need to jump out of three nested loops? No problem--throw an exception (without a stack trace) and catch it on the outside. Need to jump out of five nested method calls? No problem! Either doesn't supply anything like this.
That said, as you've pointed out there are a number of similarities. You can pass back information (though Either makes that trivial, while checked exceptions make you write your own class to store any extra information you want); you can pass the Either on or you can fold it into something else, etc..
So, in summary: although you can accomplish the same things with Either and checked exceptions with regards to explicit error handling, they are relatively different in practice. In particular, Either makes creating and passing back different states really easy, while checked exceptions are good at bypassing all your normal control flow to get back, hopefully, to somewhere that an extraordinary condition can be sensibly dealt with.
Either is equivalent to a checked exception in terms of the return signature forming an exclusive disjunction. The result can be a thrown exception X or an A. However, throwing an exception isn't equivalent to returning one – the first is not referentially transparent.
Where Scala's Either is not (as of 2.9) equivalent is that a return type is positively biased, and requires effort to extract/deconstruct the Exception, Either is unbiased; you need to explicitly ask for the left or right value. This is a topic of some discussion, and in practice a bit of pain – consider the following three calls to Either producing methods
for {
a <- eitherA("input").right
b <- eitherB(a).right
c <- eitherC(b).right
} yield c // Either[Exception, C]
you need to manually thread through the RHS. This may not seem that onerous, but in practice is a pain and somewhat surprising to new-comers.
Yes, Either is a way to embed exceptions in a language; where a set of operations that can fail can throw an error value to some non-local site.
In addition to the practical issues Rex mentioned, there's some extra things you get from the simple semantics of an Either:
Either forms a monad; so you can use monadic operations over sets of expressions that evaluate to Either. E.g. for short circuiting evaluation without having to test the result
Either is in the type -- so the type checker alone is sufficient to track incorrect handling of the value
Once you have the ability to return either an error message (Left s) or a successful value Right v, you can layer exceptions on top, as just Either plus an error handler, as is done for MonadError in Haskell.

My code works in Debug mode, but not in Release mode

I have a code in Visual Studio 2008 in C++ that works with files just by fopen and fclose.
Everything works perfect in Debug mode. and I have tested with several datasets.
But it doesn't work in release mode. It crashes all the time.
I have turned off all the optimizations, also there is no dependency to anything(in the linker), and also I have set these:
Optimization: Disabled(/Od)
Keep Unreferenced Data.
Do Not Remove Redundant
Optimize for Windows98: NO
I still keep wondering how it should not work under these circumstances.
What else should I turn off to let it work as in debug mode?
I think if it works in release mode but not in debug mode, it might be a coding fault but the other way looks weird. isn't it?
I appreciate any help.
--Nima
Debug modes often initialize heap data allocations. The program might be dependent on this behavior. Look for variables and buffers that are not getting initialized.
1) Double check any and all code that depends on preprocessor macros.
2) Use assert() for verify program state preconditions. These must not be expected to impact program flow (ie. removing the check would still allow the code to provide the same end result) because assert is a macro. Use regular run-time conditionals when an assert won't do.
3) Indeed, never leave a variable in an uninitialized state.
By far the most likely explanation is differing undefined behavior in the two modes caused by uninitialized memory. Lack of thread safety and problems with synchronization code can also exhibit this kind of behavior because of differing timing environments between debug and release, but if your program isn't multi-threaded then obviously this can't be it.
I had experienced this and in my case it was because of one of my array of struct which suppose to have only X index, but my looping which check this struct was over checking to X+1 index. Interesting is debugging mode was running fine though I was on Visual C++ 2005.
I spent a few hours by putting in printf into my coding line by line to catch the bug. Anyone has good way to debug this kind of error please let me know.