Beginning in Scala and reading about Either I naturally comparing new concepts to something I know (in this case from Java). Are there any differences from the concept of checked exceptions and Either?
In both cases
the possibility of failure is explicitly annotated in the method (throws or returning Either)
the programmer can handle the error case directly when it occurs or move it up (returning again an Either)
there is a way to inform the caller about the reason of the error
I suppose one uses for-comprehensions on Either to write code as there would be no error similar to checked exceptions.
I wonder if I am the only beginner who has problems to see the difference.
Thanks
Either can be used for more than just exceptions. For example, if you were to have a user either type input for you or specify a file containing that input, you could represent that as Either[String, File].
Either is very often used for exception handling. The main difference between Either and checked exceptions is that control flow with Either is always explicit. The compiler really won't let you forget that you are dealing with an Either; it won't collect Eithers from multiple places without you being aware of it, everything that is returned must be an Either, etc.. Because of this, you use Either not when maybe something extraordinary will go wrong, but as a normal part of controlling program execution. Also, Either does not capture a stack trace, making it much more efficient than a typical exception.
One other difference is that exceptions can be used for control flow. Need to jump out of three nested loops? No problem--throw an exception (without a stack trace) and catch it on the outside. Need to jump out of five nested method calls? No problem! Either doesn't supply anything like this.
That said, as you've pointed out there are a number of similarities. You can pass back information (though Either makes that trivial, while checked exceptions make you write your own class to store any extra information you want); you can pass the Either on or you can fold it into something else, etc..
So, in summary: although you can accomplish the same things with Either and checked exceptions with regards to explicit error handling, they are relatively different in practice. In particular, Either makes creating and passing back different states really easy, while checked exceptions are good at bypassing all your normal control flow to get back, hopefully, to somewhere that an extraordinary condition can be sensibly dealt with.
Either is equivalent to a checked exception in terms of the return signature forming an exclusive disjunction. The result can be a thrown exception X or an A. However, throwing an exception isn't equivalent to returning one – the first is not referentially transparent.
Where Scala's Either is not (as of 2.9) equivalent is that a return type is positively biased, and requires effort to extract/deconstruct the Exception, Either is unbiased; you need to explicitly ask for the left or right value. This is a topic of some discussion, and in practice a bit of pain – consider the following three calls to Either producing methods
for {
a <- eitherA("input").right
b <- eitherB(a).right
c <- eitherC(b).right
} yield c // Either[Exception, C]
you need to manually thread through the RHS. This may not seem that onerous, but in practice is a pain and somewhat surprising to new-comers.
Yes, Either is a way to embed exceptions in a language; where a set of operations that can fail can throw an error value to some non-local site.
In addition to the practical issues Rex mentioned, there's some extra things you get from the simple semantics of an Either:
Either forms a monad; so you can use monadic operations over sets of expressions that evaluate to Either. E.g. for short circuiting evaluation without having to test the result
Either is in the type -- so the type checker alone is sufficient to track incorrect handling of the value
Once you have the ability to return either an error message (Left s) or a successful value Right v, you can layer exceptions on top, as just Either plus an error handler, as is done for MonadError in Haskell.
Related
I'm having hard time understanding what value effect systems, like ZIO or Cats Effect.
It does not make code readable, e.g.:
val wrappedB = for {
a <- getA() // : ZIO[R, E, A]
b <- getB(a) // : ZIO[R, E, B]
} yield b
is no more readable to me than:
val a = getA() // : A
val b = getB(a) // : B
I could even argue, that the latter is more straight forward, because calling a function executes it, instead of just creating an effect or execution pipeline.
Delayed execution does not sound convincing, because all examples I've encountered so far are just executing the pipeline right away anyways. Being able to execute effects in parallel or multiple time can be achieved in simpler ways IMHO, e.g. C# has Parallel.ForEach
Composability. Functions can be composed without using effects, e.g. by plain composition.
Pure functional methods. In the end the pure instructions will be executed, so it seems like it's just pretending DB access is pure. It does not help to reason, because while construction of the instructions is pure, executing them is not.
I may be missing something or just downplaying the benefits above or maybe benefits are bigger in certain situations (e.g. complex domain).
What are the biggest selling points to use effect systems?
Because it makes it easy to deal with side effects. From your example:
a <- getA() // ZIO[R, E, A] (doesn't have to be ZIO btw)
val a = getA(): A
The first getA accounts in the effect and the possibility of returning an error, a side effect. This would be like getting an A from some db where the said A may not exist or that you lack permission to access it. The second getA would be like a simple def getA = "A".
How do we put these methods together ? What if one throws an error ? Should we still proceed to the next method or just quit it ? What if one blocks your thread ?
Hopefully that addresses your second point about composability. To quickly address the rest:
Delayed execution. There are probably two reasons for this. The first is you actually don't want to accidentally start an execution. Or just because you write it it starts right away. This breaks what the cool guys refer to as referential transparency. The second is concurrent execution requires a thread pool or execution context. Normally we want to have a centralized place where we can fine tune it for the whole app. And when building a library we can't provide it ourselves. It's the users who provide it. In fact we can also defer the effect. All you do is define how the effect should behave and the users can use ZIO, Monix, etc, it's totally up to them.
Purity. Technically speaking wrapping a process in a pure effect doesn't necessarily mean the underlying process actually uses it. Only the implementation knows if it's really used or not. What we can do is lift it to make it compatible with the composition.
what makes programming with ZIO or Cats great is when it comes to concurrent programming. They are also other reasons but this one is IMHO where I got the "Ah Ah! Now I got it".
Try to write a program that monitor the content of several folders and for each files added to the folders parse their content but not more than 4 files at the same time. (Like the example in the video "What Java developpers could learn from ZIO" By Adam Fraser on youtube https://www.youtube.com/watch?v=wxpkMojvz24 .
I mean this in ZIO is really easy to write :)
The all idea behind the fact that you combine data structure (A ZIO is a data structure) in order to make bigger data structure is so easy to understand that I would not want to code without it for complex problems :)
The two examples are not comparable since an error in the first statement will mark as faulty the value equal to the objectified sequence in the first form while it will halt the whole program in the second. The second form shall then be a function definition to properly encapsulate the two statements, followed by an affectation of the result of its call.
But more than that, in order to completely mimic the first form, some additional code has to be written, to catch exceptions and build a true faulty result, while all these things are made for free by ZIO...
I think that the ability to cleanly propagate the error state between successive statements is the real value of the ZIO approach. Any composite ZIO program fragment is then fully composable itself.
That's the main benefit of any workflow based approach, anyway.
It is this modularity which gives to effect handling its real value.
Since an effect is an action which structurally may produce errors, handling effects like this is an excellent way to handle errors in a composable way. In fact, handling effects consists in handling errors !
ZIO (https://zio.dev/) is a scala framework which has at its core the ZIO[R, E, A] datastructure and its site gives the following information for the three parameters:
ZIO
The ZIO[R, E, A] data type has three type parameters:
R - Environment Type. The effect requires an environment of type R. If this type parameter is Any, it means the effect has no
requirements, because you can run the effect with any value (for
example, the unit value ()).
E - Failure Type. The effect may fail with a value of type E. Some applications will use Throwable. If this type parameter is
Nothing, it means the effect cannot fail, because there are no
values of type Nothing.
A - Success Type. The effect may succeed with a value of type A. If this type parameter is Unit, it means the effect produces no
useful information, while if it is Nothing, it means the effect runs
forever (or until failure).
It's easy to get what A is: it's the value returned by the function in the nominal case, ie why we coded the function for.
R is so kind of dependency injection - an interesting topic, but we can just ignore it to use ZIO by alway setting it to Any (and there is actually a IO[E, A] = ZIO[Any, E, A] alias in the lib).
So, it remains the E type, which is for error (the famous error channel). I roughtly get that IO[E, A] is kind of Either[E, A], but deals with effect (which is great).
My question is: why should I use an error channel EVERYWHERE in my application, and how can I decide what should go in the error channel?
1/ Why effect management with an error channel?
As a developper, one of your hardest task is to decide what is an error and what is not in your application - or more preciselly, to discover failure modes: what the nominal path (ie the goal of that code), what is an expected error that can be dealt with by the application in some way later on, and what are unexpected errors that the application can't deal with. There is no definitive answer for that question, it depends of the application and context, and so it's you, the developper, who needs to decide.
But the hardest task is to build an application that keeps its promises (your promises, since you chose what is an error and what is the nominal path) and that is not surprising so that users, administrators, and dev - including the futur you in two weeks - know what the code do in most cases without having to guess and have agency to adapt to that behavior, including to respond to errors.
This is hard, and you need a systematic process to deals with all the possible cases without going made.
The error channel in IO bi-monad (and thus ZIO) helps you for that task: the IO monad helps you keep track of effects, which are the source of most errors, and the error channel makes explicit what are the possible error cases, and so other parts of the application have agency to deal with them if they can. You will be able to manage your effects in a pure, consistant, composable way with explicit failure modes.
Moreover, in the case of ZIO, you can easely import non-pure code like legacy java extremelly easily:
val pure = ZIO.effect(someJavaCodeThrowingException)
2/ How do I choose what is an error?
So, the error channel provide a way to encode answer to what if? question to futur dev working on that code. "What if database is down?" "there's a DatabaseConnectionError".
But all what if are not alike for YOUR use case, for CURRENT application level. "What if user is not found?" - ah, it may be a totally expected answer at the low, "repository" level (like a "find" that didn't find anything), or it can be an error at an other level (like when you are in the process of authenticating an user, it should really be there). On the first case, you will likely not use the error channel: it's the nominal path, sometimes you don't find things. And in the second case, you will likelly use the error channel (UserNotFoundError).
So as we said, errors in error channel are typically for what if question that you may want to deal with in the application, just not at that function level. The first example of DatabaseConnectionError may be catch higher in the app and lead to an user message like "please try again" and a notification email to sysadmin ("quick, get a look, something if wrong here"). The UserNotFoundError will likely be managed as an error message for the user in the login form, something like "bad login or password, try again or recover credentials with that process".
So these cases (nominal and expected errors) are the easy parts. But there are some what if questions that your application, whatever the level, has no clue how to answer. "What if I get a memory exception when I try to allocate that object?" I don't have any clue, and actually, even if I had a clue, that's out of the scope of the things that I want to deal with for that application. So these errors DON'T go in the error channel. We call them failure and we crash the application when they happens, because it's likely that the application is now in an unknow, dangerous, zombie state.
Again, that choice (nominal path/error channel/failure) is your choice: two applications can make different choices. For example, a one-time-data-processing-app-then-discard-it will likelly treat all non-nominal paths as failures. There is a dev to catch the case in realtime and decide if it's important (see: Shell, Python, and any scripting where that strategy is heavely used - ok, sometimes even when there is no dev to catch errors:). On the other end of the specter, Nasa dev put EVERYTHING in the error channel(+), even memory CORRUPTION. Because it is an expected error, so the application need to know how to deal with that and continue.
(+)NOTE: AFAIK they don't use zio (for now), but the decision process about what is an error is the same, even in C.
To go further, I (#fanf42) gave a talk at Scala.io conference. The talk, "Ssytematic error management in application", is available in French here. Yes, French, I know - but slides are available in English here! And you can ping me (see contact info near the end of slide deck).
The biggest misunderstanding for me in Swift is the throws keyword. Consider the following piece of code:
func myUsefulFunction() throws
We cannot really understand what kind of error it will throw. The only thing we know is that it might throw some error. The only way to understand what the error might be is by looking at the documentation or checking the error at runtime.
But isn't this against Swift's nature? Swift has powerful generics and a type system to make the code expressive, yet it feels as if throws is exactly opposite because you cannot get anything about the error from looking at the function signature.
Why is that so? Or have I missed something important and mistook the concept?
I was an early proponent of typed errors in Swift. This is how the Swift team convinced me I was wrong.
Strongly typed errors are fragile in ways that can lead to poor API evolution. If the API promises to throw only one of precisely 3 errors, then when a fourth error condition arises in a later release, I have a choice: I bury it somehow in the existing 3, or I force every caller to rewrite their error handling code to deal with it. Since it wasn't in the original 3, it probably isn't a very common condition, and this puts strong pressure on APIs not to expand their list of errors, particularly once a framework has extensive use over a long time (think: Foundation).
Of course with open enums, we can avoid that, but an open enum achieves none of the goals of a strongly typed error. It is basically an untyped error again because you still need a "default."
You might still say "at least I know where the error comes from with an open enum," but this tends to make things worse. Say I have a logging system and it tries to write and gets an IO error. What should it return? Swift doesn't have algebraic data types (I can't say () -> IOError | LoggingError), so I'd probably have to wrap IOError into LoggingError.IO(IOError) (which forces every layer to explicitly rewrap; you can't have rethrows very often). Even if it did have ADTs, do you really want IOError | MemoryError | LoggingError | UnexpectedError | ...? Once you have a few layers, I wind up with layer upon layer of wrapping of some underlying "root cause" that have to be painfully unwrapped to deal with.
And how are you going to deal with it? In the overwhelming majority of cases, what do catch blocks look like?
} catch {
logError(error)
return
}
It is extremely uncommon for Cocoa programs (i.e. "apps") to dig deeply into the exact root cause of the error and perform different operations based on each precise case. There might be one or two cases that have a recovery, and the rest are things you couldn't do anything about anyway. (This is a common issue in Java with checked exception that aren't just Exception; it's not like no one has gone down this path before. I like Yegor Bugayenko's arguments for checked exceptions in Java which basically argues as his preferred Java practice exactly the Swift solution.)
This is not to say that there aren't cases where strongly typed errors would be extremely useful. But there are two answers to this: first, you're free to implement strongly typed errors on your own with an enum and get pretty good compiler enforcement. Not perfect (you still need a default catch outside the switch statement, but not inside), but pretty good if you follow some conventions on your own.
Second, if this use case turns out to be important (and it might), it is not difficult to add strongly typed errors later for those cases without breaking the common cases that want fairly generic error handling. They would just add syntax:
func something() throws MyError { }
And callers would have to treat that as a strong type.
Last of all, for strongly typed errors to be of much use, Foundation would need to throw them since it is the largest producer of errors in the system. (How often do you really create an NSError from scratch compared to deal with one generated by Foundation?) That would be a massive overhaul of Foundation and very hard to keep compatible with existing code and ObjC. So typed errors would need to be absolutely fantastic at solving very common Cocoa problems to be worth considering as the default behavior. It couldn't be just a little nicer (let alone have the problems described above).
So none of this is to say that untyped errors are the 100% perfect solution to error handling in all cases. But these arguments convinced me that it was the right way to go in Swift today.
The choice is a deliberate design decision.
They did not want the situation where you don't need to declare exception throwing as in Objective-C, C++ and C# because that makes callers have to either assume all functions throw exceptions and include boilerplate to handle exceptions that might not happen, or to just ignore the possibility of exceptions. Neither of these are ideal and the second makes exceptions unusable except for the case when you want to terminate the program because you can't guarantee that every function in the call stack has correctly deallocated resources when the stack is unwound.
The other extreme is the idea you have advocated and that each type of exception thrown can be declared. Unfortunately, people seem to object to the consequence of this which is that you have large numbers of catch blocks so you can handle each type of exception. So, for instance, in Java, they will throw Exception reducing the situation to the same as we have in Swift or worse, they use unchecked exceptions so you can ignore the problem altogether. The GSON library is an example of the latter approach.
We chose to use unchecked exceptions to indicate a parsing failure. This is primarily done because usually the client can not recover from bad input, and hence forcing them to catch a checked exception results in sloppy code in the catch() block.
https://github.com/google/gson/blob/master/GsonDesignDocument.md
That is an egregiously bad decision. "Hi, you can't be trusted to do your own error handling, so your application should crash instead".
Personally, I think Swift gets the balance about right. You have to handle errors, but you don't have to write reams of catch statements to do it. If they went any further, people would find ways to subvert the mechanism.
The full rationale for the design decision is at https://github.com/apple/swift/blob/master/docs/ErrorHandlingRationale.rst
EDIT
There seems to be some people having problems with some of the things I have said. So here is an explanation.
There are two broad categories of reasons why a program might throw an exception.
unexpected conditions in the environment external to the program such as an IO error on a file or malformed data. These are errors that the application can usually handle, for example by reporting the error to the user and allowing them to choose a different course of action.
Errors in programming such as null pointer or array bound errors. The proper way to fix these is for the programmer to make a code change.
The second type of error should not, in general be caught, because they indicate a false assumption about the environment that could mean the program's data is corrupt. There my be no way to continue safely, so you have to abort.
The first type of error usually can be recovered, but in order to recover safely, every stack frame has to be unwound correctly which means that the function corresponding to each stack frame must be aware that the functions it calls may throw an exception and take steps to ensure that everything gets cleaned up consistently if an exception is thrown, with, for example, a finally block or equivalent. If the compiler doesn't provide support for telling the programmer they have forgotten to plan for exceptions, the programmer won't always plan for exceptions and will write code that leaks resources or leaves data in an inconsistent state.
The reason why the gson attitude is so appalling is because they are saying you can't recover from a parse error (actually, worse, they are telling you that you lack the skills to recover from a parse error). That is a ridiculous thing to assert, people attempt to parse invalid JSON files all the time. Is it a good thing that my program crashes if somebody selects an XML file by mistake? No isn't. It should report the problem and ask them to select a different file.
And the gson thing was, by the way, just an example of why using unchecked exceptions for errors you can recover from is bad. If I do want to recover from somebody selecting an XML file, I need to catch Java runtime exceptions, but which ones? Well I could look in the Gson docs to find out, assuming they are correct and up to date. If they had gone with checked exceptions, the API would tell me which exceptions to expect and the compiler would tell me if I don't handle them.
In regard to potential runtime failures, like database queries, it seems that one must use some form of Either[String, Option[T]] in order to accurately capture the following outcomes:
Some (record(s) found)
None (no record(s) found)
SQL Exception
Option simply does not have enough options.
I guess I need to dive into scalaz, but for now it's straight Either, unless I'm missing something in the above.
Have boxed myself into a corner with my DAO implementation, only employing Either for write operations, but am now seeing that some Either writes depend on Option reads (e.g. checking if email exists on new user signup), which is a majorly bad gamble to make.
Before I go all-in on Either, does anyone have alternate solutions for how to handle the runtime trifecta of success/fail/exception?
Try Box from the fantastic lift framework. It provides exactly what you want.
See this wiki (and the links at the top) for details. Fortunately lift project is well modulized, the only dependency to use Box is net.lift-web % lift-common
Use Option[T] for the cases records found and no records found and throw an exception in the case of SQLException.
Just wrap the exception inside your own exception type, like PersistenceException so that you don't have a leaky abstraction.
We do it like this because we can't and don't want to recover from unexpected database exceptions. The exception gets caught on the top level and our web service returns a 500 Internal server error in such case.
In cases where we want to recover we use Validation from scalaz, which is much like Lift's Box.
Here's my revised approach
Preserve Either returning query write operations (useful for transactional blocks where we want to rollback on for comprehension Left outcome).
For Option returning query reads, however, rather than swallowing the exception with None (and logging it), I have created a 500 error screen, letting the exception bubble up.
Why not just work with Either result type by default when working with runtime failures like query Exceptions? Option[T] reads are a bit more convenient to work with vs Either[Why-Fail, Option[T]], which you have to fold/map through to get at T. Leaving Either to write operations simplifies things (all the more so given that's how the application is currently setup, no refactoring required ;-))
The only other change required is for AJAX requests. Rather than displaying the entire 500 error page response in the AJAX status div container, we check for the status type and display 500 error message accordingly.
if(data.status == 500)
$('#status > div').html("an error occurred, please try again")
Could probably do an isAjax check server-side prior to sending the response; in which case I can send back only status + message rather than the error page itself.
Question:
What is considered to be "Best practice" - and why - of handling errors in a constructor?.
"Best Practice" can be a quote from Schwartz, or 50% of CPAN modules use it, etc...; but I'm happy with well reasoned opinion from anyone even if it explains why the common best practice is not really the best approach.
As far as my own view of the topic (informed by software development in Perl for many years), I have seen three main approaches to error handling in a perl module (listed from best to worst in my opinion):
Construct an object, set an invalid flag (usually "is_valid" method). Often coupled with setting error message via your class's error handling.
Pros:
Allows for standard (compared to other method calls) error handling as it allows to use $obj->errors() type calls after a bad constructor just like after any other method call.
Allows for additional info to be passed (e.g. >1 error, warnings, etc...)
Allows for lightweight "redo"/"fixme" functionality, In other words, if the object that is constructed is very heavy, with many complex attributes that are 100% always OK, and the only reason it is not valid is because someone entered an incorrect date, you can simply do "$obj->setDate()" instead of the overhead of re-executing entire constructor again. This pattern is not always needed, but can be enormously useful in the right design.
Cons: None that I'm aware of.
Return "undef".
Cons: Can not achieve any of the Pros of the first solution (per-object error messages outside of global variables and lightweight "fixme" capability for heavy objects).
Die inside the constructor. Outside of some very narrow edge cases, I personally consider this an awful choice for too many reasons to list on the margins of this question.
UPDATE: Just to be clear, I consider the (otherwise very worthy and a great design) solution of having very simple constructor that can't fail at all and a heavy initializer method where all the error checking occurs to be merely a subset of either case #1 (if initializer sets error flags) or case #3 (if initializer dies) for the purposes of this question. Obviously, choosing such a design, you automatically reject option #2.
It depends on how you want your constructors to behave.
The rest of this response goes into my personal observations, but as with most things Perl, Best Practices really boils down to "Here's one way to do it, which you can take or leave depending on your needs." Your preferences as you described them are totally valid and consistent, and nobody should tell you otherwise.
I actually prefer to die if construction fails, because we set it up so that the only types of errors that can occur during object construction really are big, obvious errors that should halt execution.
On the other hand, if you prefer that doesn't happen, I think I'd prefer 2 over 1, because it's just as easy to check for an undefined object as it is to check for some flag variable. This isn't C, so we don't have a strong typing constraint telling us that our constructor MUST return an object of this type. So returning undef, and checking for that to establish success or failure, is a great choice.
The 'overhead' of construction failure is a consideration in certain edge cases (where you can't quickly fail before incurring overhead), so for those you might prefer method 1. So again, it depends on what semantics you've defined for object construction. For example, I prefer to do heavyweight initialization outside of construction. As to standardization, I think that checking whether a constructor returns a defined object is as good a standard as checking a flag variable.
EDIT: In response to your edit about initializers rejecting case #2, I don't see why an initializer can't simply return a value that indicates success or failure rather than setting a flag variable. Actually, you may want to use both, depending on how much detail you want about the error that occurred. But it would be perfectly valid for an initializer to return true on success and undef on failure.
I prefer:
Do as little initialization as possible in the constructor.
croak with an informative message when something goes wrong.
Use appropriate initialization methods to provide per object error messages etc
In addition, returning undef (instead of croaking) is fine in case the users of the class may not care why exactly the failure occurred, only if they got a valid object or not.
I despise easy to forget is_valid methods or adding extra checks to ensure methods are not called when the internal state of the object is not well defined.
I say these from a very subjective perspective without making any statements about best practices.
I would recommend against #1 simply because it leads to more error handling code which will not be written. For example, if you just return false then this works fine.
my $obj = Class->new or die "Construction failed...";
But if you return an object which is invalid...
my $obj = Class->new;
die "Construction failed #{[ $obj->error_message ]}" if $obj->is_valid;
And as the quantity of error handling code increases the probability of it being written decreases. And its not linear. By increasing the complexity of your error handling system you actually decrease the amount of errors it will catch in practical use.
You also have to be careful that your invalid object in question dies when any method is called (aside from is_valid and error_message) leading to yet more code and opportunities for mistakes.
But I agree there is value in being able to get information about the failure, which makes returning false (just return not return undef) inferior. Traditionally this is done by calling a class method or global variable as in DBI.
my $dbh = DBI->connect($data_source, $username, $password)
or die $DBI::errstr;
But it suffers from A) you still have to write error handling code and B) its only valid for the last operation.
The best thing to do, in general, is throw an exception with croak. Now in the normal case the user writes no special code, the error occurs at the point of the problem, and they get a good error message by default.
my $obj = Class->new;
Perl's traditional recommendations against throwing exceptions in library code as being impolite is outdated. Perl programmers are (finally) embracing exceptions. Rather than writing error handling code ever and over again, badly and often forgetting, exceptions DWIM. If you're not convinced just start using autodie (watch pjf's video about it) and you'll never go back.
Exceptions align Huffman encoding with actual use. The common case of expecting the constructor to just work and wanting an error if it doesn't is now the least code. The uncommon case of wanting to handle that error requires writing special code. And the special code is pretty small.
my $obj = eval { Class->new } or do { something else };
If you find yourself wrapping every call in an eval you are doing it wrong. Exceptions are called that because they are exceptional. If, as in your comment above, you want graceful error handling for the user's sake, then take advantage of the fact that errors bubble up the stack. For example, if you want to provide a nice user error page and also log the error you can do this:
eval {
run_the_main_web_code();
} or do {
log_the_error($#);
print_the_pretty_error_page;
};
You only need it in one place, at top of your call stack, rather than scattered everywhere. You can take advantage of this at smaller increments, for example...
my $users = eval { Users->search({ name => $name }) } or do {
...handle an error while finding a user...
};
There's two things going on. 1) Users->search always returns a true value, in this case an array ref. That makes the simple my $obj = eval { Class->method } or do work. That's optional. But more importantly 2) you only need to put special error handling around Users->search. All the methods called inside Users->search and all the methods they call... they just throw exceptions. And they're all caught at one point and handled the same. Handling the exception at the point which cares about it makes for much neater, compact and flexible error handling code.
You can pack more information into the exception by croaking with a string overloaded object rather than just a string.
my $obj = eval { Class->new }
or die "Construction failed: $# and there were #{[ $#->num_frobnitz ]} frobnitzes";
Exceptions:
Do the right thing without any thought by the caller
Require the least code for the most common case
Provide the most flexibility and information about the failure to the caller
Modules such as Try::Tiny fix most of the hanging issues surrounding using eval as an exception handler.
As for your use case where you might have a very expensive object and want to try and continue with it partially build... smells like YAGNI to me. Do you really need it? Or you have a bloated object design which is doing too much work too early. IF you do need it, you can put the information necessary to continue the construction in the exception object.
First the pompous general observations:
A constructor's job should be: Given valid construction parameters, return a valid object.
A constructor that does not construct a valid object cannot perform its job and is therefore a perfect candidate for exception generation.
Making sure the constructed object is valid is part of the constructor's job. Handing out a known-to-be-bad object and relying on the client to check that the object is valid is a surefire way to wind up with invalid objects that explode in remote places for non-obvious reasons.
Checking that all the correct arguments are in place before the constructor call is the client's job.
Exceptions provide a fine-grained way of propagating the particular error that occurred without needing to have a broken object in hand.
return undef; is always bad[1]
bIlujDI' yIchegh()Qo'; yIHegh()!
Now to the actual question, which I will construe to mean "what do you, darch, consider the best practice and why". First, I'll note that returning a false value on failure has a long Perl history (most of the core works that way, for example), and a lot of modules follow this convention. However, it turns out this convention produces inferior client code and newer modules are moving away from it.[2]
[The supporting argument and code samples for this turn out to be the more general case for exceptions that prompted the creation of autodie, and so I will resist the temptation to make that case here. Instead:]
Having to check for successful creation is actually more onerous than checking for an exception at an appropriate exception-handling level. The other solutions require the immediate client to do more work than it should have to just to obtain an object, work that is not required when the constructor fails by throwing an exception.[3] An exception is vastly more expressive than undef and equally expressive as passing back a broken object for purposes of documenting errors and annotating them at various levels in the call stack.
You can even get the partially-constructed object if you pass it back in the exception. I think this is a bad practice per my belief about what a constructor's contract with its clients ought to be, but the behavior is supported. Awkwardly.
So: A constructor that cannot create a valid object should throw an exception as early as possible. The exceptions a constructor can throw should be documented parts of its interface. Only the calling levels that can meaningfully act on the exception should even look for it; very often, the behavior of "if this construction fails, don't do anything" is exactly correct.
[1]: By which I mean, I am not aware of any use cases where return; is not strictly superior. If someone calls me on this I might have to actually open a question. So please don't. ;)
[2]: Per my extremely unscientific recollection of the module interfaces I've read in the last two years, subject to both selection and confirmation biases.
[3]: Note that throwing an exception does still require error-handling, as would the other proposed solutions. This does not mean wrapping every instantiation in an eval unless you actually want to do complex error-handling around every construction (and if you think you do, you're probably wrong). It means wrapping the call which is able to meaningfully act on the exception in an eval.