what to choose between require and assert in scala - scala

Both require and assert are used to perform certain checks during runtime to verify certain conditions.
So what is the basic difference between them?
The only one I see is that require throws IllegalArgumentException and assert throws AssertionError.
How do I choose which one to use?

As Kigyo mentioned there is a semantic difference
assert means that your program has reached an inconsistent state this might be a problem with the current method/function (I like to think of it a bit as HTTP 500 InternalServerError)
require means that the caller of the method is at fault and should fix its call (I like to think of it a bit as HTTP 400 BadRequest)
There is also a major technical difference:
assert is annotated with #elidable(ASSERTION)
meaning you can compile your program with -Xelide-below ASSERTION or with -Xdisable-assertions and the compiler will not generate the bytecode for the assertions. This can significantly reduce bytecode size and improve performance if you have a large number of asserts.
Knowing this, you can use an assert to verify all the invariants everywhere in your program (all the preconditions/postconditions for every single method/function calls) and not pay the price in production.
You would usually have the "test" build with all the assertions enabled, it would be slower as it would verify all the assertions at all times, then you could have the "production" build of your product without the assertions, which you would eliminate all the internal state checks done through assertion
require is not elidable, it makes more sense for use in libraries (including internal libraries) to inform the caller of the preconditions to call a given method/function.

This is only my subjective point of view.
I use require whenever I want a constraint on parameters.
As an example we can take the factorial for natural numbers. As we do not want to address negative numbers, we want to throw an IllegalArgumentException.
I would use assert, whenever you want to make sure some conditions (like invariants) are always true during execution. I see it as a way of testing.
Here is an example implementation of factorial with require and assert
def fac(i: Int) = {
require(i >= 0, "i must be non negative") //this is for correct input
#tailrec def loop(k: Int, result: Long = 1): Long = {
assert(result == 1 || result >= k) //this is only for verification
if(k > 0) loop(k - 1, result * k) else result
}
loop(i)
}
When result > 1 is true, then the loop was executed at least once. So the result has to be bigger or equal to k. That would be a loop invariant.
When you are sure that your code is correct, you can remove the assert, but the require would stay.

You can see here for a detailed discussion within Scala language.
I can add that, the key to distinguish between require and assert is to understand these two. These two are both tools of software quality but from different toolboxes of different paradigms. In summary assert is a Software testing tool which takes a corrective approach, whereas require is a design by contract tool which takes a preventive approach.
Both require and assert are means of controlling validity of state. Historically there were 2 distinct paradigms for dealing with invalid states. The first one which is mainstream collectively called software testing discipline methodologies and tools. The other, called design by contract. These are two paradigms which are not comparable.
Software testing ensures a code versatile enough to be capable of error prone actions, were not misused. Design by contract controls code from having such capability. In other words Software testing is corrective, and design by contract is preventive.
assert is used to write unit tests, i.e. if a method passes all tests each written by an assert expression, the code is qualified as error free. So assert seats besides operational code, and is an independent body.
require is embedded within code and part of it to assure nothing harmful can happen.

In very simple language:
Require is used to enforce a precondition on the caller of a function or the creator of an object of some class. Whereas, assert is used to check the code of the function itself.
So, if a precondition fails, then you get an illegal argument exception. Whereas, if an assertion fails and it's not the caller's fault and consequently you get an assertion error.

require, ensure and invariance are concepts in Contract By Design (CBD) development process.
require checks for the pre-conditions that the caller should satisfy to consume the routine.
ensure checks for the correctness in the return value (and to also verify only the desired change has happened and nothing more)
invariance checks for the validness of the class at all critical times.
CBD is a development methodology to build correct/robust software. For more details on CBD Google and you should hit a link from Eiffel Software. Hope this helps.

Scaladocs/javadocs are pretty good as well:
assert()
Tests an expression, throwing an AssertionError if false. Calls to this method will not be generated if -Xelide-below is greater than ASSERTION.
require()
Tests an expression, throwing an IllegalArgumentException if false. This method is similar to assert, but blames the caller of the method for violating the condition.

Related

What is the benefit of effect system (e.g. ZIO)?

I'm having hard time understanding what value effect systems, like ZIO or Cats Effect.
It does not make code readable, e.g.:
val wrappedB = for {
a <- getA() // : ZIO[R, E, A]
b <- getB(a) // : ZIO[R, E, B]
} yield b
is no more readable to me than:
val a = getA() // : A
val b = getB(a) // : B
I could even argue, that the latter is more straight forward, because calling a function executes it, instead of just creating an effect or execution pipeline.
Delayed execution does not sound convincing, because all examples I've encountered so far are just executing the pipeline right away anyways. Being able to execute effects in parallel or multiple time can be achieved in simpler ways IMHO, e.g. C# has Parallel.ForEach
Composability. Functions can be composed without using effects, e.g. by plain composition.
Pure functional methods. In the end the pure instructions will be executed, so it seems like it's just pretending DB access is pure. It does not help to reason, because while construction of the instructions is pure, executing them is not.
I may be missing something or just downplaying the benefits above or maybe benefits are bigger in certain situations (e.g. complex domain).
What are the biggest selling points to use effect systems?
Because it makes it easy to deal with side effects. From your example:
a <- getA() // ZIO[R, E, A] (doesn't have to be ZIO btw)
val a = getA(): A
The first getA accounts in the effect and the possibility of returning an error, a side effect. This would be like getting an A from some db where the said A may not exist or that you lack permission to access it. The second getA would be like a simple def getA = "A".
How do we put these methods together ? What if one throws an error ? Should we still proceed to the next method or just quit it ? What if one blocks your thread ?
Hopefully that addresses your second point about composability. To quickly address the rest:
Delayed execution. There are probably two reasons for this. The first is you actually don't want to accidentally start an execution. Or just because you write it it starts right away. This breaks what the cool guys refer to as referential transparency. The second is concurrent execution requires a thread pool or execution context. Normally we want to have a centralized place where we can fine tune it for the whole app. And when building a library we can't provide it ourselves. It's the users who provide it. In fact we can also defer the effect. All you do is define how the effect should behave and the users can use ZIO, Monix, etc, it's totally up to them.
Purity. Technically speaking wrapping a process in a pure effect doesn't necessarily mean the underlying process actually uses it. Only the implementation knows if it's really used or not. What we can do is lift it to make it compatible with the composition.
what makes programming with ZIO or Cats great is when it comes to concurrent programming. They are also other reasons but this one is IMHO where I got the "Ah Ah! Now I got it".
Try to write a program that monitor the content of several folders and for each files added to the folders parse their content but not more than 4 files at the same time. (Like the example in the video "What Java developpers could learn from ZIO" By Adam Fraser on youtube https://www.youtube.com/watch?v=wxpkMojvz24 .
I mean this in ZIO is really easy to write :)
The all idea behind the fact that you combine data structure (A ZIO is a data structure) in order to make bigger data structure is so easy to understand that I would not want to code without it for complex problems :)
The two examples are not comparable since an error in the first statement will mark as faulty the value equal to the objectified sequence in the first form while it will halt the whole program in the second. The second form shall then be a function definition to properly encapsulate the two statements, followed by an affectation of the result of its call.
But more than that, in order to completely mimic the first form, some additional code has to be written, to catch exceptions and build a true faulty result, while all these things are made for free by ZIO...
I think that the ability to cleanly propagate the error state between successive statements is the real value of the ZIO approach. Any composite ZIO program fragment is then fully composable itself.
That's the main benefit of any workflow based approach, anyway.
It is this modularity which gives to effect handling its real value.
Since an effect is an action which structurally may produce errors, handling effects like this is an excellent way to handle errors in a composable way. In fact, handling effects consists in handling errors !

What is ZIO error channel and how to get a feeling about what to put in it?

ZIO (https://zio.dev/) is a scala framework which has at its core the ZIO[R, E, A] datastructure and its site gives the following information for the three parameters:
ZIO
The ZIO[R, E, A] data type has three type parameters:
R - Environment Type. The effect requires an environment of type R. If this type parameter is Any, it means the effect has no
requirements, because you can run the effect with any value (for
example, the unit value ()).
E - Failure Type. The effect may fail with a value of type E. Some applications will use Throwable. If this type parameter is
Nothing, it means the effect cannot fail, because there are no
values of type Nothing.
A - Success Type. The effect may succeed with a value of type A. If this type parameter is Unit, it means the effect produces no
useful information, while if it is Nothing, it means the effect runs
forever (or until failure).
It's easy to get what A is: it's the value returned by the function in the nominal case, ie why we coded the function for.
R is so kind of dependency injection - an interesting topic, but we can just ignore it to use ZIO by alway setting it to Any (and there is actually a IO[E, A] = ZIO[Any, E, A] alias in the lib).
So, it remains the E type, which is for error (the famous error channel). I roughtly get that IO[E, A] is kind of Either[E, A], but deals with effect (which is great).
My question is: why should I use an error channel EVERYWHERE in my application, and how can I decide what should go in the error channel?
1/ Why effect management with an error channel?
As a developper, one of your hardest task is to decide what is an error and what is not in your application - or more preciselly, to discover failure modes: what the nominal path (ie the goal of that code), what is an expected error that can be dealt with by the application in some way later on, and what are unexpected errors that the application can't deal with. There is no definitive answer for that question, it depends of the application and context, and so it's you, the developper, who needs to decide.
But the hardest task is to build an application that keeps its promises (your promises, since you chose what is an error and what is the nominal path) and that is not surprising so that users, administrators, and dev - including the futur you in two weeks - know what the code do in most cases without having to guess and have agency to adapt to that behavior, including to respond to errors.
This is hard, and you need a systematic process to deals with all the possible cases without going made.
The error channel in IO bi-monad (and thus ZIO) helps you for that task: the IO monad helps you keep track of effects, which are the source of most errors, and the error channel makes explicit what are the possible error cases, and so other parts of the application have agency to deal with them if they can. You will be able to manage your effects in a pure, consistant, composable way with explicit failure modes.
Moreover, in the case of ZIO, you can easely import non-pure code like legacy java extremelly easily:
val pure = ZIO.effect(someJavaCodeThrowingException)
2/ How do I choose what is an error?
So, the error channel provide a way to encode answer to what if? question to futur dev working on that code. "What if database is down?" "there's a DatabaseConnectionError".
But all what if are not alike for YOUR use case, for CURRENT application level. "What if user is not found?" - ah, it may be a totally expected answer at the low, "repository" level (like a "find" that didn't find anything), or it can be an error at an other level (like when you are in the process of authenticating an user, it should really be there). On the first case, you will likely not use the error channel: it's the nominal path, sometimes you don't find things. And in the second case, you will likelly use the error channel (UserNotFoundError).
So as we said, errors in error channel are typically for what if question that you may want to deal with in the application, just not at that function level. The first example of DatabaseConnectionError may be catch higher in the app and lead to an user message like "please try again" and a notification email to sysadmin ("quick, get a look, something if wrong here"). The UserNotFoundError will likely be managed as an error message for the user in the login form, something like "bad login or password, try again or recover credentials with that process".
So these cases (nominal and expected errors) are the easy parts. But there are some what if questions that your application, whatever the level, has no clue how to answer. "What if I get a memory exception when I try to allocate that object?" I don't have any clue, and actually, even if I had a clue, that's out of the scope of the things that I want to deal with for that application. So these errors DON'T go in the error channel. We call them failure and we crash the application when they happens, because it's likely that the application is now in an unknow, dangerous, zombie state.
Again, that choice (nominal path/error channel/failure) is your choice: two applications can make different choices. For example, a one-time-data-processing-app-then-discard-it will likelly treat all non-nominal paths as failures. There is a dev to catch the case in realtime and decide if it's important (see: Shell, Python, and any scripting where that strategy is heavely used - ok, sometimes even when there is no dev to catch errors:). On the other end of the specter, Nasa dev put EVERYTHING in the error channel(+), even memory CORRUPTION. Because it is an expected error, so the application need to know how to deal with that and continue.
(+)NOTE: AFAIK they don't use zio (for now), but the decision process about what is an error is the same, even in C.
To go further, I (#fanf42) gave a talk at Scala.io conference. The talk, "Ssytematic error management in application", is available in French here. Yes, French, I know - but slides are available in English here! And you can ping me (see contact info near the end of slide deck).

may the compiler optimize based on assert(...) expressions/contracts?

http://dlang.org/expression.html#AssertExpression
Regarding assert(0): "The optimization and code generation phases of compilation may assume that it is unreachable code."
The same documentation claims assert(0) is a 'special case', but there are several reasons that follow.
Can the D compiler optimize based on general assert-ions made in contracts and elsewhere?
(as if I needed another reason to enjoy the in{} and out{} constructs, but it certainly would make me feel a little more giddy to know that writing them could make things go fwoosh-ier)
In theory, yes, in practice, I don't think it does, especially since the asserts are killed before even getting to the optimizer on dmd -release. I'm not sure about gdc and ldc, but I think they share this portion of the code.
The spec's special case reference btw is that assert(0) is still present, in some form, with the -release compile flag. It is translated into an illegal instruction there (asm {hlt;} - non-kernel programs on x86 aren't allowed to use that so it will segfault upon hitting it), whereas all other asserts are simply left out of the code entirely in -release mode.
GDC certainly does optimise based on asserts. The if conditions make for much better code, even causing unnecessary code to disappear. However, unfortunately at the moment the way it is implemented is that the entire assert can disappear in release build mode so then the compiler never sees the beneficial if-condition info and actually generates worse code in release than in debug mode! Ironic. I have to admit that I've only looked at this effect with if conditions in asserts in the body, I haven't checked what effect in and out blocks have. The in- and out- etc contract blocks can be turned off based on a command line switch iirc, so they are not even compiled, I think this possibly means the compiler doesn't even look at them. So this is another thing that might possibly affect code generation, I haven't looked at it. But there is a feature here that I would very much like to see, that the if condition truth values in the assert conditions (checking that there is no side-effect code in the expression for the assert cond) can always be injected into the compiler as an assumption, just as if there had been an if statement even in release mode. It would involve pretending you had just seen an if ( xxx ) but with the actual code generation for the test suppressed in release mode, and with subsequent code feeling the beneficial effects of say known truth values, value-range limits and so on.

Is Either the equivalent to checked exceptions?

Beginning in Scala and reading about Either I naturally comparing new concepts to something I know (in this case from Java). Are there any differences from the concept of checked exceptions and Either?
In both cases
the possibility of failure is explicitly annotated in the method (throws or returning Either)
the programmer can handle the error case directly when it occurs or move it up (returning again an Either)
there is a way to inform the caller about the reason of the error
I suppose one uses for-comprehensions on Either to write code as there would be no error similar to checked exceptions.
I wonder if I am the only beginner who has problems to see the difference.
Thanks
Either can be used for more than just exceptions. For example, if you were to have a user either type input for you or specify a file containing that input, you could represent that as Either[String, File].
Either is very often used for exception handling. The main difference between Either and checked exceptions is that control flow with Either is always explicit. The compiler really won't let you forget that you are dealing with an Either; it won't collect Eithers from multiple places without you being aware of it, everything that is returned must be an Either, etc.. Because of this, you use Either not when maybe something extraordinary will go wrong, but as a normal part of controlling program execution. Also, Either does not capture a stack trace, making it much more efficient than a typical exception.
One other difference is that exceptions can be used for control flow. Need to jump out of three nested loops? No problem--throw an exception (without a stack trace) and catch it on the outside. Need to jump out of five nested method calls? No problem! Either doesn't supply anything like this.
That said, as you've pointed out there are a number of similarities. You can pass back information (though Either makes that trivial, while checked exceptions make you write your own class to store any extra information you want); you can pass the Either on or you can fold it into something else, etc..
So, in summary: although you can accomplish the same things with Either and checked exceptions with regards to explicit error handling, they are relatively different in practice. In particular, Either makes creating and passing back different states really easy, while checked exceptions are good at bypassing all your normal control flow to get back, hopefully, to somewhere that an extraordinary condition can be sensibly dealt with.
Either is equivalent to a checked exception in terms of the return signature forming an exclusive disjunction. The result can be a thrown exception X or an A. However, throwing an exception isn't equivalent to returning one – the first is not referentially transparent.
Where Scala's Either is not (as of 2.9) equivalent is that a return type is positively biased, and requires effort to extract/deconstruct the Exception, Either is unbiased; you need to explicitly ask for the left or right value. This is a topic of some discussion, and in practice a bit of pain – consider the following three calls to Either producing methods
for {
a <- eitherA("input").right
b <- eitherB(a).right
c <- eitherC(b).right
} yield c // Either[Exception, C]
you need to manually thread through the RHS. This may not seem that onerous, but in practice is a pain and somewhat surprising to new-comers.
Yes, Either is a way to embed exceptions in a language; where a set of operations that can fail can throw an error value to some non-local site.
In addition to the practical issues Rex mentioned, there's some extra things you get from the simple semantics of an Either:
Either forms a monad; so you can use monadic operations over sets of expressions that evaluate to Either. E.g. for short circuiting evaluation without having to test the result
Either is in the type -- so the type checker alone is sufficient to track incorrect handling of the value
Once you have the ability to return either an error message (Left s) or a successful value Right v, you can layer exceptions on top, as just Either plus an error handler, as is done for MonadError in Haskell.

Should a Perl constructor return an undef or a "invalid" object?

Question:
What is considered to be "Best practice" - and why - of handling errors in a constructor?.
"Best Practice" can be a quote from Schwartz, or 50% of CPAN modules use it, etc...; but I'm happy with well reasoned opinion from anyone even if it explains why the common best practice is not really the best approach.
As far as my own view of the topic (informed by software development in Perl for many years), I have seen three main approaches to error handling in a perl module (listed from best to worst in my opinion):
Construct an object, set an invalid flag (usually "is_valid" method). Often coupled with setting error message via your class's error handling.
Pros:
Allows for standard (compared to other method calls) error handling as it allows to use $obj->errors() type calls after a bad constructor just like after any other method call.
Allows for additional info to be passed (e.g. >1 error, warnings, etc...)
Allows for lightweight "redo"/"fixme" functionality, In other words, if the object that is constructed is very heavy, with many complex attributes that are 100% always OK, and the only reason it is not valid is because someone entered an incorrect date, you can simply do "$obj->setDate()" instead of the overhead of re-executing entire constructor again. This pattern is not always needed, but can be enormously useful in the right design.
Cons: None that I'm aware of.
Return "undef".
Cons: Can not achieve any of the Pros of the first solution (per-object error messages outside of global variables and lightweight "fixme" capability for heavy objects).
Die inside the constructor. Outside of some very narrow edge cases, I personally consider this an awful choice for too many reasons to list on the margins of this question.
UPDATE: Just to be clear, I consider the (otherwise very worthy and a great design) solution of having very simple constructor that can't fail at all and a heavy initializer method where all the error checking occurs to be merely a subset of either case #1 (if initializer sets error flags) or case #3 (if initializer dies) for the purposes of this question. Obviously, choosing such a design, you automatically reject option #2.
It depends on how you want your constructors to behave.
The rest of this response goes into my personal observations, but as with most things Perl, Best Practices really boils down to "Here's one way to do it, which you can take or leave depending on your needs." Your preferences as you described them are totally valid and consistent, and nobody should tell you otherwise.
I actually prefer to die if construction fails, because we set it up so that the only types of errors that can occur during object construction really are big, obvious errors that should halt execution.
On the other hand, if you prefer that doesn't happen, I think I'd prefer 2 over 1, because it's just as easy to check for an undefined object as it is to check for some flag variable. This isn't C, so we don't have a strong typing constraint telling us that our constructor MUST return an object of this type. So returning undef, and checking for that to establish success or failure, is a great choice.
The 'overhead' of construction failure is a consideration in certain edge cases (where you can't quickly fail before incurring overhead), so for those you might prefer method 1. So again, it depends on what semantics you've defined for object construction. For example, I prefer to do heavyweight initialization outside of construction. As to standardization, I think that checking whether a constructor returns a defined object is as good a standard as checking a flag variable.
EDIT: In response to your edit about initializers rejecting case #2, I don't see why an initializer can't simply return a value that indicates success or failure rather than setting a flag variable. Actually, you may want to use both, depending on how much detail you want about the error that occurred. But it would be perfectly valid for an initializer to return true on success and undef on failure.
I prefer:
Do as little initialization as possible in the constructor.
croak with an informative message when something goes wrong.
Use appropriate initialization methods to provide per object error messages etc
In addition, returning undef (instead of croaking) is fine in case the users of the class may not care why exactly the failure occurred, only if they got a valid object or not.
I despise easy to forget is_valid methods or adding extra checks to ensure methods are not called when the internal state of the object is not well defined.
I say these from a very subjective perspective without making any statements about best practices.
I would recommend against #1 simply because it leads to more error handling code which will not be written. For example, if you just return false then this works fine.
my $obj = Class->new or die "Construction failed...";
But if you return an object which is invalid...
my $obj = Class->new;
die "Construction failed #{[ $obj->error_message ]}" if $obj->is_valid;
And as the quantity of error handling code increases the probability of it being written decreases. And its not linear. By increasing the complexity of your error handling system you actually decrease the amount of errors it will catch in practical use.
You also have to be careful that your invalid object in question dies when any method is called (aside from is_valid and error_message) leading to yet more code and opportunities for mistakes.
But I agree there is value in being able to get information about the failure, which makes returning false (just return not return undef) inferior. Traditionally this is done by calling a class method or global variable as in DBI.
my $dbh = DBI->connect($data_source, $username, $password)
or die $DBI::errstr;
But it suffers from A) you still have to write error handling code and B) its only valid for the last operation.
The best thing to do, in general, is throw an exception with croak. Now in the normal case the user writes no special code, the error occurs at the point of the problem, and they get a good error message by default.
my $obj = Class->new;
Perl's traditional recommendations against throwing exceptions in library code as being impolite is outdated. Perl programmers are (finally) embracing exceptions. Rather than writing error handling code ever and over again, badly and often forgetting, exceptions DWIM. If you're not convinced just start using autodie (watch pjf's video about it) and you'll never go back.
Exceptions align Huffman encoding with actual use. The common case of expecting the constructor to just work and wanting an error if it doesn't is now the least code. The uncommon case of wanting to handle that error requires writing special code. And the special code is pretty small.
my $obj = eval { Class->new } or do { something else };
If you find yourself wrapping every call in an eval you are doing it wrong. Exceptions are called that because they are exceptional. If, as in your comment above, you want graceful error handling for the user's sake, then take advantage of the fact that errors bubble up the stack. For example, if you want to provide a nice user error page and also log the error you can do this:
eval {
run_the_main_web_code();
} or do {
log_the_error($#);
print_the_pretty_error_page;
};
You only need it in one place, at top of your call stack, rather than scattered everywhere. You can take advantage of this at smaller increments, for example...
my $users = eval { Users->search({ name => $name }) } or do {
...handle an error while finding a user...
};
There's two things going on. 1) Users->search always returns a true value, in this case an array ref. That makes the simple my $obj = eval { Class->method } or do work. That's optional. But more importantly 2) you only need to put special error handling around Users->search. All the methods called inside Users->search and all the methods they call... they just throw exceptions. And they're all caught at one point and handled the same. Handling the exception at the point which cares about it makes for much neater, compact and flexible error handling code.
You can pack more information into the exception by croaking with a string overloaded object rather than just a string.
my $obj = eval { Class->new }
or die "Construction failed: $# and there were #{[ $#->num_frobnitz ]} frobnitzes";
Exceptions:
Do the right thing without any thought by the caller
Require the least code for the most common case
Provide the most flexibility and information about the failure to the caller
Modules such as Try::Tiny fix most of the hanging issues surrounding using eval as an exception handler.
As for your use case where you might have a very expensive object and want to try and continue with it partially build... smells like YAGNI to me. Do you really need it? Or you have a bloated object design which is doing too much work too early. IF you do need it, you can put the information necessary to continue the construction in the exception object.
First the pompous general observations:
A constructor's job should be: Given valid construction parameters, return a valid object.
A constructor that does not construct a valid object cannot perform its job and is therefore a perfect candidate for exception generation.
Making sure the constructed object is valid is part of the constructor's job. Handing out a known-to-be-bad object and relying on the client to check that the object is valid is a surefire way to wind up with invalid objects that explode in remote places for non-obvious reasons.
Checking that all the correct arguments are in place before the constructor call is the client's job.
Exceptions provide a fine-grained way of propagating the particular error that occurred without needing to have a broken object in hand.
return undef; is always bad[1]
bIlujDI' yIchegh()Qo'; yIHegh()!
Now to the actual question, which I will construe to mean "what do you, darch, consider the best practice and why". First, I'll note that returning a false value on failure has a long Perl history (most of the core works that way, for example), and a lot of modules follow this convention. However, it turns out this convention produces inferior client code and newer modules are moving away from it.[2]
[The supporting argument and code samples for this turn out to be the more general case for exceptions that prompted the creation of autodie, and so I will resist the temptation to make that case here. Instead:]
Having to check for successful creation is actually more onerous than checking for an exception at an appropriate exception-handling level. The other solutions require the immediate client to do more work than it should have to just to obtain an object, work that is not required when the constructor fails by throwing an exception.[3] An exception is vastly more expressive than undef and equally expressive as passing back a broken object for purposes of documenting errors and annotating them at various levels in the call stack.
You can even get the partially-constructed object if you pass it back in the exception. I think this is a bad practice per my belief about what a constructor's contract with its clients ought to be, but the behavior is supported. Awkwardly.
So: A constructor that cannot create a valid object should throw an exception as early as possible. The exceptions a constructor can throw should be documented parts of its interface. Only the calling levels that can meaningfully act on the exception should even look for it; very often, the behavior of "if this construction fails, don't do anything" is exactly correct.
[1]: By which I mean, I am not aware of any use cases where return; is not strictly superior. If someone calls me on this I might have to actually open a question. So please don't. ;)
[2]: Per my extremely unscientific recollection of the module interfaces I've read in the last two years, subject to both selection and confirmation biases.
[3]: Note that throwing an exception does still require error-handling, as would the other proposed solutions. This does not mean wrapping every instantiation in an eval unless you actually want to do complex error-handling around every construction (and if you think you do, you're probably wrong). It means wrapping the call which is able to meaningfully act on the exception in an eval.