Side Effect Tracking in SSA - compiler-optimization

Side Effect Tracking in SSA - compiler-optimization

I am working on an optimizer for Java bytecode and decided to use SSA. However, most optimizations require all operations to be purely functional, so in order to handle side effects, I decided to add an extra opaque state parameter and return value for every operation which could potentially have side effects. This will prevent optimizing away or reordering operations with side effects. For example, ignoring exception handling, you'd get something like this pseudocode.
function arguments: x1, e1
if x1 != 0
x2 = add(x1, 3)
x3, e2 = invoke(foo, x2, e1)
x4 = phi(x1, x3)
e3 = phi(e1, e2)
return x4, e3
Is there a name for what I'm doing? Is it a good approach? I have heard that functional languages have a concept called Monads, which sounds similar but is not the same. Is using monads a better approach? If so, how can I modify this to use monads?

This was too long to fit in a comment, but it's not really meant to be an answer..
Firm calls it "memory edges", there may be more names. I've heard it being called "explicit state passing", but google appears to disagree.
But I disagree with your premises - most SSA optimizations work just fine (at times a little kludgy perhaps) in the presence of side effects. What doesn't work is using a graph representation without making side effects explicit (obviously the order would disappear). But you only need something to make that order explicit - SSA also works as a "list of operations", where the order is simply fixed (except perhaps in an explicit reordering phase), but in that case it can still be easier to make side effects explicit (leads to fewer special cases in optimizations etc).
It's a fine approach. I don't know how it relates to Monads though, I don't understand them and I probably never will.

Related

What is the benefit of effect system (e.g. ZIO)?

I'm having hard time understanding what value effect systems, like ZIO or Cats Effect.
It does not make code readable, e.g.:
val wrappedB = for {
a <- getA() // : ZIO[R, E, A]
b <- getB(a) // : ZIO[R, E, B]
} yield b
is no more readable to me than:
val a = getA() // : A
val b = getB(a) // : B
I could even argue, that the latter is more straight forward, because calling a function executes it, instead of just creating an effect or execution pipeline.
Delayed execution does not sound convincing, because all examples I've encountered so far are just executing the pipeline right away anyways. Being able to execute effects in parallel or multiple time can be achieved in simpler ways IMHO, e.g. C# has Parallel.ForEach
Composability. Functions can be composed without using effects, e.g. by plain composition.
Pure functional methods. In the end the pure instructions will be executed, so it seems like it's just pretending DB access is pure. It does not help to reason, because while construction of the instructions is pure, executing them is not.
I may be missing something or just downplaying the benefits above or maybe benefits are bigger in certain situations (e.g. complex domain).
What are the biggest selling points to use effect systems?

Because it makes it easy to deal with side effects. From your example:
a <- getA() // ZIO[R, E, A] (doesn't have to be ZIO btw)
val a = getA(): A
The first getA accounts in the effect and the possibility of returning an error, a side effect. This would be like getting an A from some db where the said A may not exist or that you lack permission to access it. The second getA would be like a simple def getA = "A".
How do we put these methods together ? What if one throws an error ? Should we still proceed to the next method or just quit it ? What if one blocks your thread ?
Hopefully that addresses your second point about composability. To quickly address the rest:
Delayed execution. There are probably two reasons for this. The first is you actually don't want to accidentally start an execution. Or just because you write it it starts right away. This breaks what the cool guys refer to as referential transparency. The second is concurrent execution requires a thread pool or execution context. Normally we want to have a centralized place where we can fine tune it for the whole app. And when building a library we can't provide it ourselves. It's the users who provide it. In fact we can also defer the effect. All you do is define how the effect should behave and the users can use ZIO, Monix, etc, it's totally up to them.
Purity. Technically speaking wrapping a process in a pure effect doesn't necessarily mean the underlying process actually uses it. Only the implementation knows if it's really used or not. What we can do is lift it to make it compatible with the composition.

what makes programming with ZIO or Cats great is when it comes to concurrent programming. They are also other reasons but this one is IMHO where I got the "Ah Ah! Now I got it".
Try to write a program that monitor the content of several folders and for each files added to the folders parse their content but not more than 4 files at the same time. (Like the example in the video "What Java developpers could learn from ZIO" By Adam Fraser on youtube https://www.youtube.com/watch?v=wxpkMojvz24 .
I mean this in ZIO is really easy to write :)
The all idea behind the fact that you combine data structure (A ZIO is a data structure) in order to make bigger data structure is so easy to understand that I would not want to code without it for complex problems :)

The two examples are not comparable since an error in the first statement will mark as faulty the value equal to the objectified sequence in the first form while it will halt the whole program in the second. The second form shall then be a function definition to properly encapsulate the two statements, followed by an affectation of the result of its call.
But more than that, in order to completely mimic the first form, some additional code has to be written, to catch exceptions and build a true faulty result, while all these things are made for free by ZIO...
I think that the ability to cleanly propagate the error state between successive statements is the real value of the ZIO approach. Any composite ZIO program fragment is then fully composable itself.
That's the main benefit of any workflow based approach, anyway.
It is this modularity which gives to effect handling its real value.
Since an effect is an action which structurally may produce errors, handling effects like this is an excellent way to handle errors in a composable way. In fact, handling effects consists in handling errors !

Why variable assignments should not used in functional programming

I am learning functional programming and I can understand why immutability is preferred over mutable objects.
This article also explains it well.
But I am unable to understand why assignments should be performed inside of pure functions.
One reason that I can understand is variable mutability leads to locking and since in a pure function in scala we mostly tail recursion and this creates variables/objects on the call stack rather than a heap.
Is there any other reason why one should avoid variable assignment in functional programming.

There is a difference between assignments and re-assignments. re-assignments are not allowed in functional programming because its mutability is not allowed in pure functions.Variable assignment is allowed.
val a = 1 //assignment allowed
a = 2 //re-assignment not allowed
Reading in a impure fashion (changing state) from the external world is a side-effect in functional programming.
So, function accessing a global variable which can potentially be mutated is not pure.
Just makes life easy.
Generally
When you are disciplined life is less chaotic. Thats exactly what functional programming advocates. When life is less chaotic you can concentrate on better things in life.
So, the main reason for immutability
It becomes hard to reason about the correctness of the program with mutations. In case of concurrent programs this is very painful to debug.
that means it becomes hard to keep track of changes variables undergo in order to understand the code/program or to debug the program.
Mutation is one of the side effect where which makes program hard to understand and reason about.
Functional programming enforces this discipline (use of immutability) so that code is maintainable, expressive and understandable.
Mutation is one of the side effects
Pure function is that one which does not have side effects.
Side effects:
Mutation of variables
Mutation of mutable data structures
Reading or writing to a file/console (external source)
Throwing exceptions to the halt program
Avoiding above mentioned side effects makes a function depend only on the parameters of the function rather than any outside values or state.
Pure function is the most isolated function which neither reads from the world nor writes to the world. It does not halt or break the program control flow.
The above properties make the pure function easy to understand and reason about.
Pure function is mathematical function
Its a mapping from co-domain to range where every value in co-domain is mapped to exactly one value in range.
That means if f(2) is equal to 4 then f(2) is 4 irrespective of what the state of the world is.
Pure function is a relation between a set of inputs and a set of permissible outputs with the property that each input is related to exactly one output.

What does the "world" mean in functional programming world?

I've been diving into functional programming for more than 3 years and I've been reading and understanding many articles and aspects of functional programming.
But I often stumbled into many articles about the "world" in side effect computations and also carrying and copying the "world" in IO monad samples. What does the "world" means in this context? Is this the same "world" in all side effect computation context or is it only applied in IO monads?
Also the documentation and other articles about Haskell mention the "world" many times.
Some reference about this "world":
http://channel9.msdn.com/Shows/Going+Deep/Erik-Meijer-Functional-Programming
and this:
http://www.infoq.com/presentations/Taming-Effect-Simon-Peyton-Jones
I expect a sample, not just explanation of the world concept. I welcome sample code in Haskell, F#, Scala, Scheme.

The "world" is just an abstract concept that captures "the state of the world", i.e. the state of everything outside the current computation.
Take this I/O function, for example:
write : Filename -> String -> ()
This is non-functional, as it changes the file (whose content is part of the state of the world) by side effect. If, however, we modelled the world as an explicit object, we could provide this function:
write : World -> Filename -> String -> World
This takes the current world and functionally produces a "new" one, with the file modified, which you can then pass to consecutive calls. The World itself is just an abstract type, there is no way to peek at it directly, except through corresponding functions like read.
Now, there is one problem with the above interface: without further restrictions, it would allow a program to "duplicate" the world. For example:
w1 = write w "file" "yes"
w2 = write w "file" "no"
You've used the same world w twice, producing two different future worlds. Obviously, this makes no sense as a model for physical I/O. To prevent examples like that, a more fancy type system is needed that makes sure that the world is handled linearly, i.e., never used twice. The language Clean is based on a variation of this idea.
Alternatively, you can encapsulate the world such that it never becomes explicit and thereby cannot be duplicated by construction. That is what the I/O monad achieves -- it can be thought of as a state monad whose state is the world, which it threads through the monadic actions implicitly.

The "world" is a concept involved in one kind of embedding of imperative programming into a purely functional language.
As you most certainly know, purely functional programming requires the result of a function to depend exclusively on the values of the arguments. So suppose we want to express a typical getLine operation as a pure function. There are two evident problems:
getLine can produce a different result each time it's called with the same arguments (no arguments, in this case).
getLine has the side effect of consuming some portion of a stream. If your program uses getLine, then (a) each invocation of it must consume a different part of the input, (b) each part of the program's input must be consumed by some invocation. (You can't have two calls to getLine reading the same input line twice unless that line occurs twice in the input; you can't have the program randomly skip a line of input either.)
So getLine just can't be a function, right? Well, not so fast, there's some tricks we could do:
Multiple calls to getLine can return different results. To make that compatible with purely functional behavior, this means that a purely functional getLine could take an argument: getLine :: W -> String. Then we can reconcile the idea of different results on each call by stipulating that each call must be made with a different value for the W argument. You could imagine that W represents the state of the input stream.
Multiple calls to getLine must be executed in some definite order, and each must consume the input that was left over from the previous call. Change: give getLine the type W -> (String, W), and forbid programs from using a W value more than once (something that we can check at compilation). Now to use getLine more than once in your program you must take care to feed the earlier call's W result to the succeeding call.
As long as you can guarantee that Ws are not reused, you can use this sort of technique to translate any (single-threaded) imperative program into a purely functional one. You don't even need to have any actual in-memory objects for the W type—you just type-check your program and analyze it to prove that each W is only used once, then emit code that doesn't refer to anything of the sort.
So the "world" is just this idea, but generalized to cover all imperative operations, not just getLine.
Now having explained all that, you may be wondering if you're better off knowing this. My opinion is no, you aren't. See, IMO, the whole "passing the world around" idea is one of those things like monad tutorials, where too many Haskell programmers have chosen to be "helpful" in ways that actually aren't.
"Passing the world around" is routinely offered as an "explanation" to help newbies understand Haskell IO. But the problem is that (a) it's a really exotic concept for many people to wrap their heads around ("what do you mean I'm going to pass the state of the whole world around?"), (b) very abstract (a lot of people can't wrap their head around the idea that nearly every function your program will have an unused dummy parameter that neither appears in the source code nor the object code), and (c) not the easiest, most practical explanation anyway.
The easiest, most practical explanation of Haskell I/O, IMHO, goes like this:
Haskell is purely functional, so things like getLine can't be functions.
But Haskell has things like getLine. This means those things are something else that's not a function. We call them actions.
Haskell allows you to treat actions as values. You can have functions that produce actions (e.g., putStrLn :: String -> IO ()), functions that accept actions as arguments (e.g., (>>) :: IO a -> IO b -> IO b), etc.
Haskell however has no function that executes an action. There can't be an execute :: IO a -> a because it would not be a true function.
Haskell has built-in functions to compose actions: make compound actions out of simple actions. Using basic actions and action combinators, you can describe any imperative program as an action.
Haskell compilers know how to translate actions into executable native code. So you write an executable Haskell program by writing a main :: IO () action in terms of subactions.

Passing around values that represent "the world" is one way to make a pure model for doing IO (and other side effects) in pure declarative programming.
The "problem" with pure declarative (not just functional) programming is obvious. Pure declarative programming provides a model of computation. These models can express any possible computation, but in the real world we use programs to have computers do things that aren't computation in a theoretical sense: taking input, rendering to displays, reading and writing storage, using networks, controlling robots, etc, etc. You can directly model almost all of such programs as computation (e.g. what output should be written to a file given this input is a computation), but the actual interactions with things outside the program just isn't part of the pure model.
That's actually true of imperative programming too. The "model" of computation that is the C programming language provides no way to write to files, read from keyboards, or anything. But the solution in imperative programming is trivial. Performing a computation in the imperative model is executing a sequences of instructions, and what each instruction actually does depends on the whole environment of the program at the time it is executed. So you can just provide "magic" instructions that carry out your IO actions when they are executed. And since imperative programmers are used to thinking about their programs operationally1, this fits very naturally with what they're already doing.
But in all pure models of computation, what a given unit of computation (function, predicate, etc) will do should only depend on its inputs, not on some arbitrary environment that can be different every time. So not only performing IO actions but also implementing computations which depend on the universe outside the program is impossible.
The idea for the solution is fairly simple though. You build a model for how IO actions work within the whole pure model of computation. Then all the principles and theories that apply to the pure model in general will also apply to the part of it that models IO. Then, within the language or library implementation (because it's not expressible in the language itself), you hook up manipulations of the IO model to actual IO actions.
This brings us to passing around a value that represents the world. For example, a "hello world" program in Mercury looks like this:
:- pred main(io::di, io::uo) is det.
main(InitialWorld, FinalWorld) :-
print("Hello world!", InitialWorld, TmpWorld),
nl(TmpWorld, FinalWorld).
The program is given InitialWorld, a value in the type io which represents the entire universe outside the program. It passes this world to print, which gives it back TmpWorld, the world that is like InitialWorld but in which "Hello world!" has been printed to the terminal, and whatever else has happened in the meantime since InitialWorld was passed to main is also incorporated. It then passes TmpWorld to nl, which gives back FinalWorld (a world that is very like TmpWorld but it incorporates the printing of the newline, plus any other effects that happened in the meantime). FinalWorld is the final state of the world passed out of main back to the operating system.
Of course, we're not really passing around the entire universe as a value in the program. In the underlying implementation there usually isn't a value of type io at all, because there's no information that's useful to actually pass around; it all exists outside the program. But using the model where we pass around io values allows us to program as if the entire universe was an input and output of every operation that is affected by it (and consequently see that any operation that doesn't take an input and output io argument can't be affected by the external world).
And in fact, usually you wouldn't actually even think of programs that do IO as if they're passing around the universe. In real Mercury code you'd use the "state variable" syntactic sugar, and write the above program like this:
:- pred main(io::di, io::uo) is det.
main(!IO) :-
print("Hello world!", !IO),
nl(!IO).
The exclamation point syntax signifies that !IO really stands for two arguments, IO_X and IO_Y, where the X and Y parts are automatically filled in by the compiler such that the state variable is "threaded" through the goals in the order in which they are written. This is not just useful in the context of IO btw, state variables are really handy syntactic sugar to have in Mercury.
So the programmer actually tends to think of this as a sequence of steps (depending on and affecting external state) that are executed in the order in which they are written. !IO almost becomes a magic tag that just marks the calls to which this applies.
In Haskell, the pure model for IO is a monad, and a "hello world" program looks like this:
main :: IO ()
main = putStrLn "Hello world!"
One way to interpret the IO monad is similarly to the State monad; it's automatically threading a state value through, and every value in the monad can depend on or affect this state. Only in the case of IO the state being threaded is the entire universe, as in the Mercury program. With Mercury's state variables and Haskell's do notation, the two approaches end up looking quite similar, with the "world" automatically threaded through in a way that respects the order in which the calls were written in the source code, =but still having IO actions explicitly marked.
As explained quite well in sacundim's answer, another way to interpret Haskell's IO monad as a model for IO-y computations is to imagine that putStrLn "Hello world!" isn't in fact a computation through which "the universe" needs to be threaded, but rather that putStrLn "Hello World!" is itself a data structure describing an IO action that could be taken. On this understanding what programs in the IO monad are doing is using pure Haskell programs to generate at runtime an imperative program. In pure Haskell there's no way to actually execute that program, but since main is of type IO () main itself evaluates to such a program, and we just know operationally that the Haskell runtime will execute the main program.
Since we're hooking up these pure models of IO to actual interactions with the outside world, we need to be a little careful. We're programming as if the entire universe was a value we can pass around the same as other values. But other values can be passed into multiple different calls, stored in polymorphic containers, and many other things that don't make any sense in terms of the actual universe. So we need some restrictions that prevent us from doing anything with "the world" in the model that doesn't correspond to anything that can actually be done to the real world.
The approach taken in Mercury is to use unique modes to enforce that the io value remains unique. That's why the input and output world were declared as io::di and io::uo respectively; it's a shorthand for declaring that the type of the first paramter is io and it's mode is di (short for "destructive input"), while the type of the second parameter is io and its mode is uo (short for "unique output"). Since io is an abstract type, there's no way to construct new ones, so the only way to meet the uniqueness requirement is to always pass the io value to at most one call, which must also give you back a unique io value, and then to output the final io value from the last thing you call.
The approach taken in Haskell is to use the monad interface to allow values in the IO monad to be constructed from pure data and from other IO values, but not expose any functions on IO values that would allow you to "extract" pure data from the IO monad. This means that only the IO values incorporated into main will ever do anything, and those actions must be correctly sequenced.
I mentioned before that programmers doing IO in a pure language still tend to think operationally about most of their IO. So why go to all this trouble to come up with a pure model for IO if we're only going to think about it the same way imperative programmers do? The big advantage is that now all the theories/code/whatever that apply to all of the language apply to IO code as well.
For example, in Mercury the equivalent of fold processes a list element-by-element to build up an accumulator value, which means fold takes an input/output pair of variables of some arbitrary type as the accumulator (this is a very common pattern in the Mercury standard library, and is why I said state variable syntax often turns out to be very handy in other contexts than IO). Since "the world" appears in Mercury programs explicitly as a value in the type io, it's possible to use io values as the accumulator! Printing a list of strings in Mercury is as simple as foldl(print, MyStrings, !IO). Similarly in Haskell, generic monad/functor code works just fine on IO values. We get a whole lot of "higher-order" IO operations that would have to be implemented anew specialised to IO in a language that handles IO by some completely special mechanism.
Also, since we avoid breaking the pure model by IO, theories that are true of the computational model remain true even in the presence of IO. This makes reasoning by the programmer and by program-analysis tools not have to consider whether IO might be involved. In languages like Scala for example, even though much "normal" code is in fact pure, optimizations and implementation techniques that work on pure code are generally inapplicable, because the compiler has to presume that every single call might contain IO or other effects.
1 Thinking about programs operationally means understanding them in terms of the operations the computer will carry out when executing them.

I think the first thing we should read about this subject is Tackling the Awkward Squad. (I didn't do so and I regret it.)
The author actually describes the GHC's internal representation of IO as world -> (a,world) as "a bit of a hack".
I think this "hack" is meant to be a kind of innocent lie. I think there are two kinds of lie here:
GHC pretends the 'world' is representable by some variable.
The type world -> (a,world) basically says that if we could in some way instantiate the world, then the "next state" of our world is functionally determined by some small program running on a computer. Since this is clearly not realizable, the primitives are (of course) implemented as functions with side effects, ignoring the meaningless "world" parameter, just like in most other languages.
The author defends this "hack" on the two bases:
By treating the IO as a thin wrapper of the type world -> (a,world), GHC can reuse many optimizations for the IO code, so this design is very practical and economical.
The operational semantics of the IO computation implemented as above can be proved sound provided the compiler satisfies certain properties. This paper is cited for the proof of this.
The problem (that I wanted to ask here, but you asked it first so forgive me to write it here) is that in the presense of the standard 'lazy IO' functions, I'm no longer sure that the GHC's operational semantics remains sound.
The standard 'lazy IO' functions such as hGetContents internally calls unsafeInterleaveIO which in turn is equivalent to
unsafeDupableInterleaveIO for single thread programs.
unsafeDupableInterleaveIO :: IO a -> IO a
unsafeDupableInterleaveIO (IO m)
= IO ( \ s -> let r = case m s of (# _, res #) -> res
in (# s, r #))
Pretending that the equational reasoning still works for this kind of programs (note that m is an impure function) and ignoring the constructor , we have
unsafeDupableInterleaveIO m >>= f ==> \world -> f (snd (m world)) world , which semantically would have the same effect that Andreas Rossberg described above: it "duplicates" the world. Since our world cannot be duplicated this way, and the precise evaluation order of a Haskell program is virtually unpredictable --- what we get is an almost unconstrained and unsynchronized concurrency racing for some precious system resources such as file handles. This kind of operation is of course never considered in Ariola&Sabry. So I disagree with Andreas in this respect -- the IO monad doesn't really thread the world properly even if we restrict ourselves within the limit of the standard library (and this is why some people say lazy IO is bad).

The world means just that - the physical, real world. (There is only one, mind you.)
By neglecting physical processes that are confined to the CPU and memory, one can classify every function:
Those that do not have effects in the physical world (except for ephemeral, mostly unobservable effects in the CPU and RAM)
Those that do have observable effects. for example: print something on the printer, send electrons through network cables, launch rockets or move disk heads.
The distinction is a bit artificial, insofar as running even the purest Haskell program in reality does have observable effects, like: your CPU getting hotter, which causes the fan to turn on.

Basically every program you write can be divided into 2 parts (in FP word, in imperative/OO world there is no such distinction).
Core/Pure part: This is your actual logic/algorithm of the application that is used to solve the problem for which you have build the application. (95% of applications today lack this part as they are just a mess of API calls with if/else sprinkled, and people start calling themselves programmers) For ex: In an image manipulation tool the algorithm to apply various effects to the image belongs to this core part. So in FP, you build this core part using FP concepts like purity etc. You build your function that takes input and return result and there is no mutation whatsoever in this part of your application.
The outer layer part: Now lets says you have completed the core part of the image manipulation tool and have tested the algorithms by calling function with various input and checking the output but this isnt something that you can ship, how the user is supposed to use this core part, there is no face of it, it is just a bunch of functions. Now to make this core usable from end user point of view, you need to build some sort of UI, way to read files from disk, may be use some embedded database to store user preferences and the list goes on. This interaction with various other stuff, which is not the core concept of your application but still is required to make it usable is called the world in FP.
Exercise: Think about any application you have build earlier and try to divide it into above mentioned 2 parts and hopefully that will make things more clear.

The world refers to interacting with the real world / has side effects - for example
fprintf file "hello world"
which has a side effect - the file has had "hello world" added to it.
This is opposed to purely functional code like
let add a b = a + b
which has no side effects

Is Either the equivalent to checked exceptions?

Beginning in Scala and reading about Either I naturally comparing new concepts to something I know (in this case from Java). Are there any differences from the concept of checked exceptions and Either?
In both cases
the possibility of failure is explicitly annotated in the method (throws or returning Either)
the programmer can handle the error case directly when it occurs or move it up (returning again an Either)
there is a way to inform the caller about the reason of the error
I suppose one uses for-comprehensions on Either to write code as there would be no error similar to checked exceptions.
I wonder if I am the only beginner who has problems to see the difference.
Thanks

Either can be used for more than just exceptions. For example, if you were to have a user either type input for you or specify a file containing that input, you could represent that as Either[String, File].
Either is very often used for exception handling. The main difference between Either and checked exceptions is that control flow with Either is always explicit. The compiler really won't let you forget that you are dealing with an Either; it won't collect Eithers from multiple places without you being aware of it, everything that is returned must be an Either, etc.. Because of this, you use Either not when maybe something extraordinary will go wrong, but as a normal part of controlling program execution. Also, Either does not capture a stack trace, making it much more efficient than a typical exception.
One other difference is that exceptions can be used for control flow. Need to jump out of three nested loops? No problem--throw an exception (without a stack trace) and catch it on the outside. Need to jump out of five nested method calls? No problem! Either doesn't supply anything like this.
That said, as you've pointed out there are a number of similarities. You can pass back information (though Either makes that trivial, while checked exceptions make you write your own class to store any extra information you want); you can pass the Either on or you can fold it into something else, etc..
So, in summary: although you can accomplish the same things with Either and checked exceptions with regards to explicit error handling, they are relatively different in practice. In particular, Either makes creating and passing back different states really easy, while checked exceptions are good at bypassing all your normal control flow to get back, hopefully, to somewhere that an extraordinary condition can be sensibly dealt with.

Either is equivalent to a checked exception in terms of the return signature forming an exclusive disjunction. The result can be a thrown exception X or an A. However, throwing an exception isn't equivalent to returning one – the first is not referentially transparent.
Where Scala's Either is not (as of 2.9) equivalent is that a return type is positively biased, and requires effort to extract/deconstruct the Exception, Either is unbiased; you need to explicitly ask for the left or right value. This is a topic of some discussion, and in practice a bit of pain – consider the following three calls to Either producing methods
for {
a <- eitherA("input").right
b <- eitherB(a).right
c <- eitherC(b).right
} yield c // Either[Exception, C]
you need to manually thread through the RHS. This may not seem that onerous, but in practice is a pain and somewhat surprising to new-comers.

Yes, Either is a way to embed exceptions in a language; where a set of operations that can fail can throw an error value to some non-local site.
In addition to the practical issues Rex mentioned, there's some extra things you get from the simple semantics of an Either:
Either forms a monad; so you can use monadic operations over sets of expressions that evaluate to Either. E.g. for short circuiting evaluation without having to test the result
Either is in the type -- so the type checker alone is sufficient to track incorrect handling of the value
Once you have the ability to return either an error message (Left s) or a successful value Right v, you can layer exceptions on top, as just Either plus an error handler, as is done for MonadError in Haskell.

Should Events be externally mutable?

I am playing around with FRP and was wondering about how the act of an Event 'occurring' should be handled publicly. By this, I mean should a programmer be able to do the following within an FRP context:
event.occur(now, 5)
I have never seen examples of this in any FRP papers and it doesn't feel right to me. I feel that FRP frameworks should really hide this type of action and that occurrences of Events should happen behind the scenes only. Am I correct in thinking this?
To clarify, my approach would be to have 'occur' only accessible to the Event class itself. If an abstraction for some external source was needed (such as a mouse) this could be built by extending the Event class. In this way all the logic dealing with occurrence creation is abstracted.

Well, if the FRP library exposes a way to bind to external events — e.g. an existing event-based framework — then it must provide functionality equivalent to this, or it couldn't interact with the outside world.
However, the question is really: what do you mean by "external"? The FRP system itself is usually taken to be pure, so the idea of executing side-effectful code like event.occur(now, 5) from inside the FRP system isn't even meaningful. Generally, of course, a facility to execute such code in response to FRP events is provided, but this is usually taken not as part of the pure programming model, but as a facility to interface the network as a whole with the outside world.
So, in my opinion, there's two possible ways to interpret this question:
Should it be possible to trigger an event from outside of the FRP system? — definitely yes, as it's required for interfacing with the outside world, but this does not affect the programming model of FRP itself.
Should it be possible to trigger an event from "inside" of the FRP system, assuming some facility for executing side-effectful code in reaction to an event? — also yes, because allowing normal side-effectful code to cause events but forbidding it inside the code executed in response to events seems like a very strange (and circumventable) restriction, given that the intention of the facility is to interface with the outside world.
Indeed, it's possible to cause something just like #2 even if you explicitly forbid it: consider setting things up so that switchToWindow 3 is executed when the event buttonClicked triggers, e.g. (using reactive-banana notation):
reactimate (switchToWindow 3 <$ buttonClicked)
And say that we have an event
newWindowFocused :: Event Int
The reaction we've set up causes the newWindowFocused event to fire, even if firing events from inside code executed due to an event is prevented.
Now, everything I've said so far concerns only "external" events: those not expressed with pure FRP, but explicitly created to represent events that occur in the outside world, beyond the FRP system. If you're asking whether there should be a facility to cause special occurrences in purely-defined events, then my response is: absolutely not! This destroys the meaning of the system, because suddenly fmap f (union e1 e2) doesn't mean "occurs with value f x when either e1 or e2 occurs with value x", but instead "occurs with value f x when either e1 or e2 occurs with value x... or when some external code randomly decides to fire it".
Not only would such a facility make reasoning about the behaviour of an FRP system essentially meaningless,1 it'd also violate referential transparency: if you construct two events equivalent to fmap f (union e1 e2), then you can distinguish them by firing one and noticing that the other doesn't occur. You simply can't prevent this in all cases: imagine fmap g (union e1 e2), where f computes the same function as g; equality on functions is not decidable :)
Of course, it's entirely possible to implement FRP in an impure language, but I think providing a way to violate the referential transparency of the FRP system itself is a very bad thing, as it is, after all, a pure model.
If I understand it correctly, your solution to this flaw in the API (namely, exposing occur publicly, which breaks referential transparency of equivalent events, etc. as I talked about above) would be to make occur internal to your Event class, so that it cannot be used from outside. I agree that, if you need occur internally, this is the correct solution. I also agree that it's reasonable to expose it to subclasses if your implementation of external events is done by subclassing Event. That falls under "outside world glue", which falls outside the purvue of the FRP model itself, so it's perfectly OK to give it the ability to "break the rules" in this way — after all, that's essentially what it's for: disturbing the system with side-effects :)
So, in conclusion:
No, events should not expose this interface.
Yes, you are correct in thinking this :)
1 Of course, you could argue that external events do this full stop, as the whole behaviour of the system ultimately depends on the "edges" hooked up to the outside world, but this isn't really true: yes, you can't really assume anything about the external events themselves, but you can still rely on everything you build out of them to obey the laws of their constructions. Offering an "external firing" facility to every event means that no construction has any laws.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse