Tagless-final effect propagation - scala

The tagless-final pattern lets us write pure functional programs which are explicit about the effects they require.
However, scaling this pattern might become challenging. I'll try to demonstrate this with an example. Imagine a simple program that reads records from the database and prints them to the console. We will require some custom typeclasses Database and Console, in addition to Monad from cats/scalaz in order to compose them:
def main[F[_]: Monad: Console: Database]: F[Unit] =
read[F].flatMap(Console[F].print)
def read[F[_]: Functor: Database]: F[List[String]] =
Database[F].read.map(_.map(recordToString))
The problem starts when I want to add a new a effect to a function in the inner layers. For example, I want my read function to log a message if no records were found
def read[F[_]: Monad: Database: Logger]: F[List[String]] =
Database[F].read.flatMap {
case Nil => Logger[F].log("no records found") *> Nil.pure
case records => records.map(recordToString).pure
}
But now, I have to add the Logger constraint to all the callers of read up the chain. In this contrived example it's just main, but imagine this is several layers down a complicated real-world application.
We can look at this issue in two ways:
We can say it's a good thing that were explicit about our effects, and we know exactly which effects are needed by each layer
We can also say that this leaks implementation details - main doesn't care about logging, it's just needs the result of read. Also, in real applications you see really long chains of effects in the top layers. It feels like a code-smell, but I can't put my finger on what other approach I can take.
Would love to get your insights on this.
Thanks.

We can also say that this leaks implementation details - main doesn't
care about logging, it's just needs the result of read. Also, in real
applications you see really long chains of effects in the top layers.
It feels like a code-smell, but I can't put my finger on what other
approach I can take.
I actually believe the contrary is true. One of the key promises of pure FP is equational reasoning as a means of deriving the method implementation from it's signature. If read needs a logging effect in order to do it's business, then by all means it should be declaratively expressed in the signature. Another advantage of being explicit about your effects is the fact that when they start to accumulate, perhaps we need to rethink what this specific method is doing and split it up into smaller components? Or should this effect really be used here?
It is true that effects stack up, but as #TravisBrown mentioned in the comments, it is usually the highest place in the call stack that has to "suffer the consequence" of actually providing all the implicit evidence for the entire call tree.

Related

How to convert `fs2.Stream[IO, T]` to `Iterator[T]` in Scala

Need to fill in the methods next and hasNext and preserve laziness
new Iterator[T] {
val stream: fs2.Stream[IO, T] = ...
def next(): T = ???
def hasNext(): Boolean = ???
}
But cannot figure out how an earth to do this from a fs2.Stream? All the methods on a Stream (or on the "compiled" thing) are fairly useless.
If this is simply impossible to do in a reasonable amount of code, then that itself is a satisfactory answer and we will just rip out fs2.Stream from the codebase - just want to check first!
fs2.Stream, while similar in concept to Iterator, cannot be converted to one while preserving laziness. I'll try to elaborate on why...
Both represent a pull-based series of items, but the way in which they represent that series and implement the laziness differs too much.
As you already know, Iterator represents its pull in terms of the next() and hasNext methods, both of which are synchronous and blocking. To consume the iterator and return a value, you can directly call those methods e.g. in a loop, or use one of its many convenience methods.
fs2.Stream supports two capabilities that make it incompatible with that interface:
cats.effect.Resource can be included in the construction of a Stream. For example, you could construct a fs2.Stream[IO, Byte] representing the contents of a file. When consuming that stream, even if you abort early or do some strange flatMap, the underlying Resource is honored and your file handle is guaranteed to be closed. If you were trying to do the same thing with iterator, the "abort early" case would pose problems, forcing you to do something like Iterator[Byte] with Closeable and the caller would have to make sure to .close() it, or some other pattern.
Evaluation of "effects". In this context, effects are types like IO or Future, where the process of obtaining the value may perform some possibly-asynchronous action, and may perform side-effects. Asynchrony poses a problem when trying to force the process into a synchronous interface, since it forces you to block your current thread to wait for the asynchronous answer, which can cause deadlocks if you aren't careful. Libraries like cats-effect strongly discourage you from calling methods like unsafeRunSync.
fs2.Stream does allow for some special cases that prevent the inclusion of Resource and Effects, via its Pure type alias which you can use in place of IO. That gets you access to Stream.PureOps, but that only gets you methods that consume the whole stream by building a collection; the laziness you want to preserve would be lost.
Side note: you can convert an Iterator to a Stream.
The only way to "convert" a Stream to an Iterator is to consume it to some collection type via e.g. .compile.toList, which would get you an IO[List[T]], then .map(_.iterator) that to get an IO[Iterator[T]]. But ultimately that doesn't fit what you're asking for since it forces you to consume the stream to a buffer, breaking laziness.
#Dima mentioned the "XY Problem", which was poorly-received since they didn't really elaborate (initially) on the incompatibility, but they're right. It would be helpful to know why you're trying to make a Stream-to-Iterator conversion, in case there's some other approach that would serve your overall goal instead.

Why future has side effects?

I am reading the book FPiS and on the page 107 the author says:
We should note that Future doesn’t have a purely functional interface.
This is part of the reason why we don’t want users of our library to
deal with Future directly. But importantly, even though methods on
Future rely on side effects, our entire Par API remains pure. It’s
only after the user calls run and the implementation receives an
ExecutorService that we expose the Future machinery. Our users
therefore program to a pure interface whose implementation
nevertheless relies on effects at the end of the day. But since our
API remains pure, these effects aren’t side effects.
Why Future has not purely functional interface?
The problem is that creating a Future that induces a side-effect is in itself also a side-effect, due to Future's eager nature.
This breaks referential transparency. I.e. if you create a Future that only prints to the console, the future will be run immediately and run the side-effect without you asking it to.
An example:
for {
x <- Future { println("Foo") }
y <- Future { println("Foo") }
} yield ()
This results in "Foo" being printed twice. Now if Future was referentially transparent we should be able to get the same result in the non-inlined version below:
val printFuture = Future { println("Foo") }
for {
x <- printFuture
y <- printFuture
} yield ()
However, this instead prints "Foo" only once and even more problematic, it prints it no matter if you include the for-expression or not.
With referentially transparent expression we should be able to inline any expression without changing the semantics of the program, Future can not guarantee this, therefore it breaks referential transparency and is inherently effectful.
A basic premise of FP is referential transparency. In other words, avoiding side effects.
What's a side effect? From Wikipedia:
In computer science, a function or expression is said to have a side effect if it modifies some state outside its scope or has an observable interaction with its calling functions or the outside world. (Except, by convention, returning a value: returning a value has an effect on the calling function, but this is usually not considered as a side effect.)
And what is a Scala future? From the documentation page:
A Future is a placeholder object for a value that may not yet exist.
So a future can transition from a not-yet-existing-value to an existing-value without any interaction from or with the rest of the program, and, as you quoted: "methods on Future rely on side effects."
It would appear that Scala futures do not maintain referential transparency.
As far as I know, Future runs its computation automatically when it's created. Even if it lacks side-effects in its nested computation, it still breaks flatMap composition rule, because it changes state over time:
someFuture.flatMap(Future(_)) == someFuture // can be false
Equality implementation questions aside, we can have a race condition here: new Future immediately runs for a tiny fraction of time, and its isCompleted can differ from someFuture if it is already done.
In order to be pure w.r.t. effect it represents, Future should defer its computation and run it only when explicitly asked for it, like in the case of Par (or scalaz's Task).
To complement the other points and explain relationship between referential transparency (a requirement) and side-effects (mutation that might break this requirement), here is kinda simplistic but pragmatic view on what's happening:
newly created Future immediately submits a Callable task into your pool's queue. Given that queue is a mutable collection - this is basically a side-effect
any subscription (from onComplete to map) does the same + uses an additional mutable collection of subscribers per Callable.
Btw, subscriptions are not only in violation of Monad laws as noted by #P.Frolov (for flatMap) - Functor laws f.map(identity) == f are broken too. Especially, in the light of fact that newly created Future (by map) isn't equivalent to original - it has its separate subscriptions and Callable
This "fire and subscribe" allows you to do stuff like:
val f = Future{...}
val f2 = f.map(...)
val f3 = f.map(...)//twice or more
Every line of this code produces a side-effect that might potentially break referential transparency and actually does as many mentioned.
The reason why many authors prefer "referential transparency" term is probably because from low-level perspective we always do some side-effects, however only subset (usually a more high-level one) of those actually makes your code "non-functional".
As per the futures, breaking referential transparency is most disruptive as it also leads to non-determinism (in Futures case):
val f1 = Future {
println("1")
}
val f2 = Future {
println("2")
}
It gets worse when this is combined with Monads, including for-comprehension cases mentioned by #Luka Jacobowitz. In practice, monads are used not only to flatten-merge compatible containers, but also in order to guarantee [con]sequential relation. This is probably because even in abstract algebra Monads are generalizing over consequence operators meant as a general characterization of the notion of deduction.
This simply means that it's hard to reason about non-deterministic logic, even harder than just non-referential-transparent stuff:
analyzing logs produced by Futures, or even worse actors, is a hell. Even no matter how many labels and thread-local propagation you have - everything breaks eventually.
non-deterministic (aka "sometimes appearing") bugs are most annoying and stay in production for years(!) - even extensive high-load testing (including performance tests) doesn't always catch those.
So, even in absence of other criteria, code that is easier to reason about, is essentially more functional and Futures often lead to code that isn't.
P.S. As a conclusion, if your project is tolerant to scalaz/cats/monix/fs2 so on, it's better to use Tasks/Streams/Iteratees. Those libraries introduce some risks of overdesgn of course; however, IMO it's better to spent time simplifying incomprehensible scalaz-code than debugging an incomprehensible bug.

Parenthesis for not pure functions

I know that that I should use () by convention if a method has side effects
def method1(a: String): Unit = {
//.....
}
//or
def method2(): Unit = {
//.....
}
Do I have to do the same thing if a method doesn't have side effects but it's not pure, doesn't have any parameters and, of course, it returns the different results each time it's being called?
def method3() = getRemoteSessionId("login", "password")
Edit: After reviewing Luigi Plinge's comment, I came to think that I should rewrite the answer. This is also not a clear yes/no answer, but some suggestions.
First: The case regarding var is an interesting one. Declaring a var foo gives you a getter foo without parentheses. Obviously it is an impure call, but it does not have a side effect (it does not change anything unobserved by the caller).
Second, regarding your question: I now would not argue that the problem with getRemoteSessionId is that it is impure, but that it actually makes the server maintain some session login for you, so clearly you interfere destructively with the environment. Then method3() should be written with parentheses because of this side-effect nature.
A third example: Getting the contents of a directory should thus be written file.children and not file.children(), because again it is an impure function but should not have side effects (other than perhaps a read-only access to your file system).
A fourth example: Given the above, you should write System.currentTimeMillis. I do tend to write System.currentTimeMillis() however...
Using this forth case, my tentative answer would be: Parentheses are preferable when the function has either a side-effect; or if it is impure and depending on state not under the control of your program.
With this definition, it would not matter whether getRemoteSessionId has known side-effects or not. On the other hand, it implies to revert to writing file.children()...
The Scala style guide recommends:
Methods which act as accessors of any sort (either encapsulating a field or a logical property) should be declared without parentheses except if they have side effects.
It doesn't mention any other use case besides accessors. So the question boils down to whether you regard this method as an accessor, which in turns depends on how the rest of the class is set up and perhaps also on the (intended) call sites.

Either monadic operations

I just start to be used to deal with monadic operations.
For the Option type, this Cheat Sheet of Tony Morris helped:
http://blog.tmorris.net/posts/scalaoption-cheat-sheet/
So in the end it seems easy to understand that:
map transforms the value of inside an option
flatten permits to transform Option[Option[X]] in Option[X]
flatMap is somehow a map operation producing an Option[Option[X]] and then flattened to Option[X]
At least it is what I understand until now.
For Either, it seems a bit more difficult to understand since Either itself is not right biaised, does not have map / flatMap operations... and we use projection.
I can read the Scaladoc but it is not as clear as the Cheat Sheet on Options.
Can someone provide an Either Sheet Cheat to describe the basic monadic operations?
It seems to me that Either.joinRight is a bit like RightProjection.flatMap and seems to be the equivalent of Option.flatten for Either.
It seems to me that if Either was Right biaised, then Either.flatten would be Either.joinRight no?
In this question: Either, Options and for comprehensions I ask about for comprehension with Eiher, and one of the answers says that we can't mix monads because of the way it is desugared into map/flatMap/filter.
When using this kind of code:
def updateUserStats(user: User): Either[Error,User] = for {
stampleCount <- stampleRepository.getStampleCount(user).right
userUpdated <- Right(copyUserWithStats(user,stampleCount)).right
userSaved <- userService.update(userUpdated).right
} yield userSaved
Does this mean that all my 3 method calls must always return Either[Error,Something]?
I mean if I have a method call Either[Throwable,Something] it won't work right?
Edit:
Is Try[Something] exactly the same as a right-biaised Either[Throwable,Something]?
Either was never really meant to be an exception handling based structure. It was meant to represent a situation where a function really could possible return one of two distinct types, but people started the convention where the left type is a supposed to be a failed case and the right is success. If you want to return a biased type for some pass/fail type business checks logic, then Validation from scalaz works well. If you have a function that could return a value or a Throwable, then Try would be a good choice. Either should be used for situations where you really might get one of two possible types, and now that I am using Try and Validation (each for different types of situations), I never use Either any more.

Why do immutable objects enable functional programming?

I'm trying to learn scala and I'm unable to grasp this concept. Why does making an object immutable help prevent side-effects in functions. Can anyone explain like I'm five?
Interesting question, a bit difficult to answer.
Functional programming is very much about using mathematics to reason about programs. To do so, one needs a formalism that describe the programs and how one can make proofs about properties they might have.
There are many models of computation that provide such formalisms, such as lambda calculus and turing machines. And there's a certain degree of equivalency between them (see this question, for a discussion).
In a very real sense, programs with mutability and some other side effects have a direct mapping to functional program. Consider this example:
a = 0
b = 1
a = a + b
Here are two ways of mapping it to functional program. First one, a and b are part of a "state", and each line is a function from a state into a new state:
state1 = (a = 0, b = ?)
state2 = (a = state1.a, b = 1)
state3 = (a = state2.a + state2.b, b = state2.b)
Here's another, where each variable is associated with a time:
(a, t0) = 0
(b, t1) = 1
(a, t2) = (a, t0) + (b, t1)
So, given the above, why not use mutability?
Well, here's the interesting thing about math: the less powerful the formalism is, the easier it is to make proofs with it. Or, to put it in other words, it's too hard to reason about programs that have mutability.
As a consequence, there's very little advance regarding concepts in programming with mutability. The famous Design Patterns were not arrived at through study, nor do they have any mathematical backing. Instead, they are the result of years and years of trial and error, and some of them have since proved to be misguided. Who knows about the other dozens "design patterns" seen everywhere?
Meanwhile, Haskell programmers came up with Functors, Monads, Co-monads, Zippers, Applicatives, Lenses... dozens of concepts with mathematical backing and, most importantly, actual patterns of how code is composed to make up programs. Things you can use to reason about your program, increase reusability and improve correctness. Take a look at the Typeclassopedia for examples.
It's no wonder people not familiar with functional programming get a bit scared with this stuff... by comparison, the rest of the programming world is still working with a few decades-old concepts. The very idea of new concepts is alien.
Unfortunately, all these patterns, all these concepts, only apply with the code they are working with does not contain mutability (or other side effects). If it does, then their properties cease to be valid, and you can't rely on them. You are back to guessing, testing and debugging.
In short, if a function mutates an object then it has side effects. Mutation is a side effect. This is just true by definition.
In truth, in a purely functional language it should not matter if an object is technically mutable or immutable, because the language will never "try" to mutate an object anyway. A pure functional language doesn't give you any way to perform side effects.
Scala is not a pure functional language, though, and it runs in the Java environment in which side effects are very popular. In this environment, using objects that are incapable of mutation encourages you to use a pure functional style because it makes a side-effect oriented style impossible. You are using data types to enforce purity because the language does not do it for you.
Now I will say a bunch of other stuff in the hope that it helps this make sense to you.
Fundamental to the concept of a variable in functional languages is referential transparency.
Referential transparency means that there is no difference between a value, and a reference to that value. In a language where this is true, it makes it much simpler to think about a program works, since you never have to stop and ask, is this a value, or a reference to a value? Anyone who's ever programmed in C recognizes that a great part of the challenge of learning that paradigm is knowing which is which at all times.
In order to have referential transparency, the value that a reference refers to can never change.
(Warning, I'm about to make an analogy.)
Think of it this way: in your cell phone, you have saved some phone numbers of other people's cell phones. You assume that whenever you call that phone number, you will reach the person you intend to talk to. If someone else wants to talk to your friend, you give them the phone number and they reach that same person.
If someone changes their cell phone number, this system breaks down. Suddenly, you need to get their new phone number if you want to reach them. Maybe you call the same number six months later and reach a different person. Calling the same number and reaching a different person is what happens when functions perform side effects: you have what seems to be the same thing, but you try to use it, it turns out it's different now. Even if you expected this, what about all the people you gave that number to, are you going to call them all up and tell them that the old number doesn't reach the same person anymore?
You counted on the phone number corresponding to that person, but it didn't really. The phone number system lacks referential transparency: the number isn't really ALWAYS the same as the person.
Functional languages avoid this problem. You can give out your phone number and people will always be able to reach you, for the rest of your life, and will never reach anybody else at that number.
However, in the Java platform, things can change. What you thought was one thing, might turn into another thing a minute later. If this is the case, how can you stop it?
Scala uses the power of types to prevent this, by making classes that have referential transparency. So, even though the language as a whole isn't referentially transparent, your code will be referentially transparent as long as you use immutable types.
Practically speaking, the advantages of coding with immutable types are:
Your code is simpler to read when the reader doesn't have to look out for surprising side effects.
If you use multiple threads, you don't have to worry about locking because shared objects can never change. When you have side effects, you have to really think through the code and figure out all the places where two threads might try to change the same object at the same time, and protect against the problems that this might cause.
Theoretically, at least, the compiler can optimize some code better if it uses only immutable types. I don't know if Java can do this effectively, though, since it allows side effects. This is a toss-up at best, anyway, because there are some problems that can be solved much more efficiently by using side effects.
I'm running with this 5 year old explanation:
class Account(var myMoney:List[Int] = List(10, 10, 1, 1, 1, 5)) {
def getBalance = println(myMoney.sum + " dollars available")
def myMoneyWithInterest = {
myMoney = myMoney.map(_ * 2)
println(myMoney.sum + " dollars will accru in 1 year")
}
}
Assume we are at an ATM and it is using this code to give us account information.
You do the following:
scala> val myAccount = new Account()
myAccount: Account = Account#7f4a6c40
scala> myAccount.getBalance
28 dollars available
scala> myAccount.myMoneyWithInterest
56 dollars will accru in 1 year
scala> myAccount.getBalance
56 dollars available
We mutated the account balance when we wanted to check our current balance plus a years worth of interest. Now we have an incorrect account balance. Bad news for the bank!
If we were using val instead of var to keep track of myMoney in the class definition, we would not have been able to mutate the dollars and raise our balance.
When defining the class (in the REPL) with val:
error: reassignment to val
myMoney = myMoney.map(_ * 2
Scala is telling us that we wanted an immutable value but are trying to change it!
Thanks to Scala, we can switch to val, re-write our myMoneyWithInterest method and rest assured that our Account class will never alter the balance.
One important property of functional programming is: If I call the same function twice with the same arguments I'll get the same result. This makes reasoning about code much easier in many cases.
Now imagine a function returning the attribute content of some object. If that content can change the function might return different results on different calls with the same argument. => no more functional programming.
First a few definitions:
A side effect is a change in state -- also called a mutation.
An immutable object is an object which does not support mutation, (side effects).
A function which is passed mutable objects (either as parameters or in the global environment) may or may not produce side effects. This is up to the implementation.
However, it is impossible for a function which is passed only immutable objects (either as parameters or in the global environment) to produce side effects. Therefore, exclusive use of immutable objects will preclude the possibility of side effects.
Nate's answer is great, and here is some example.
In functional programming, there is an important feature that when you call a function with same argument, you always get same return value.
This is always true for immutable objects, because you can't modify them after create it:
class MyValue(val value: Int)
def plus(x: MyValue) = x.value + 10
val x = new MyValue(10)
val y = plus(x) // y is 20
val z = plus(x) // z is still 20, plus(x) will always yield 20
But if you have mutable objects, you can't guarantee that plus(x) will always return same value for same instance of MyValue.
class MyValue(var value: Int)
def plus(x: MyValue) = x.value + 10
val x = new MyValue(10)
val y = plus(x) // y is 20
x.value = 30
val z = plus(x) // z is 40, you can't for sure what value will plus(x) return because MyValue.value may be changed at any point.
Why do immutable objects enable functional programming?
They don't.
Take one definition of "function," or "prodecure," "routine" or "method," which I believe applies to many programming languages: "A section of code, typically named, accepting arguments and/or returning a value."
Take one definition of "functional programming:" "Programming using functions." The ability to program with functions is indepedent of whether state is modified.
For instance, Scheme is considered a functional programming language. It features tail calls, higher-order functions and aggregate operations using functions. It also has mutable objects. While mutability destroys some nice mathematical qualities, it does not necessarily prevent "functional programming."
I've read all the answers and they don't satisfy me, because they mostly talk about "immutability", and not about its relation to FP.
The main question is:
Why do immutable objects enable functional programming?
So I've searched a bit more and I have another answer, I believe the easy answer to this question is: "Because Functional Programming is basically defined on the basis of functions that are easy to reason about". Here's the definition of Functional Programming:
The process of building software by composing pure functions.
If a function is not pure -- which means receiving the same input, it's not guaranteed to always produce the same output (e.g., if the function relies on a global object, or date and time, or a random number to compute the output) -- then that function is unpredictable, that's it! Now exactly the same story goes about "immutability" as well, if objects are not immutable, a function with the same object as its input may have different results (aka side effects) each time used, and this will make it hard to reason about the program.
I first tried to put this in a comment, but it got longer than the limit, I'm by no means a pro so please take this answer with a grain of salt.