Is this function considered a side effect, thus bad FP design? - scala

private def retrieveSongId(songName: String): Option[JsValue] = {
val geniusStringResponse = Http("https://api.genius.com/search?q=" + formattedSongName)
.param("access_token", apiKey)
.asString
.body
//Extra processing with geniusStringResponse
}
}
Will the above function be considered a side effect due to the HTTP request? If so, is Scala code like this appropriate?

Yes, calling this function has the side effect of making an HTTP request. Calling this function may affect the result of another function (e.g. getSearchCount), and this function may return different results given the same input values (e.g. the server is not available all the time).
However this does not mean it is not appropriate. Any usable Scala program is going to have side effects, but the trick is to keep them as constrained as possible. A well-written Scala program will have a rich set of side-effect free classes and functions, and a relatively light layer of non-functional code that calls them.
In this case, for example, you should have a simple function that does the HTTP request and a second function that processes the results. The result-processing function can be pure functional code, and can be effectively tested with mock data.

Related

How to convert `fs2.Stream[IO, T]` to `Iterator[T]` in Scala

Need to fill in the methods next and hasNext and preserve laziness
new Iterator[T] {
val stream: fs2.Stream[IO, T] = ...
def next(): T = ???
def hasNext(): Boolean = ???
}
But cannot figure out how an earth to do this from a fs2.Stream? All the methods on a Stream (or on the "compiled" thing) are fairly useless.
If this is simply impossible to do in a reasonable amount of code, then that itself is a satisfactory answer and we will just rip out fs2.Stream from the codebase - just want to check first!
fs2.Stream, while similar in concept to Iterator, cannot be converted to one while preserving laziness. I'll try to elaborate on why...
Both represent a pull-based series of items, but the way in which they represent that series and implement the laziness differs too much.
As you already know, Iterator represents its pull in terms of the next() and hasNext methods, both of which are synchronous and blocking. To consume the iterator and return a value, you can directly call those methods e.g. in a loop, or use one of its many convenience methods.
fs2.Stream supports two capabilities that make it incompatible with that interface:
cats.effect.Resource can be included in the construction of a Stream. For example, you could construct a fs2.Stream[IO, Byte] representing the contents of a file. When consuming that stream, even if you abort early or do some strange flatMap, the underlying Resource is honored and your file handle is guaranteed to be closed. If you were trying to do the same thing with iterator, the "abort early" case would pose problems, forcing you to do something like Iterator[Byte] with Closeable and the caller would have to make sure to .close() it, or some other pattern.
Evaluation of "effects". In this context, effects are types like IO or Future, where the process of obtaining the value may perform some possibly-asynchronous action, and may perform side-effects. Asynchrony poses a problem when trying to force the process into a synchronous interface, since it forces you to block your current thread to wait for the asynchronous answer, which can cause deadlocks if you aren't careful. Libraries like cats-effect strongly discourage you from calling methods like unsafeRunSync.
fs2.Stream does allow for some special cases that prevent the inclusion of Resource and Effects, via its Pure type alias which you can use in place of IO. That gets you access to Stream.PureOps, but that only gets you methods that consume the whole stream by building a collection; the laziness you want to preserve would be lost.
Side note: you can convert an Iterator to a Stream.
The only way to "convert" a Stream to an Iterator is to consume it to some collection type via e.g. .compile.toList, which would get you an IO[List[T]], then .map(_.iterator) that to get an IO[Iterator[T]]. But ultimately that doesn't fit what you're asking for since it forces you to consume the stream to a buffer, breaking laziness.
#Dima mentioned the "XY Problem", which was poorly-received since they didn't really elaborate (initially) on the incompatibility, but they're right. It would be helpful to know why you're trying to make a Stream-to-Iterator conversion, in case there's some other approach that would serve your overall goal instead.

How to get the ID of the currently executing ZIO fiber from side effecting code

I know that I can get hold of the ID of the currently executing fiber by calling
ZIO.descriptor.map(_.id)
However, what I want, is an impure function that I can call from side effecting code, lets define it like
def getCurrentFiberId(): Option[FiberId]
so that
for {
fiberId <- ZIO.descriptor.map(_.id)
maybeId <- UIO(getCurrentFiberId())
} yield maybeId.contains(fiberId)
yields true. Is it possible to define such a function, and if so, how? Note that this question is strongly related to How to access fiber local data from side-effecting code in ZIO.
Not possible. That information is contained in an instance of a class called FiberContext which is practically the core of the ZIO Runtime in charge of interpreting the Effects.
Also, such class is internal implementation and understandably package private.
Additionally there's not only one instance for it, but one for each time you unsafeRun an effect and one more each time a fork is interpreted.
As execution of an effect is not bound to a Thread, ThreadLocal is not used and so, no hope of somehow extracting that info the way you want.

How to write unit test when you use Future?

I've wrote a class with some functions that does HTTP calls and returns a Future[String]. I use those functions inside a method that I need to write some tests:
def score(rawEvent: Json) = {
httpService
.get("name", formatJsonAttribute(rawEvent.name))
.onComplete { op =>
op.map { json =>
//What must be tested
}
}
}
The function onComplete doesn't have a return type - it returns Unit. How can I replace that onComplete to make my function return something to be tested?
I completely agree with #Michal, that you should always prefer map to onComplete with Futures. However I'd like to point out that, as you said yourself, what you wish to test is not the HTTP call itself (which relies on an HTTP client you probably don't need to test, a response from a server on which you may have no control, ...), but what you do with its answer.
So why not write a test, not on the function score, but on the function you wrote in your onComplete (or map, if you decided to change it)?
That way you will be able to test it with precise values for json, that you may wish to define as the result you will get from the server, but that you can control completely (for instance, you could test border cases without forcing your server to give unusual responses).
Testing that the two (HTTP call and callback function) sit well together is not a unit-test question, but an integration-test question, and should be done only once you know that your function does what is expected of it.
At that time, you will effectively need to check the value of a Future, in which case, you can use Await.result as #Michal suggested, or use the relevant constructs that your test framework gives. For instance, scalatest has an AsyncTestSuite trait for this kind of issue.
Use map instead of onComplete. It will also provide you with resolved value inside mapping function. The return type of score function will be Future[T] where T will be the result type of your processing.
In the tests you can use scala.concurrent.Await.result() function.

is a call to create a new object instance considered pure or not?

in functional programming terminology if I perform:
val a = new Client
val b = new Client
Is calling the above constructor twice considered a pure or a non pure function?
If you can substitute your two lines by:
val a = new Client
val b = a
without changing the whole program behavior, the object instantiation could be considered as pure (referential transparency).
It will fail if the Client constructor has any "observable" side effect, or if you use identity equality in the program.
Generally memory allocation is not considered to be a side-effect and so a constructor call in itself is considered to be pure.
Although it could eventually cause your program to run out of memory, this isn't something you can really control as a programmer, so "purity" is generally considered on the assumption of infinite memory.
If your constructor itself has a side-effect, then calling it would not be pure.

Parenthesis for not pure functions

I know that that I should use () by convention if a method has side effects
def method1(a: String): Unit = {
//.....
}
//or
def method2(): Unit = {
//.....
}
Do I have to do the same thing if a method doesn't have side effects but it's not pure, doesn't have any parameters and, of course, it returns the different results each time it's being called?
def method3() = getRemoteSessionId("login", "password")
Edit: After reviewing Luigi Plinge's comment, I came to think that I should rewrite the answer. This is also not a clear yes/no answer, but some suggestions.
First: The case regarding var is an interesting one. Declaring a var foo gives you a getter foo without parentheses. Obviously it is an impure call, but it does not have a side effect (it does not change anything unobserved by the caller).
Second, regarding your question: I now would not argue that the problem with getRemoteSessionId is that it is impure, but that it actually makes the server maintain some session login for you, so clearly you interfere destructively with the environment. Then method3() should be written with parentheses because of this side-effect nature.
A third example: Getting the contents of a directory should thus be written file.children and not file.children(), because again it is an impure function but should not have side effects (other than perhaps a read-only access to your file system).
A fourth example: Given the above, you should write System.currentTimeMillis. I do tend to write System.currentTimeMillis() however...
Using this forth case, my tentative answer would be: Parentheses are preferable when the function has either a side-effect; or if it is impure and depending on state not under the control of your program.
With this definition, it would not matter whether getRemoteSessionId has known side-effects or not. On the other hand, it implies to revert to writing file.children()...
The Scala style guide recommends:
Methods which act as accessors of any sort (either encapsulating a field or a logical property) should be declared without parentheses except if they have side effects.
It doesn't mention any other use case besides accessors. So the question boils down to whether you regard this method as an accessor, which in turns depends on how the rest of the class is set up and perhaps also on the (intended) call sites.