This question already has answers here:
Scalaz pipe operator connected with a list method
(2 answers)
Closed 4 years ago.
I find myself writing Scala programs more often recently.
I like to program in a style that often uses long method chains, but sometimes the transformation you want to apply is not a method of the object you want to transform. So I find myself defining:
class Better[T] (t: T){
def xform[U](func: T => U) = func(t)
}
implicit def improve[T](t: T) = new Better(t)
This allows my to write the chains I want, such as
val content = s3.getObject(bucket, key)
.getObjectContent
.xform(Source.fromInputStream)
.mkString
.toInt
Is there any similar facility already in the standard library? If so, how should I have discovered it without resorting to StackOverflow?
It's not the standard library, but it might be "standard enough": with Cats, you should be able to write something like
val content =
s3
.getObject(bucket, key)
.getObjectContent
.pure[Id].map(Source.fromInputStream)
.mkString
.toInt
where pure[Id] wraps the input value into the do-nothing Id monad, and then passes it as argument to Source.fromInputStream.
EDIT: This does not seem to work reliably. If the object already has a method map, then this method is called instead of Id.map.
Smaller example (just to demonstrate the necessary imports):
import cats.Id
import cats.syntax.applicative._
import cats.syntax.functor._
object Main {
def square(x: Int) = x * x
def main(args: Array[String]): Unit = {
println(42.pure[Id].map(square))
}
}
However, writing either
val content =
Source
.fromInputStream(
s3
.getObject(bucket, key)
.getObjectContent
)
.mkString
.toInt
or
val content =
Source
.fromInputStream(s3.getObject(bucket, key).getObjectContent)
.mkString
.toInt
does not require any extra dependencies, and frees you both from the burden of defining otherwise useless case classes, and also from the burden of reindenting your code every time you rename either content or s3.
It also shows how the expressions are actually nested, and what depends on what - there is a reason why the vast majority of mainstream programming languages of the past 50 years have a call-stack.
Related
In general I have problems figuring out how to write tailrecursive functions when working 'inside' monads. Here is a quick example:
This is from a small example application that I am writing to better understand FP in Scala. First of all the user is prompted to enter a Team consisting of 7 Players. This function recursively reads the input:
import cats.effect.{ExitCode, IO, IOApp}
import cats.implicits._
case class Player (name: String)
case class Team (players: List[Player])
/**
* Reads a team of 7 players from the command line.
* #return
*/
def readTeam: IO[Team] = {
def go(team: Team): IO[Team] = { // here I'd like to add #tailrec
if(team.players.size >= 7){
IO(println("Enough players!!")) >>= (_ => IO(team))
} else {
for {
player <- readPlayer
team <- go(Team(team.players :+ player))
} yield team
}
}
go(Team(Nil))
}
private def readPlayer: IO[Player] = ???
Now what I'd like to achieve (mainly for educational purposes) is to be able to write a #tailrec notation in front of def go(team: Team). But I don't see a possibility to have the recursive call as my last statement because the last statement as far as I can see always has to 'lift' my Team into the IO monad.
Any hint would be greatly appreciated.
First of all, this isn't necessary, because IO is specifically designed to support stack-safe monadic recursion. From the docs:
IO is trampolined in its flatMap evaluation. This means that you can safely call flatMap in a recursive function of arbitrary depth, without fear of blowing the stack…
So your implementation will work just fine in terms of stack safety, even if instead of seven players you needed 70,000 players (although at that point you might need to worry about the heap).
This doesn't really answer your question, though, and of course even #tailrec is never necessary, since all it does is verify that the compiler is doing what you think it should be doing.
While it's not possible to write this method in such a way that it can be annotated with #tailrec, you can get a similar kind of assurance by using Cats's tailRecM. For example, the following is equivalent to your implementation:
import cats.effect.IO
import cats.syntax.functor._
case class Player (name: String)
case class Team (players: List[Player])
// For the sake of example.
def readPlayer: IO[Player] = IO(Player("foo"))
/**
* Reads a team of 7 players from the command line.
* #return
*/
def readTeam: IO[Team] = cats.Monad[IO].tailRecM(Team(Nil)) {
case team if team.players.size >= 7 =>
IO(println("Enough players!!")).as(Right(team))
case team =>
readPlayer.map(player => Left(Team(team.players :+ player)))
}
This says "start with an empty team and repeatedly add players until we have the necessary number", but without any explicit recursive calls. As long as the monad instance is lawful (according to Cats's definition—there's some question about whether tailRecM even belongs on Monad), you don't have to worry about stack safety.
As a side note, fa.as(b) is equivalent to fa >>= (_ => IO(b)) but more idiomatic.
Also as a side note (but maybe a more interesting one), you can write this method even more concisely (and to my eye more clearly) as follows:
import cats.effect.IO
import cats.syntax.monad._
case class Player (name: String)
case class Team (players: List[Player])
// For the sake of example.
def readPlayer: IO[Player] = IO(Player("foo"))
/**
* Reads a team of 7 players from the command line.
* #return
*/
def readTeam: IO[Team] = Team(Nil).iterateUntilM(team =>
readPlayer.map(player => Team(team.players :+ player))
)(_.players.size >= 7)
Again there are no explicit recursive calls, and it's even more declarative than the tailRecM version—it's just "perform this action iteratively until the given condition holds".
One postscript: you might wonder why you'd ever use tailRecM when IO#flatMap is stack safe, and one reason is that you may someday decide to make your program generic in the effect context (e.g. via the finally tagless pattern). In this case you should not assume that flatMap behaves the way you want it to, since lawfulness for cats.Monad doesn't require flatMap to be stack safe. In that case it would be best to avoid explicit recursive calls through flatMap and choose tailRecM or iterateUntilM, etc. instead, since these are guaranteed to be stack safe for any lawful monadic context.
Say I have a local method/function
def withExclamation(string: String) = string + "!"
Is there a way in Scala to transform an instance by supplying this method? Say I want to append an exclamation mark to a string. Something like:
val greeting = "Hello"
val loudGreeting = greeting.applyFunction(withExclamation) //result: "Hello!"
I would like to be able to invoke (local) functions when writing a chain transformation on an instance.
EDIT: Multiple answers show how to program this possibility, so it seems that this feature is not present on an arbitraty class. To me this feature seems incredibly powerful. Consider where in Java I want to execute a number of operations on a String:
appendExclamationMark(" Hello! ".trim().toUpperCase()); //"HELLO!"
The order of operations is not the same as how they read. The last operation, appendExclamationMark is the first word that appears. Currently in Java I would sometimes do:
Function.<String>identity()
.andThen(String::trim)
.andThen(String::toUpperCase)
.andThen(this::appendExclamationMark)
.apply(" Hello "); //"HELLO!"
Which reads better in terms of expressing a chain of operations on an instance, but also contains a lot of noise, and it is not intuitive to have the String instance at the last line. I would want to write:
" Hello "
.applyFunction(String::trim)
.applyFunction(String::toUpperCase)
.applyFunction(this::withExclamation); //"HELLO!"
Obviously the name of the applyFunction function can be anything (shorter please). I thought backwards compatibility was the sole reason Java's Object does not have this.
Is there any technical reason why this was not added on, say, the Any or AnyRef classes?
You can do this with an implicit class which provides a way to extend an existing type with your own methods:
object StringOps {
implicit class RichString(val s: String) extends AnyVal {
def withExclamation: String = s"$s!"
}
def main(args: Array[String]): Unit = {
val m = "hello"
println(m.withExclamation)
}
}
Yields:
hello!
If you want to apply any functions (anonymous, converted from methods, etc.) in this way, you can use a variation on Yuval Itzchakov's answer:
object Combinators {
implicit class Combinators[A](val x: A) {
def applyFunction[B](f: A => B) = f(x)
}
}
A while after asking this question, I noticed that Kotlin has this built in:
inline fun <T, R> T.let(block: (T) -> R): R
Calls the specified function block with this value as its argument and returns
its result.
A lot more, quite useful variations of the above function are provided on all types, like with, also, apply, etc.
As a personal project, I am writing yet another Scala library for DynamoDb. It contains many interesting aspect such as reading and writing from an AST (just as Json), handling HTTP request, streaming data…
In order to be able able to communicate with DynamoDb, one needs to be able to read from / to the DynamoDb format (the “AST”). I extracted this reading / writing from / to the AST in a minimalist library: dynamo-ast. It contains two main type classes: DynamoReads[_] and DynamoWrites[_] (deeply inspired from Play Json).
I successfully coded the reading part of the library ending with a very simple code such as :
trait DynamoRead[A] { self =>
def read(dynamoType: DynamoType): DynamoReadResult[A]
}
case class TinyImage(url: String, alt: String)
val dynamoReads: DynamoReads[TinyImage] = {
for {
url <- read[String].at(“url”)
alt <- read[String].at(“alt”)
} yield (url, alt) map (TinyImage.apply _).tupled
}
dynamoReads.reads(dynamoAst) //yield DynamoReadResult[TinyImage]
At that point, I thought I wrote the most complicated part of the library and the DynamoWrite[_] part would be a piece of cake. I am however stuck on writing the DynamoWrite part. I was a fool.
My goal is to provide a very similar “user experience” with the DynamoWrite[_] and keep it as simple as possible such as :
val dynamoWrites: DynamoWrites[TinyImage] = {
for {
url <- write[String].at(“url”)
alt <- write[String].at(“alt”)
} yield (url, alt) map (TinyImage.unapply _) //I am not sure what to yield here nor how to code it
}
dynamoWrites.write(TinyImage(“http://fake.url”, “The alt desc”)) //yield DynamoWriteResult[DynamoType]
Since this library is deeply inspired from Play Json library (because I like its simplicity) I had a look at the sources several times. I kind of dislike the way the writer part is coded because to me, it adds a lot of overhead (basically each time a field a written, a new JsObject is created with one field and the resulting JsObject for a complete class is the merge of all the JsObjects containing one field).
From my understanding, the DynamoReads part can be written with only one trait (DynamoRead[_]). The DynamoWrites part however requires at least two such as :
trait DynamoWrites[A] {
def write(a: A): DynamoWriteResult[DynamoType]
}
trait DynamoWritesPath[A] {
def write(path:String, a: A): DynamoWriteResult[(String, DynamoType)]
}
The DynamoWrites[_] is to write plain String, Int… and the DynamoWritesPath[_] is to write a tuple of (String, WhateverTypeHere) (to simulate a “field”).
So writing write[String].at(“url”) would yield a DynamoWritesPath[String]. Now I have several issues :
I have no clue how to write flatMap for my DynamoWritesPath[_]
what should yield a for comprehension to be able to obtain a DynamoWrite[TinyImage]
What I wrote so far (totally fuzzy and not compiling at all, looking for some help on this). Not committed at the moment (gist): https://gist.github.com/louis-forite/cad97cc0a47847b2e4177192d9dbc3ae
To sum up, I am looking for some guidance on how to write the DynamoWrites[_] part. My goal is to provide for the client the most straight forward way to code a DynamoWrites[_] for a given type. My non goal is to write the perfect library and keep it a zero dependency library.
Link to the library: https://github.com/louis-forite/dynamo-ast
A Reads is a covariant functor. That means it has map. It can also be seen as a Monad which means it has flatMap (although a monad is overkill unless you need the previous field in order to know how to process the next):
trait Reads[A] {
def map [B] (f: A => B): Reads[B]
def flatMap [B](f: A => Reads[B]): Reads[B] // not necessary, but available
}
The reason for this, is that to transform a Reads[Int] to a Reads[String], you need to first read the Int, then apply the Int => String function.
But a Writes is a contravariant functor. It has contramap where the direction of the types is reversed:
trait Writes[A] {
def contramap [B](f: B => A): Reads[B]
}
The type on the function is reversed because to transform a Writes[Int] to a Writes[String] you must receive the String from the caller, apply the transformation String => Int and then write the Int.
I don't think it makes sense to provide for-comprehension syntax (flatMap) for the Writes API.
// here it is clear that you're extracting a string value
url <- read[String].at(“url”)
// but what does this mean for the write method?
url <- write[String].at("url")
// what is `url`?
That's probably why play doesn't provide one either, and why they focus on their combinator syntax (using the and function, their version of applicative functor builder?).
For reference: http://blog.tmorris.net/posts/functors-and-things-using-scala/index.html
You can achieve a more consistent API by using something like the and method in play json:
(write[String]("url") and write[String]("alt"))(unlift(TinyImage.unapply))
(read[String]("url") and read[String]("alt"))(TinyImage.apply)
// unfortunately, the type ascription is necessary in this case
(write[String]("url") and write[String]("alt")) {(x: TinyImage) =>
(x.url, x.alt)
}
// transforming
val instantDynamoType: DynamoFormat[Instant] =
format[String].xmap(Instant.parse _)((_: Instant).toString)
You can still use for-comprehension for the reads, although it's a bit over-powered (sort of implies that fields must be processed in-sequence, while that's not technically necessary).
Having used a Scala library that liberally exposes the reliance on implicits to the caller, I had experienced friction around this mechanism, as Scala makes it quite hard at times to debug implicit arguments, and because there's quite a bunch of places Scala would fill in values for implicit arguments from. (I could almost relate to it as "implicits hell" at one time).
At one time in my coding, Scala "complained" an implicit value could not be matched whereas in fact there was a "collision" of implicit values each coming from a different import.
Regardless of that perceived brittleness, it may at times feel borderline to an abuse of the context design pattern.
Why does it make sense to have implicit parameters in Scala?
In what scenarios would you use them and how would you avoid trouble?
As I'm not sure the experimentation-curve and potential for other team members getting totally confused are worth it, could you possibly suggest other scala idioms for sharing context between a multitude of Scala functions?
This questions is not for a specific implementation at hand, hopefully it's still a good fit for this site.
Generally, using a common type as an implicit parameter is a bad idea.
def badIdea(n: Int)(implicit s: String) = s * n
It doesn't take much to imagine why: you'll get conflicting implicits for the same thing if anyone else adopts this policy. Better to avoid it.
But who really wants to manually stuff in a scala.concurrent.ExecutionContext manually every time it's needed (which is practically everywhere)?
So the key is: when you have something with a specialized type, especially if it's bookkeeping that might need to be overridden manually but mostly should just do the right thing, then use implicit parameters. (This usually covers type classes as well.)
Then what do you do if you really need a string? Well, wrap it (at least formally--here it's a value class so in some contexts it will just pass the string around):
class MyWrappedString(val underlying: String) extends AnyVal {}
implicit val myString = new MyWrappedString("bird")
def decentIdea(n: Int)(implicit mws: MyWrappedString) = mws.underlying * n
scala> decentIdea(2) // In the bush?
res14: String = birdbird
Or if you think some additional logic is helpful, write a wrapper that takes an extra type parameter:
class ImplicitWithValue[K,V](val value: V) {
// Any extra generic logic goes here
}
object ImplicitWithValue {
class ValuePart[K] {
def apply[V](v: V) = new ImplicitWithValue[K,V](v)
}
private val genericValuePart = new ValuePart[Any]
private def typedValuePart[K] = genericValuePart.asInstanceOf[ValuePart[K]]
def apply[K] = typedValuePart[K]
}
Then you can
trait Marker1
implicit val implicit1 = ImplicitWithValue[Marker1]("fish")
def goodIdea(n: Int)(implicit ms: ImplicitWithValue[Marker1, String]) = ms.value * n
scala> goodIdea(3)
res17: String = fishfishfish
I'm asking a slight different question than this one. Suppose I have a code snippet:
def foo(i : Int) : List[String] = {
val s = i.toString + "!" //using val
s :: Nil
}
This is functionally equivalent to the following:
def foo(i : Int) : List[String] = {
def s = i.toString + "!" //using def
s :: Nil
}
Why would I choose one over the other? Obviously I would assume the second has a slight disadvantages in:
creating more bytecode (the inner def is lifted to a method in the class)
a runtime performance overhead of invoking a method over accessing a value
non-strict evaluation means I could easily access s twice (i.e. unnecesasarily redo a calculation)
The only advantage I can think of is:
non-strict evaluation of s means it is only called if it is used (but then I could just use a lazy val)
What are peoples' thoughts here? Is there a significant dis-benefit to me making all inner vals defs?
1)
One answer I didn't see mentioned is that the stack frame for the method you're describing could actually be smaller. Each val you declare will occupy a slot on the JVM stack, however, the whenever you use a def obtained value it will get consumed in the first expression you use it in. Even if the def references something from the environment, the compiler will pass .
The HotSpot should optimize both these things, or so some people claim. See:
http://www.ibm.com/developerworks/library/j-jtp12214/
Since the inner method gets compiled into a regular private method behind the scene and it is usually very small, the JIT compiler might choose to inline it and then optimize it. This could save time allocating smaller stack frames (?), or, by having fewer elements on the stack, make local variables access quicker.
But, take this with a (big) grain of salt - I haven't actually made extensive benchmarks to backup this claim.
2)
In addition, to expand on Kevin's valid reply, the stable val provides also means that you can use it with path dependent types - something you can't do with a def, since the compiler doesn't check its purity.
3)
For another reason you might want to use a def, see a related question asked not so long ago:
Functional processing of Scala streams without OutOfMemory errors
Essentially, using defs to produce Streams ensures that there do not exist additional references to these objects, which is important for the GC. Since Streams are lazy anyway, the overhead of creating them is probably negligible even if you have multiple defs.
The val is strict, it's given a value as soon as you define the thing.
Internally, the compiler will mark it as STABLE, equivalent to final in Java. This should allow the JVM to make all sorts of optimisations - I just don't know what they are :)
I can see an advantage in the fact that you are less bound to a location when using a def than when using a val.
This is not a technical advantage but allows for better structuring in some cases.
So, stupid example (please edit this answer, if you’ve got a better one), this is not possible with val:
def foo(i : Int) : List[String] = {
def ret = s :: Nil
def s = i.toString + "!"
ret
}
There may be cases where this is important or just convenient.
(So, basically, you can achieve the same with lazy val but, if only called at most once, it will probably be faster than a lazy val.)
For a local declaration like this (with no arguments, evaluated precisely once and with no code evaluated between the point of declaration and the point of evaluation) there is no semantic difference. I wouldn't be surprised if the "val" version compiled to simpler and more efficient code than the "def" version, but you would have to examine the bytecode and possibly profile to be sure.
In your example I would use a val. I think the val/def choice is more meaningful when declaring class members:
class A { def a0 = "a"; def a1 = "a" }
class B extends A {
var c = 0
override def a0 = { c += 1; "a" + c }
override val a1 = "b"
}
In the base class using def allows the sub class to override with possibly a def that does not return a constant. Or it could override with a val. So that gives more flexibility than a val.
Edit: one more use case of using def over val is when an abstract class has a "val" for which the value should be provided by a subclass.
abstract class C { def f: SomeObject }
new C { val f = new SomeObject(...) }