Noticing that my code was essentially iterating over a list and updating a value in a Map, I first created a trivial helper method which took a function for the transformation of the map value and return an updated map. As the program evolved, it gained a few other Map-transformation functions, so it was natural to turn it into an implicit value class that adds methods to scala.collection.immutable.Map[A, B]. That version works fine.
However, there's nothing about the methods that require a specific map implementation and they would seem to apply to a scala.collection.Map[A, B] or even a MapLike. So I would like it to be generic in the map type as well as the key and value types. This is where it all goes pear-shaped.
My current iteration looks like this:
implicit class RichMap[A, B, MapType[A, B] <: collection.Map[A, B]](
val self: MapType[A, B]
) extends AnyVal {
def updatedWith(k: A, f: B => B): MapType[A, B] =
self updated (k, f(self(k)))
}
This code does not compile because self updated (k, f(self(k))) isa scala.collection.Map[A, B], which is not a MapType[A, B]. In other words, the return type of self.updated is as if self's type was the upper type bound rather than the actual declared type.
I can "fix" the code with a downcast:
def updatedWith(k: A, f: B => B): MapType[A, B] =
self.updated(k, f(self(k))).asInstanceOf[MapType[A, B]]
This does not feel satisfactory because downcasting is a code smell and indicates misuse of the type system. In this particular case it would seem that the value will always be of the cast-to type, and that the whole program compiles and runs correctly with this downcast supports this view, but it still smells.
So, is there a better way to write this code to have scalac correctly infer types without using a downcast, or is this a compiler limitation and a downcast is necessary?
[Edited to add the following.]
My code which uses this method is somewhat more complex and messy as I'm still exploring a few ideas, but an example minimum case is the computation of a frequency distribution as a side-effect with code roughly like this:
var counts = Map.empty[Int, Int] withDefaultValue 0
for (item <- items) {
// loads of other gnarly item-processing code
counts = counts updatedWith (count, 1 + _)
}
There are three answers to my question at the time of writing.
One boils down to just letting updatedWith return a scala.collection.Map[A, B] anyway. Essentially, it takes my original version that accepted and returned an immutable.Map[A, B], and makes the type less specific. In other words, it's still insufficiently generic and sets policy on which types the caller uses. I can certainly change the type on the counts declaration, but that is also a code smell to work around a library returning the wrong type, and all it really does is move the downcast into the caller's code. So I don't really like this answer at all.
The other two are variations on CanBuildFrom and builders in that they essentially iterate over the map to produce a modified copy. One inlines a modified updated method, whereas the other calls the original updated and appends it to the builder and thus appears to make an extra temporary copy. Both are good answers which solve the type correctness problem, although the one that avoids an extra copy is the better of the two from a performance standpoint and I prefer it for that reason. The other is however shorter and arguably more clearly shows intent.
In the case of a hypothetical immutable Map that shares large trees in a similar vein to List, this copying would break the sharing and reduce performance and so it would be preferable to use the existing modified without performing copies. However, Scala's immutable maps don't appear to do this and so copying (once) seems to be the pragmatic solution that is unlikely to make any difference in practice.
Yes! Use CanBuildFrom. This is how the Scala collections library infers the closest collection type to the one you want, using CanBuildFrom evidence. So long as you have implicit evidence of CanBuildFrom[From, Elem, To], where From is the type of collection you're starting with, Elem is the type contained within the collection, and To is the end result you want. The CanBuildFrom will supply a Builder to which you can add elements to, and when you're done, you can call Builder#result() to get the completed collection of the appropriate type.
In this case:
From = MapType[A, B]
Elem = (A, B) // The type actually contained in maps
To = MapType[A, B]
Implementation:
import scala.collection.generic.CanBuildFrom
implicit class RichMap[A, B, MapType[A, B] <: collection.Map[A, B]](
val self: MapType[A, B]
) extends AnyVal {
def updatedWith(k: A, f: B => B)(implicit cbf: CanBuildFrom[MapType[A, B], (A, B), MapType[A, B]]): MapType[A, B] = {
val builder = cbf()
builder ++= self.updated(k, f(self(k)))
builder.result()
}
}
scala> val m = collection.concurrent.TrieMap(1 -> 2, 5 -> 3)
m: scala.collection.concurrent.TrieMap[Int,Int] = TrieMap(1 -> 2, 5 -> 3)
scala> m.updatedWith(1, _ + 10)
res1: scala.collection.concurrent.TrieMap[Int,Int] = TrieMap(1 -> 12, 5 -> 3)
Please note that updated method returns Map class, rather than generic, so I would say you should be fine returning Map as well. But if you really want to return a proper type, you could have a look at implementation of updated in List.updated
I've wrote a small example. I'm not sure it covers all the cases, but it works on my tests. I also used mutable Map, because it was harder for me to test immutable, but I guess it can be easily converted.
implicit class RichMap[A, B, MapType[x, y] <: Map[x, y]](val self: MapType[A, B]) extends AnyVal {
import scala.collection.generic.CanBuildFrom
def updatedWith[R >: B](k: A, f: B => R)(implicit bf: CanBuildFrom[MapType[A, B], (A, R), MapType[A, R]]): MapType[A, R] = {
val b = bf(self)
for ((key, value) <- self) {
if (key != k) {
b += (key -> value)
} else {
b += (key -> f(value))
}
}
b.result()
}
}
import scala.collection.immutable.{TreeMap, HashMap}
val map1 = HashMap(1 -> "s", 2 -> "d").updatedWith(2, _.toUpperCase()) // map1 type is HashMap[Int, String]
val map2 = TreeMap(1 -> "s", 2 -> "d").updatedWith(2, _.toUpperCase()) // map2 type is TreeMap[Int, String]
val map3 = HashMap(1 -> "s", 2 -> "d").updatedWith(2, _.asInstanceOf[Any]) // map3 type is HashMap[Int, Any]
Please also note that CanBuildFrom pattern is much more powerfull and this example doesn't use all of it's power. Thanks to CanBuildFrom some operations can change the type of collection completely like BitSet(1, 3, 5, 7) map {_.toString } type is actually SortedSet[String].
Related
Since Scala 2.12 (or is it 2.13, can't be sure), the compiler can infer latent type arguments across multiple methods:
def commutative[
A,
B
]: ((A, B) => (B, A)) = {???} // implementing omitted
val a = (1 -> "a")
val b = commutative.apply(a)
The last line successfully inferred A = Int, B = String, unfortunately, this requires an instance a: (Int, String) to be given.
Now I'd like to twist this API for a bit and define the following function:
def findApplicable[T](fn: Any => Any)
Such that findApplicable[(Int, String)](commutative) automatically generate the correct function specialised for A = Int, B = String. Is there a way to do it within the language's capability? Or I'll have to upgrade to scala 3 to do this?
UPDATE 1 it should be noted that the output of commutative can be any type, not necessarily a Function2, e.g. I've tried the following definition:
trait SummonedFn[-I, +O] extends (I => O) {
final def summon[II <: I]: this.type = this
}
Then redefine commutative to use it:
def commutative[
A,
B
]: SummonedFn[(A, B), (B, A)] = {???} // implementing omitted
val b = commutative.summon[(Int, String)]
Oops, this doesn't work, type parameters don't get equal treatment like value parameters
If at some point some call-site knows the types of arguments (they aren't actually Any => Any) it is doable using type classes:
trait Commutative[In, Out] {
def swap(in: In): Out
}
object Commutative {
def swap[In, Out](in: In)(implicit c: Commutative[In, Out]): Out =
c.swap(in)
implicit def tuple2[A, B]: Commutative[(A, B), (B, A)] =
in => in.swap
}
At call site:
def use[In, Out](ins: List[In])(implicit c: Commutative[In, Out]): List[Out] =
ins.map(Commutative.swap(_))
However, this way you have to pass both In as well as Out as type parameters. If there are multiple possible Outs for a single In type, then there is not much you can do.
But if you want to have Input type => Output type implication, you can use dependent types:
trait Commutative[In] {
type Out
def swap(in: In): Out
}
object Commutative {
// help us transform dependent types back into generics
type Aux[In, Out0] = Commutative[In] { type Out = Out0 }
def swap[In](in: In)(implicit c: Commutative[In]): c.Out =
c.swap(in)
implicit def tuple2[A, B]: Commutative.Aux[(A, B), (B, A)] =
in => in.swap
}
Call site:
// This code is similar to the original code, but when the compiler
// will be looking for In it will automatically figure Out.
def use[In, Out](ins: List[In])(implicit c: Commutative.Aux[In, Out]): List[Out] =
ins.map(Commutative.swap(_))
// Alternatively, without Aux pattern:
def use2[In](ins: List[In])(implicit c: Commutative[In]): List[c.Out] =
ins.map(Commutative.swap(_))
def printMapped(list: List[(Int, String)]): Unit =
println(list)
// The call site that knows the input and provides implicit
// will also know the exact Out type.
printMapped(use(List("a" -> 1, "b" -> 2)))
printMapped(use2(List("a" -> 1, "b" -> 2)))
That's how you can solve the issue when you know the exact input type. If you don't know it... then you cannot use compiler (neither in Scala 2 nor in Scala 3) to generate this behavior as you have to implement this functionality using some runtime reflection, e.g. checking types using isInstanceOf, casting to some assumed types and then running predefined behavior etc.
I'm not sure I understand the question 100%, but it seems like you want to do some kind of advanced partial type application. Usually you can achieve such an API by introducing an intermediary class. And to preserve as much type information as possible you can use a method with a dependent return type.
class FindApplicablePartial[A] {
def apply[B](fn: A => B): fn.type = fn
}
def findApplicable[A] = new FindApplicablePartial[A]
scala> def result = findApplicable[(Int, String)](commutative)
def result: SummonedFn[(Int, String),(String, Int)]
And actually in this case since findApplicable itself doesn't care about type B (i.e. B doesn't have a context bound or other uses), you don't even need the intermediary class, but can use a wildcard/existential type instead:
def findApplicable[A](fn: A => _): fn.type = fn
This works just as well.
I am wondering if there is a typeclass in Cats or Scalaz which offers an operator like this:
def scan[G[_],A,B](zero: B)(g: G[A],f: (A,B) => B):G[B]
Or if there exists some mathematical definition for such operator (something like Monad for bind/flatMap).
The idea of this typeclass would be apply a binary function to a type constructor and obtain back the same type constructor but with a different type parameter (the same type that binary function returns).
I think would be similar to scanLeft of Scala Standard Library collections.
One of possible implementation is to traverse with a State:
import cats._, data._, implicits._
def scan[G[_]: Traverse: Applicative: MonoidK, A, B](list: G[A], zero: B, f: (B, A) => B): G[B] = {
def generate(a: A): State[B, B] =
for {
prev <- State.get[B]
next = f(prev, a)
_ <- State.set(next)
} yield next
zero.pure[G] <+> list.traverse(generate).runA(zero).value
}
This works like scanLeft from stdlib for Vectors and Lists (but not options!), but requires quite a lot of typeclasses! Unfortunately, stdlib scanLeft prepends the initial element, so the result collection will always be one element larger than the original one and no single typeclass provides any similar operation.
If you are fine with not prepending zero, all you need on G[_] is Traverse and this is not half-bad. If you're not, you're probably better off generalizing using subtypes
Answering the original question, no, I don't think there is such a typeclass already. However, you can implement similar functionality with Foldable.
Using Cats:
import cats.data.NonEmptyList
import cats.Foldable
implicit class ScanLeftable[F[_], T](val ts: F[T]) extends AnyVal {
def scanLeft2[U](zero: U)(f: (U, T) => U)
(implicit fo: Foldable[F]): NonEmptyList[U] = {
Foldable[F].foldLeft(ts, NonEmptyList.of(zero)) { (lu, t) =>
f(lu.head, t) :: lu
}.reverse
}
}
import cats.instances.list._
val z = List(5, 10).scanLeft2(0)((a, b) => a + b)
println(z == NonEmptyList.of(0, 5, 15)) //true
You might try to make it more generic in terms of the return type, or return a lazy structure like Iterator, though. However, I'm not sure how generic it could be without introducing a new typeclass.
EDIT: the method is scanLeft2 strictly so that I can be sure the standard library one isn't called.
I have read TypeTag related article, but I am unable to realize filter a collection by elements type.
Example:
trait A
class B extends A
class C extends A
val v = Vector(new B,new C)
v filter ( _.isInstanceOf[B] )
The code above works fine.
However I want to extract filter out of v. E.g.
def filter[T,T2](data:Traversable[T2]) = (data filter ( _.isInstanceOf[T])).asInstanceOf[Traversable[T]]
//Then filter v by
filter[B,A](v)
In this case I get warning abstract type T is unchecked since it is eliminated by erasure. I tried to use TypeTag, but it seems not easy to get Type on runtime.
Is there any elegant solution to realize function filter?
Any solution via scala macro is also acceptable.
You need to provide a ClassTag, not a TypeTag, and use pattern matching. ClassTags work well with pattern matching. You can even use the collect method to perform the filter and map together:
def filter[T, T2](data: Traversable[T2])(implicit ev: ClassTag[T]) = data collect {
case t: T => t
}
For example:
val data = Seq(new B, new B, new C, new B)
filter[B, A](data) //Traversable[B] with length 3
filter[C, A](data) //Traversable[C] with length 1
One problem with this is that it might not work as expected with nested generic types.
The collect method takes in a parameter of type PartialFunction, representing a function that does not need to be defined on the entire domain. When using collect elements where the PartialFunction is undefined are filtered out, and elements that matched some case statement are mapped accordingly.
You can also use existential types and let the compiler deduce the type of the data parameter for a more concise syntax. You can also use context bounds:
def filter[T : ClassTag](data: Traversable[_]) = data collect { case t: T => t }
filter[B](data)
One problem with the methods here is that there is a significant difference between the native filter method you have: these methods always returns a Traversable while the native filter returns the best type it can. For example:
val data = Vector(new B, new B, new C, new B)
data filter { _.isInstanceOf[B] } //Vector[A]
data filter { _.isInstanceOf[B] } map { _.asInstanceOf[B] } //Vector[B]
data collect { case t: B => t } //Vector[B]. Note that if you know the type at the calling point, this is pretty concise and might not need a helper method at all
//As opposed to:
filter[B](data) //Traversable[B], not a Vector!
You can fix this by using the CanBuildFrom pattern using another implicit parameter. You can also use implicit classes to essentially add the method to the class (as opposed to calling the method in the static style shown above). This all adds up to a pretty complicated method, but I'll leave it here if you're interested in these enhancements:
implicit class RichTraversable[T2, Repr <: TraversableLike[T2, Repr], That](val trav: TraversableLike[T2, Repr]) extends AnyVal {
def withType[T : ClassTag](implicit bf: CanBuildFrom[Repr, T, That]) = trav.collect {
case t: T => t
}
}
This would allow you to do:
data.withType[B] //Vector[B], as desired
There must definitely be an easier way. The below works but I would hope even for a simpler solution
trait A
case class B()
case class C()
import scala.reflect.runtime.universe._
import scala.util.Try
val m = runtimeMirror(getClass.getClassLoader)
def filter[T](data:Traversable[_])(implicit t:TypeTag[T]) = data collect {
case i if Try(i.getClass.asSubclass(m.runtimeClass(typeOf[T].typeSymbol.asClass))).isSuccess => i.asInstanceOf[T]
}
val v= Vector(new B,new C)
scala> println(filter[B](v))
Vector(B()) //This gets printed
I'm working on this Functional Programming in Scala exercise:
// But what if our list has an element type that doesn't have a Monoid instance?
// Well, we can always map over the list to turn it into a type that does.
As I understand this exercise, it means that, if we have a Monoid of type B, but our input List is of type A, then we need to convert the List[A] to List[B], and then call foldLeft.
def foldMap[A, B](as: List[A], m: Monoid[B])(f: A => B): B = {
val bs = as.map(f)
bs.foldLeft(m.zero)((s, i) => m.op(s, i))
}
Does this understanding and code look right?
First I'd simplify the syntax of the body a bit:
def foldMap[A, B](as: List[A], m: Monoid[B])(f: A => B): B =
as.map(f).foldLeft(m.zero)(m.ops)
Then I'd move the monoid instance into its own implicit parameter list:
def foldMap[A, B](as: List[A])(f: A => B)(implicit m: Monoid[B]): B =
as.map(f).foldLeft(m.zero)(m.ops)
See the original "Type Classes as Objects and Implicits" paper for more detail about how Scala implements type classes using implicit parameter resolution, or this answer by Rex Kerr that I've also linked above.
Next I'd switch the order of the other two parameter lists:
def foldMap[A, B](f: A => B)(as: List[A])(implicit m: Monoid[B]): B =
as.map(f).foldLeft(m.zero)(m.ops)
In general you want to place the parameter lists containing parameters that change the least often first, in order to make partial application more useful. In this case there may only be one possible useful value of A => B for any A and B, but there are lots of values of List[A].
For example, switching the order allows us to write the following (which assumes a monoid instance for Bar):
val fooSum: List[Foo] => Bar = foldMap(fooToBar)
Finally, as a performance optimization (mentioned by stew above), you could avoid creating an intermediate list by moving the application of f into the fold:
def foldMap[A, B](f: A => B)(as: List[A])(implicit m: Monoid[B]): B =
as.foldLeft(m.zero) {
case (acc, a) => m.op(acc, f(a))
}
This is equivalent and more efficient, but to my eye much less clear, so I'd suggest treating it like any optimization—if you need it, use it, but think twice about whether the gains are really worth the loss of clarity.
My problem is phrased in the code below.
I'm trying to get some input that has the .map function in it. I know that if I call .map to it, it will return an Int to me.
// In my case, they are different representations of Ints
// By that I mean that in the end it all boils down to Int
val list: Seq[Int] = Seq(1,2,3,4)
val optInt: Option[Int] = Some(1)
// I can use a .map with a Seq, check!
list.map {
value => println(value)
}
// I can use it with an Option, check!
optInt.map {
value => println(value)
}
// Well, you're asking yourself why do I have to do it,
// Why don't I use foreach to solve my problem. Check!
list.foreach(println)
optInt.foreach(println)
// The problem is that I don't know what I'm going to get as input
// The only thing I know is that it's "mappable" (it has the .map function)
// And that if I were to apply .map it would return Ints to me
// Like this:
def printValues(genericInputThatHasMap: ???) {
genericInputThatHasMap.map {
value => println(value)
}
}
// The point is, what do I have to do to have this functionality?
// I'm researching right now, but I still haven't found anything.
// That's why I'm asking it here =(
// this works:
def printValues(genericInputThatHasMap: Seq[Int]) {
genericInputThatHasMap.map {
value => println(value)
}
}
Thanks in advance! Cheers!
First for a quick note about map and foreach. If you're only interested in performing an operation with a side effect (e.g., printing to standard output or a file, etc.) on each item in your collection, use foreach. If you're interested in creating a new collection by transforming each element in your old one, use map. When you write xs.map(println), you will in fact print all the elements of the collection, but you'll also get back a (completely useless) collection of units, and will also potentially confuse future readers of your code—including yourself—who expect foreach to be used in a situation like this.
Now on to your problem. You've run into what is in my opinion one of the ugliest warts of the Scala standard library—the fact that methods named map and foreach (and flatMap) get magical treatment at the language level that has nothing to do with a specific type that defines them. For example, I can write this:
case class Foo(n: Int) {
def foreach(f: Int => Unit) {
(0 until n) foreach f
}
}
And use it in a for loop like this, simply because I've named my method foreach:
for (i <- Foo(10)) println(i)
You can use structural types to do something similar in your own code:
def printValues(xs: { def foreach(f: (Int) => Unit): Unit }) {
xs foreach println
}
Here any xs with an appropriately typed foreach method—for example an Option[Int] or a List[Int]—will compile and work as expected.
Structural types get a lot messier when you're trying to work with map or flatMap though, and are unsatisfying in other ways—they impose some ugly overhead due to their use of runtime reflection, for example. They actually have to be explicitly enabled in Scala 2.10 to avoid warnings for these reasons.
As senia's answer points out, the Scalaz library provides a much more coherent approach to the problem through the use of type classes like Monad. You wouldn't want to use Monad, though, in a case like this: it's a much more powerful abstraction than you need. You'd use Each to provide foreach, and Functor for map. For example, in Scalaz 7:
import scalaz._, Scalaz._
def printValues[F[_]: Each](xs: F[Int]) = xs foreach println
Or:
def incremented[F[_]: Functor](xs: F[Int]) = xs map (_ + 1)
To summarize, you can do what you want in a standard, idiomatic, but arguably ugly way with structural types, or you can use Scalaz to get a cleaner solution, but at the cost of a new dependency.
My thoughts on the two approaches.
Structural Types
You can use a structural type for foreach, but for map it doesn't appear you can construct one to work across multiple types. For example:
import collection.generic.CanBuildFrom
object StructuralMap extends App {
type HasMapAndForeach[A] = {
// def map[B, That](f: (A) ⇒ B)(implicit bf: CanBuildFrom[List[A], B, That]): That
def foreach[B](f: (A) ⇒ B): Unit
}
def printValues(xs: HasMapAndForeach[Any]) {
xs.foreach(println _)
}
// def mapValues(xs: HasMapAndForeach[Any]) {
// xs.map(_.toString).foreach(println _)
// }
def forComp1(xs: HasMapAndForeach[Any]) {
for (i <- Seq(1,2,3)) println(i)
}
printValues(List(1,2,3))
printValues(Some(1))
printValues(Seq(1,2,3))
// mapValues(List(1,2,3))
}
scala> StructuralMap.main(new Array[String](0))
1
2
3
4
5
6
7
8
9
10
See the map method commented out above, it has List hardcoded as a type parameter in the CanBuildFrom implicit. There might be a way to pick up the type generically - I will leave that as a question to the Scala type gurus out there. I tried substituting HasMapAndForeach and this.type for List but neither of those worked.
The usual performance caveats about structural types apply.
Scalaz
Since structural types is a dead end if you want to support map then let's look at the scalaz approach from Travis and see how it works. Here are his methods:
def printValues[F[_]: Each](xs: F[Int]) = xs foreach println
def incremented[F[_]: Functor](xs: F[Int]) = xs map (_ + 1)
(In the below correct me if I am wrong, I am using this as a scalaz learning experience)
The typeclasses Each and Functor are used to restrict the types of F to ones where implicits are available for Each[F] or Functor[F], respectively. For example, in the call
printValues(List(1,2,3))
the compiler will look for an implicit that satisfies Each[List]. The Each trait is
trait Each[-E[_]] {
def each[A](e: E[A], f: A => Unit): Unit
}
In the Each object there is an implicit for Each[TraversableOnce] (List is a subtype of TraversableOnce and the trait is contravariant):
object Each {
implicit def TraversableOnceEach[A]: Each[TraversableOnce] = new Each[TraversableOnce] {
def each[A](e: TraversableOnce[A], f: A => Unit) = e foreach f
}
}
Note that the "context bound" syntax
def printValues[F[_]: Each](xs: F[Int])
is shorthand for
def printValues(xs: F[Int])(implicit ev: Each[F])
Both of these denote that F is a member of the Each typeclass. The implicit that satisfies the typeclass is passed as the ev parameter to the printValues method.
Inside the printValues or incremented methods the compiler doesn't know that xs has a map or foreach method because the type parameter F doesn't have any upper or lower bounds. As far as it can tell F is AnyRef and satisfies the context bound (is part of the typeclass). What is in scope that does have foreach or map? MA from scalaz has both foreach and map methods:
trait MA[M[_], A] {
def foreach(f: A => Unit)(implicit e: Each[M]): Unit = e.each(value, f)
def map[B](f: A => B)(implicit t: Functor[M]): M[B] = t.fmap(value, f)
}
Note that the foreach and map methods on MA are constrained by the Each or Functor typeclass. These are the same constraints from the original methods so the constraints are satisfied and an implicit conversion to MA[F, Int] takes place via the maImplicit method:
trait MAsLow extends MABLow {
implicit def maImplicit[M[_], A](a: M[A]): MA[M, A] = new MA[M, A] {
val value = a
}
}
The type F in the original method becomes type M in MA.
The implicit parameter that was passed into the original call is then passed as the implicit parameter into foreach or map. In the case of foreach, each is called on its implicit parameter e. In the example from above the implicit ev was type Each[TraversableOnce] because the original parameter was a List, so e is the same type. foreach calls each on e which delegates to foreach on TraversableOnce.
So the order of calls for printValues(List(1,2,3)) is:
new Each[TraversableOnce] -> printValues -> new MA -> MA.foreach -> Each.each -> TraversableOnce.foreach
As they say, there is no problem that can't be solved with an extra level of indirection :)
You can use MA from scalaz:
import scalaz._
import Scalaz._
def printValues[A, M[_]](ma: MA[M, A])(implicit e: Each[M]) {
ma |>| { println _ }
}
scala> printValues(List(1, 2, 3))
1
2
3
scala> printValues(Some(1))
1