Scala: flatten mixed set of sets (or lists or arrays) - scala

I have a Set which incorporates a combination of strings, and subSets of strings, like so:
val s = Set(brand1-_test, Set(brand-one, brand_one, brandone), brands-two, brandthree1, Set(brand-three2, brand_three2, brandthree2))
How do I flatten this so that I have one flat set of strings? s.flatten doesn't work with the following error:
error: No implicit view available from Object => scala.collection.GenTraversableOnce[B]
Neither does flatMap. What am I missing here? The Set could just as easily incorporate a subLists or subArrays (they are the result of a previous function), if that makes a difference.

s.flatMap { case x:Iterable[_] => x; case y => Seq(y) }

Try putting it in a REPL:
scala> val s = Set("s1", Set("s2", "s3"))
s: scala.collection.immutable.Set[Object] = Set(s1, Set(s2, s3))
since you are providing two types (Set and String) then scala infers a type which covers both (Object in this case, but probably Any or AnyRef in most cases) which is not a collection and therefore cannot be flattened.

Related

Why is a scala Set casted to a Vector instead of a List?

I wonder why a Set[A] is converted to a Vector[A] if I ask for a Seq[A] subclass? To illustrate this take the following example:
val A = Set("one", "two")
val B = Set("one", "two", "three")
def f(one: Seq[String], other : Seq[String]) = {
one.intersect(other) match {
case head :: tail => head
case _ => "unknown"
}
}
f(A.to, B.to)
This function will return "unknown" instead of one. The reason is that A.to will be casted to a Vector[String]. The cons operator (::) is not defined for Vectors but for Lists so the second case is applied and "unknown" is returned. To fix this problem I could use the +: operator which is defined for all Seqs or convert the Set to List (A.to[List]). So my (academic) question is:
Why does A.to returns a Vector. At least according to the scala docs the default implementation of Seq is LinearSeq and the default of this is List. What did I got wrong?
Because it can, you are depending on runtime class implementation details, instead of compile-time type information guarantees. The to or toSeq method is free to return anything that typechecks, it could even generate a random number and chose a concrete class in base of that number, so you may get a List something other times a Vector or whatever. It may even decide in base of the operating system. Of course, I am being pedantic here and hopefully, they do not do that, but my point is, we can't really explain, that is what the implementation does and it may change in the future.
Also, the "default implementation of Seq is a List", applies only in the constructor. And again, they may change that in any moment.
So, if you want a List ask for a List, not for a Seq.

How map work on Options in Scala?

I have this two functions
def pattern(s: String): Option[Pattern] =
try {
Some(Pattern.compile(s))
} catch {
case e: PatternSyntaxException => None
}
and
def mkMatcher(pat: String): Option[String => Boolean] =
pattern(pat) map (p => (s: String) => p.matcher(s).matches)
Map is the higher-order function that applies a given function to each element of a list.
Now I am not getting that how map is working here as per above statement.
Map is the higher-order function that applies a given function to each element of a list.
This is an uncommonly restrictive definition of map.
At any rate, it works because it was defined by someone who did not hold to that.
For example, that someone wrote something akin to
sealed trait Option[+A] {
def map[B](f: A => B): Option[B] = this match {
case Some(value) => Some(f(value))
case None => None
}
}
as part of the standard library. This makes map applicable to Option[A]
It was defined because it makes sense to map many kinds of data structures not just lists.
Mapping is a transformation applied to the elements held by the data structure.
It applies a function to each element.
Option[A] can be thought of as a trivial sequence. It either has zero or one elements. To map it means to apply the function on its element if it has one.
Now it may not make much sense to use this facility all of the time, but there are cases where it is useful.
For example, it is one of a few distinct methods that, when present enable enable For Expressions to operate on a type. Option[A] can be used in for expressions which can be convenient.
For example
val option: Option[Int] = Some(2)
val squared: Option[Int] = for {
n <- option
if n % 2 == 0
} yield n * n
Interestingly, this implies that filter is also defined on Option[A].
If you just have a simple value it may well be clearer to use a less general construct.
Map is working the same way that it does with other collections types like List and Vector. It applies your function to the contents of the collection, potentially changing the type but keeping the collection type the same.
In many cases you can treat an Option just like a collection with either 0 or 1 elements. You can do a lot of the same operations on Option that you can on other collections.
You can modify the value
var opt = Option(1)
opt.map(_ + 3)
opt.map(_ * math.Pi)
opt.filter(_ == 1)
opt.collect({case i if i > 0 => i.toString })
opt.foreach(println)
and you can test the value
opt.contains(3)
opt.forall(_ > 0)
opt.exists(_ > 0)
opt.isEmpty
Using these methods you rarely need to use a match statement to unpick an Option.

Using zipped collections to initialize a case class in scala

I have to generate a collection of object from some collections of primitive types. So I tried the following two methods and both work:
case class Gr (x:Int,y:Int, z:Int)
val x = List(1,2,4,2,5)
val y = Array(1,2,7,4,5)
val z = Seq(1,2,4,8,5)
(x,y,z).zipped.toList.map(a => Gr(a._1,a._2,a._3))
(x,y,z).zipped.map(Gr:(Int,Int,Int) => Gr)
So, which one is better and how does the second one actually work ? And is there a better way ?
The 1st one can be reduced to (x,y,z).zipped.toList.map(Gr.tupled) and 2nd can be reduced to (x,y,z).zipped.map(Gr), which seems to be shorter/clearer to me.
Recall that the argument to map() is, essentially, A => B, so instead of writing ds.map(d => Math.sqrt(d)), which is type Double => Double, we can simply write ds.map(Math.sqrt) because sqrt() is the correct type.
In this case, the Gr constructor is type (A,A,A) => B. The Scala compiler is able to take the output of zipped and match the constructor type, so the constructor can be used as the argument to map().

sequence List of disjunction with tuples

I use sequenceU to turn inside type out while working with disjunctions in scalaz.
for e.g.
val res = List[\/[Errs,MyType]]
doing
res.sequenceU will give \/[Errs,List[MyType]]
Now if I have a val res2 = List[(\/[Errs,MyType], DefModel)] - List containing tuples of disjunctions; what's the right way to convert
res2 to \/[Errs,List[ (Mype,DefModel)]
As noted in the comments, the most straightforward way to write this is probably just with a traverse and map:
def sequence(xs: List[(\/[Errs, MyType], DefModel)]): \/[Errs, List[(MyType, DefModel)]] =
xs.traverseU { case (m, d) => m.map((_, d)) }
It's worth noting, though, that tuples are themselves traversable, so the following is equivalent:
def sequence(xs: List[(\/[Errs, MyType], DefModel)]): \/[Errs, List[(MyType, DefModel)]] =
xs.traverseU(_.swap.sequenceU.map(_.swap))
Note that this would be even simpler if the disjunction were on the right side of the tuple. If you're willing to make that change, you can also more conveniently take advantage of the fact that Traverse instances compose:
def sequence(xs: List[(DefModel, \/[Errs, MyType])]): \/[Errs, List[(DefModel, MyType)]] =
Traverse[List].compose[(DefModel, ?)].sequenceU(xs)
I'm using kind-projector here but you could also write out the type lambda.

Converting a Scala Map to a List

I have a map that I need to map to a different type, and the result needs to be a List. I have two ways (seemingly) to accomplish what I want, since calling map on a map seems to always result in a map. Assuming I have some map that looks like:
val input = Map[String, List[Int]]("rk1" -> List(1,2,3), "rk2" -> List(4,5,6))
I can either do:
val output = input.map{ case(k,v) => (k.getBytes, v) } toList
Or:
val output = input.foldRight(List[Pair[Array[Byte], List[Int]]]()){ (el, res) =>
(el._1.getBytes, el._2) :: res
}
In the first example I convert the type, and then call toList. I assume the runtime is something like O(n*2) and the space required is n*2. In the second example, I convert the type and generate the list in one go. I assume the runtime is O(n) and the space required is n.
My question is, are these essentially identical or does the second conversion cut down on memory/time/etc? Additionally, where can I find information on storage and runtime costs of various scala conversions?
Thanks in advance.
My favorite way to do this kind of things is like this:
input.map { case (k,v) => (k.getBytes, v) }(collection.breakOut): List[(Array[Byte], List[Int])]
With this syntax, you are passing to map the builder it needs to reconstruct the resulting collection. (Actually, not a builder, but a builder factory. Read more about Scala's CanBuildFroms if you are interested.) collection.breakOut can exactly be used when you want to change from one collection type to another while doing a map, flatMap, etc. — the only bad part is that you have to use the full type annotation for it to be effective (here, I used a type ascription after the expression). Then, there's no intermediary collection being built, and the list is constructed while mapping.
Mapping over a view in the first example could cut down on the space requirement for a large map:
val output = input.view.map{ case(k,v) => (k.getBytes, v) } toList