Failure parsing views - scala

I define the following diff function on Seq[Int] which uses view to avoid copying data:
object viewDiff {
def main(args: Array[String]){
val values = 1 to 10
println("diff="+diffInt(values).toList)
}
def diffInt(seq: Seq[Int]): Seq[Int] = {
val v1 = seq.view(0,seq.size-1)
val v2 = seq.view(1,seq.size)
(v2,v1).zipped.map(_-_)
}
}
This code fails with an UnsupportedOperationException. If I uses slice instead of view it works.
Can anyone explain this?
[tested with scala 2.10.5 and 2.11.6]
Edit
I selected Carlos's answer because it was the (first) correct explanation of the problem. However, som-snytt's answer is more detailed, and provides a simple solution using view on the zipped object.
I also posted a very simple solution that works for this specific case.
Note
In the code above, I also made a mistake on the algorithm to compute a seq derivative. The last line should be: seq.head +: (v2,v1).zipped.map( _-_ )

When you use seq.view in your code, you are creating SeqView[Int, Seq[Int]] objects that cannot be zipped, as it can't support TraversableView.Builder.result. But you can use something like this:
def diffInt(seq: Seq[Int]) = {
val v1 = seq.view(0,seq.size-1)
val v2 = seq.view(1,seq.size)
(v2.toList,v1.toList).zipped.map {
case (x1: Int, y1: Int) => x1-y1
case _ => 0
}
}

That looks strange indeed, and the zipped seem to be the culprit. What you can do instead, as a minimal change, is to use zip:
def diffInt(seq: Seq[Int]): Seq[Int] = {
val v1 = seq.view(0,seq.size-1)
val v2 = seq.view(1,seq.size)
v2.zip(v1).map { case (x1, x2) => x1 - x2 }
}

Normally, you don't build views when mapping them, since you want to defer building the result collection until you force the view.
Since Tuple2Zipped is not a view, on map it tries to build a result that is the same type as its first tupled collection, which is a view.
SeqView's CanBuildFrom yields the NoBuilder that refuses to be forced.
Since the point of using Tuple2Zipped is to avoid intermediate collections, you also want to avoid forcing prematurely, so take a view before mapping:
scala> Seq(1,2,3).view(1,3)
res0: scala.collection.SeqView[Int,Seq[Int]] = SeqViewS(...)
scala> Seq(1,2,3).view(0,2)
res1: scala.collection.SeqView[Int,Seq[Int]] = SeqViewS(...)
scala> (res0, res1).zipped
res2: scala.runtime.Tuple2Zipped[Int,scala.collection.SeqView[Int,Seq[Int]],Int,scala.collection.SeqView[Int,Seq[Int]]] = (SeqViewS(...), SeqViewS(...)).zipped
scala> res2.view map { case (i: Int, j: Int) => i - j }
res3: scala.collection.TraversableView[Int,Traversable[_]] = TraversableViewM(...)
scala> .force
res4: Traversable[Int] = List(1, 1)
Here's a look at the mechanism:
import collection.generic.CanBuildFrom
import collection.SeqView
import collection.mutable.ListBuffer
import language._
object Test extends App {
implicit val cbf = new CanBuildFrom[SeqView[Int, Seq[Int]], Int, Seq[Int]] {
def apply(): scala.collection.mutable.Builder[Int,Seq[Int]] = ListBuffer.empty[Int]
def apply(from: scala.collection.SeqView[Int,Seq[Int]]): scala.collection.mutable.Builder[Int,Seq[Int]] = apply()
}
//val res = (6 to 10 view, 1 to 5 view).zipped.map[Int, List[Int]](_ - _)
val res = (6 to 10 view, 1 to 5 view).zipped.map(_ - _)
Console println res
}

Ah, those good old times of imperative programming:
val seq = 1 to 10
val i1 = seq.iterator
val i2 = seq.iterator.drop(1)
val i = scala.collection.mutable.ArrayBuffer.empty[Int]
while (i1.hasNext && i2.hasNext) i += i2.next - i1.next
println(i)
I'd say it's as efficient as it gets (no copying and excessive allocations), and pretty readable.

As Carlos Vilchez wrote, zipped cannot work with view. Looks like a bug to me...
But this only happens if the first zipped seq is a view. As the zipped is stopping when any of its seq is finished, it is possible to use the whole input seq as first zipped item and inverse the - operation:
def diffInt2(seq: Seq[Int]): Seq[Int] = {
val v1 = seq//.view(0,seq.size-1)
val v2 = seq.view(1,seq.size)
seq.head +: (v1,v2).zipped.map( (a,b) => b-a ) // inverse v1 and v2 order
}

Related

Scala Yeild returning Try[Either[]] rather then Either

I am trying to do some handson with scala basic operations and got stuck here in the following sample code
def insuranceRateQuote(a: Int, tickets:Int) : Either[Exception, Double] = {
// ... something
Right(Double)
}
def parseInsuranceQuoteFromWebForm(age: String, numOfTickets: String) : Either[Exception, Double]= {
try{
val a = Try(age.toInt)
val tickets = Try(numOfTickets.toInt)
for{
aa <- a
t <- tickets
} yield insuranceRateQuote(aa,t) // ERROR HERE
} catch {
case _ => Left(new Exception)}
}
The Error I am getting is that it says found Try[Either[Exception,Double]]
I am not getting why it is wrapper under Try of Either
PS - This must not be the perfect way to do in scala so feel free to post your sample code :)
The key to understand is that for-comprehensions might transform what is inside the wrapper but will not change the wrapper itself. The reason is because for-comprehension de-sugar to map/flatMap calls on the wrapper determined in the first step of the chain. For example consider the following snippet
val result: Try[Int] = Try(41).map(v => v + 1)
// result: scala.util.Try[Int] = Success(42)
Note how we transformed the value inside the Try wrapper from 41 to 42 however the wrapper remained unchanged. Alternatively we could express the same thing using a for-comprehension
val result: Try[Int] = for { v <- Try(41) } yield v + 1
// result: scala.util.Try[Int] = Success(42)
Note how the effect is exactly the same. Now consider the following for comprehension which chains multiple steps
val result: Try[Int] =
for {
a <- Try(41) // first step determines the wrapper for all the other steps
b <- Try(1)
} yield a + b
// result: scala.util.Try[Int] = Success(42)
This expands to
val result: Try[Int] =
Try(41).flatMap { (a: Int) =>
Try(1).map { (b: Int) => a + b }
}
// result: scala.util.Try[Int] = Success(42)
where again we see the result is the same, namely, a value transformed inside the wrapper but wrapper remained untransformed.
Finally consider
val result: Try[Either[Exception, Int]] =
for {
a <- Try(41) // first step still determines the top-level wrapper
b <- Try(1)
} yield Right(a + b) // here we wrap inside `Either`
// result: scala.util.Try[Either[Exception,Int]] = Success(Right(42))
The principle remains the same - we did wrap a + b inside Either however this does not affect the top-level outer wrapper which is still Try.
Mario Galic's answer already explains the problem with your code, but I'd fix it differently.
Two points:
Either[Exception, A] (or rather, Either[Throwable, A]) is kind of equivalent to Try[A], with Left taking the role of Failure and Right the role of Success.
The outer try/catch is not useful because the exceptions should be captured by working in Try.
So you probably want something like
def insuranceRateQuote(a: Int, tickets:Int) : Try[Double] = {
// ... something
Success(someDouble)
}
def parseInsuranceQuoteFromWebForm(age: String, numOfTickets: String): Try[Double] = {
val a = Try(age.toInt)
val tickets = Try(numOfTickets.toInt)
for{
aa <- a
t <- tickets
q <- insuranceRateQuote(aa,t)
} yield q
}
A bit unfortunately, this does a useless map(q => q) if you figure out what the comprehension does, so you can write it more directly as
a.flatMap(aa => tickets.flatMap(t => insuranceRateQuote(aa,t)))

Scala method to side effect on map and return it

What is the best way to apply a function to each element of a Map and at the end return the same Map, unchanged, so that it can be used in further operations?
I'd like to avoid:
myMap.map(el => {
effectfullFn(el)
el
})
to achieve syntax like this:
myMap
.mapEffectOnKV(effectfullFn)
.foreach(println)
map is not what I'm looking for, because I have to specify what comes out of the map (as in the first code snippet), and I don't want to do that.
I want a special operation that knows/assumes that the map elements should be returned without change after the side-effect function has been executed.
In fact, this would be so useful to me, I'd like to have it for Map, Array, List, Seq, Iterable... The general idea is to peek at the elements to do something, then automatically return these elements.
The real case I'm working on looks like this:
calculateStatistics(trainingData, indexMapLoaders)
.superMap { (featureShardId, shardStats) =>
val outputDir = summarizationOutputDir + "/" + featureShardId
val indexMap = indexMapLoaders(featureShardId).indexMapForDriver()
IOUtils.writeBasicStatistics(sc, shardStats, outputDir, indexMap)
}
Once I have calculated the statistics for each shard, I want to append the side effect of saving them to disk, and then just return those statistics, without having to create a val and having that val's name be the last statement in the function, e.g.:
val stats = calculateStatistics(trainingData, indexMapLoaders)
stats.foreach { (featureShardId, shardStats) =>
val outputDir = summarizationOutputDir + "/" + featureShardId
val indexMap = indexMapLoaders(featureShardId).indexMapForDriver()
IOUtils.writeBasicStatistics(sc, shardStats, outputDir, indexMap)
}
stats
It's probably not very hard to implement, but I was wondering if there was something in Scala already for that.
Function cannot be effectful by definition, so I wouldn't expect anything convenient in scala-lib. However, you can write a wrapper:
def tap[T](effect: T => Unit)(x: T) = {
effect(x)
x
}
Example:
scala> Map(1 -> 1, 2 -> 2)
.map(tap(el => el._1 + 5 -> el._2))
.foreach(println)
(1,1)
(2,2)
You can also define an implicit:
implicit class TapMap[K,V](m: Map[K,V]){
def tap(effect: ((K,V)) => Unit): Map[K,V] = m.map{x =>
effect(x)
x
}
}
Examples:
scala> Map(1 -> 1, 2 -> 2).tap(el => el._1 + 5 -> el._2).foreach(println)
(1,1)
(2,2)
To abstract more, you can define this implicit on TraversableOnce, so it would be applicable to List, Set and so on if you need it:
implicit class TapTraversable[Coll[_], T](m: Coll[T])(implicit ev: Coll[T] <:< TraversableOnce[T]){
def tap(effect: T => Unit): Coll[T] = {
ev(m).foreach(effect)
m
}
}
scala> List(1,2,3).tap(println).map(_ + 1)
1
2
3
res24: List[Int] = List(2, 3, 4)
scala> Map(1 -> 1).tap(println).toMap //`toMap` is needed here for same reasons as it needed when you do `.map(f).toMap`
(1,1)
res5: scala.collection.immutable.Map[Int,Int] = Map(1 -> 1)
scala> Set(1).tap(println)
1
res6: scala.collection.immutable.Set[Int] = Set(1)
It's more useful, but requires some "mamba-jumbo" with types, as Coll[_] <: TraversableOnce[_] doesn't work (Scala 2.12.1), so I had to use an evidence for that.
You can also try CanBuildFrom approach: How to enrich a TraversableOnce with my own generic map?
Overall recommendation about dealing with passthrough side-effects on iterators is to use Streams (scalaz/fs2/monix) and Task, so they've got an observe (or some analog of it) function that does what you want in async (if needed) way.
My answer before you provided example of what you want
You can represent effectful computation without side-effects and have distinct values that represent state before and after:
scala> val withoutSideEffect = Map(1 -> 1, 2 -> 2)
withoutSideEffect: scala.collection.immutable.Map[Int,Int] = Map(1 -> 1, 2 -> 2)
scala> val withSideEffect = withoutSideEffect.map(el => el._1 + 5 -> (el._2 + 5))
withSideEffect: scala.collection.immutable.Map[Int,Int] = Map(6 -> 6, 7 -> 7)
scala> withoutSideEffect //unchanged
res0: scala.collection.immutable.Map[Int,Int] = Map(1 -> 1, 2 -> 2)
scala> withSideEffect //changed
res1: scala.collection.immutable.Map[Int,Int] = Map(6 -> 6, 7 -> 7)
Looks like the concept you're after is similar to the Unix tee
utility--take an input and direct it to two different outputs. (tee
gets its name from the shape of the letter 'T', which looks like a
pipeline from left to right with another line branching off downwards.)
Here's the Scala version:
package object mypackage {
implicit class Tee[A](a: A) extends AnyVal {
def tee(f: A => Unit): A = { f(a); a }
}
}
With that, we can do:
calculateStatistics(trainingData, indexMapLoaders) tee { stats =>
stats foreach { case (featureShardId, shardStats) =>
val outputDir = summarizationOutputDir + "/" + featureShardId
val indexMap = indexMapLoaders(featureShardId).indexMapForDriver()
IOUtils.writeBasicStatistics(sc, shardStats, outputDir, indexMap)
}
}
Note that as defined, Tee is very generic--it can do an effectful
operation on any value and then return the original passed-in value.
Call foreach on your Map with your effectfull function. You original Map will not be changed as Maps in scala are immutable.
val myMap = Map(1 -> 1)
myMap.foreach(effectfullFn)
If you are trying to chain this operation, you can use map
myMap.map(el => {
effectfullFn(el)
el
})

Getting a HashMap from Scala's HashMap.mapValues?

The example below is a self-contained example I've extracted from my larger app.
Is there a better way to get a HashMap after calling mapValues below? I'm new to Scala, so it's very likely that I'm going about this all wrong, in which case feel free to suggest a completely different approach. (An apparently obvious solution would be to move the logic in the mapValues to inside the accum but that would be tricky in the larger app.)
#!/bin/sh
exec scala "$0" "$#"
!#
import scala.collection.immutable.HashMap
case class Quantity(val name: String, val amount: Double)
class PercentsUsage {
type PercentsOfTotal = HashMap[String, Double]
var quantities = List[Quantity]()
def total: Double = (quantities map { t => t.amount }).sum
def addQuantity(qty: Quantity) = {
quantities = qty :: quantities
}
def percentages: PercentsOfTotal = {
def accum(m: PercentsOfTotal, qty: Quantity) = {
m + (qty.name -> (qty.amount + (m getOrElse (qty.name, 0.0))))
}
val emptyMap = new PercentsOfTotal()
// The `emptyMap ++` at the beginning feels clumsy, but it does the
// job of giving me a PercentsOfTotal as the result of the method.
emptyMap ++ (quantities.foldLeft(emptyMap)(accum(_, _)) mapValues (dollars => dollars / total))
}
}
val pu = new PercentsUsage()
pu.addQuantity(new Quantity("A", 100))
pu.addQuantity(new Quantity("B", 400))
val pot = pu.percentages
println(pot("A")) // prints 0.2
println(pot("B")) // prints 0.8
Rather than using a mutable HashMap to build up your Map, you can just use scala collections' built in groupBy function. This creates a map from the grouping property to a list of the values in that group, which can then be aggregated, e.g. by taking a sum:
def percentages: Map[String, Double] = {
val t = total
quantities.groupBy(_.name).mapValues(_.map(_.amount).sum / t)
}
This pipeline transforms your List[Quantity] => Map[String, List[Quantity]] => Map[String, Double] giving you the desired result.

What is the correct way to get a subarray in Scala?

I am trying to get a subarray in scala, and I am a little confused on what the proper way of doing it is. What I would like the most would be something like how you can do it in python:
x = [3, 2, 1]
x[0:2]
but I am fairly certain you cannot do this.
The most obvious way to do it would be to use the Java Arrays util library.
import java.util.Arrays
val start = Array(1, 2, 3)
Arrays.copyOfRange(start, 0, 2)
But it always makes me feel a little dirty to use Java libraries in Scala. The most "scalaic" way I found to do it would be
def main(args: List[String]) {
val start = Array(1, 2, 3)
arrayCopy(start, 0, 2)
}
def arrayCopy[A](arr: Array[A], start: Int, end: Int)(implicit manifest: Manifest[A]): Array[A] = {
val ret = new Array(end - start)
Array.copy(arr, start, ret, 0, end - start)
ret
}
but is there a better way?
You can call the slice method:
scala> Array("foo", "hoo", "goo", "ioo", "joo").slice(1, 4)
res6: Array[java.lang.String] = Array(hoo, goo, ioo)
It works like in python.
Imagine you have an array with elements from a to f
scala> val array = ('a' to 'f').toArray // Array('a','b','c','d','e','f')
Then you can extract a sub-array from it in different ways:
Dropping the first n first elements with drop(n: Int)
array.drop(2) // Array('c','d','e','f')
Take the first n elements with take(n: Int)
array.take(4) // Array('a','b','c','d')
Select any interval of elements with slice(from: Int, until: Int). Note that until is excluded.
array.slice(2,4) // Array('c','d')
The slice method is stricly equivalent to:
array.take(4).drop(2) // Array('c','d')
Exclude the last n elements with dropRight(n: Int):
array.dropRight(4) // Array('a','b')
Select the last n elements with takeRight(n: Int):
array.takeRight(4) // Array('c','d','e','f')
Reference: Official documentation
An example of extracting specific columns from a 2D Scala Array (original_array):
import scala.collection.mutable.ArrayBuffer
val sub_array = ArrayBuffer[Array[String]]()
val columns_subset: Seq[String] = Seq("ColumnA", "ColumnB", "ColumnC")
val columns_original = original_array(0)
for (column_now <- columns_subset) {
sub_array += original_array.map{_(columns_original.indexOf(column_now))}
}
sub_array

Creating a repeating true/false List in scala

I want to generate a Seq/List of true/false values which I can zip with some input in order to do the equivalent of checking whether a for loop index is odd/even.
Is there a better way than
input.zip((1 to n).map(_ % 2 == 0))
or
input.zip(List.tabulate(n)(_ % 2 != 0))
I would have thought something like (true, false).repeat(n/2) is more obvious
Using #DaveGriffith's idea:
input.zip(Stream.iterate(false)(!_))
Or, if you use this pattern in several places:
def falseTrueStream = Stream.iterate(false)(!_)
input.zip(falseTrueStream)
This has the distinct advantage of not needing to specify the size of the false-true list.
Edit:
Of course, def falseTrueStream creates the stream of true/false objects every time you use it, and as #DanielCSobral mentioned, making it a val will cause the objects to be held in memory (until the program ends if the val is on an object).
If you're slightly evil and want to prematurely optimize it, you can build the Stream objects yourself.
object TrueFalseStream extends Stream[Boolean] {
val tailDefined = true
override val isEmpty = false
override val head = true
override val tail = FalseTrueStream
}
object FalseTrueStream extends Stream[Boolean] {
val tailDefined = true
override val isEmpty = false
override val head = false
override val tail = TrueFalseStream
}
If you want a list of alternating true/false of size n:
List.iterate(false, n)(!_)
So then you could do:
val input = List("a", "b", "c", "d")
input.zip(List.iterate(false, input.length)(!_))
//List[(java.lang.String, Boolean)] = List((a,false), (b,true), (c,false), (d,true))
There's a very useful function in Haskell - cycle - which is useful for such purposes:
haskell> zip [1..7] $ cycle [True, False]
[(1,True),(2,False),(3,True),(4,False),(5,True),(6,False),(7,True)]
For some reason, Scala standard library doesn't have it. You can define it on your own, and then use it.
scala> def cycle[A](s: Stream[A]): Stream[A] = Stream.continually(s).flatten
cycle: [A](s: Stream[A])Stream[A]
scala> (1 to 7) zip cycle(Stream(true, false))
res13: scala.collection.immutable.IndexedSeq[(Int, Boolean)] = Vector((1,true), (2,false), (3,true), (4,false), (5,true), (6,false), (7,true))
You want
input.indices.map(_%2==0)
I couldn't come up with anything simpler (and this is far from simple):
(for(_ <- 1 to n/2) yield List(true, false)).flatten
and:
(1 to n/2).foldLeft(List[Boolean]()) {(cur,_) => List(true, false) ++ cur}
Watch for odd n!
However based on your requirements it looks like you might want to have something lazy:
def oddEven(init: Boolean): Stream[Boolean] = Stream.cons(init, oddEven(!init))
...and it never ends (try: oddEven(true) foreach println). Now you can take as much as you want:
oddEven(true).take(10).toList
...in order to do the equivalent of checking whether a for loop index is odd/even.
I'm ignoring your specific request, and addressing your main concern in a different way.
You can make your own control function, like so:
def for2[A,B](xs: List[A])(f: A => Unit, g: A => Unit): Unit = xs match {
case (y :: ys) => {
f(y)
for2(ys)(g, f)
}
case _ => Unit
}
Testing
> for2(List(0,1,2,3,4,5))((x) => println("E: " + x), (x) => println("O: " + x))
E: 0
O: 1
E: 2
O: 3
E: 4
O: 5