Scala Seq.sliding() violating the docs rationale? - scala

When writing tests for some part of my system I found some weird behavior, which upon closer inspection boils down to the following:
scala> List(0, 1, 2, 3).sliding(2).toList
res36: List[List[Int]] = List(List(0, 1), List(1, 2), List(2, 3))
scala> List(0, 1, 2).sliding(2).toList
res37: List[List[Int]] = List(List(0, 1), List(1, 2))
scala> List(0, 1).sliding(2).toList
res38: List[List[Int]] = List(List(0, 1))
scala> List(0).sliding(2).toList //I mean the result of this line
res39: List[List[Int]] = List(List(0))
To me it seems like List.sliding(), and the sliding() implementations for a number of other types are violating the guarantees given in the docs:
def sliding(size: Int): Iterator[List[A]]
Groups elements in fixed size blocks by passing a "sliding window"
over them (as opposed to partitioning them, as is done in grouped.)
size: the number of elements per group
returns: An iterator producing lists of size size, except the last and the only element will be truncated if there are fewer
elements than size.
From what I understand there is a guarantee that all the lists that can be iterated over using the iterator returned by sliding(2) will be of length 2. I find it hard to believe that this is a bug that got all the way to the current version of scala, so perhaps there's an explanation for this or I'm misunderstanding the docs?
I'm using "Scala version 2.10.3 (OpenJDK 64-Bit Server VM, Java 1.7.0_25)."

No, there's is no such guarantee, and your pretty much emphasized the doc line that explicitly says so. Here it is again, with a different emphasis:
returns: An iterator producing lists of size size, except the last and
the only element will be truncated if there are fewer elements than
size.
So if you have a list that has length n, and call .sliding(m), where m > n, the last and the only element of the result with have length n.
In the case of:
List(0).sliding(2)
there is only one element (n = 1), you call sliding(2), i.e. m = 2, 2 > 1, this causes the last and only element of the result to be truncated to 1.

Related

Scala: How to find the minimum of more than 2 elements?

Since the Math.min() function only allows for the use of 2 elements, I was wondering if there is maybe another function which can calculate the minimum of more than 2 elements.
Thanks in advance!
If you have multiple elements you can just chain calls to the min method:
Math.min(x, Math.min(y, z))
Since scala adds the a min method to numbers via implicits you could write the following which looks much fancier:
x min y min z
If you have a list of values and want to find their minimum:
val someNumbers: List[Int] = ???
val minimum = someNumbers.min
Note that this throws an exception if the list is empty. From scala 2.13.x onwards, there will be a minOption method to handle such cases gracefully. For older versions you could use the reduceOption method as workaround:
someNumbers.reduceOption(_ min _)
someNumbers.reduceOption(Math.min)
Add all numbers in the collection like the list and find a minimum of it.
scala> val list = List(2,3,7,1,9,4,5)
list: List[Int] = List(2, 3, 7, 1, 9, 4, 5)
scala> list.min
res0: Int = 1

Split sorted Scala Sequence/Array according to gaps between elements [duplicate]

What is the most elegant way of grouping a list of values into groups based on their neighbor values?
The wider context I have is having a list of lines, that need to be grouped into paragraphs. I want to be able to say that if the vertical difference between two lines is lower than threshold, they are in the same paragraph.
I ended up solving this problem differently, but I'm wondering about the correct solution here.
case class Box(y: Int)
val list = List(Box(y=1), Box(y=2), Box(y=5))
def group(list: List[Box], threshold: Int): List[List[Box]] = ???
val grouped = group(list, 2)
> List(List(Box(y=1), Box(y=2)), List(Box(y=5)))
I have looked at groupBy(), but that can only work with one element at a time. I have also tried an approach that involved pre-computing differences using sliding(), but then it becomes awkward to retrieve the elements from the original collection.
It's a one liner. Generalising types left as an exercise for the reader.
Using ints and absolute difference rather than lines and spacing to avoid clutter.
val zs = List(1,2,4,8,9,10,15,16)
def closeEnough(a:Int, b:Int) = (Math.abs(b -a) <= 2)
zs.drop(1).foldLeft(List(List(zs.head)))
((acc, e)=> if (closeEnough(e, acc.head.head))
(e::acc.head)::acc.tail
else
List(e)::acc)
.map(_.reverse)
.reverse
// List(List(1, 2, 4), List(8, 9, 10), List(15, 16))
Or a two liner for a slight efficiency gain
val ys = zs.reverse
ys.drop(1).foldLeft(List(List(ys.head)))
((acc, e)=> if (closeEnough(e, acc.head.head))
(e::acc.head)::acc.tail
else
List(e)::acc)
// List(List(1, 2, 4), List(8, 9, 10), List(15, 16))

Lists in Scala - plus colon vs double colon (+: vs ::)

I am little bit confused about +: and :: operators that are available.
It looks like both of them gives the same results.
scala> List(1,2,3)
res0: List[Int] = List(1, 2, 3)
scala> 0 +: res0
res1: List[Int] = List(0, 1, 2, 3)
scala> 0 :: res0
res2: List[Int] = List(0, 1, 2, 3)
For my novice eye source code for both methods looks similar (plus-colon method has additional condition on generics with use of builder factories).
Which one of these methods should be used and when?
+: works with any kind of collection, while :: is specific implementation for List.
If you look at the source for +: closely, you will notice that it actually calls :: when the expected return type is List. That is because :: is implemented more efficiently for the List case: it simply connects the new head to the existing list and returns the result, which is a constant-time operation, as opposed to linear copying the entire collection in the generic case of +:.
+: on the other hand, takes CanBuildFrom, so you can do fancy (albeit, not looking as nicely in this case) things like:
val foo: Array[String] = List("foo").+:("bar")(breakOut)
(It's pretty useless in this particular case, as you could start with the needed type to begin with, but the idea is you can prepend and element to a collection, and change its type in one "go", avoiding an additional copy).

Seq with maximal elements

I have a Seq and function Int => Int. What I need to achieve is to take from original Seq only thoose elements that would be equal to the maximum of the resulting sequence (the one, I'll have after applying given function):
def mapper:Int=>Int= x=>x*x
val s= Seq( -2,-2,2,2 )
val themax= s.map(mapper).max
s.filter( mapper(_)==themax)
But this seems wasteful, since it has to map twice (once for the filter, other for the maximum).
Is there a better way to do this? (without using a cycle, hopefully)
EDIT
The code has since been edited; in the original this was the filter line: s.filter( mapper(_)==s.map(mapper).max). As om-nom-nom has pointed out, this evaluates `s.map(mapper).max each (filter) iteration, leading to quadratic complexity.
Here is a solution that does the mapping only once and using the `foldLeft' function:
The principle is to go through the seq and for each mapped element if it is greater than all mapped before then begin a new sequence with it, otherwise if it is equal return the list of all maximums and the new mapped max. Finally if it is less then return the previously computed Seq of maximums.
def getMaxElems1(s:Seq[Int])(mapper:Int=>Int):Seq[Int] = s.foldLeft(Seq[(Int,Int)]())((res, elem) => {
val e2 = mapper(elem)
if(res.isEmpty || e2>res.head._2)
Seq((elem,e2))
else if (e2==res.head._2)
res++Seq((elem,e2))
else res
}).map(_._1) // keep only original elements
// test with your list
scala> getMaxElems1(s)(mapper)
res14: Seq[Int] = List(-2, -2, 2, 2)
//test with a list containing also non maximal elements
scala> getMaxElems1(Seq(-1, 2,0, -2, 1,-2))(mapper)
res15: Seq[Int] = List(2, -2, -2)
Remark: About complexity
The algorithm I present above has a complexity of O(N) for a list with N elements. However:
the operation of mapping all elements is of complexity O(N)
the operation of computing the max is of complexity O(N)
the operation of zipping is of complexity O(N)
the operation of filtering the list according to the max is also of complexity O(N)
the operation of mapping all elements is of complexity O(M), with M the number of final elements
So, finally the algorithm you presented in your question has the same complexity (quality) than my answer's one, moreover the solution you present is more clear than mine. So, even if the 'foldLeft' is more powerful, for this operation I would recommend your idea, but with zipping original list and computing the map only once (especially if your map is more complicated than a simple square). Here is the solution computed with the help of *scala_newbie* in question/chat/comments.
def getMaxElems2(s:Seq[Int])(mapper:Int=>Int):Seq[Int] = {
val mappedS = s.map(mapper) //map done only once
val m = mappedS.max // find the max
s.zip(mappedS).filter(_._2==themax).unzip._1
}
// test with your list
scala> getMaxElems2(s)(mapper)
res16: Seq[Int] = List(-2, -2, 2, 2)
//test with a list containing also non maximal elements
scala> getMaxElems2(Seq(-1, 2,0, -2, 1,-2))(mapper)
res17: Seq[Int] = List(2, -2, -2)

Inconsistent behaviour for xs.sliding(n) if n is less than size?

According to scaladoc, sliding() returns...
"An iterator producing iterable collections of size size, except the last and the only element will be truncated if there are fewer elements than size."
For me, intuitivelly, sliding(n) would return a sliding window of n elements if available. With the current implementation, I need to perform an extra check to make sure I don't get a list of 1 or 2 elements.
scala> val xs = List(1, 2)
xs: List[Int] = List(1, 2)
scala> xs.sliding(3).toList
res2: List[List[Int]] = List(List(1, 2))
I expected here an empty list instead. Why is sliding() implemented this way instead?
It was a mistake, but wasn't fixed as of 2.9. Everyone occasionally makes design errors, and once one gets into the library it's a nontrivial task to remove it.
Workaround: add a filter.
xs.sliding(3).filter(_.size==3).toList
You can "work around" this by using the GroupedIterator#withPartial modifier.
scala> val xs = List(1, 2)
xs: List[Int] = List(1, 2)
scala> xs.iterator.sliding(3).withPartial(false).toList
res7: List[Seq[Int]] = List()
(I don't know why you need to say xs.iterator but xs.sliding(3).withPartial(false) does not work because you get an Iterator instead of a GroupedIterator.
EDIT:
Check Rex's answer (which is the correct one). I'm leaving this just because (as Rex said on the comments) it was the original (wrong) idea behind that design decision.
I don't know why you would expect an empty list there, returning the full list seems like the best result, consider this example:
def slidingWindowsThing(windows : List[List[Int]]) { // do your thing
For this methods you probably want all these calls to work:
slidingWindowsThing((1 to 10).sliding(3))
slidingWindowsThing((1 to 3).sliding(3))
slidingWindowsThing((1 to 1).sliding(3))
This is why the method defaults to a list of size list.length instead of Nil (empty list).