Efficient way to check if a traversable has more than 1 element in Scala - scala

I need to check if a Traversable (which I already know to be nonEmpty) has a single element or more.
I could use size, but (tell me if I'm wrong) I suspect that this could be O(n), and traverse the collection to compute it.
I could check if tail.nonEmpty, or if .head != .last
Which are the pros and cons of the two approaches? Is there a better way? (for example, will .last do a full iteration as well?)

All approaches that cut elements from beginning of the collection and return tail are inefficient. For example tail for List is O(1), while tail for Array is O(N). Same with drop.
I propose using take:
list.take(2).size == 1 // list is singleton
take is declared to return whole collection if collection length is less that take's argument. Thus there will be no error if collection is empty or has only one element. On the other hand if collection is huge take will run in O(1) time nevertheless. Internally take will start iterating your collection, take two steps and break, putting elements in new collection to return.
UPD: I changed condition to exactly match the question

Not all will be the same, but let's take a worst case scenario where it's a List. last will consume the entire List just to access that element, as will size.
tail.nonEmpty is obtained from a head :: tail pattern match, which doesn't need to consume the entire List. If you already know the list to be non-empty, this should be the obvious choice.
But not all tail operations take constant time like a List: Scala Collections Performance

You can take a view of a traversable. You can slice the TraversableView lazily.
The initial star is because the REPL prints some output.
scala> val t: Traversable[Int] = Stream continually { println("*"); 42 }
*
t: Traversable[Int] = Stream(42, ?)
scala> t.view.slice(0,2).size
*
res1: Int = 2
scala> val t: Traversable[Int] = Stream.fill(1) { println("*"); 42 }
*
t: Traversable[Int] = Stream(42, ?)
scala> t.view.slice(0,2).size
res2: Int = 1
The advantage is that there is no intermediate collection.
scala> val t: Traversable[_] = Map((1 to 10) map ((_, "x")): _*)
t: Traversable[_] = Map(5 -> x, 10 -> x, 1 -> x, 6 -> x, 9 -> x, 2 -> x, 7 -> x, 3 -> x, 8 -> x, 4 -> x)
scala> t.take(2)
res3: Traversable[Any] = Map(5 -> x, 10 -> x)
That returns an unoptimized Map, for instance:
scala> res3.getClass
res4: Class[_ <: Traversable[Any]] = class scala.collection.immutable.HashMap$HashTrieMap
scala> Map(1->"x",2->"x").getClass
res5: Class[_ <: scala.collection.immutable.Map[Int,String]] = class scala.collection.immutable.Map$Map2

What about pattern matching?
itrbl match { case _::Nil => "one"; case _=>"more" }

Related

andThen in List scala

Has anyone got an example of how to use andThen with Lists? I notice that andThen is defined for List but the documentations hasn't got an example to show how to use it.
My understanding is that f andThen g means that execute function f and then execute function g. The input of function g is output of function f. Is this correct?
Question 1 - I have written the following code but I do not see why I should use andThen because I can achieve the same result with map.
scala> val l = List(1,2,3,4,5)
l: List[Int] = List(1, 2, 3, 4, 5)
//simple function that increments value of element of list
scala> def f(l:List[Int]):List[Int] = {l.map(x=>x-1)}
f: (l: List[Int])List[Int]
//function which decrements value of elements of list
scala> def g(l:List[Int]):List[Int] = {l.map(x=>x+1)}
g: (l: List[Int])List[Int]
scala> val p = f _ andThen g _
p: List[Int] => List[Int] = <function1>
//printing original list
scala> l
res75: List[Int] = List(1, 2, 3, 4, 5)
//p works as expected.
scala> p(l)
res74: List[Int] = List(1, 2, 3, 4, 5)
//but I can achieve the same with two maps. What is the point of andThen?
scala> l.map(x=>x+1).map(x=>x-1)
res76: List[Int] = List(1, 2, 3, 4, 5)
Could someone share practical examples where andThen is more useful than methods like filter, map etc. One use I could see above is that with andThen, I could create a new function,p, which is a combination of other functions. But this use brings out usefulness of andThen, not List and andThen
andThen is inherited from PartialFunction a few parents up the inheritance tree for List. You use List as a PartialFunction when you access its elements by index. That is, you can think of a List as a function from an index (from zero) to the element that occupies that index within the list itself.
If we have a list:
val list = List(1, 2, 3, 4)
We can call list like a function (because it is one):
scala> list(0)
res5: Int = 1
andThen allows us to compose one PartialFunction with another. For example, perhaps I want to create a List where I can access its elements by index, and then multiply the element by 2.
val list2 = list.andThen(_ * 2)
scala> list2(0)
res7: Int = 2
scala> list2(1)
res8: Int = 4
This is essentially the same as using map on the list, except the computation is lazy. Of course, you could accomplish the same thing with a view, but there might be some generic case where you'd want to treat the List as just a PartialFunction, instead (I can't think of any off the top of my head).
In your code, you aren't actually using andThen on the List itself. Rather, you're using it for functions that you're passing to map, etc. There is no difference in the results between mapping a List twice over f and g and mapping once over f andThen g. However, using the composition is preferred when mapping multiple times becomes expensive. In the case of Lists, traversing multiple times can become a tad computationally expensive when the list is large.
With the solution l.map(x=>x+1).map(x=>x-1) you are traversing the list twice.
When composing 2 functions using the andThen combinator and then applying it to the list, you only traverse the list once.
val h = ((x:Int) => x+1).andThen((x:Int) => x-1)
l.map(h) //traverses it only once

Checking sameness/equality in Scala

As I asked in other post (Unique id for Scala object), it doesn't seem like that I can have id just like Python.
I still need to check the sameness in Scala for unittest. I run a test and compare the returned value of some nested collection object (i.e., List[Map[Int, ...]]) with the one that I create.
However, the hashCode for mutable map is the same as that of immutable map. As a result (x == y) returns True.
scala> val x = Map("a" -> 10, "b" -> 20)
x: scala.collection.immutable.Map[String,Int] = Map(a -> 10, b -> 20)
scala> x.hashCode
res0: Int = -1001662700
scala> val y = collection.mutable.Map("b" -> 20, "a" -> 10)
y: scala.collection.mutable.Map[String,Int] = Map(b -> 20, a -> 10)
scala> y.hashCode
res2: Int = -1001662700
In some cases, it's OK, but in other cases, I may need to make it failed test. So, here comes my question.
Q1: What is the normally used method for comparing two values (including very complicated data types) are the same? I may compare the toString() results, but I don't think this is a good idea.
Q2: Is it a general rule that mutable data structure has the same hashCode with immutable counterpart?
You are looking for AnyRef.eq which does reference equality (which is as close as you can get to Python's id function and is identical if you just want to compare references and you don't care about the actual ID):
scala> x == y
true
scala> x eq y
false

How can I functionally iterate over a collection combining elements?

I have a sequence of values of type A that I want to transform to a sequence of type B.
Some of the elements with type A can be converted to a B, however some other elements need to be combined with the immediately previous element to produce a B.
I see it as a small state machine with two states, the first one handling the transformation from A to B when just the current A is needed, or saving A if the next row is needed and going to the second state; the second state combining the saved A with the new A to produce a B and then go back to state 1.
I'm trying to use scalaz's Iteratees but I fear I'm overcomplicating it, and I'm forced to return a dummy B when the input has reached EOF.
What's the most elegant solution to do it?
What about invoking the sliding() method on your sequence?
You might have to put a dummy element at the head of the sequence so that the first element (the real head) is evaluated/converted correctly.
If you map() over the result from sliding(2) then map will "see" every element with its predecessor.
val input: Seq[A] = ??? // real data here (no dummy values)
val output: Seq[B] = (dummy +: input).sliding(2).flatMap(a2b).toSeq
def a2b( arg: Seq[A] ): Seq[B] = {
// arg holds 2 elements
// return a Seq() of zero or more elements
}
Taking a stab at it:
Partition your list into two lists. The first is the one you can directly convert and the second is the one that you need to merge.
scala> val l = List("String", 1, 4, "Hello")
l: List[Any] = List(String, 1, 4, Hello)
scala> val (string, int) = l partition { case s:String => true case _ => false}
string: List[Any] = List(String, Hello)
int: List[Any] = List(1, 4)
Replace the logic in the partition block with whatever you need.
After you have the two lists, you can do whatever you need to with your second using something like this
scala> string ::: int.collect{case i:Integer => i}.sliding(2).collect{
| case List(a, b) => a+b.toString}.toList
res4: List[Any] = List(String, Hello, 14)
You would replace the addition with whatever your aggregate function is.
Hopefully this is helpful.

Scala - increasing prefix of a sequence

I was wondering what is the most elegant way of getting the increasing prefix of a given sequence. My idea is as follows, but it is not purely functional or any elegant:
val sequence = Seq(1,2,3,1,2,3,4,5,6)
var currentElement = sequence.head - 1
val increasingPrefix = sequence.takeWhile(e =>
if (e > currentElement) {
currentElement = e
true
} else
false)
The result of the above is:
List(1,2,3)
You can take your solution, #Samlik, and effectively zip in the currentElement variable, but then map it out when you're done with it.
sequence.take(1) ++ sequence.zip(sequence.drop(1)).
takeWhile({case (a, b) => a < b}).map({case (a, b) => b})
Also works with infinite sequences:
val sequence = Seq(1, 2, 3).toStream ++ Stream.from(1)
sequence is now an infinite Stream, but we can peek at the first 10 items:
scala> sequence.take(10).toList
res: List[Int] = List(1, 2, 3, 1, 2, 3, 4, 5, 6, 7)
Now, using the above snippet:
val prefix = sequence.take(1) ++ sequence.zip(sequence.drop(1)).
takeWhile({case (a, b) => a < b}).map({case (a, b) => b})
Again, prefix is a Stream, but not infinite.
scala> prefix.toList
res: List[Int] = List(1, 2, 3)
N.b.: This does not handle the cases when sequence is empty, or when the prefix is also infinite.
If by elegant you mean concise and self-explanatory, it's probably something like the following:
sequence.inits.dropWhile(xs => xs != xs.sorted).next
inits gives us an iterator that returns the prefixes longest-first. We drop all the ones that aren't sorted and take the next one.
If you don't want to do all that sorting, you can write something like this:
sequence.scanLeft(Some(Int.MinValue): Option[Int]) {
case (Some(last), i) if i > last => Some(i)
case _ => None
}.tail.flatten
If the performance of this operation is really important, though (it probably isn't), you'll want to use something more imperative, since this solution still traverses the entire collection (twice).
And, another way to skin the cat:
val sequence = Seq(1,2,3,1,2,3,4,5,6)
sequence.head :: sequence
.sliding(2)
.takeWhile{case List(a,b) => a <= b}
.map(_(1)).toList
// List[Int] = List(1, 2, 3)
I will interpret elegance as the solution that most closely resembles the way we humans think about the problem although an extremely efficient algorithm could also be a form of elegance.
val sequence = List(1,2,3,2,3,45,5)
val increasingPrefix = takeWhile(sequence, _ < _)
I believe this code snippet captures the way most of us probably think about the solution to this problem.
This of course requires defining takeWhile:
/**
* Takes elements from a sequence by applying a predicate over two elements at a time.
* #param xs The list to take elements from
* #param f The predicate that operates over two elements at a time
* #return This function is guaranteed to return a sequence with at least one element as
* the first element is assumed to satisfy the predicate as there is no previous
* element to provide the predicate with.
*/
def takeWhile[A](xs: Traversable[A], f: (Int, Int) => Boolean): Traversable[A] = {
// function that operates over tuples and returns true when the predicate does not hold
val not = f.tupled.andThen(!_)
// Maybe one day our languages will be better than this... (dependant types anyone?)
val twos = sequence.sliding(2).map{case List(one, two) => (one, two)}
val indexOfBreak = twos.indexWhere(not)
// Twos has one less element than xs, we need to compensate for that
// An intuition is the fact that this function should always return the first element of
// a non-empty list
xs.take(i + 1)
}

Scala: How do I use fold* with Map?

I have a Map[String, String] and want to concatenate the values to a single string.
I can see how to do this using a List...
scala> val l = List("te", "st", "ing", "123")
l: List[java.lang.String] = List(te, st, ing, 123)
scala> l.reduceLeft[String](_+_)
res8: String = testing123
fold* or reduce* seem to be the right approach I just can't get the syntax right for a Map.
Folds on a map work the same way they would on a list of pairs. You can't use reduce because then the result type would have to be the same as the element type (i.e. a pair), but you want a string. So you use foldLeft with the empty string as the neutral element. You also can't just use _+_ because then you'd try to add a pair to a string. You have to instead use a function that adds the accumulated string, the first value of the pair and the second value of the pair. So you get this:
scala> val m = Map("la" -> "la", "foo" -> "bar")
m: scala.collection.immutable.Map[java.lang.String,java.lang.String] = Map(la -> la, foo -> bar)
scala> m.foldLeft("")( (acc, kv) => acc + kv._1 + kv._2)
res14: java.lang.String = lalafoobar
Explanation of the first argument to fold:
As you know the function (acc, kv) => acc + kv._1 + kv._2 gets two arguments: the second is the key-value pair currently being processed. The first is the result accumulated so far. However what is the value of acc when the first pair is processed (and no result has been accumulated yet)? When you use reduce the first value of acc will be the first pair in the list (and the first value of kv will be the second pair in the list). However this does not work if you want the type of the result to be different than the element types. So instead of reduce we use fold where we pass the first value of acc as the first argument to foldLeft.
In short: the first argument to foldLeft says what the starting value of acc should be.
As Tom pointed out, you should keep in mind that maps don't necessarily maintain insertion order (Map2 and co. do, but hashmaps do not), so the string may list the elements in a different order than the one in which you inserted them.
The question has been answered already, but I'd like to point out that there are easier ways to produce those strings, if that's all you want. Like this:
scala> val l = List("te", "st", "ing", "123")
l: List[java.lang.String] = List(te, st, ing, 123)
scala> l.mkString
res0: String = testing123
scala> val m = Map(1 -> "abc", 2 -> "def", 3 -> "ghi")
m: scala.collection.immutable.Map[Int,java.lang.String] = Map((1,abc), (2,def), (3,ghi))
scala> m.values.mkString
res1: String = abcdefghi