Why is headOption faster - scala

I made a change to some code and it got 4.5x faster. I'm wondering why. It used to be essentially:
def doThing(queue: Queue[(String, String)]): Queue[(String, String)] = queue match {
case Queue((thing, stuff), _*) => doThing(queue.tail)
case _ => queue
}
and I changed it to this to get a huge speed boost:
def doThing(queue: Queue[(String, String)]): Queue[(String, String)] = queue.headOption match {
case Some((thing, stuff)) => doThing(queue.tail)
case _ => queue
}
What does _* do and why is it so expensive compared to headOption?

My guess after running scalac with -Xprint:all is that at the end of patmat in the queue match { case Queue((thing, stuff), _*) => doThing(queue.tail) } example I see the following methods being called (edited for brevity):
val o9 = scala.collection.immutable.Queue.unapplySeq[(String, String)](x1);
if (o9.isEmpty.unary_!)
if (o9.get.!=(null).&&(o9.get.lengthCompare(1).>=(0)))
{
val p2: (String, String) = o9.get.apply(0);
val p3: Seq[(String, String)] = o9.get.drop(1);
So lengthCompare compare the length of the collection in a possibly optimized way. For Queue, it creates an iterator and iterates one time. So that should be somewhat fast. On the other hand drop(1) also constructs an iterator, skips one element and adds the rest of the elements to the result queue, so that would be linear in the size of the collection.
The headOption example is more straightforward, it checks if the list is empty (two comparisons), and if not returns a Some(head), which then just has its _1 and _2 assigned to thing and stuff. So no iterators are created and nothing linear in the length of the collection.

There should be no significant difference between your code samples.
case Queue((thing, stuff), _*) is actually translated by compiler to call of head (apply(0)) method. You could use scalac -Xprint:patmat to investigate this:
<synthetic> val p2: (String, String) = o9.get.apply(0);
if (p2.ne(null))
matchEnd6(doThing(queue.tail))
The cost of head and cost of headOption are almost the same.
Methods head, tail and dequeue could cause reverce on internal List of Queue (with cost O(n)). In both you code samples there will be at most 2 reverce calls.
You should use dequeue like this to get at most a single reverce call:
def doThing(queue: Queue[(String, String)]): Queue[(String, String)] =
if (queue.isEmpty) queue
else queue.dequeue match {
case (e, q) => doThing(q)
}
You could also replace (thing, stuff) with _. In this case compiler will generate only call of lengthCompare without head or tail:
if (o9.get != null && o9.get.lengthCompare(1) >= 0)

_* is used to specify varargs arguments, so what you are doing in the first version is deconstructing the Queue into a pair of Strings, and an appropriate number of further pairs of Strings - ie you are deconstructing the whole Queue even though you only care about the first element.
If you just remove the asterisk, giving
def doThing(queue: Queue[(String, String)]): Queue[(String, String)] = queue match {
case Queue((thing, stuff), _) => doThing(queue.tail)
case _ => queue
}
then you are only deconstructing the Queue into a pair of Strings, and a remainder (which thus does not need to be fully deconstructed). This should run in comparable time to your second version (haven't timed it myself, though).

Related

Finding first and last occurrence of an element using binary search and effects in scala

I've got functions with following signatures:
case class Sth(index: Long, str: String)
def fetch(n: Long): F[Sth] = ??? //implemented
def findFirstAndLast(min: Long, max: Long, str: String): F[(Long, Long)] = ???
I have certainty that str are grouped and groups don't occur twice.
For example this will be correct:
Sth(1, "a")
Sth(2, "a")
Sth(3, "b")
Sth(4, "b")
Sth(5, "b")
Sth(6, "c")
Sth(7, "d")
Sth(8, "d")
and that should be result of my function: findFirstAndLast(1, 8, "b") = F((3, 5))
But following case will never happen:
Sth(1, "a")
Sth(2, "b")
Sth(3, "b")
Sth(4, "a")
I tried to implement this but my brain stopped functioning when I've added the effect.
Start by envisioning your algorithm as a recursive one, where there is some initial "state", and the algorithm can choose at each step to either update the state and recurse, or finish with a result.
You could represent your state as
case class State(index: Long, firstMatch: Option[Long])
where the index is the current index you are about to inspect, and the firstMatch is the index where you found your first matching string.
At each step, the algorithm will look at the index and check whether it is in-bounds compared to the max. If it's out of bounds, it must exit with whatever values it found. If it's in-bounds, grab the string associated with that index, then based on whether that string matches and whether you already found your firstMatch, decide whether to continue checking at the next index or return a result.
In order to perform recursion with effects, you're going to need a Monad[F]. You can find a lot of articles that explain what a Monad is, but to simplify for the purposes of this answer, it's a thing that knows how to flatMap your F, and provides the handy tailRecM function for doing that flatMapping recursively.
tailRecM is a way to represent a recursive algorithm with an F[_] effect.
def tailRecM[A, B](a: A)(f: (A) ⇒ F[Either[A, B]]): F[B]
The a: A is your initial state, e.g. State(min, None)
The f: A => F[Either[A, B]] is a function that inspects your current state, then decides whether to recurse, where that decision is done inside an F effect. Basically Left means recurse, Right means exit. It's perfect for your situation, where you have a fetch method that forces you into an F effect.
It returns an F[B] when your f returns an F[Right[B]], i.e. the end of recursion.
When you write your method, you just have to make sure there is a Monad available for the F type you're using, so you can use it to call tailRecM. In my demo, I put F as a type parameter on the method and made fetch into an argument, but I suspect in your code, both findFirstAndLast and fetch are defined inside some class that has the F type parameter. Adjust as necessary.
def findFirstAndLast[F[_]: Monad](
min: Long,
max: Long,
str: String,
fetch: Long => F[String],
)(implicit F: Monad[F]): F[Option[(Long, Long)]] =
F.tailRecM(State(min, None)) {
// if out of bounds, result is None if firstIndex was never found,
// or `firstIndex -> max` if it was found
case State(index, firstIndex) if index > max =>
// note: `f` wants an `F[Either[..]]` so we wrap the either with `F.pure`
F.pure(Right(firstIndex.map(_ -> max)))
// if in-bounds, fetch the index and decide from there
case State(index, firstMatch) =>
F.map(fetch(index)) {
case `str` if firstMatch.isEmpty =>
// found the first match!
Left(State(index + 1, Some(index)))
case `str` if firstMatch.isDefined =>
// still matching after first
Left(State(index + 1, firstMatch))
case other if firstMatch.isDefined =>
// no longer matching, last match must be previous step
Right(Some((firstMatch.get, index - 1)))
case other =>
// still looking for first match
Left(State(index + 1, None))
}
}
Example usage with F=SyncIO, but this will work generally with any F type that has a Monad, e.g. IO, monix.eval.Task, Future, Option, etc.
def exampleFetch(n: Long) = SyncIO { "aabbbcdd".charAt(n.toInt - 1).toString }
val result = findFirstAndLast(1, 8, "b", exampleFetch).unsafeRunSync()
println(result) // Some((3, 5))
https://scastie.scala-lang.org/oN0P6e3JQ7WEKkgbxuPLuQ

Work around type-erasure when doing pattern-match of a list / sequence in Scala

I'm having a situation like this:
I have a sequence that I need to match. Actually, in the "case" I only need to match against a sequence whose elements are of tuple (String, Seq[String]) but I couldn't find a way to do that, so I resorted to the technique I read on web: decapitate the seq, match against the first element, and re-attach inside the block to get the original seq.
The problem with that approach is: type erasure.
The resulting seq from the expression "head +: rest" is a Seq[Any] instead of Seq[(String, Seq[String])]
That's why the tuple_.1 gives compile error (line 153 in the attached image).
How to work around this situation?
It was my bad, there are a couple of coding errors in the above screenshot; from the case block of sequence matching I should have added another line to the end: complexSeqReconstructed.
Apart from that, there's a little detail: from the "case Nil", I also have to return an empty Seq of type Seq[(String, Seq[String])] ... that matches the return from the other case. That way Scala will correctly perform type inference. Otherwise the line hhh.map wouldn't compile (it will say, Object hhh doesn't have map method).
So here's the updated (working) code:
val theComplexSeq: Seq[(String, Seq[String])] = Seq(
("the_key_a", Seq("value_a_1", "value_a_2"))
)
val hhh = theComplexSeq match {
case Nil => {
Seq[(String, Seq[String])]()
}
case (head: (String, Seq[String])) +: rest => {
println(head)
println(rest)
val complexSeqReconstructed = head +: rest
println(complexSeqReconstructed)
complexSeqReconstructed
}
}
hhh.map {tuple =>
println(tuple._1 + "->" + tuple._2)
tuple
}

How do you stop building an Option[Collection] upon reaching the first None?

When building up a collection inside an Option, each attempt to make the next member of the collection might fail, making the collection as a whole a failure, too. Upon the first failure to make a member, I'd like to give up immediately and return None for the whole collection. What is an idiomatic way to do this in Scala?
Here's one approach I've come up with:
def findPartByName(name: String): Option[Part] = . . .
def allParts(names: Seq[String]): Option[Seq[Part]] =
names.foldLeft(Some(Seq.empty): Option[Seq[Part]]) {
(result, name) => result match {
case Some(parts) =>
findPartByName(name) flatMap { part => Some(parts :+ part) }
case None => None
}
}
In other words, if any call to findPartByName returns None, allParts returns None. Otherwise, allParts returns a Some containing a collection of Parts, all of which are guaranteed to be valid. An empty collection is OK.
The above has the advantage that it stops calling findPartByName after the first failure. But the foldLeft still iterates once for each name, regardless.
Here's a version that bails out as soon as findPartByName returns a None:
def allParts2(names: Seq[String]): Option[Seq[Part]] = Some(
for (name <- names) yield findPartByName(name) match {
case Some(part) => part
case None => return None
}
)
I currently find the second version more readable, but (a) what seems most readable is likely to change as I get more experience with Scala, (b) I get the impression that early return is frowned upon in Scala, and (c) neither one seems to make what's going on especially obvious to me.
The combination of "all-or-nothing" and "give up on the first failure" seems like such a basic programming concept, I figure there must be a common Scala or functional idiom to express it.
The return in your code is actually a couple levels deep in anonymous functions. As a result, it must be implemented by throwing an exception which is caught in the outer function. This isn't efficient or pretty, hence the frowning.
It is easiest and most efficient to write this with a while loop and an Iterator.
def allParts3(names: Seq[String]): Option[Seq[Part]] = {
val iterator = names.iterator
var accum = List.empty[Part]
while (iterator.hasNext) {
findPartByName(iterator.next) match {
case Some(part) => accum +:= part
case None => return None
}
}
Some(accum.reverse)
}
Because we don't know what kind of Seq names is, we must create an iterator to loop over it efficiently rather than using tail or indexes. The while loop can be replaced with a tail-recursive inner function, but with the iterator a while loop is clearer.
Scala collections have some options to use laziness to achieve that.
You can use view and takeWhile:
def allPartsWithView(names: Seq[String]): Option[Seq[Part]] = {
val successes = names.view.map(findPartByName)
.takeWhile(!_.isEmpty)
.map(_.get)
.force
if (!names.isDefinedAt(successes.size)) Some(successes)
else None
}
Using ifDefinedAt avoids potentially traversing a long input names in the case of an early failure.
You could also use toStream and span to achieve the same thing:
def allPartsWithStream(names: Seq[String]): Option[Seq[Part]] = {
val (good, bad) = names.toStream.map(findPartByName)
.span(!_.isEmpty)
if (bad.isEmpty) Some(good.map(_.get).toList)
else None
}
I've found trying to mix view and span causes findPartByName to be evaluated twice per item in case of success.
The whole idea of returning an error condition if any error occurs does, however, sound more like a job ("the" job?) for throwing and catching exceptions. I suppose it depends on the context in your program.
Combining the other answers, i.e., a mutable flag with the map and takeWhile we love.
Given an infinite stream:
scala> var count = 0
count: Int = 0
scala> val vs = Stream continually { println(s"Compute $count") ; count += 1 ; count }
Compute 0
vs: scala.collection.immutable.Stream[Int] = Stream(1, ?)
Take until a predicate fails:
scala> var failed = false
failed: Boolean = false
scala> vs map { case x if x < 5 => println(s"Yup $x"); Some(x) case x => println(s"Nope $x"); failed = true; None } takeWhile (_.nonEmpty) map (_.get)
Yup 1
res0: scala.collection.immutable.Stream[Int] = Stream(1, ?)
scala> .toList
Compute 1
Yup 2
Compute 2
Yup 3
Compute 3
Yup 4
Compute 4
Nope 5
res1: List[Int] = List(1, 2, 3, 4)
or more simply:
scala> var count = 0
count: Int = 0
scala> val vs = Stream continually { println(s"Compute $count") ; count += 1 ; count }
Compute 0
vs: scala.collection.immutable.Stream[Int] = Stream(1, ?)
scala> var failed = false
failed: Boolean = false
scala> vs map { case x if x < 5 => println(s"Yup $x"); x case x => println(s"Nope $x"); failed = true; -1 } takeWhile (_ => !failed)
Yup 1
res3: scala.collection.immutable.Stream[Int] = Stream(1, ?)
scala> .toList
Compute 1
Yup 2
Compute 2
Yup 3
Compute 3
Yup 4
Compute 4
Nope 5
res4: List[Int] = List(1, 2, 3, 4)
I think your allParts2 function has a problem as one of the two branches of your match statement will perform a side effect. The return statement is the not-idiomatic bit, behaving as if you are doing an imperative jump.
The first function looks better, but if you are concerned with the sub-optimal iteration that foldLeft could produce you should probably go for a recursive solution as the following:
def allParts(names: Seq[String]): Option[Seq[Part]] = {
#tailrec
def allPartsRec(names: Seq[String], acc: Seq[String]): Option[Seq[String]] = names match {
case Seq(x, xs#_*) => findPartByName(x) match {
case Some(part) => allPartsRec(xs, acc +: part)
case None => None
}
case _ => Some(acc)
}
allPartsRec(names, Seq.empty)
}
I didn't compile/run it but the idea should be there and I believe it is more idiomatic than using the return trick!
I keep thinking that this has to be a one- or two-liner. I came up with one:
def allParts4(names: Seq[String]): Option[Seq[Part]] = Some(
names.map(findPartByName(_) getOrElse { return None })
)
Advantage:
The intent is extremely clear. There's no clutter and there's no exotic or nonstandard Scala.
Disadvantages:
The early return violates referential transparency, as Aldo Stracquadanio pointed out. You can't put the body of allParts4 into its calling code without changing its meaning.
Possibly inefficient due to the internal throwing and catching of an exception, as wingedsubmariner pointed out.
Sure enough, I put this into some real code, and within ten minutes, I'd enclosed the expression inside something else, and predictably got surprising behavior. So now I understand a little better why early return is frowned upon.
This is such a common operation, so important in code that makes heavy use of Option, and Scala is normally so good at combining things, I can't believe there isn't a pretty natural idiom to do it correctly.
Aren't monads good for specifying how to combine actions? Is there a GiveUpAtTheFirstSignOfResistance monad?

Scala: Generalised method to find match and return match dependant values in collection

I wish to find a match within a List and return values dependant on the match. The CollectFirst works well for matching on the elements of the collection but in this case I want to match on the member swEl of the element rather than on the element itself.
abstract class CanvNode (var swElI: Either[CSplit, VistaT])
{
private[this] var _swEl: Either[CSplit, VistaT] = swElI
def member = _swEl
def member_= (value: Either[CSplit, VistaT] ){ _swEl = value; attach}
def attach: Unit
attach
def findVista(origV: VistaIn): Option[Tuple2[CanvNode,VistaT]] = member match
{
case Right(v) if (v == origV) => Option(this, v)
case _ => None
}
}
def nodes(): List[CanvNode] = topNode :: splits.map(i => List(i.n1, i.n2)).flatten
//Is there a better way of implementing this?
val temp: Option[Tuple2[CanvNode, VistaT]] =
nodes.map(i => i.findVista(origV)).collectFirst{case Some (r) => r}
Do I need a View on that, or will the collectFirst method ensure the collection is only created as needed?
It strikes me that this must be a fairly general pattern. Another example could be if one had a List member of the main List's elements and wanted to return the fourth element if it had one. Is there a standard method I can call? Failing that I can create the following:
implicit class TraversableOnceRichClass[A](n: TraversableOnce[A])
{
def findSome[T](f: (A) => Option[T]) = n.map(f(_)).collectFirst{case Some (r) => r}
}
And then I can replace the above with:
val temp: Option[Tuple2[CanvNode, VistaT]] =
nodes.findSome(i => i.findVista(origV))
This uses implicit classes from 2.10, for pre 2.10 use:
class TraversableOnceRichClass[A](n: TraversableOnce[A])
{
def findSome[T](f: (A) => Option[T]) = n.map(f(_)).collectFirst{case Some (r) => r}
}
implicit final def TraversableOnceRichClass[A](n: List[A]):
TraversableOnceRichClass[A] = new TraversableOnceRichClass(n)
As an introductory side node: The operation you're describing (return the first Some if one exists, and None otherwise) is the sum of a collection of Options under the "first" monoid instance for Option. So for example, with Scalaz 6:
scala> Stream(None, None, Some("a"), None, Some("b")).map(_.fst).asMA.sum
res0: scalaz.FirstOption[java.lang.String] = Some(a)
Alternatively you could put something like this in scope:
implicit def optionFirstMonoid[A] = new Monoid[Option[A]] {
val zero = None
def append(a: Option[A], b: => Option[A]) = a orElse b
}
And skip the .map(_.fst) part. Unfortunately neither of these approaches is appropriately lazy in Scalaz, so the entire stream will be evaluated (unlike Haskell, where mconcat . map (First . Just) $ [1..] is just fine, for example).
Edit: As a side note to this side note: apparently Scalaz does provide a sumr that's appropriately lazy (for streams—none of these approaches will work on a view). So for example you can write this:
Stream.from(1).map(Some(_).fst).sumr
And not wait forever for your answer, just like in the Haskell version.
But assuming that we're sticking with the standard library, instead of this:
n.map(f(_)).collectFirst{ case Some(r) => r }
I'd write the following, which is more or less equivalent, and arguably more idiomatic:
n.flatMap(f(_)).headOption
For example, suppose we have a list of integers.
val xs = List(1, 2, 3, 4, 5)
We can make this lazy and map a function with a side effect over it to show us when its elements are accessed:
val ys = xs.view.map { i => println(i); i }
Now we can flatMap an Option-returning function over the resulting collection and use headOption to (safely) return the first element, if it exists:
scala> ys.flatMap(i => if (i > 2) Some(i.toString) else None).headOption
1
2
3
res0: Option[java.lang.String] = Some(3)
So clearly this stops when we hit a non-empty value, as desired. And yes, you'll definitely need a view if your original collection is strict, since otherwise headOption (or collectFirst) can't reach back and stop the flatMap (or map) that precedes it.
In your case you can skip findVista and get even more concise with something like this:
val temp = nodes.view.flatMap(
node => node.right.toOption.filter(_ == origV).map(node -> _)
).headOption
Whether you find this clearer or just a mess is a matter of taste, of course.

How to extract remainder of sequence in pattern matching

I've obviously done a very poor job of explaining what I'm looking for in my original post so let's try this one more time. What I'm trying to accomplish is the ability to pass a sequence of items, extract one or more of the items, and then pass the REMAINDER of the sequence on to another extractor. Note that by sequence I mean sequence (not necessarily a List). My previous examples used list as the sequence and I gave some examples of extraction using cons (::), but I could just as well pass an Array as my sequence.
I thought I knew how pattern matching and extraction worked but I could be wrong so to avoid any more basic comments and links to how to do pattern matching sites here's my understanding:
If I want to return a single item from my extractor I would define an unapply method. This method takes whatever type I chose as input (the type could be a sequence...) and returns a single optional item (the return type could itself be a sequence). The return must be wrapped in Some if I want a match or None if I don't. Here is an example that takes a sequence as input and returns the same sequence wrapped in Some but only if it contains all Strings. I could very well just return the sequence wrapped in Some and not do anything else, but this seems to cause confusion for people. The key is if it is wrapped in Some then it will match and if it is None it will not. Just to be more clear, the match will also not happen unless the input also matches my unapply methods input type. Here is my example:
object Test {
// In my original post I just returned the Seq itself just to verify I
// had matched but many people commented they didn't understand what I
// was trying to do so I've made it a bit more complicated (e.g. match
// only if the sequence is a sequence of Strings). Hopefully I don't
// screw this up and introduce a bug :)
def unapply[A](xs: Seq[A]): Option[Seq[String]] =
if (xs forall { _.isInstanceOf[String] })
Some(xs.asInstanceOf[Seq[String]])
else
None
}
Using List as an example, I can now perform the following:
// This works
def test1(xs: List[_]) = xs match {
case (s: String) :: Test(rest) =>
println("s = " + s + ", rest = " + rest)
case _ =>
println("no match")
}
test1(List("foo", "bar", "baz")) // "s = foo, rest = List(bar, baz)"
My test1 function takes List as input and extracts the head and tail using cons via the constructor pattern (e.g. ::(s, rest)). It then uses type ascription (: String) to make sure the head (s) is a String. The tail contains List("bar", "baz"). This is a List which means it is also a Seq (sequence). It is then passed as input to my Test extractor which verifies that both "bar" and "baz" are strings and returns the List wrapped in Some. Since Some is returned it is considered a match (although in my original post where I inadvertently mixed up unapplySeq with unapply this didn't work as expected, but that aside...). This is NOT what I'm looking for. This was only an example to show that Test does in fact extract a Seq as input as expected.
Now, here's where I caused mass confusion last time when I inadvertently used unapplySeq instead of unapply in my write up. After much confusion trying to understand the comments that were posted I finally picked up on the mistake. Many thanks to Dan for pointing me in the right direction...
But just be avoid any more confusion, let me clarify my understanding of unapplySeq. Like unapply, unapplySeq takes in whatever argument I choose as input, but instead of returning a single element it returns a sequence of elements. Each item in this sequence can then be used for additional pattern matching. Again, to make a match happen the input type must match and my returned sequence must be wrapped in Some and not be None. When extracting over the sequence of items returned from unapplySeq, you can use _* to match any remaining items not yet matched.
Ok, so my extractor takes a sequence as input and returns a sequence (as a single item) in return. Since I only want to return a single item as a match I need to use unapply NOT unapplySeq. Even though in my case I'm returning a Seq, I don't want unapplySeq because I don't want to do more pattern matching on the items in the Seq. I just want to return the items as a Seq on its own to then be passed to the body of my case match. This sounds confusing, but to those that understand unapply vs unapplySeq I hope it isn't.
So here is what I WANT to do. I want to take something that returns a sequence (e.g. List or Array) and I want to extract a few items from this sequence and then extract the REMAINDER of the items (e.g. _*) as a sequence. Let's call it the remainder sequence. I want to then pass the remainder sequence as input to my extractor. My extractor will then return the remaining items as a single Seq if it matches my criteria. Just to be 100% clear. The List (or Array, etc) will have its unapplySeq extractor called to create the sequence of items. I will extract a one or more of these items and then pass what is left as a sequence to my Test extractor which will use unapply (NOT unapplySeq) to return the remainder. If you are confused by this, then please don't comment...
Here are my tests:
// Doesn't compile. Is there a syntax for this?
def test2(xs: Seq[_]) = xs match {
// Variations tried:
// Test(rest) # _* - doesn't compile (this one seems reasonable to me)
// Test(rest # _*) - doesn't compile (would compile if Test had
// unapplySeq, but in that case would bind List's
// second element to Test as a Seq and then bind
// rest to that Seq (if all strings) - not what I'm
// looking for...). I though that this might work
// since Scala knows Test has no unapplySeq only
// unapply so # _* can be tied to the List not Test
// rest # Test(_*) - doesn't compile (didn't expect to)
case List(s: String, Test(rest) # _*) =>
println("s = " + s + " rest = " + rest)
case _ =>
println("no match")
}
// This works, but messy
def test3(xs: List[_]) = xs match {
case List(s: String, rest # _*) if (
rest match { case Test(rest) => true; case _ => false }
) =>
println("s = " + s + " rest = " + rest)
case _ =>
println("no match")
}
I created test3 based on comments from Julian (thanks Julian..). Some have commented that test3 does what I want so they are confused what I'm looking for. Yes, it accomplishes what I want to accomplish, but I'm not satisfied with it. Daniel's example also works (thanks Daniel), but I'm also not satisfied with having to create another extractor to split things and then do embedded extractions. These solutions seem too much work in order to accomplish something that seems fairly straight forward to me. What I WANT is to make test2 work or know that it can't be done this way. Is the error given because the syntax is wrong? I know that rest # _* will return a Seq, that can be verified here:
def test4(xs: List[_]) = xs match {
case List(s: String, rest # _*) =>
println(rest.getClass) // scala.collection.immutable.$colon$colon
case _ =>
println("no match")
}
It returns cons (::) which is a List which is a Seq. So how can I pass the _* Seq on to my extractor and have is return bound to the variable rest?
Note that I've also tried passing varargs to my unapply constructor (e.g. unapply(xs: A*)...) but that won't match either.
So, I hope it is clear now when I say I want to extract the remainder of a sequence in pattern matching. I'm not sure how else I can word it.
Based on the great feedback from Daniel I'm hoping he is going to have an answer for me :)
I'd like to extract the first item and pass the remainder on to another extractor.
OK. Your test1 does that, exactly. first_item :: Extractor(the_rest). The weird behavior you're seeing comes from your Test extractor. As you already had the answer to your stated question, and as expected behavior from your Test strikes you as a problem with test1, it seems that what you really want is some help with extractors.
So, please read Extractor Objects, from docs.scala-lang.org, and Pattern Matching in Scala (pdf). Although that PDF has an example of unapplySeq, and suggests where you'd want to use it, here are some extra examples:
object Sorted {
def unapply(xs: Seq[Int]) =
if (xs == xs.sortWith(_ < _)) Some(xs) else None
}
object SortedSeq {
def unapplySeq(xs: Seq[Int]) =
if (xs == xs.sortWith(_ < _)) Some(xs) else None
}
Interactively:
scala> List(1,2,3,4) match { case Sorted(xs) => Some(xs); case _ => None }
res0: Option[Seq[Int]] = Some(List(1, 2, 3, 4))
scala> List(4,1,2,3) match { case Sorted(xs) => Some(xs); case _ => None }
res1: Option[Seq[Int]] = None
scala> List(4,1,2,3) match { case first :: Sorted(rest) => Some(first, rest); case _ => None }
res2: Option[(Int, Seq[Int])] = Some((4,List(1, 2, 3)))
scala> List(1,2,3,4) match { case SortedSeq(a,b,c,d) => (a,b,c,d) }
res3: (Int, Int, Int, Int) = (1,2,3,4)
scala> List(4,1,2,3) match { case _ :: SortedSeq(a, b, _*) => (a,b) }
res4: (Int, Int) = (1,2)
scala> List(1,2,3,4) match { case SortedSeq(a, rest # _*) => (a, rest) }
res5: (Int, Seq[Int]) = (1,List(2, 3, 4))
Or maybe -- I only have the faint suspicion of this, you haven't said as much -- you don't want extractor help, but actually you want a terse way to express something like
scala> List(1,2,3,4) match { case 1 :: xs if (xs match { case Sorted(_) => true; case _ => false }) => xs }
res6: List[Int] = List(2, 3, 4)
Erlang has a feature like this (although, without these crazy extractors):
example(L=[1|_]) -> examine(L).
, which pattern-matches the same argument twice - to L and also to [1|_]. In Erlang both sides of the = are full-fledged patterns and could be anything, and you can add a third or more patterns with more =. Scala seems to only support the L=[1|_] form, having a variable and then a full pattern.
scala> List(4,1,2,3) match { case xs # _ :: Sorted(_) => xs }
collection.immutable.::[Int] = List(4, 1, 2, 3)
Well, the easiest way is this:
case (s: String) :: Test(rest # _*) =>
If you need this to work on general Seq, you can just define an extractor to split head from tail:
object Split {
def unapply[T](xs: Seq[T]): Option[(T, Seq[T])] = if (xs.nonEmpty) Some(xs.head -> xs.tail) else None
}
And then use it like
case Split(s: String, Test(rest # _*)) =>
Also note that if you had defined unapply instead of unapplySeq, then # _* would not be required on the pattern matched by Test.
:: is an extractor. For how it works (from a random googling), see, for example, here.
def test1(xs: List[_]) = xs match {
case s :: rest =>
println("s = " + s + " rest = " + rest)
case _ =>
println("no match")
}
scala> test1(List("a", "b", "c"))
s = a rest = List(b, c)
I think this is what you wanted?
Messing around with this, it seems that the issue has something to do with unapplySeq.
object Test {
def unapply[A](xs: List[A]): Option[List[A]] = Some(xs)
}
def test1(xs: List[_]) = xs match {
case (s: String) :: Test(s2 :: rest) =>
println("s = " + s + " rest = " + rest)
case _ =>
println("no match")
}
test1(List("foo", "bar", "baz"))
produces the output:
s = foo rest = List(baz)
I'm havng trouble googling up docs on the difference between unapply and unapplySeq.