Matching a sequence within a sequence in scala - scala

I'm looking to match a sequence within a sequence like either in ex 1 or ex 2;
List(1, 2, 3, 4) match {
case 1 :: List(_*) :: 4 :: tail => // Ex 1
case 1 :: (seq : List[Int]) :: 4 :: tail => // Ex 2
case _ =>
}
This is a variant to the fixed length sequence pattern _*. Like the _*, I don't care about the content of the inner pattern, but it is important that the length can vary and that the pattern is surrounded by a prefix (such as the 1 above) and a suffix (like the 4).
My question is if any of you have a trick to do this by some crafty unapply magic or if you'd just iterate the list to search for the sequence manually.
Thanks in advance! :-)

This is really outside the scope of what pattern matching is supposed to be used for. Even if you could finagle a set of custom unapply methods to do what you wanted, it wouldn't be apparent whether matching was greedy or not, etc..
However, if you really want to, you can proceed as follows (for example):
import scala.collection.SeqLike
class Decon[A](a0: A, a1: A) {
def unapply[C <: SeqLike[A, C]](xs: C with SeqLike[A, C]): Option[(C, C)] = {
xs.span(_ != a1) match {
case (a0 +: pre, a1 +: post) => Some((pre,post))
case _ => None
}
}
}
val Dc = new Decon(1,4)
scala> List(1,2,3,4,5) match { case Dc(pre, post) => (pre, post); case _ => (Nil, Nil) }
res1: (List[Int], List[Int]) = (List(2, 3),List(5))
Separating the specification of the fixed elements 1 and 4 from the match is necessary; otherwise the normal algorithm would ask the unapply to return values without any knowledge of what was being sought, and then would test to make sure they were correct.

You could do something like this:
List(1,2,3,4) match {
case 1 :: tail if tail.last == 4 => println("Found it!")
}
But if you check the implementation of last in LinearSeqOptimized:
def last: A = {
if (isEmpty) throw new NoSuchElementException
var these = this
var nx = these.tail
while (!nx.isEmpty) {
these = nx
nx = nx.tail
}
these.head
}
It is iterating indeed. This is because List inherits from LinearSeq which are optimized to provide efficient head and tail operations. For what you want to do, it is better to use an IndexedSeq implementation like Vector which has an optimized length operation, used in last:
override /*TraversableLike*/ def last: A = {
if (isEmpty) throw new UnsupportedOperationException("empty.last")
apply(length-1)
}
So you could do something like this:
Vector(1,2,3,4) match {
case v if v.head == 1 && v.last == 4 => println("Found it!")
}

Related

scalaz, Disjunction.sequence returning a list of lefts

In scalaz 7.2.6, I want to implement sequence on Disjunction, such that if there is one or more lefts, it returns a list of those, instead of taking only the first one (as in Disjunction.sequenceU):
import scalaz._, Scalaz._
List(1.right, 2.right, 3.right).sequence
res1: \/-(List(1, 2, 3))
List(1.right, "error2".left, "error3".left).sequence
res2: -\/(List(error2, error3))
I've implemented it as follows and it works, but it looks ugly. Is there a getRight method (such as in scala Either class, Right[String, Int](3).right.get)? And how to improve this code?
implicit class RichSequence[L, R](val l: List[\/[L, R]]) {
def getLeft(v: \/[L, R]):L = v match { case -\/(x) => x }
def getRight(v: \/[L, R]):R = v match { case \/-(x) => x }
def sequence: \/[List[L], List[R]] =
if (l.forall(_.isRight)) {
l.map(e => getRight(e)).right
} else {
l.filter(_.isLeft).map(e => getLeft(e)).left
}
}
Playing around I've implemented a recursive function for that, but the best option would be to use separate:
implicit class RichSequence[L, R](val l: List[\/[L, R]]) {
def sequence: \/[List[L], List[R]] = {
def seqLoop(left: List[L], right: List[R], list: List[\/[L, R]]): \/[List[L], List[R]] =
list match {
case (h :: t) =>
h match {
case -\/(e) => seqLoop(left :+ e, right, t)
case \/-(s) => seqLoop(left, right :+ s, t)
}
case Nil =>
if(left.isEmpty) \/-(right)
else -\/(left)
}
seqLoop(List(), List(), l)
}
def sequenceSeparate: \/[List[L], List[R]] = {
val (left, right) = l.separate[\/[L, R], L, R]
if(left.isEmpty) \/-(right)
else -\/(left)
}
}
The first one just collects results and at the end decide what to do with those, the second its basically the same with the exception that the recursive function is much simpler, I didn't think about performance here, I've used :+, if you care use prepend or some other collection.
You may also want to take a look at Validation and ValidationNEL which unlike Disjunction accumulate failures.

Difference between any and parametric polymorphism scala?

I know that parametric polymorphism is what actually works, but I'm curious why using Any in it's place does not. For example how is the first function
def len[T] (l:List[T]):Int =
l match {
case Nil => 0
case _ :: t => 1 + len(t)
}
different from this one?
def len (l:List[Any]):Int =
l match {
case Nil => 0
case _ :: t => 1 + len(t)
}
What do you mean it doesn't work? This seems fine:
len(List('a,'b,'c))
// res0: Int = 3
Your in your example, there really isn't a difference, since you're not actually using the contents of the list for anything, but imagine a slightly different function:
def second[T](l: List[T]): Option[T] =
l match {
case Nil => None
case _ :: Nil => None
case _ :: x :: _ => Some(x)
}
println(second(List(1,2,3)).map(_ + 5)) // Some(7)
println(second(List(List('a,'b,'c), List('d,'e))).map(_.head)) // Some('d)
If you tried this with Any, you wouldn't be able to get anything except Option[Any] in return, so the compiler wouldn't let you do anything useful with the result (like add it to an Int or call .head, as in the examples, respectively).
In this case there really isn't a difference, because you aren't relying on the contained type at all, just the structure of List itself. It doesn't matter what T is, the length will be the same either way.
The type parameter would be important if you wanted to return another List[T]. For example:
def takeEveryOther[T](l: List[T]): List[T] =
l.zipWithIndex.collect { case (a, i) if(i % 2 == 0) => a }

Whats wrong with this scala tuple?

Trying to create a method that determines if a set is a subset of another set, both given as parameters. When I tried to test it the console printed out
scala.MatchError : (List(1, 2, 3, 4, 5, 6, 7),List(1, 2, 3, 4)) (of class scala.Tuple2),
the two lists given are what I was using as parameters to test it. Also, scala was making me type out return in front of true and false, any ideas what led to this either?
def subset(a: List[Int], b: List[Int]): Boolean ={
(a,b) match {
case (_,Nil)=> return true
}
b match {
case h::t if (a.contains(h)) => subset(a,t)
case h::t => return false
}}
The other answers don't really answer exactly why your code is incorrect. It appears that you're handling the case when list b is empty and non-empty and that everything should be okay, but in fact you're actually not. Let's look at your code again, with some formatting fixes.
def subset(a: List[Int], b: List[Int]): Boolean = {
(a, b) match {
case (_, Nil) => return true
} // we can never make it past here, because either we return true,
// or a MatchError is raised.
b match {
case h :: t if (a.contains(h)) => subset(a,t)
case h :: t => return false
}
}
The real problem here is that you have two completely disconnected match statements. So when b is non-empty, the first match will fail, because it only handles the case when b is Nil.
As pointed out in the other solutions, the proper way to do this is to merge the two match statements together into one.
def subset(a: List[Int], b: List[Int]): Boolean = {
(a, b) match {
case (_, Nil) => true
case (xs, head :: tail) if(xs contains head) => subset(xs, tail)
case _ => false
}
}
Notice how the return statements are no longer needed. In scala you should avoid using return as much as possible, as it's likely that your way of thinking around return actually lead you into this trap. Methods that return early are likely to lead to bugs, and are difficult to read.
A cleaner way to implement this could use diff. b can be considered a subset of a if the set of elements of b minus the elements of a is empty.
def subset(a: List[Int], b: List[Int]): Boolean = (b.distinct diff a.distinct).nonEmpty
distinct is only required if it's possible for a and b to contain duplicates, because we're trying a List like a Set when it's actually not.
Better yet, if we convert the Lists to Sets, then we can use subsetOf.
def subset(a: List[Int], b: List[Int]): Boolean = b.toSet.subsetOf(a.toSet)
Scala match expression should match to at least one case expression. Otherwise the MatchError is raised.
You should have used the following cases:
(a, b) match {
case (_, Nil) => true
case (aa, h :: t) if aa contains h => subset(aa, t)
case _ => false
}
An alternative way could be to call methods in standard library.
For each element in 'b', check if 'a' contains that element.
Here is the simple code:
def subset(a: List[Int], b: List[Int]): Boolean = {
(b.forall(a.contains(_)))
}
MatchError - This class implements errors which are thrown whenever an object doesn't match any pattern of a pattern matching expression.
Obviously the second list having elements in it will cause this error, as no pattern will match. You should just add another branch to the first match like so:
def subset(a: List[Int], b: List[Int]): Boolean = {
(a, b) match {
case (_, List()) => return true
case _ => b match {
case h :: t if (a.contains(h)) => subset(a, t)
case h :: t => return false
}
}
}
MatchError occurs whenever an object doesn't match any pattern of a pattern matching expression.
An alternative way is by using dropWhile or takeWhile
def subsets(a:List[Int], b:List[Int]):Boolean = {
return b.dropWhile { ele => a.contains(ele)}.size==0
}
OR
def subsets(a:List[Int], b:List[Int]):Boolean = {
return b.takeWhile { ele => a.contains(ele)}.size==b.size
}

Scala pattern matching on sequences other than Lists

I have the following code which recursively operates on each element within a List
def doMatch(list: List[Int]): Unit = list match {
case last :: Nil => println("Final element.")
case head :: tail => println("Recursing..."); doMatch(tail)
}
Now, ignoring that this functionality is available through filter() and foreach(), this works just fine. However, if I try to change it to accept any Seq[Int], I run into problems:
Seq doesn't have ::, but it does have +:, which as I understand is basically the same thing. If I try to match on head +: tail however, the compiler complains 'error: not found: value +:'
Nil is specific to List, and I'm not sure what to replace it with. I'm going to try Seq() if I ever get past the previous problem
Here is how I think the code should look, except it doesn't work:
def doMatch(seq: Seq[Int]): Unit = seq match {
case last +: Seq() => println("Final element.")
case head +: tail => println("Recursing..."); doMatch(tail)
}
Edit: So many good answers! I'm accepting agilesteel's answer as his was the first that noted that :: isn't an operator in my example, but a case class and hence the difference.
As of the ides of March 2012, this works in 2.10+:
def doMatch(seq: Seq[Int]): Unit = seq match {
case last +: Seq() => println("Final element.")
case head +: tail => println("Recursing..."); doMatch(tail)
} //> doMatch: (seq: Seq[Int])Unit
doMatch(List(1, 2)) //> Recursing...
//| Final element.
More generally, two different head/tail and init/last decomposition objects mirroring append/prepend were added for Seq in SeqExtractors:
List(1, 2) match { case init :+ last => last } //> res0: Int = 2
List(1, 2) match { case head +: tail => tail } //> res1: List[Int] = List(2)
Vector(1, 2) match { case init :+ last => last } //> res2: Int = 2
Vector(1, 2) match { case head +: tail => tail } //> res3: scala.collection.immutable.Vector[Int] = Vector(2)
Kind of cheating, but here it goes:
def doMatch(seq: Seq[Int]): Unit = seq match {
case Seq(x) => println("Final element " + x)
case Seq(x, xs#_*) => println("Recursing..." + x); doMatch(xs)
}
Don't ask me why xs* doesn't work...
There are two :: (pronounced cons) in Scala. One is an operator defined in class List and one is a class (subclass of List), which represents a non empty list characterized by a head and a tail.
head :: tail is a constructor pattern, which is syntactically modified from ::(head, tail).
:: is a case class, which means there is an extractor object defined for it.
You can actually define an object for +: to do exactly what you are looking for:
object +: {
def unapply[T](s: Seq[T]) =
if(s.nonEmpty)
Some(s.head, s.tail)
else
None
}
scala> val h +: t = Seq(1,2,3)
h: Int = 1
t: Seq[Int] = List(2, 3)
Then your code works exactly as expected.
This works because h +: t is equivalent to +:(h,t) when used for patten matching.
I don't think there is pattern matching support for arbitrary sequences in the standard library. You could do it with out pattern matching though:
def doMatch(seq: Seq[Int]) {
if (seq.size == 1) println("final element " + seq(0)) else {
println("recursing")
doMatch(seq.tail)
}
}
doMatch(1 to 10)
You can define your own extractor objects though. See http://www.scala-lang.org/node/112
object SEQ {
def unapply[A](s:Seq[A]):Option[(A, Seq[A])] = {
if (s.size == 0) None else {
Some((s.head, s.tail))
}
}
}
def doMatch(seq: Seq[Int]) {
seq match {
case SEQ(head, Seq()) => println("final")
case SEQ(head, tail) => {
println("recursing")
doMatch(tail)
}
}
}
A simple tranformation from Seq to List would do the job:
def doMatch (list: List[Int]): Unit = list match {
case last :: Nil => println ("Final element.")
case head :: tail => println ("Recursing..."); doMatch (tail)
case Nil => println ("only seen for empty lists")
}
def doMatchSeq (seq: Seq[Int]) : Unit = doMatch (seq.toList)
doMatch (List(3, 4, 5))
doMatchSeq (3 to 5)

Step-by-step explanation of Scala syntax used in Wikipedia quicksort example

I am trying to understand the Scala quicksort example from Wikipedia. How could the sample be disassembled step by step and what does all the syntactic sugar involved mean?
def qsort: List[Int] => List[Int] = {
case Nil => Nil
case pivot :: tail =>
val (smaller, rest) = tail.partition(_ < pivot)
qsort(smaller) ::: pivot :: qsort(rest)
}
As much as I can gather at this stage qsort is a function that takes no parameters and returns a new Function1[List[Int],List[Int]] that implements quicksort through usage of pattern matching, list manipulation and recursive calls. But I can't quite figure out where the pivot comes from, and how exactly the pattern matching syntax works in this case.
UPDATE:
Thanks everyone for the great explanations!
I just wanted to share another example of quicksort implementation which I have discovered in the Scala by Example by Martin Odersky. Although based around arrays instead of lists and less of a show-off in terms of varios Scala features I personally find it much less convoluted than its Wikipedia counterpart, and just so much more clear and to the point expression of the underlying algorithm:
def sort(xs: Array[Int]): Array[Int] = {
if (xs.length <= 1) xs
else {
val pivot = xs(xs.length / 2)
Array.concat(
sort(xs filter (pivot >)),
xs filter (pivot ==),
sort(xs filter (pivot <)))
}
}
def qsort: List[Int] => List[Int] = {
case Nil => Nil
case pivot :: tail =>
val (smaller, rest) = tail.partition(_ < pivot)
qsort(smaller) ::: pivot :: qsort(rest)
}
let's pick apart a few bits.
Naming
Operators (such as * or +) are valid candidates for method and class names in Scala (hence you can have a class called :: (or a method called :: for that matter - and indeed both exist). Scala appears to have operator-overloading but in fact it does not: it's merely that you can declare a method with the same name.
Pattern Matching
target match {
case p1 =>
case p2 =>
}
Where p1 and p2 are patterns. There are many valid patterns (you can match against Strings, types, particular instances etc). You can also match against something called an extractor. An extractor basically extracts arguments for you in the case of a match, so:
target match {
case MyExtractor(arg1, arg2, arg3) => //I can now use arg1, arg2 etc
}
In scala, if an extractor (of which a case class is an example) exists called X, then the pattern X(a, b) is equivalent to a X b. The case class :: has a constructor taking 2 arguments and putting this together we get that:
case x :: xs =>
case ::(x, xs) =>
Are equivalent. This match says "if my List is an instance of :: extract the value head into x and tail into xs". pattern-matching is also used in variable declaration. For example, if p is a pattern, this is valid:
val p = expression
This why we can declare variables like:
val x :: xs = List(1, 2, 3)
val (a, b) = xs.partition(_ % 2 == 0 ) //returns a Tuple2 which is a pattern (t1, t2)
Anonymous Functions
Secondly we have a function "literal". tail is an instance of List which has a method called partition which takes a predicate and returns two lists; one of those entries satisfying the predicate and one of those entries which did not.
val pred = (el: Int) => e < 2
Declares a function predicate which takes an Int and returns true iff the int value is less than 2. There is a shorthand for writing functions inline
tail.partition(_ < pivot) // _ is a placeholder for the parameter
tail.partition( (e: Int) => e < pivot )
These two expressions mean the same thing.
Lists
A List is a sealed abstract class with only two implementations, Nil (the empty list) and :: (also called cons), which is a non-empty list consisting of a head and a tail (which is also a list). You can now see that the pattern match is a match on whether the list is empty or not. a List can be created by cons-ing it to other lists:
val l = 1 :: 2 :: Nil
val m = List(1, 2, 3) ::: List(4, 5, 6)
The above lines are simply method calls (:: is a valid method name in scala). The only difference between these and normal method calls is that, if a method end in a colon : and is called with spaces, the order of target and parameter is reversed:
a :: b === b.::(a)
Function Types
val f: A => B
the previous line types the reference f as a function which takes an A and returns a B, so I could then do:
val a = new A
val b: B = f(a)
Hence you can see that def qsort: List[Int] => List[Int] declares a method called qsort which returns a function taking a List[Int] and returning a List[Int]. So I could obviously do:
val l = List(2, 4, 1)
val m = qsort.apply(l) //apply is to Function what run is to Runnable
val n = qsort(l) //syntactic sugar - you don't have to define apply explicitly!
Recursion
When a method call is tail recursive, Scala will optimize this into the iterator pattern. There was a msitake in my original answer because the qsort above is not tail-recursive (the tail-call is the cons operator)
def qsort: List[Int] => List[Int] = {
case Nil => Nil
case pivot :: tail =>
val (smaller, rest) = tail.partition(_ < pivot)
qsort(smaller) ::: pivot :: qsort(rest)
}
Let's rewrite that. First, replace the function literal with an instance of Function1:
def qsort: List[Int] => List[Int] = new Function1[List[Int], List[Int]] {
def apply(input: List[Int]): List[Int] = input match {
case Nil => Nil
case pivot :: tail =>
val (smaller, rest) = tail.partition(_ < pivot)
qsort(smaller) ::: pivot :: qsort(rest)
}
}
Next, I'm going to replace the pattern match with equivalent if/else statements. Note that they are equivalent, not the same. The bytecode for pattern matches are more optimized. For instance, the second if and the exception throwing below do not exist, because the compile knows the second match will always happen if the first fails.
def qsort: List[Int] => List[Int] = new Function1[List[Int], List[Int]] {
def apply(input: List[Int]): List[Int] = if (input == Nil) {
Nil
} else if (input.isInstanceOf[::[_]] &&
scala.collection.immutable.::.unapply(input.asInstanceOf[::[Int]]) != None) {
val unapplyResult = scala.collection.immutable.::.unapply(input.asInstanceOf[::[Int]]).get
val pivot = unapplyResult._1
val tail = unapplyResult._2
val (smaller, rest) = tail.partition(_ < pivot)
qsort(smaller) ::: pivot :: qsort(rest)
} else {
throw new scala.MatchError(input)
}
}
Actually, val (smaller, rest) is pattern match as well, so Let's decompose it as well:
def qsort: List[Int] => List[Int] = new Function1[List[Int], List[Int]] {
def apply(input: List[Int]): List[Int] = if (input == Nil) {
Nil
} else if (input.isInstanceOf[::[_]] &&
scala.collection.immutable.::.unapply(input.asInstanceOf[::[Int]]) != None) {
val unapplyResult0 = scala.collection.immutable.::.unapply(input.asInstanceOf[::[Int]]).get
val pivot = unapplyResult0._1
val tail = unapplyResult0._2
val tmp0 = tail.partition(_ < pivot)
if (Tuple2.unapply(tmp0) == None)
throw new scala.MatchError(tmp0)
val unapplyResult1 = Tuple2.unapply(tmp0).get
val smaller = unapplyResult1._1
val rest = unapplyResult1._2
qsort(smaller) ::: pivot :: qsort(rest)
} else {
throw new scala.MatchError(input)
}
}
Obviously, this is highly unoptmized. Even worse, there are some function calls being done more than once, which doesn't happen in the original. Unfortunately, to fix that would require some structural changes to the code.
There's still some syntactic sugar here. There is an anonymous function being passed to partition, and there is the syntactic sugar for calling functions. Rewriting those yields the following:
def qsort: List[Int] => List[Int] = new Function1[List[Int], List[Int]] {
def apply(input: List[Int]): List[Int] = if (input == Nil) {
Nil
} else if (input.isInstanceOf[::[_]] &&
scala.collection.immutable.::.unapply(input.asInstanceOf[::[Int]]) != None) {
val unapplyResult0 = scala.collection.immutable.::.unapply(input.asInstanceOf[::[Int]]).get
val pivot = unapplyResult0._1
val tail = unapplyResult0._2
val func0 = new Function1[Int, Boolean] {
def apply(input: Int): Boolean = input < pivot
}
val tmp0 = tail.partition(func0)
if (Tuple2.unapply(tmp0) == None)
throw new scala.MatchError(tmp0)
val unapplyResult1 = Tuple2.unapply(tmp0).get
val smaller = unapplyResult1._1
val rest = unapplyResult1._2
qsort.apply(smaller) ::: pivot :: qsort.apply(rest)
} else {
throw new scala.MatchError(input)
}
}
For once, the extensive explanations about each syntactic sugar and how it works are being done by others. :-) I hope this complements their answers. Just as a final note, the following two lines are equivalent:
qsort(smaller) ::: pivot :: qsort(rest)
qsort(rest).::(pivot).:::(qsort(smaller))
The pivot in this pattern matching example is the first element of the list:
scala> List(1,2,3) match {
| case x :: xs => println(x)
| case _ => println("empty")
| }
1
The pattern matching is based on extractors and the cons is not part of the language. It uses the infix syntax. You can also write
scala> List(1,2,3) match {
| case ::(x,xs) => println(x)
| case _ => println("empty")
| }
1
as well. So there is a type :: that looks like the cons operator. This type defines how it is extracted:
final case class ::[B](private var hd: B, private[scala] var tl: List[B]){ ... }
It's a case class so the extractor will be generated by the Scala compiler. Like in this example class A.
case class A(x : Int, y : Int)
A(1,2) match { case x A y => printf("%s %s", x, y)}
-> 1 2
Based on this machinary patterns matching is supported for Lists, Regexp and XML.