Find all indices of pattern in string? - scala

Here is the code I have which looks ugly because it uses two vars.
def patternMatching(pattern: String, genome: String): List[Int] = {
assert(pattern.length < genome.length)
var curr = 0
var r = List[Int]()
while (curr != -1) {
curr = genome.indexOf(pattern, curr)
if (curr != -1) {
r ::= curr
curr += 1
}
}
r.reverse
}
How do you write this in a functional way?

It's quite straigthforward :
0.until(genome.length).filter(genome.startsWith(pattern, _))

You could use unfold method from scalaz like this:
import scalaz._, Scalaz._
def patternIndexes(pattern: String, genome: String) = unfold(0){
genome.indexOf(pattern, _) match {
case -1 => None
case n => (n, n+1).some
}
}
Usage:
scala> patternIndexes("a", "aba").toList
res0: List[Int] = List(0, 2)

There is a much simpler idiomatic Scala solution that does not require trying to explicitly apply a pattern at every location or using 3rd party library:
def patternMatching(pattern: String, genome: String): List[Int] =
pattern.r.findAllMatchIn(genome).map(_.start).toList

If you require to know also the ending position of the index you could use:
def patternMatchingIndex(pattern: Regex, text: String): List[(Int, Int)] =
pattern.findAllMatchIn(text).map(index => (index.start, index.end)).toList

Related

Scala apply function on list elements to return a value

I have found this answer in the forum earlier to a question I have been looking for, How to find a matching element in a list and map it in as an Scala API method?
// Returns Some(66)
List(1, 2, 3) collectFirst { case i if (i * 33 % 2 == 0) => i * 33 }
Now if I am replacing the if clause with a function
//
List(1, 2, 3) collectFirst { case i if ( test(i) > 0) => test(i) }
this works but test() will be evaluated twice. Is there a better solution to apply a function to a list and return a result when a condition is met (not having to go through all elements and not havong to call the function twice (for evaluation and for returning the value.
Something like this, perhaps?
Breaking it into two separate operations lets you save/reuse intermediate results.
List(1,2,3).iterator.map(test).find(_ > 0)
You can wrap your function in a custom extractor:
def test(i: Int): Int = i - 1
object Test {
def unapply(i: Int): Option[Int] = Some(test(i))
}
scala> List(1, 10, 20) collectFirst { case Test(i) if i > 0 => i }
res0: Option[Int] = Some(9)
You can generalize this solution and make a class for that kind of extractors:
case class Extract[T, U](f: T => U) {
def unapply(t: T): Option[U] = Some(f(t))
}
scala> val Test2 = Extract(test)
Test2: Extract[Int,Int] = Extract($$Lambda$1326/1843609566#69c33ea2)
scala> List(1, 10, 20) collectFirst { case Test2(i) if i > 0 => i }
res1: Option[Int] = Some(9)
You can also wrap the guard into the extractor as well:
case class ExtractWithGuard[T, U](f: T => U)(pred: U => Boolean) {
def unapply(t: T): Option[U] = {
val u = f(t)
if (pred(u)) Some(u)
else None
}
}
scala> val Test3 = ExtractWithGuard(test)(_ > 0)
Test3: ExtractWithGuard[Int,Int] = ExtractWithGuard($$Lambda$1327/391731126#591a4d25)
scala> List(1, 10, 20) collectFirst { case Test3(i) => i }
res2: Option[Int] = Some(9)

Structure to allow #tailrec when multiple recursive calls are invoked

The following logic identifies the combination of integers summing to n that produces the maximum product:
def bestProd(n: Int) = {
type AType = (Vector[Int], Long)
import annotation._
// #tailrec (umm .. nope ..)
def bestProd0(n: Int, accum : AType): AType = {
if (n<=1) accum
else {
var cmax = accum
for (k <- 2 to n) {
val tmpacc = bestProd0(n-k, (accum._1 :+ k, accum._2 * k))
if (tmpacc._2 > cmax._2) {
cmax = tmpacc
}
}
cmax
}
}
bestProd0(n, (Vector(), 1))
}
This code does work:
scala> bestProd(11)
res22: (Vector[Int], Long) = (Vector(2, 3, 3, 3),54)
Now it was not a surprise to me that #tailrec did not work. After all the recursive invocation is not in the tail position. Is is possible to reformulate the for loop to instead do a proper single-call to achieve the tail recursion?
I don't think it's possible if you're trying to stick close to the stated algorithm. Rethinking the approach you could do something like this.
import scala.annotation.tailrec
def bestProd1(n: Int) = {
#tailrec
def nums(acc: Vector[Int]): Vector[Int] = {
if (acc.head > 4)
nums( (acc.head - 3) +: 3 +: acc.tail )
else
acc
}
val result = nums( Vector(n) )
(result, result.product)
}
It comes up with the same results (as far as I can tell) except for I don't split 4 into 2,2.

Min/max with Option[T] for possibly empty Seq?

I'm doing a bit of Scala gymnastics where I have Seq[T] in which I try to find the "smallest" element. This is what I do right now:
val leastOrNone = seq.reduceOption { (best, current) =>
if (current.something < best.something) current
else best
}
It works fine, but I'm not quite satisfied - it's a bit long for such a simple thing, and I don't care much for "if"s. Using minBy would be much more elegant:
val least = seq.minBy(_.something)
... but min and minBy throw exceptions when the sequence is empty. Is there an idiomatic, more elegant way of finding the smallest element of a possibly empty list as an Option?
seq.reduceOption(_ min _)
does what you want?
Edit: Here's an example incorporating your _.something:
case class Foo(a: Int, b: Int)
val seq = Seq(Foo(1,1),Foo(2,0),Foo(0,3))
val ord = Ordering.by((_: Foo).b)
seq.reduceOption(ord.min) //Option[Foo] = Some(Foo(2,0))
or, as generic method:
def minOptionBy[A, B: Ordering](seq: Seq[A])(f: A => B) =
seq reduceOption Ordering.by(f).min
which you could invoke with minOptionBy(seq)(_.something)
Starting Scala 2.13, minByOption/maxByOption is now part of the standard library and returns None if the sequence is empty:
seq.minByOption(_.something)
List((3, 'a'), (1, 'b'), (5, 'c')).minByOption(_._1) // Option[(Int, Char)] = Some((1,b))
List[(Int, Char)]().minByOption(_._1) // Option[(Int, Char)] = None
A safe, compact and O(n) version with Scalaz:
xs.nonEmpty option xs.minBy(_.foo)
Hardly an option for any larger list due to O(nlogn) complexity:
seq.sortBy(_.something).headOption
Also, it is available to do like that
Some(seq).filter(_.nonEmpty).map(_.minBy(_.something))
How about this?
import util.control.Exception._
allCatch opt seq.minBy(_.something)
Or, more verbose, if you don't want to swallow other exceptions:
catching(classOf[UnsupportedOperationException]) opt seq.minBy(_.something)
Alternatively, you can pimp all collections with something like this:
import collection._
class TraversableOnceExt[CC, A](coll: CC, asTraversable: CC => TraversableOnce[A]) {
def minOption(implicit cmp: Ordering[A]): Option[A] = {
val trav = asTraversable(coll)
if (trav.isEmpty) None
else Some(trav.min)
}
def minOptionBy[B](f: A => B)(implicit cmp: Ordering[B]): Option[A] = {
val trav = asTraversable(coll)
if (trav.isEmpty) None
else Some(trav.minBy(f))
}
}
implicit def extendTraversable[A, C[A] <: TraversableOnce[A]](coll: C[A]): TraversableOnceExt[C[A], A] =
new TraversableOnceExt[C[A], A](coll, identity)
implicit def extendStringTraversable(string: String): TraversableOnceExt[String, Char] =
new TraversableOnceExt[String, Char](string, implicitly)
implicit def extendArrayTraversable[A](array: Array[A]): TraversableOnceExt[Array[A], A] =
new TraversableOnceExt[Array[A], A](array, implicitly)
And then just write seq.minOptionBy(_.something).
I have the same problem before, so I extends Ordered and implement the compare function.
here is example:
case class Point(longitude0: String, latitude0: String) extends Ordered [Point]{
def this(point: Point) = this(point.original_longitude,point.original_latitude)
val original_longitude = longitude0
val original_latitude = latitude0
val longitude = parseDouble(longitude0).get
val latitude = parseDouble(latitude0).get
override def toString: String = "longitude: " +original_longitude +", latitude: "+ original_latitude
def parseDouble(s: String): Option[Double] = try { Some(s.toDouble) } catch { case _ => None }
def distance(other: Point): Double =
sqrt(pow(longitude - other.longitude, 2) + pow(latitude - other.latitude, 2))
override def compare(that: Point): Int = {
if (longitude < that.longitude)
return -1
else if (longitude == that.longitude && latitude < that.latitude)
return -1
else
return 1
}
}
so if I have a seq of Point
I can ask for max or min method
var points = Seq[Point]()
val maxPoint = points.max
val minPoint = points.min
You could always do something like:
case class Foo(num: Int)
val foos: Seq[Foo] = Seq(Foo(1), Foo(2), Foo(3))
val noFoos: Seq[Foo] = Seq.empty
def minByOpt(foos: Seq[Foo]): Option[Foo] =
foos.foldLeft(None: Option[Foo]) { (acc, elem) =>
Option((elem +: acc.toSeq).minBy(_.num))
}
Then use like:
scala> minByOpt(foos)
res0: Option[Foo] = Some(Foo(1))
scala> minByOpt(noFoos)
res1: Option[Foo] = None
For scala < 2.13
Try(seq.minBy(_.something)).toOption
For scala 2.13
seq.minByOption(_.something)
In Haskell you'd wrap the minimumBy call as
least f x | Seq.null x = Nothing
| otherwise = Just (Seq.minimumBy f x)

Scala reverse string

I'm a newbie to scala, I'm just writing a simple function to reverse a given string:
def reverse(s: String) : String
for(i <- s.length - 1 to 0) yield s(i)
the yield gives back a scala.collection.immutable.IndexedSeq[Char], and can not convert it to a String. (or is it something else?)
how do i write this function ?
Note that there is already defined function:
scala> val x = "scala is awesome"
x: java.lang.String = scala is awesome
scala> x.reverse
res1: String = emosewa si alacs
But if you want to do that by yourself:
def reverse(s: String) : String =
(for(i <- s.length - 1 to 0 by -1) yield s(i)).mkString
or (sometimes it is better to use until, but probably not in that case)
def reverse(s: String) : String =
(for(i <- s.length until 0 by -1) yield s(i-1)).mkString
Also, note that if you use reversed counting (from bigger one to less one value) you should specify negative step or you will get an empty set:
scala> for(i <- x.length until 0) yield i
res2: scala.collection.immutable.IndexedSeq[Int] = Vector()
scala> for(i <- x.length until 0 by -1) yield i
res3: scala.collection.immutable.IndexedSeq[Int] = Vector(16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1)
Here's a short version
def reverse(s: String) = ("" /: s)((a, x) => x + a)
edit
: or even shorter, we have the fantastically cryptic
def reverse(s: String) = ("" /: s)(_.+:(_))
but I wouldn't really recommend this...
You could also write this using a recursive approach (throwing this one in just for fun)
def reverse(s: String): String = {
if (s.isEmpty) ""
else reverse(s.tail) + s.head
}
As indicated by om-nom-nom, pay attention to the by -1 (otherwise you are not really iterating and your result will be empty). The other trick you can use is collection.breakOut.
It can also be provided to the for comprehension like this:
def reverse(s: String): String =
(for(i <- s.length - 1 to 0 by -1) yield s(i))(collection.breakOut)
reverse("foo")
// String = oof
The benefit of using breakOut is that it will avoid creating a intermediate structure as in the mkString solution.
note: breakOut is leveraging CanBuildFrom and builders which are part of the foundation of the redesigned collection library introduced in scala 2.8.0
All the above answers are correct and here's my take:
scala> val reverseString = (str: String) => str.foldLeft("")((accumulator, nextChar) => nextChar + accumulator)
reverseString: String => java.lang.String = <function1>
scala> reverseString.apply("qwerty")
res0: java.lang.String = ytrewq
def rev(s: String): String = {
val str = s.toList
def f(s: List[Char], acc: List[Char]): List[Char] = s match {
case Nil => acc
case x :: xs => f(xs, x :: acc)
}
f(str, Nil).mkString
}
Here is my version of reversing a string.
scala> val sentence = "apple"
sentence: String = apple
scala> sentence.map(x => x.toString).reduce((x, y) => (y + x))
res9: String = elppa

In Scala, is there a way to get the currently evaluated items in a Stream?

In Scala, is there a way to get the currently evaluated items in a Stream? For example in the Stream
val s: Stream[Int] = Stream.cons(1, Stream.cons(2, Stream.cons(3, s.map(_+1))))
the method should return only List(1,2,3).
In 2.8, there is a protected method called tailDefined that will return false when you get to the point in the stream that has not yet been evaluated.
This isn't too useful (unless you want to write your own Stream class) except that Cons itself makes the method public. I'm not sure why it's protected in Stream and not in Cons--I would think one or the other might be a bug. But for now, at least, you can write a method like so (writing a functional equivalent is left as an exercise to the reader):
def streamEvalLen[T](s: Stream[T]) = {
if (s.isEmpty) 0
else {
var i = 1
var t = s
while (t match {
case c: Stream.Cons[_] => c.tailDefined
case _ => false
}) {
i += 1
t = t.tail
}
i
}
}
Here you can see it in action:
scala> val s = Stream.iterate(0)(_+1)
s: scala.collection.immutable.Stream[Int] = Stream(0, ?)
scala> streamEvalLen(s)
res0: Int = 1
scala> s.take(3).toList
res1: List[Int] = List(0, 1, 2)
scala> s
res2: scala.collection.immutable.Stream[Int] = Stream(0, 1, 2, ?)
scala> streamEvalLen(s)
res3: Int = 3
The solution based on Rex's answer:
def evaluatedItems[T](stream: => Stream[T]): List[T] = {
#tailrec
def inner(s: => Stream[T], acc: List[T]): List[T] = s match {
case Empty => acc
case c: Cons[T] => if (c.tailDefined) {
inner(c.tail, acc ++ List(c.head))
} else { acc ++ List(c.head) }
}
inner(stream, List())
}
Type that statement into the interactive shell and you will see that it evaluates to s: Stream[Int] = Stream(1, ?). So, in fact, the other two elements of 2 and 3 are not yet known.
As you access further elements, more of the stream is calculated. So, now put s(3) into the shell, which will return res0: Int = 2. Now put s into the shell and you will see the new value res1: Stream[Int] = Stream(1, 2, 3, 2, ?).
The only method I could find that contained the information that you wanted was, unfortunately, s.toString. With some parsing you will be able to get the elements back out of the string. This is a barely acceptable solution with just ints and I couldn't imagine any generic solution using the string parsing idea.
Using scanLeft
lazy val s: Stream[Int] = 1 #:: s.scanLeft(2) { case (a, _) => 1 + a }