Generic column access for scala matrix - scala

I have code that stores matrices of different types, e.g. m1: Array[Array[Double]], m2: List[List[Int]]. As seen, these matrices are all stored as as a sequence of rows. Any row is easy to retrieve but columns seem to me to require traversal of the matrix. I'd like to write a very generic function that returns a column from a matrix of either of these types. I've written this in many ways, the latest of which is:
/* Get a column of any matrix stored in rows */
private def column(M: Seq[Seq[Any]], n: Int, c: Seq[Any] = List(),
i: Int = 0): List[Any] = {
if (i != M.size) column(M, n, c :+ M(i)(n), i+1) else c.toList
This compiles however it doesn't work: I get a type mismatch when I try to pass in an Array[Array[Double]]. I've tried to write this with some view bounds as well i.e.
private def column[T1 <% Seq[Any], T2 <% Seq[T1]] ...
But this wasn't fruitful either. How come the first code segment I wrote doesn't work? What is the best way to do this?

import collection.generic.CanBuildFrom
def column[T, M[_]](xss: M[M[T]], c: Int)(
implicit cbf: CanBuildFrom[Nothing, T, M[T]],
mm2s: M[M[T]] => Seq[M[T]],
m2s: M[T] => Seq[T]
): M[T] = {
val bf = cbf()
for (xs <- mm2s(xss)) { bf += m2s(xs).apply(c) }
bf.result
}

If you don't care about the return type, this is a really simple way to do it:
def column[A, M[_]](matrix: M[M[A]], colIdx: Int)
(implicit v1: M[M[A]] => Seq[M[A]], v2: M[A] => Seq[A]): Seq[A] =
matrix.map(_(colIdx))

I suggest you represent a Matrix as an underlying single-dimensional Array (the only kind of Array there is!) and separately represent its structure in terms of rows and columns.
This gives you more flexibility both in representation and access. E.g., you can provide both row-major and column-major organizations. Producing row iterators is just as easy as producing column iterators, regardless of whether it's a row-major or column-major organization.

Try this one:
private def column[T](
M: Seq[Seq[T]], n: Int, c: Seq[T] = List(), i: Int = 0): List[T] =
if (i != M.size) column(M, n, c :+ M(i)(n), i+1) else c.toList

Related

How to search efficiently in a nested collection in a functional way

I'd like to find the indices (coordinates) of the first element whose value is 4, in a nested Vector of Int, in a functional way.
val a = Vector(Vector(1,2,3), Vector(4,5), Vector(3,8,4))
a.map(_.zipWithIndex).zipWithIndex.collect{
case (col, i) =>
col.collectFirst {
case (num, index) if num == 4 =>
(i, index)
}
}.collectFirst {
case Some(x) ⇒ x
}
It returns:
Some((0, 1))
the coordinate of the first 4 occurrence.
This solution is quite simple, but it has a performance penalty, because the nested col.collect is performed for all the elements of the top Vector, when we are only interested in the 1st match.
One possible solution is to write a guard in the pattern matching. But I don't know how to write a guard based in a slow condition, and return something that has already been calculated in the guard.
Can it be done better?
Recursive maybe?
If you insist on using Vectors, something like this will work (for a non-indexed seq, you'd need a different approach):
#tailrec
findit(
what: Int,
lists: IndexedSeq[IndexedSeq[Int]],
i: Int = 0,
j: Int = 0
): Option[(Int, Int)] =
if(i >= lists.length) None
else if(j >= lists(i).length) findit(what, lists, i+1, 0)
else if(lists(i)(j) == what) Some((i,j))
else findit(what, lists, i, j+1)
A simple thing you can to without changing the algorithm is to use Scala streams to be able to exit as soon as you find the match. Streams are lazily evaluated as opposed to sequences.
Just make a change similar to this
a.map(_.zipWithIndex.toStream).zipWithIndex.toStream.collect{ ...
In terms of algorithmic changes, if you can somehow have your data sorted (even before you start to search) then you can use Binary search instead of looking at each element.
import scala.collection.Searching._
val dummy = 123
implicit val anOrdering = new Ordering[(Int, Int, Int)]{
override def compare(x: (Int, Int, Int), y: (Int, Int, Int)): Int = Integer.compare(x._1, y._1)
}
val seqOfIntsWithPosition = a.zipWithIndex.flatMap(vectorWithIndex => vectorWithIndex._1.zipWithIndex.map(intWithIndex => (intWithIndex._1, vectorWithIndex._2, intWithIndex._2)))
val sorted: IndexedSeq[(Int, Int, Int)] = seqOfIntsWithPosition.sortBy(_._1)
val element = sorted.search((4, dummy, dummy))
This code is not very pretty or readable, I just quickly wanted to show an example of how it could be done.

Converting vector of vectors to Matrix in scala

What is the most efficient way to convert breeze.linalg.Vector[breeze.linalg.Vector[Double]] to a DenseMatrix?
I tried using asDenseMatrix, toBreezeMatrix, creating a new DenseMatrix etc but it seems like I am missing the most simple and obvious way to do this.
Not real pretty, but this will work and is probably fairly efficient:
val v: Vector[Vector[Double]] = ???
val matrix = DenseMatrix(v.valuesIterator.map(_.valuesIterator.toArray).toSeq: _*)
You could make this a bit nicer by defining an implicit LiteralRow for subclasses of Vector like so:
implicit def vectorLiteralRow[E, V](implicit ev: V <:< Vector[E]) = new LiteralRow[V, E] {
def foreach[X](row: V, fn: (Int, E) => X): Unit = row.foreachPair(fn)
def length(row: V) = row.length
}
Now with this implicit in scope you could use
val matrix = DenseVector(v.toArray: _*)
It seems pretty natural to construct a matrix from its row vectors, so I'm not sure why the breeze library doesn't define implcit LiteralRows for subclasses of Vector. Maybe someone with more knowledge of the breeze library could comment on this.

Correct way to work with two instances of Option together

When I have one Option[T] instance it is quite easy to perform any operation on T using monadic operations such as map() and flatMap(). This way I don't have to do checks to see whether it is defined or empty, and chain operations together to ultimately get an Option[R] for the result R.
My difficulty is whether there is a similar elegant way to perform functions on two Option[T] instances.
Lets take a simple example where I have two vals, x and y of type Option[Int]. And I want to get the maximum of them if they are both defined, or the one that is defined if only one is defined, and None if none are defined.
How would one write this elegantly without involving lots of isDefined checks inside the map() of the first Option?
You can use something like this:
def optMax(op1:Option[Int], op2: Option[Int]) = op1 ++ op2 match {
case Nil => None
case list => list.max
}
Or one much better:
def f(vars: Option[Int]*) = (for( vs <- vars) yield vs).max
#jwvh,thanks for a good improvement:
def f(vars: Option[Int]*) = vars.max
Usually, you'll want to do something if both values are defined.
In that case, you could use a for-comprehension:
val aOpt: Option[Int] = getIntOpt
val bOpt: Option[Int] = getIntOpt
val maxOpt: Option[Int] =
for {
a <- aOpt
b <- bOpt
} yield max(a, b)
Now, the problem you described is not as common. You want to do something if both values are defined, but you also want to retrieve the value of an option if only one of them is defined.
I would just use the for-comprehension above, and then chain two calls to orElse to provide alternative values if maxOpt turns out to be None.
maxOpt orElse aOpt orElse bOpt
orElse's signature:
def orElse[B >: A](alternative: ⇒ Option[B]): Option[B]
Here's another fwiw:
import scala.util.Try
def maxOpt (a:Option[Int]*)= Try(a.flatten.max).toOption
It works with n arguments (including zero arguments).
Pattern matching would allow something easy to grasp, but that might not be the most elegant way:
def maxOpt[T](optA: Option[T], optB: Option[T])(implicit f: (T, T) => T): Option[T] = (optA, optB) match {
case (Some(a), Some(b)) => Some(f(a, b))
case (None, Some(b)) => Some(b)
case (Some(a), None) => Some(a)
case (None, None) => None
}
You end up with something like:
scala> maxOpt(Some(1), None)(Math.max)
res2: Option[Int] = Some(1)
Once you have that building, block, you can use it inside for-comp or monadic operations.
To get maxOpt, you can also use an applicative, which using Scalaz would look like (aOpt |#| bOpt) { max(_, _) } & then chain orElses as #dcastro suggested.
I assume you expect Some[Int]|None as a result, not Int|None (otherwise return type has to be Any):
def maxOption(opts: Option[Int]*) = {
val flattened = opts.flatten
flattened.headOption.map { _ => flattened.max }
}
Actually, Scala already gives you this ability more or less directly.
scala> import Ordering.Implicits._
import Ordering.Implicits._
scala> val (a,b,n:Option[Int]) = (Option(4), Option(9), None)
a: Option[Int] = Some(4)
b: Option[Int] = Some(9)
n: Option[Int] = None
scala> a max b
res60: Option[Int] = Some(9)
scala> a max n
res61: Option[Int] = Some(4)
scala> n max b
res62: Option[Int] = Some(9)
scala> n max n
res63: Option[Int] = None
A Haskell-ish take on this question is to observe that the following operations:
max, min :: Ord a => a -> a -> a
max a b = if a < b then b else a
min a b = if a < b then a else b
...are associative:
max a (max b c) == max (max a b) c
min a (min b c) == min (min a b) c
As such, any type Ord a => a together with either of these operations is a semigroup, a concept for which reusable abstractions can be built.
And you're dealing with Maybe (Haskell for "option"), which adds a generic "neutral" element to the base a type (you want max Nothing x == x to hold as a law). This takes you into monoids, which are a subtype of semigroups.
The Haskell semigroups library provides a Semigroup type class and two wrapper types, Max and Min, that generically implement the corresponding behaviors.
Since we're dealing with Maybe, in terms of that library the type that captures the semantics you want is Option (Max a)—a monoid that has the same binary operation as the Max semigroup, and uses Nothing as the identity element. So then the function simply becomes:
maxOpt :: Ord a => Option (Max a) -> Option (Max a) -> Option (Max a)
maxOpt a b = a <> b
...which since it's just the <> operator for Option (Max a) is not worth writing. You also gain all the other utility functions and classes that work on Semigroup and Monoid, so for example to find the maximum element of a [Option (Max a)] you'd just use the mconcat function.
The scalaz library comes with a Semigroup and a Monoid trait, as well as Max, Min, MaxVal and MinVal tags that implement those traits, so in fact the stuff that I've demonstrated here in Haskell exists in scalaz as well.

Function to compute partial derivatives of function with arbitrary many variables

I am trying to write a function in Scala that will compute the partial derivative of a function with arbitrary many variables. For example
One Variable(regular derivative):
def partialDerivative(f: Double => Double)(x: Double) = { (f(x+0.001)-f(x))/0.001 }
Two Variables:
def partialDerivative(c: Char, f: (Double, Double) => Double)(x: Double)(y: Double) = {
if (c == 'x') (f(x+0.0001, y)-f(x, y))/0.0001
else if (c == 'y') (f(x, y+0.0001)-f(x, y))/0.0001
}
I am wondering if there is a way to write partialDerivative where the number of variables in f do not need to be known in advance.
I read some blog posts about varargs but can't seem to come up with the correct signature.
Here is what I tried.
def func(f: (Double*) => Double)(n: Double*)
but this doesn't seem to be correct. Thanks for any help on this.
Double* means f accepts an arbitrary Seq of Doubles, which is not correct.
The only way I can think of to write something like this is using shapeless Sized. You will need more implicits than this, and possibly some type-level equality implicits as well; type-level programming in scala is quite complex and I don't have the time to debug this properly, but it should give you some idea:
def partialDerivative[N <: Nat, I <: Nat](f: Sized[Seq[Double], N] => Double)(i: I, xs: Sized[Seq[Double], N])(implicit diff: Diff[I, N]) = {
val (before, atAndAfter) = xs.splitAt(i)
val incrementedAtAndAfter = (atAndAfter.head + 0.0001) +: atAndAfter.tail
val incremented = before ++ incrementedAtAndAfter
(f(incremeted) - f(xs)) / 0.0001
}

generating permutations with scalacheck

I have some generators like this:
val fooRepr = oneOf(a, b, c, d, e)
val foo = for (s <- choose(1, 5); c <- listOfN(s, fooRepr)) yield c.mkString("$")
This leads to duplicates ... I might get two a's, etc. What I really want is to generate random permutation with exactly 0 or 1 or each of a, b, c, d, or e (with at least one of something), in any order.
I was thinking there must be an easy way, but I'm struggling to even find a hard way. :)
Edited: Ok, this seems to work:
val foo = for (s <- choose(1, 5);
c <- permute(s, a, b, c, d, e)) yield c.mkString("$")
def permute[T](n: Int, gs: Gen[T]*): Gen[Seq[T]] = {
val perm = Random.shuffle(gs.toList)
for {
is <- pick(n, 1 until gs.size)
xs <- sequence[List,T](is.toList.map(perm(_)))
} yield xs
}
...borrowing heavily from Gen.pick.
Thanks for your help, -Eric
Rex, thanks for clarifying exactly what I'm trying to do, and that's useful code, but perhaps not so nice with scalacheck, particularly if the generators in question are quite complex. In my particular case the generators a, b, c, etc. are generating huge strings.
Anyhow, there was a bug in my solution above; what worked for me is below. I put a tiny project demonstrating how to do this at github
The guts of it is below. If there's a better way, I'd love to know it...
package powerset
import org.scalacheck._
import org.scalacheck.Gen._
import org.scalacheck.Gen
import scala.util.Random
object PowersetPermutations extends Properties("PowersetPermutations") {
def a: Gen[String] = value("a")
def b: Gen[String] = value("b")
def c: Gen[String] = value("c")
def d: Gen[String] = value("d")
def e: Gen[String] = value("e")
val foo = for (s <- choose(1, 5);
c <- permute(s, a, b, c, d, e)) yield c.mkString
def permute[T](n: Int, gs: Gen[T]*): Gen[Seq[T]] = {
val perm = Random.shuffle(gs.toList)
for {
is <- pick(n, 0 until gs.size)
xs <- sequence[List, T](is.toList.map(perm(_)))
} yield xs
}
implicit def arbString: Arbitrary[String] = Arbitrary(foo)
property("powerset") = Prop.forAll {
a: String => println(a); true
}
}
Thanks,
Eric
You're not describing a permutation, but the power set (minus the empty set)Edit: you're describing a combination of a power set and a permutation. The power set of an indexed set N is isomorphic to 2^N, so we simply (in Scala alone; maybe you want to alter this for use with ScalaCheck):
def powerSet[X](xs: List[X]) = {
val xis = xs.zipWithIndex
(for (j <- 1 until (1<<xs.length)) yield {
for ((x,i) <- xis if ((j & (1<<i)) != 0)) yield x
}).toList
}
to generate all possible subsets given a set. Of course, explicit generation of power sets is unwise if they original set contains more than a handful of elements. If you don't want to generate all of them, just pass in a random number from 1 until (1<<(xs.length-1)) and run the inner loop. (Switch to Long if there are 33-64 elements, and to BitSet if there are more yet.) You can then permute the result to switch the order around if you wish.
Edit: there's another way to do this if you can generate permutations easily and you can add a dummy argument: make your list one longer, with a Stop token. Then permute and .takeWhile(_ != Stop). Ta-da! Permutations of arbitrary length. (Filter out the zero-length answer if need be.)