Scala Breeze adding row and column header to DenseMatrix - scala

Below is an example of code which will generate Correlation Matrix but I need to add column header and row header in front and top of matrix.
For example in the above matrix amber coloured objects are the labels which i need to add to the blue coloured data generated by Correlation matrix whose code i have attached below.
In Scala breeze is there a way to add labels to matrix ? The problem is DenseMatrix is Double and labels are character so i am not able to add any char label to matrix object.
def getCorMatrix(c :String,d :String,n :Int) :breeze.linalg.DenseMatrix[Double] = {
CorMatrixlogger.info("Inside generating Correlation Matrix")
val query = MongoDBObject("RunDate" -> d) ++ ("Country" -> c)
CorMatrixlogger.info("Query Object created for {}", c)
val dbObject = for (d <- price.find(query)) yield(d)
val objectReader = (dbObject map {x => objectRead(x)}).toList
val fetchAssetData = objectReader map {x => x.Symbol} map { x=> assetStats(x,n) } filterNot {x => x.length < n-1}
CorMatrixlogger.info("Asset Data fetched")
val excessReturnMatrix = DenseMatrix((for(i <- fetchAssetData) yield i.excessReturn).map(_.toArray):_*)
CorMatrixlogger.info("Excess Return matrix generated")
val transposeExcessreturnMatrix = excessReturnMatrix.t
val vcvMatrix = breeze.numerics.rint(((excessReturnMatrix * transposeExcessreturnMatrix):/ (n-1).toDouble ) :* 1000000.0) :/ 1000000.0
CorMatrixlogger.info("VcV Matrix Generated")
val transposeStDevVector = DenseMatrix(for (i <- fetchAssetData ) yield i.sigma)
val stDevVector = transposeStDevVector.t
val stDevMatrix = breeze.numerics.rint(( stDevVector * transposeStDevVector) :* 1000000.0) :/ 1000000.0
CorMatrixlogger.info("Correlation Matrix Generated")
lowerTriangular(breeze.numerics.rint((vcvMatrix :/ stDevMatrix) :* 10000.0) :/ 10000.0)
}
Edit
Thanks David. Your solution really worked well for me.
val ma = DenseMatrix((1.0,2.0,3.0), (3.0,4.0,5.0),(6.0,7.0,8.0))
val im = DenseMatrix.tabulate(ma.rows,ma.cols)(ma(_,_).toString)
val head = DenseVector("a","b","c")
val thead = head.t
val withHeader:DenseMatrix[String] = DenseMatrix.tabulate(im.rows+1, im.cols+1) { (i, j) =>
if (i == 0 && j == 0) " "
else if (i == 0) head(j -1)
else if (j == 0 ) thead (i -1)
else im(i-1,j-1)
} //> withHeader : breeze.linalg.DenseMatrix[String] = a b c
//| a 1.0 2.0 3.0
//| b 3.0 4.0 5.0
//| c 6.0 7.0 8.0

There's nothing built in, sadly. You could do something like
val withHeader:DenseMatrix[Any] = DenseMatrix.tabulate(n+1, m+1){ (i, j) =>
if (i == 0 && j == 0) ""
else if (i == 0) colHeaders(j - 1)
else if (j == 0) rowHeaders(i - 1)
else orig(i - 1, j - 1)
}
You lose all typing information that way, of course, but if you just need to toString something, it's probably the quickest way in current Breeze.

Related

Combine multiple sequential entries in Scala/Spark

I have an array of numbers separated by comma as shown:
a:{108,109,110,112,114,115,116,118}
I need the output something like this:
a:{108-110, 112, 114-116, 118}
I am trying to group the continuous numbers with "-" in between.
For example, 108,109,110 are continuous numbers, so I get 108-110. 112 is separate entry; 114,115,116 again represents a sequence, so I get 114-116. 118 is separate and treated as such.
I am doing this in Spark. I wrote the following code:
import scala.collection.mutable.ArrayBuffer
def Sample(x:String):ArrayBuffer[String]={
val x1 = x.split(",")
var a:Int = 0
var present=""
var next:Int = 0
var yrTemp = ""
var yrAr= ArrayBuffer[String]()
var che:Int = 0
var storeV = ""
var p:Int = 0
var q:Int = 0
var count:Int = 1
while(a < x1.length)
{
yrTemp = x1(a)
if(x1.length == 1)
{
yrAr+=x1(a)
}
else
if(a < x1.length - 1)
{
present = x1(a)
if(che == 0)
{
storeV = present
}
p = x1(a).toInt
q = x1(a+1).toInt
if(p == q)
{
yrTemp = yrTemp
che = 1
}
else
if(p != q)
{
yrTemp = storeV + "-" + present
che = 0
yrAr+=yrTemp
}
}
else
if(a == x1.length-1)
{
present = x1(a)
yrTemp = present
che = 0
yrAr+=yrTemp
}
a = a+1
}
yrAr
}
val SampleUDF = udf(Sample(_:String))
I am getting the output as follows:
a:{108-108, 109-109, 110-110, 112, 114-114, 115-115, 116-116, 118}
I am not able to figure out where I am going wrong. Can you please help me in correcting this. TIA.
Here's another way:
def rangeToString(a: Int, b: Int) = if (a == b) s"$a" else s"$a-$b"
def reduce(xs: Seq[Int], min: Int, max: Int, ranges: Seq[String]): Seq[String] = xs match {
case y +: ys if (y - max <= 1) => reduce(ys, min, y, ranges)
case y +: ys => reduce(ys, y, y, ranges :+ rangeToString(min, max))
case Seq() => ranges :+ rangeToString(min, max)
}
def output(xs: Array[Int]) = reduce(xs, xs.head, xs.head, Vector())//.toArray
Which you can test:
println(output(Array(108,109,110,112,114,115,116,118)))
// Vector(108-110, 112, 114-116, 118)
Basically this is a tail recursive function - i.e. you take your "variables" as the input, then it calls itself with updated "variables" on each loop. So here xs is your array, min and max are integers used to keep track of the lowest and highest numbers so far, and ranges is the output sequence of Strings that gets added to when required.
The first pattern (y being the first element, and ys being the rest of the sequence - because that's how the +: extractor works) is matched if there's at least one element (ys can be an empty list) and it follows on from the previous maximum.
The second is if it doesn't follow on, and needs to reset the minimum and add the completed range to the output.
The third case is where we've got to the end of the input and just output the result, rather than calling the loop again.
Internet karma points to anyone who can work out how to eliminate the duplication of ranges :+ rangeToString(min, max)!
here is a solution :
def combineConsecutive(s: String): Seq[String] = {
val ints: List[Int] = s.split(',').map(_.toInt).toList.reverse
ints
.drop(1)
.foldLeft(List(List(ints.head)))((acc, e) => if ((acc.head.head - e) <= 1)
(e :: acc.head) :: acc.tail
else
List(e) :: acc)
.map(group => if (group.size > 1) group.min + "-" + group.max else group.head.toString)
}
val in = "108,109,110,112,114,115,116,118"
val result = combineConsecutive(in)
println(result) // List(108-110, 112, 114-116, 118)
}
This solution partly uses code from this question: Grouping list items by comparing them with their neighbors

Scala: transform a collection, yielding 0..many elements on each iteration

Given a collection in Scala, I'd like to traverse this collection and for each object I'd like to emit (yield) from 0 to multiple elements that should be joined together into a new collection.
For example, I expect something like this:
val input = Range(0, 15)
val output = input.somefancymapfunction((x) => {
if (x % 3 == 0)
yield(s"${x}/3")
if (x % 5 == 0)
yield(s"${x}/5")
})
to build an output collection that will contain
(0/3, 0/5, 3/3, 5/5, 6/3, 9/3, 10/5, 12/3)
Basically, I want a superset of what filter (1 → 0..1) and map (1 → 1) allows to do: mapping (1 → 0..n).
Solutions I've tried
Imperative solutions
Obviously, it's possible to do so in non-functional maneer, like:
var output = mutable.ListBuffer()
input.foreach((x) => {
if (x % 3 == 0)
output += s"${x}/3"
if (x % 5 == 0)
output += s"${x}/5"
})
Flatmap solutions
I know of flatMap, but it again, either:
1) becomes really ugly if we're talking about arbitrary number of output elements:
val output = input.flatMap((x) => {
val v1 = if (x % 3 == 0) {
Some(s"${x}/3")
} else {
None
}
val v2 = if (x % 5 == 0) {
Some(s"${x}/5")
} else {
None
}
List(v1, v2).flatten
})
2) requires usage of mutable collections inside it:
val output = input.flatMap((x) => {
val r = ListBuffer[String]()
if (x % 3 == 0)
r += s"${x}/3"
if (x % 5 == 0)
r += s"${x}/5"
r
})
which is actually even worse that using mutable collection from the very beginning, or
3) requires major logic overhaul:
val output = input.flatMap((x) => {
if (x % 3 == 0) {
if (x % 5 == 0) {
List(s"${x}/3", s"${x}/5")
} else {
List(s"${x}/3")
}
} else if (x % 5 == 0) {
List(s"${x}/5")
} else {
List()
}
})
which is, IMHO, also looks ugly and requires duplicating the generating code.
Roll-your-own-map-function
Last, but not least, I can roll my own function of that kind:
def myMultiOutputMap[T, R](coll: TraversableOnce[T], func: (T, ListBuffer[R]) => Unit): List[R] = {
val out = ListBuffer[R]()
coll.foreach((x) => func.apply(x, out))
out.toList
}
which can be used almost like I want:
val output = myMultiOutputMap[Int, String](input, (x, out) => {
if (x % 3 == 0)
out += s"${x}/3"
if (x % 5 == 0)
out += s"${x}/5"
})
Am I really overlooking something and there's no such functionality in standard Scala collection libraries?
Similar questions
This question bears some similarity to Can I yield or map one element into many in Scala? — but that question discusses 1 element → 3 elements mapping, and I want 1 element → arbitrary number of elements mapping.
Final note
Please note that this is not the question about division / divisors, such conditions are included purely for illustrative purposes.
Rather than having a separate case for each divisor, put them in a container and iterate over them in a for comprehension:
val output = for {
n <- input
d <- Seq(3, 5)
if n % d == 0
} yield s"$n/$d"
Or equivalently in a collect nested in a flatMap:
val output = input.flatMap { n =>
Seq(3, 5).collect {
case d if n % d == 0 => s"$n/$d"
}
}
In the more general case where the different cases may have different logic, you can put each case in a separate partial function and iterate over the partial functions:
val output = for {
n <- input
f <- Seq[PartialFunction[Int, String]](
{case x if x % 3 == 0 => s"$x/3"},
{case x if x % 5 == 0 => s"$x/5"})
if f.isDefinedAt(n)
} yield f(n)
You can also use some functional library (e.g. scalaz) to express this:
import scalaz._, Scalaz._
def divisibleBy(byWhat: Int)(what: Int): List[String] =
(what % byWhat == 0).option(s"$what/$byWhat").toList
(0 to 15) flatMap (divisibleBy(3) _ |+| divisibleBy(5))
This uses the semigroup append operation |+|. For Lists this operation means a simple list concatenation. So for functions Int => List[String], this append operation will produce a function that runs both functions and appends their results.
If you have complex computation, during which you should sometimes add some elements to operation global accumulator, you can use popular approach named Writer Monad
Preparation in scala is somewhat bulky but results are extremely composable thanks to Monad interface
import scalaz.Writer
import scalaz.syntax.writer._
import scalaz.syntax.monad._
import scalaz.std.vector._
import scalaz.syntax.traverse._
type Messages[T] = Writer[Vector[String], T]
def yieldW(a: String): Messages[Unit] = Vector(a).tell
val output = Vector.range(0, 15).traverse { n =>
yieldW(s"$n / 3").whenM(n % 3 == 0) >>
yieldW(s"$n / 5").whenM(n % 5 == 0)
}.run._1
Here is my proposition for a custom function, might be better with pimp my library pattern
def fancyMap[A, B](list: TraversableOnce[A])(fs: (A => Boolean, A => B)*) = {
def valuesForElement(elem: A) = fs collect { case (predicate, mapper) if predicate(elem) => mapper(elem) }
list flatMap valuesForElement
}
fancyMap[Int, String](0 to 15)((_ % 3 == 0, _ + "/3"), (_ % 5 == 0, _ + "/5"))
You can try collect:
val input = Range(0,15)
val output = input.flatMap { x =>
List(3,5) collect { case n if (x%n == 0) => s"${x}/${n}" }
}
System.out.println(output)
I would us a fold:
val input = Range(0, 15)
val output = input.foldLeft(List[String]()) {
case (acc, value) =>
val acc1 = if (value % 3 == 0) s"$value/3" :: acc else acc
val acc2 = if (value % 5 == 0) s"$value/5" :: acc1 else acc1
acc2
}.reverse
output contains
List(0/3, 0/5, 3/3, 5/5, 6/3, 9/3, 10/5, 12/3)
A fold takes an accumumlator (acc), a collection, and a function. The function is called with the initial value of the accumumator, in this case an empty List[String], and each value of the collection. The function should return an updated collection.
On each iteration, we take the growing accumulator and, if the inside if statements are true, prepend the calculation to the new accumulator. The function finally returns the updated accumulator.
When the fold is done, it returns the final accumulator, but unfortunately, it is in reverse order. We simply reverse the accumulator with .reverse.
There is a nice paper on folds: A tutorial on the universality and expressiveness of fold, by Graham Hutton.

Scala - Remove while loop in quick sort

def QuickSort(arr:Array[Int],first:Int,last:Int): List[Int] = {
var pivot:Int = 0
var temp:Int = 0
if (first < last) {
pivot = first
var i:Int = first
var j:Int = last;
while(i<j){
while(arr(i) <= arr(pivot) && i < last)
i=i+1
while(arr(j) > arr(pivot))
j=j+1
if(i<j)
{
temp = arr(i)
arr(i) = arr(j)
arr(j) = temp
}
}
temp = arr(pivot)
arr(pivot) = arr(j)
arr(j) = temp
QuickSort(arr, first, j-1)
QuickSort(arr, j+1, last)
}
arr.toList
}
Hello I m new to scala and trying to implement quick sort. Program is working correctly but I want to remove the while loop since I read that while and do while are not recommended in scala because they do not return any value.
Is there any way to remove while loop in above code.
The classic quicksort algorithm, as you've coded here, requires a mutable collection (like Array) and the swapping of element values, which requires mutable variables (i.e. var). These things are discouraged in functional programming and aren't held in high esteem in the Scala community.
Here's a similar approach that is a little more in keeping to the spirit of the FP ethic.
// pseudo-quicksort -- from Array[Int] to List[Int]
def pqs(arr:Array[Int]): List[Int] = arr match {
case Array() => List()
case Array(x) => List(x)
case Array(x,y) => if (x < y) List(x,y) else List(y,x)
case _ => val (below, above) = arr.partition(_ < arr(0))
pqs(below) ++ List(arr(0)) ++ pqs(above.tail)
}
Better yet is to use one of the sort methods (sortBy, sortWith, sorted) as offered in the standard library.
Not so elegant, but without while:
def QuickSort(l: List[Int]) : List[Int] = {
if( l.length == 0) return Nil
if( l.length == 1 ) return arr
val pivot = arr(arr.length / 2)
val lesserThanPivot = l.filter( _ < pivot)
val equalToPivot = l.filter( _ == pivot)
val biggerThanPivot = l.filter( _ > pivot)
QuickSort( lesserThanPivot ) ++ equalToPivot.tail ++ List(pivot) ++ QuickSort(biggerThanPivot)
}

Scala best practices: mapping 2D data

What would be the best way in Scala to do the following code in Java in proper functional way?
LinkedList<Item> itemsInRange = new LinkedList<Item>();
for (int y = 0; y < height(); y++) {
for (int x = 0; x < width(); x++) {
Item it = myMap.getItemAt(cy + y, cx + x);
if (it != null)
itemsInRange.add(it);
}
}
// iterate over itemsInRange later
Over course, it can be translated directly into Scala in imperative way:
val itemsInRange = new ListBuffer[Item]
for (y <- 0 until height) {
for (x <- 0 until width) {
val it = tileMap.getItemAt(cy + x, cx + x)
if (!it.isEmpty)
itemsInRange.append(it.get)
}
}
But I'd like to do it in proper, functional way.
I presume that there should be map operation over some sort of 2D range. Ideally, map would execute a function that would get x and y as input parameters and output Option[Item]. After that I'll get something like Iterable[Option[Item]] and flatten over it will yield Iterable[Item]. If I'm right, the only missing piece of a puzzle is doing that map operation on 2D ranges in some way.
You can do this all in one nice step:
def items[A](w: Int, h: Int)(at: ((Int, Int)) => Option[A]): IndexedSeq[A] =
for {
x <- 0 until w
y <- 0 until h
i <- at(x, y)
} yield i
Now say for example we have this representation of symbols on a four-by-four board:
val tileMap = Map(
(0, 0) -> 'a,
(1, 0) -> 'b,
(3, 2) -> 'c
)
We just write:
scala> items(4, 4)(tileMap.get)
res0: IndexedSeq[Symbol] = Vector('a, 'b, 'c)
Which I think is what you want.

Swapping array values with for and yield scala

I am trying to swap every pair of values in my array using for and yield and so far I am very unsuccessful. What I have tried is as follows:
val a = Array(1,2,3,4,5) //What I want is Array(2,1,4,3,5)
for(i<-0 until (a.length-1,2),r<- Array(i+1,i)) yield r
The above given snippet returns the vector 2,1,4,3(and the 5 is omitted)
Can somebody point out what I am doing wrong here and how to get the correct reversal using for and yields?
Thanks
a.grouped(2).flatMap(_.reverse).toArray
or if you need for/yield (much less concise in this case, and in fact expands to the same code):
(for {b <- a.grouped(2); c <- b.reverse} yield c).toArray
It would be easier if you didin't use for/yield:
a.grouped(2)
.flatMap{
case Array(x,y) => Array(y,x)
case Array(x) => Array(x)
}.toArray // Array(2, 1, 4, 3, 5)
I don't know if the OP is reading Scala for the Impatient, but this was exercise 3.3 .
I like the map solution, but we're not on that chapter yet, so this is my ugly implementation using the required for/yield. You can probably move some yield logic into a guard/definition.
for( i <- 0 until(a.length,2); j <- (i+1).to(i,-1) if(j<a.length) ) yield a(j)
I'm a Java guy, so I've no confirmation of this assertion, but I'm curious what the overhead of the maps/grouping and iterators are. I suspect it all compiles down to the same Java byte code.
Another simple, for-yield solution:
def swapAdjacent(array: ArrayBuffer[Int]) = {
for (i <- 0 until array.length) yield (
if (i % 2 == 0)
if (i == array.length - 1) array(i) else array(i + 1)
else array(i - 1)
)
}
Here is my solution
def swapAdjacent(a: Array[Int]):Array[Int] =
(for(i <- 0 until a.length) yield
if (i%2==0 && (i+1)==a.length) a(i) //last element for odd length
else if (i%2==0) a(i+1)
else a(i-1)
).toArray
https://github.com/BasileDuPlessis/scala-for-the-impatient/blob/master/src/main/scala/com/basile/scala/ch03/Ex03.scala
If you are doing exercises 3.2 and 3.3 in Scala for the Impatient here are both my answers. They are the same with the logic moved around.
/** Excercise 3.2 */
for (i <- 0 until a.length if i % 2 == 1) {val t = a(i); a(i) = a(i-1); a(i-1) = t }
/** Excercise 3.3 */
for (i <- 0 until a.length) yield { if (i % 2 == 1) a(i-1) else if (i+1 <= a.length-1) a(i+1) else a(i) }
for (i <- 0 until arr.length-1 by 2) { val tmp = arr(i); arr(i) = arr(i+1); arr(i+1) = tmp }
I have started to learn Scala recently and all solutions from the book Scala for the Impatient (1st edition) are available at my github:
Chapter 2
https://gist.github.com/carloscaldas/51c01ccad9d86da8d96f1f40f7fecba7
Chapter 3
https://gist.github.com/carloscaldas/3361321306faf82e76c967559b5cea33
I have my solution, but without yield. Maybe someone will found it usefull.
def swap(x: Array[Int]): Array[Int] = {
for (i <- 0 until x.length-1 by 2){
var left = x(i)
x(i) = x(i+1)
x(i+1) = left
}
x
}
Assuming array is not empty, here you go:
val swapResult = for (ind <- arr1.indices) yield {
if (ind % 2 != 0) arr1(ind - 1)
else if (arr1(ind) == arr1.last) arr1(ind)
else if (ind % 2 == 0) arr1(ind + 1)
}