Scala - Calculate maximum average between two lists - scala

I am at the beginning of my Scala journey. I am trying to find and compare the average value of a given dataset - type Map(String, List[Int]), for two random rows selected by the user, in order to return the greater average value between the two. I can calculate the average for each row but I can't find a way to compare the average between the two rows.
I have tried in different ways, but I only get error messages. However the program calculates the average of each row
DATASET
SK1, 9, 7, 2, 0, 7, 3, 7, 9, 1, 2, 8, 1, 9, 6, 5, 3, 2, 2, 7, 2, 8, 5, 4, 5, 1, 6, 5, 2, 4, 1
SK2, 0, 7, 6, 3, 3, 3, 1, 6, 9, 2, 9, 7, 8, 7, 3, 6, 3, 5, 5, 2, 9, 7, 3, 4, 6, 3, 4, 3, 4, 1
SK3, 8, 7, 1, 8, 0, 5, 8, 3, 5, 9, 7, 5, 4, 7, 9, 8, 1, 4, 6, 5, 6, 6, 3, 6, 8, 8, 7, 4, 0, 6
This is how I the program calculates the average of a row
//Function to find the average
def average(list: List[Int]): Double = list.sum.toDouble / list.size
def averageStockLevel1(stock1: String, stock2: String): (String, Int) = {
val ave1 = mapdata.get(stock1).map(average(_).toInt).getOrElse(0)
val ave2 = mapdata.get(stock2).map(average(_).toInt).getOrElse(0)
if (ave1>ave2){
(stock1,ave1)
}else{
(stock2,ave2)
}
}
This is how I have called the function in the menu
def handleFour(): Boolean = {
menuDoubleDataStock(averageStockLevel1)
true
}
//Pull two rows from the dataset
def menuShowDoubleDataStock(f: (String) => (String, Int), g:(String) => (String, Int)) = {
print("Please insert the Stock > ")
val data = f(readLine)
println(s"${data._1}: ${data._2}")
print("Please insert the Stock > ")
val data1 = g(readLine)
println(s"${data1._1}: ${data1._2}")
}
error message
Unspecified value parameters: g: String => (String, Int)

The error message "Unspecified value parameters: g: String => (String, Int)" tells you the following:
Your menuShowDoubleDataStock expects two parameters (f and g), but where you call it (from handleFour()), you only pass one value (averageStockLevel1) - that value is accepted as f, so the compiler complains that no value was passed for g.
Besides that specific error that the compiler currently complains about, there is also a second problem (which currently seems to be overshadowed by the one above): the type of f is defined as String => (String, Int) (a function that takes one String parameter), but the value that you are passing (averageStockLevel1) has the type (String, String) => (String, Int) (a function that takes two String parameters).
I'm not 100% sure if I understood what you are aiming to do, but I think the solution could be to change the signature of menuShowDoubleDataStock so that it only takes one parameter of type (String, String) => (String, Int):
// make the user enter two stock-names and pass them into resultCalculator to
// get the result (and then print it)
def menuShowDoubleDataStock(resultCalculator: (String, String) => (String, Int)) = {
print("Please insert the Stock > ")
val stockName1 = readLine
print("Please insert the Stock > ")
val stockName2 = readLine
val result = resultCalculator(stockName1, stockName2)
println(s"${result._1}: ${result._2}")
}
Then calling menuDoubleDataStock(averageStockLevel1) should work.

Related

Expression of type Seq[unit] does not conform to expected type Seq[DataFrame] in scala

In my function, I am returning a finalDF, a sequence of data frames. In the loop shown below, map returns Seq[DataFrame] and it is being stored in finalDF to be able to return to the caller, but in some cases where there is further processing, I would like to store the filtered dataframe for each iteration and pass it to next loop.
How do I do it? If I try to assign it to some temp val, it throws and error that expression of type Seq[unit] does not conform to expected type Seq[DataFrame].
var finalDF: Seq[DataFrame] =null
for (i <- 0 until stop){
finalDF=strataCount(i).map(x=> {
df.filter(df(cols(i)) === x)
//how to get the above data frame to pass on to the next computation?
}
)
}
Regards
Maybe this is helpful:
val finalDF: Seq[DataFrame] = (0 to stop).flatMap(i => strataCount(i).map(x => df.filter(df(cols(i)) === x))).toSeq
flatMap to flatten the Seq(Seq).
(0 to stop) will loop from 0 to stop, flatMap will flatten List, Like:
scala> (0 to 20).flatMap(i => List(i))
res0: scala.collection.immutable.IndexedSeq[Int] = Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
scala> (0 to 20).map(i => List(i)).flatten
res1: scala.collection.immutable.IndexedSeq[Int] = Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
for two counters, maybe you can do it like:
(0 to stop).flatMap(j => {
(0 to stop).flatMap(i => strataCount(i).map(x => df.filter(df(cols(i)) === x)))
}).toSeq
or try: for yield, see: Scala for/yield syntax

Finding the average from a mapped list of int

I am trying to get the average from a mapped list of ints, then return that value to the user when requested.
Here is my current code with problems, what am I doing wrong? I have included my functionality to find the last element of a tail, that works.
// *******************************************************************************************************************
// application logic
// read data from file
val mapdata = readFile("data.txt")
// *******************************************************************************************************************
// UTILITY FUNCTIONS
//GETS THE DATA FROM THE DATA.TXT
def readFile(filename: String): Map[String, List[Int]] = {
processInput(Source.fromFile(filename).getLines)
}
def processInput(lines: Iterator[String]): Map[String, List[Int]] = {
Try {
lines.foldLeft(Map[String, List[Int]]()) { (acc, line) =>
val splitline = line.split(",").map(_.trim).toList
acc.updated(splitline.head, splitline.tail.map(_.toInt))
}
}.getOrElse {
println("Sorry, an exception happened.")
Map()
}
}
//functionality to find the last tail element
def findLast(list:List[Int]):Int = {
if(list.tail == Nil)
list.head
else
findLast(list.tail)
}
//Function to find the average
def average(list:List[Int]):Double =
list.foldLeft(0.0)(_+_) / list.foldLeft(0)((r,c)=>r+1)
//Show last element in the list, most current WORKS
def currentStockLevel (stock: String): (String, Int) = {
(stock, mapdata.get (stock).map(findLast(_)).getOrElse(0))
}
//Show last element in the list, most current DOES NOT WORK
def averageStockLevel (stock: String): (String, Int) = {
(stock, mapdata.get (stock).map(average(_)).getOrElse(0))
}
my txt file
SK1, 9, 7, 2, 0, 7, 3, 7, 9, 1, 2, 8, 1, 9, 6, 5, 3, 2, 2, 7, 2, 8, 5, 4, 5, 1, 6, 5, 2, 4, 1
SK2, 0, 7, 6, 3, 3, 3, 1, 6, 9, 2, 9, 7, 8, 7, 3, 6, 3, 5, 5, 2, 9, 7, 3, 4, 6, 3, 4, 3, 4, 1
SK4, 2, 9, 5, 7, 0, 8, 6, 6, 7, 9, 0, 1, 3, 1, 6, 0, 0, 1, 3, 8, 5, 4, 0, 9, 7, 1, 4, 5, 2, 8
SK5, 2, 6, 8, 0, 3, 5, 5, 2, 5, 9, 4, 5, 3, 5, 7, 8, 8, 2, 5, 9, 3, 8, 6, 7, 8, 7, 4, 1, 2, 3
The error that I am getting is that expression of type AnyVal does not conform to type Int
Your averageStockLevel function returns the average value as an Int (the return type is (String, Int)) whereas the calculation that is done in average returns a Double.
So you either need to convert the calculated Double to an Int within averageStockLevel (e.g. by doing average(_).toInt), or you can change the return type of averageStockLevel to (String, Double). The latter variant is obviously the better one since you don't loose the precision of your average value.
def averageStockLevel (stock: String): (String, Double) = {
(stock, mapdata.get(stock).map(average).getOrElse(0.0))
}
This works, but whether or not it's a good idea to return 0.0 in case of a missing key is for you to decide. Another possibility is to omit the getOrElse part and return an Option[(String,Double)].
Apart from that, your code is quite complex. findLast and average can be defined much easier (it's not really worth it to create an own function for finding the last element, but for the sake of completeness...):
// will throw an exception for empty lists, but so does your current code
def findLast(list:List[Int]) = list.last
def average(list:List[Int]): Double = list.sum.toDouble / list.size
Another idea is to replace List by Vector. For operations such as .size and .last, List needs linear time whereas Vector basically takes constant time.

Scala trying to count instances of a digit in a number

This is my first day using scala. I am trying to make a string of the number of times each digit is represented in a string. For instance, the number 4310227 would return "1121100100" because 0 appears once, 1 appears once, 2 appears twice and so on...
def pow(n:Int) : String = {
val cubed = (n * n * n).toString
val digits = 0 to 9
val str = ""
for (a <- digits) {
println(a)
val b = cubed.count(_==a.toString)
println(b)
}
return cubed
}
and it doesn't seem to work. would like some scalay reasons why and to know whether I should even be going about it in this manner. Thanks!
When you iterate over strings, which is what you are doing when you call String#count(), you are working with Chars, not Strings. You don't want to compare these two with ==, since they aren't the same type of object.
One way to solve this problem is to call Char#toString() before performing the comparison, e.g., amend your code to read cubed.count(_.toString==a.toString).
As Rado and cheeken said, you're comparing a Char with a String, which will never be be equal. An alternative to cheekin's answer of converting each character to a string is to create a range from chars, ie '0' to '9':
val digits = '0' to '9'
...
val b = cubed.count(_ == a)
Note that if you want the Int that a Char represents, you can call char.asDigit.
Aleksey's, Ren's and Randall's answers are something you will want to strive towards as they separate out the pure solution to the problem. However, given that it's your first day with Scala, depending on what background you have, you might need a bit more context before understanding them.
Fairly simple:
scala> ("122333abc456xyz" filter (_.isDigit)).foldLeft(Map.empty[Char, Int]) ((histo, c) => histo + (c -> (histo.getOrElse(c, 0) + 1)))
res1: scala.collection.immutable.Map[Char,Int] = Map(4 -> 1, 5 -> 1, 6 -> 1, 1 -> 1, 2 -> 2, 3 -> 3)
This is perhaps not the fastest approach because intermediate datatype like String and Char are used but one of the most simplest:
def countDigits(n: Int): Map[Int, Int] =
n.toString.groupBy(x => x) map { case (n, c) => (n.asDigit, c.size) }
Example:
scala> def countDigits(n: Int): Map[Int, Int] = n.toString.groupBy(x => x) map { case (n, c) => (n.asDigit, c.size) }
countDigits: (n: Int)Map[Int,Int]
scala> countDigits(12345135)
res0: Map[Int,Int] = Map(5 -> 2, 1 -> 2, 2 -> 1, 3 -> 2, 4 -> 1)
Where myNumAsString is a String, eg "15625"
myNumAsString.groupBy(x => x).map(x => (x._1, x._2.length))
Result = Map(2 -> 1, 5 -> 2, 1 -> 1, 6 -> 1)
ie. A map containing the digit with its corresponding count.
What this is doing is taking your list, grouping the values by value (So for the initial string of "15625", it produces a map of 1 -> 1, 2 -> 2, 6 -> 6, and 5 -> 55.). The second bit just creates a map of the value to the count of how many times it occurs.
The counts for these hundred digits happen to fit into a hex digit.
scala> val is = for (_ <- (1 to 100).toList) yield r.nextInt(10)
is: List[Int] = List(8, 3, 9, 8, 0, 2, 0, 7, 8, 1, 6, 9, 9, 0, 3, 6, 8, 6, 3, 1, 8, 7, 0, 4, 4, 8, 4, 6, 9, 7, 4, 6, 6, 0, 3, 0, 4, 1, 5, 8, 9, 1, 2, 0, 8, 8, 2, 3, 8, 6, 4, 7, 1, 0, 2, 2, 6, 9, 3, 8, 6, 7, 9, 5, 0, 7, 6, 8, 7, 5, 8, 2, 2, 2, 4, 1, 2, 2, 6, 8, 1, 7, 0, 7, 6, 9, 5, 5, 5, 3, 5, 8, 2, 5, 1, 9, 5, 7, 2, 3)
scala> (new Array[Int](10) /: is) { case (a, i) => a(i) += 1 ; a } map ("%x" format _) mkString
warning: there were 1 feature warning(s); re-run with -feature for details
res7: String = a8c879caf9
scala> (new Array[Int](10) /: is) { case (a, i) => a(i) += 1 ; a } sum
warning: there were 1 feature warning(s); re-run with -feature for details
res8: Int = 100
I was going to point out that no one used a char range, but now I see Kristian did.
def pow(n:Int) : String = {
val cubed = (n * n * n).toString
val cnts = for (a <- '0' to '9') yield cubed.count(_ == a)
(cnts map (c => ('0' + c).toChar)).mkString
}

Scala: How to sort an array within a specified range of indices?

And I have a comparison function "compr" already in the code to compare two values.
I want something like this:
Sorting.stableSort(arr[i,j] , compr)
where arr[i,j] is a range of element in array.
Take the slice as a view, sort and copy it back (or take a slice as a working buffer).
scala> val vs = Array(3,2,8,5,4,9,1,10,6,7)
vs: Array[Int] = Array(3, 2, 8, 5, 4, 9, 1, 10, 6, 7)
scala> vs.view(2,5).toSeq.sorted.copyToArray(vs,2)
scala> vs
res31: Array[Int] = Array(3, 2, 4, 5, 8, 9, 1, 10, 6, 7)
Outside the REPL, the extra .toSeq isn't needed:
vs.view(2,5).sorted.copyToArray(vs,2)
Updated:
scala 2.13.8> val vs = Array(3, 2, 8, 5, 4, 9, 1, 10, 6, 7)
val vs: Array[Int] = Array(3, 2, 8, 5, 4, 9, 1, 10, 6, 7)
scala 2.13.8> vs.view.slice(2,5).sorted.copyToArray(vs,2)
val res0: Int = 3
scala 2.13.8> vs
val res1: Array[Int] = Array(3, 2, 4, 5, 8, 9, 1, 10, 6, 7)
Split array into three parts, sort middle part and then concat them, not the most efficient way, but this is FP who cares about performance =)
val sorted =
for {
first <- l.take(FROM)
sortingPart <- l.slice(FROM, UNTIL)
lastPart <- l.takeRight(UNTIL)
} yield (first ++ Sorter.sort(sortingPart) ++ lastPart)
Something like that:
def stableSort[T](x: Seq[T], i: Int, j: Int, comp: (T,T) => Boolean ):Seq[T] = {
x.take(i) ++ x.slice(i,j).sortWith(comp) ++ x.drop(i+j-1)
}
def comp: (Int,Int) => Boolean = { case (x1,x2) => x1 < x2 }
val x = Array(1,9,5,6,3)
stableSort(x,1,4, comp)
// > res0: Seq[Int] = ArrayBuffer(1, 5, 6, 9, 3)
If your class implements Ordering it would be less cumbersome.
This should be as good as you can get without reimplementing the sort. Creates just one extra array with the size of the slice to be sorted.
def stableSort[K:reflect.ClassTag](xs:Array[K], from:Int, to:Int, comp:(K,K) => Boolean) : Unit = {
val tmp = xs.slice(from,to)
scala.util.Sorting.stableSort(tmp, comp)
tmp.copyToArray(xs, from)
}

Sized generators in scalacheck

UserGuide of scalacheck project mentioned sized generators. The explanation code
def matrix[T](g:Gen[T]):Gen[Seq[Seq[T]]] = Gen.sized {size =>
val side = scala.Math.sqrt(size).asInstanceOf[Int] //little change to prevent compile-time exception
Gen.vectorOf(side, Gen.vectorOf(side, g))
}
explained nothing for me. After some exploration I understood that length of generated sequence does not depend on actual size of generator (there is resize method in Gen object that "Creates a resized version of a generator" according to javadoc (maybe that means something different?)).
val g = Gen.choose(1,5)
val g2 = Gen.resize(15, g)
println(matrix(g).sample) // (1)
println(matrix(g2).sample) // (2)
//1,2 produce Seq with same length
Could you explain me what had I missed and give me some examples how you use them in testing code?
The vectorOf (which now is replaced with listOf) generates lists with a size that depends (linearly) on the size parameter that ScalaCheck sets when it evaluates a generator. When ScalaCheck tests a property it will increase this size parameter for each test, resulting in properties that are tested with larger and larger lists (if listOf is used).
If you create a matrix generator by just using the listOf generator in a nested fashion, you will get matrices with a size that depends on the square of the size parameter. Hence when using such a generator in a property you might end up with very large matrices, since ScalaCheck increases the size parameter for each test run. However, if you use the resize generator combinator in the way it is done in the ScalaCheck User Guide, your final matrix size depend linearly on the size parameter, resulting in nicer performance when testing your properties.
You should really not have to use the resize generator combinator very often. If you need to generate lists that are bounded by some specific size, it's much better to do something like the example below instead, since there is no guarantee that the listOf/ containerOf generators really use the size parameter the way you expect.
def genBoundedList(maxSize: Int, g: Gen[T]): Gen[List[T]] = {
Gen.choose(0, maxSize) flatMap { sz => Gen.listOfN(sz, g) }
}
The vectorOf method that you use is deprecated , and you should use the listOf method. This generates a list of random length where the maximum length is limited by the size of the generator. You should therefore resize the generator that
actually generates the actual list if you want control over the maximum elements that are generated:
scala> val g1 = Gen.choose(1,5)
g1: org.scalacheck.Gen[Int] = Gen()
scala> val g2 = Gen.listOf(g1)
g2: org.scalacheck.Gen[List[Int]] = Gen()
scala> g2.sample
res19: Option[List[Int]] = Some(List(4, 4, 4, 4, 2, 4, 2, 3, 5, 1, 1, 1, 4, 4, 1, 1, 4, 5, 5, 4, 3, 3, 4, 1, 3, 2, 2, 4, 3, 4, 3, 3, 4, 3, 2, 3, 1, 1, 3, 2, 5, 1, 5, 5, 1, 5, 5, 5, 5, 3, 2, 3, 1, 4, 3, 1, 4, 2, 1, 3, 4, 4, 1, 4, 1, 1, 4, 2, 1, 2, 4, 4, 2, 1, 5, 3, 5, 3, 4, 2, 1, 4, 3, 2, 1, 1, 1, 4, 3, 2, 2))
scala> val g3 = Gen.resize(10, g2)
g3: java.lang.Object with org.scalacheck.Gen[List[Int]] = Gen()
scala> g3.sample
res0: Option[List[Int]] = Some(List(1))
scala> g3.sample
res1: Option[List[Int]] = Some(List(4, 2))
scala> g3.sample
res2: Option[List[Int]] = Some(List(2, 1, 2, 4, 5, 4, 2, 5, 3))