Related
I am at the beginning of my Scala journey. I am trying to find and compare the highest increased value of a given dataset - type Map(String, List[Int]). The program should calculate the increase(or decrease) between the 7th last value of the List ant the last value of each row and then print the highest increase row of the entire Map. For example, given the following dataset:
DATASET
SK1, 9, 7, 2, 0, 7, 3, 7, 9, 1, 2, 8, 1, 9, 6, 5, 3, 2, 2, 7, 2, 8, 5, 4, 5, 1, 6, 5, 2, 4, 1
SK2, 0, 7, 6, 3, 3, 3, 1, 6, 9, 2, 9, 7, 8, 7, 3, 6, 3, 5, 5, 2, 9, 7, 3, 4, 6, 3, 4, 3, 4, 1
SK3, 8, 7, 1, 8, 0, 5, 8, 3, 5, 9, 7, 5, 4, 7, 9, 8, 1, 4, 6, 5, 6, 6, 3, 6, 8, 8, 7, 4, 0, 7
The program should calculate the increase of each row:
SK1 = 1 "last value" - 5 "7th last value" = - 4
SK2 = 1 "last value" - 4 "7th last value" = - 3
SK3 = 7 "last value" - 6 "7th last value" = 1
The program should then print SK3 - 0 because is the highest increase.
The program can calculate the the increase of each row but it currently needs an SK input with the following two methods:
def rise(stock: String): (Int) = {
mapdata.get(stock).map(findLast(_)).getOrElse(0) -
(mapdata.get(stock).map(_.takeRight(7).head.toInt).getOrElse(0))
}
def stockRise(stock: String): (String, Int) = {
(stock, rise(stock))
}
The two methods are then called within the program menu using:
def handleFive(): Boolean = {
menuShowSingleDataStock(stockRise)
true
}
//Pull two rows from the dataset
def menuShowDoubleDataStock(resultCalculator: (String, String) => (String, Int)) = {
print("Please insert the Stock > ")
val stockName1 = readLine
print("Please insert the Stock > ")
val stockName2 = readLine
val result = resultCalculator(stockName1, stockName2)
println(s"${result._1}: ${result._2}")
}
I have tried to call the following method that calculates the rises of every row using the following method but it doesn't seem to be working:
def menuShowStocks(f: () => Map[String, List[Int]]) = {
val highestIncrese = 0
f() foreach { case (x, y) => println(s"$x: $y") }
}
A common approach is:
first map each row, calculate the score
use an aggregation function to select the desired row
Here we go:
scala> val dataSet = Map(
| "SK1" -> List(9, 7, 2, 0, 7, 3, 7, 9, 1, 2, 8, 1, 9, 6, 5, 3, 2, 2, 7, 2, 8, 5, 4, 5, 1, 6, 5, 2, 4, 1),
| "SK2" -> List(0, 7, 6, 3, 3, 3, 1, 6, 9, 2, 9, 7, 8, 7, 3, 6, 3, 5, 5, 2, 9, 7, 3, 4, 6, 3, 4, 3, 4, 1),
| "SK3" -> List(8, 7, 1, 8, 0, 5, 8, 3, 5, 9, 7, 5, 4, 7, 9, 8, 1, 4, 6, 5, 6, 6, 3, 6, 8, 8, 7, 4, 0, 7)
| )
val dataSet: Map[String, List[Int]] = Map(SK1 -> List(9, 7, 2, 0, 7, 3, 7, 9, 1, 2, 8, 1, 9, 6, 5, 3, 2, 2, 7, 2, 8, 5, 4, 5, 1, 6, 5, 2, 4, 1), SK2 -> List(0, 7, 6, 3, 3, 3, 1, 6, 9, 2, 9, 7, 8, 7, 3, 6, 3, 5, 5, 2, 9, 7, 3, 4, 6, 3, 4, 3, 4, 1), SK3 -> List(8, 7, 1, 8, 0, 5, 8, 3, 5, 9, 7, 5, 4, 7, 9, 8, 1, 4, 6, 5, 6, 6, 3, 6, 8, 8, 7, 4, 0, 7))
scala> val highestIncrease = dataSet
| .toSeq
| .map { case (name, ints) =>
| name -> (ints.last - ints(ints.length - 7))
| }
| .maxBy(_._2)
val highestIncrease: (String, Int) = (SK3,1)
Some notes:
The map is converted to a Seq first with toSeq. Mapping over Map's is entirely possible but a bit more complicated. Better leave this for a later learning moment. This produces a Seq[(String, List[Int])].
Using map we iterate over the tuples in the Seq. This uses pattern matching to extract the variables name and ints.
The score is calculated. Also, we use the -> operator to construct a new tuple of 2 items so we hang on to the name of the row.
Method maxBy accepts a function to get a value. The expression _._2, equivalent to x => x._2 is a function that gives the second value in each tuple.
The following could print the name of what we found:
println(s"The highest increase is in dataset ${highestIncrease._1} and is ${highestIncrease._2}.")
I was given a list of apps along with their ratings:
let appRatings = [
"Calendar Pro": [1, 5, 5, 4, 2, 1, 5, 4],
"The Messenger": [5, 4, 2, 5, 4, 1, 1, 2],
"Socialise": [2, 1, 2, 2, 1, 2, 4, 2]
]
I want to write a func that takes appRating as input and return their name and average rating, like this.
["Calendar Pro": 3,
"The Messenger": 3,
"Socialise": 2]
Does anyone know how to implement such a method that it takes (name and [rating]) as input and outputs (name and avgRating ) using a closure inside the func?
This is what I have so far.
func calculate( appName: String, ratings : [Int]) -> (String, Double ) {
let avg = ratings.reduce(0,+)/ratings.count
return (appName, Double(avg))
}
Fundamentally, what you're trying to achieve is a mapping between one set of values into another. Dictionary has a function for this, Dictionary.mapValues(_:), specifically for mapping values only (keeping them under the same keys).
let appRatings = [
"Calendar Pro": [1, 5, 5, 4, 2, 1, 5, 4],
"The Messenger": [5, 4, 2, 5, 4, 1, 1, 2],
"Socialise": [2, 1, 2, 2, 1, 2, 4, 2]
]
let avgAppRatings = appRatings.mapValues { allRatings in
return computeAverage(of: allRatings) // Dummy function we'll implement later
}
So now, it's a matter of figuring out how to average all the numbers in an Array. Luckily, this is very easy:
We need to sum all the ratings
We can easily achieve this with a reduce expression. StWe'll reduce all numbers by simply adding them into the accumulator, which will start with 0
allRatings.reduce(0, { accumulator, rating in accumulator + rate })
From here, we can notice that the closure, { accumulator, rating in accumulator + rate } has type (Int, Int) -> Int, and just adds the numbers together. Well hey, that's exactly what + does! We can just use it directly:
allRatings.reduce(0, +)
We need to divide the ratings by the number of ratings
There's a catch here. In order for the average to be of any use, it can't be truncated to a mere Int. So we need both the sum and the count to be converted to Double first.
You need to guard against empty arrays, whose count will be 0, resulting in Double.infinity.
Putting it all together, we get:
let appRatings = [
"Calendar Pro": [1, 5, 5, 4, 2, 1, 5, 4],
"The Messenger": [5, 4, 2, 5, 4, 1, 1, 2],
"Socialise": [2, 1, 2, 2, 1, 2, 4, 2]
]
let avgAppRatings = appRatings.mapValues { allRatings in
if allRatings.isEmpty { return nil }
return Double(allRatings.reduce(0, +)) / Double(allRatings.count)
}
Add in some nice printing logic:
extension Dictionary {
var toDictionaryLiteralString: String {
return """
[
\t\(self.map { k, v in "\(k): \(v)" }.joined(separator: "\n\t"))
]
"""
}
}
... and boom:
print(avgAppRatings.toDictionaryLiteralString)
/* prints:
[
Socialise: 2.0
The Messenger: 3.0
Calendar Pro: 3.375
]
*/
Comments on your attempt
You had some questions as to why your attempt didn't work:
func calculate( appName: String, ratings : [Int]) -> (String: Int ) {
var avg = ratings.reduce(0,$0+$1)/ratings.count
return appName: sum/avg
}
$0+$1 isn't within a closure ({ }), as it needs to be.
appName: sum/avg isn't valid Swift.
The variable sum doesn't exist.
avg is a var variable, even though it's never mutated. It should be a let constant.
You're doing integer devision, which doesn't support decimals. You'll need to convert your sum and count into a floating point type, like Double, first.
A fixed version might look like:
func calculateAverage(of numbers: [Int]) -> Double {
let sum = Double(ratings.reduce(0, +))
let count = Double(numbers.count)
return sum / count
}
To make a function that processes your whole dictionary, incoroprating my solution above, you might write a function like:
func calculateAveragesRatings(of appRatings: [String: [Int]]) -> [String: Double?] {
return appRatings.mapValues { allRatings in
if allRatings.isEmpty { return nil }
return Double(allRatings.reduce(0, +)) / Double(allRatings.count)
}
}
This a simple solution that takes into account that a rating is an integer:
let appRatings = [
"Calendar Pro": [1, 5, 5, 4, 2, 1, 5, 4],
"The Messenger": [5, 4, 2, 5, 4, 1, 1, 2],
"Socialise": [2, 1, 2, 2, 1, 2, 4, 2]
]
let appWithAverageRating: [String: Int] = appRatings.mapValues { $0.reduce(0, +) / $0.count}
print("appWithAverageRating =", appWithAverageRating)
prints appWithAverageRating = ["The Messenger": 3, "Calendar Pro": 3, "Socialise": 2]
If you'd like to check whether an app has enough ratings before returning an average rating, then the rating would be an optional Int:
let minimumNumberOfRatings = 0 // You can change this
var appWithAverageRating: [String: Int?] = appRatings.mapValues { ratingsArray in
guard ratingsArray.count > minimumNumberOfRatings else {
return nil
}
return ratingsArray.reduce(0, +) / ratingsArray.count
}
If you'd like the ratings to go by half stars (0, 0.5, 1, ..., 4.5, 5) then we could use this extension:
extension Double {
func roundToHalf() -> Double {
let n = 1/0.5
let numberToRound = self * n
return numberToRound.rounded() / n
}
}
Then the rating will be an optional Double. Let's add an AppWithoutRatings and test our code:
let appRatings = [
"Calendar Pro": [1, 5, 5, 4, 2, 1, 5, 4],
"The Messenger": [5, 4, 2, 5, 4, 1, 1, 2],
"Socialise": [2, 1, 2, 2, 1, 2, 4, 2],
"AppWithoutRatings": []
]
let minimumNumberOfRatings = 0
var appWithAverageRating: [String: Double?] = appRatings.mapValues { ratingsArray in
guard ratingsArray.count > minimumNumberOfRatings else {
return nil
}
let rating: Double = Double(ratingsArray.reduce(0, +) / ratingsArray.count)
return rating.roundToHalf()
}
And this prints:
appWithAverageRating = ["Calendar Pro": Optional(3.0), "Socialise": Optional(2.0), "The Messenger": Optional(3.0), "AppWithoutRatings": nil]
I decided to make an Dictionary extension for this, so it is very easy to use in the future.
Here is my code I created:
extension Dictionary where Key == String, Value == [Float] {
func averageRatings() -> [String : Float] {
// Calculate average
func average(ratings: [Float]) -> Float {
return ratings.reduce(0, +) / Float(ratings.count)
}
// Go through every item in the ratings dictionary
return self.mapValues { $0.isEmpty ? 0 : average(ratings: $0) }
}
}
let appRatings: [String : [Float]] = ["Calendar Pro": [1, 5, 5, 4, 2, 1, 5, 4],
"The Messenger": [5, 4, 2, 5, 4, 1, 1, 2],
"Socialise": [2, 1, 2, 2, 1, 2, 4, 2]]
print(appRatings.averageRatings())
which will print the result of ["Calendar Pro": 3.375, "Socialise": 2.0, "The Messenger": 3.0].
Just to make the post complete another approach using reduce(into:) to avoid using a dictionary with an optional value type:
extension Dictionary where Key == String, Value: Collection, Value.Element: BinaryInteger {
var averageRatings: [String : Value.Element] {
return reduce(into: [:]) {
if !$1.value.isEmpty {
$0[$1.key] = $1.value.reduce(0,+) / Value.Element($1.value.count)
}
}
}
}
let appRatings2 = ["Calendar Pro" : [1, 5, 5, 4, 2, 1, 5, 4],
"The Messenger": [5, 4, 2, 5, 4, 1, 1, 2],
"Socialise" : [2, 1, 2, 2, 1, 2, 4, 2] ]
let keySorted = appRatings2.averageRatings.sorted(by: {$0.key<$1.key})
keySorted.map{ print($0,$1) }
Calendar Pro 3
Socialise 2
The Messenger 3
I would like to use the Array.partition(by:) to move some predefined elements from an array to the the end of it.
Example:
var my_array = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
let elementsToMove = [1, 3, 4, 5, 8]
// desired result: [0, 2, 6, 7, 9, ...remaining items in any order...]
Is there an elegant way to do that? Observe that elementsToMove does not follow a pattern.
partition(by:) does not preserve the order of the elements:
var my_array = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
let elementsToMove = [1, 3, 4, 5, 8]
_ = my_array.partition(by: { elementsToMove.contains($0) } )
print(my_array) // [0, 9, 2, 7, 6, 5, 4, 3, 8, 1]
A simple solution would be to filter-out and append the elements from
the second array:
let my_array = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
let elementsToMove = [1, 3, 4, 5, 8]
let newArray = my_array.filter({ !elementsToMove.contains($0) }) + elementsToMove
print(newArray) // [0, 2, 6, 7, 9, 1, 3, 4, 5, 8]
For larger arrays it can be advantageous to create a set of the
to-be-moved elements first:
let my_array = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
let elementsToMove = [1, 3, 4, 5, 8]
let setToMove = Set(elementsToMove)
let newArray = my_array.filter({ !setToMove.contains($0) }) + elementsToMove
print(newArray) // [0, 2, 6, 7, 9, 1, 3, 4, 5, 8]
If you have unique object in your my_array then you can try something like this.
var my_array = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
var tempArray = my_array //Preserve original array
let elementsToMove = [1, 3, 4, 5, 8]
let p = tempArray.partition(by: elementsToMove.contains)
//Now sort first part of tempArray on basis of your my_array to get order you want
let newArray = tempArray[0..<p].sorted(by: { my_array.index(of: $0)! < my_array.index(of: $1)! }) + tempArray[p...]
print(newArray)
Output
[0, 2, 6, 7, 9, 5, 4, 3, 8, 1]
Why does the ArrayBuffer in the MapPartition seem to have elements that it has not traversed yet?
For instance, the way I look at this code, the first item should have 1 element, second 2, third 3 and so on. How could it be possible that the first ArrayBuffer output has 9 items. That would seem to imply that there were 9 iterations prior to the first output, but the yields count makes it clear that this was the first iteration.
val a = ArrayBuffer[Int]()
for(i <- 1 to 9) a += i
for(i <- 1 to 9) a += 9-i
val rdd1 = sc.parallelize(a.toArray())
def timePivotWithLoss(iter: Iterator[Int]) : Iterator[Row] = {
val currentArray = ArrayBuffer[Int]()
var loss = 0
var yields = 0
for (item <- iter) yield {
currentArray += item
//var left : Int = -1
yields += 1
Row(yields, item.toString(), currentArray)
}
}
rdd1.mapPartitions(it => timePivotWithLoss(it)).collect()
Output -
[1,1,ArrayBuffer(1, 2, 3, 4, 5, 6, 7, 8, 9)]
[2,2,ArrayBuffer(1, 2, 3, 4, 5, 6, 7, 8, 9)]
[3,3,ArrayBuffer(1, 2, 3, 4, 5, 6, 7, 8, 9)]
[4,4,ArrayBuffer(1, 2, 3, 4, 5, 6, 7, 8, 9)]
[5,5,ArrayBuffer(1, 2, 3, 4, 5, 6, 7, 8, 9)]
[6,6,ArrayBuffer(1, 2, 3, 4, 5, 6, 7, 8, 9)]
[7,7,ArrayBuffer(1, 2, 3, 4, 5, 6, 7, 8, 9)]
[8,8,ArrayBuffer(1, 2, 3, 4, 5, 6, 7, 8, 9)]
[9,9,ArrayBuffer(1, 2, 3, 4, 5, 6, 7, 8, 9)]
[1,8,ArrayBuffer(8, 7, 6, 5, 4, 3, 2, 1, 0)]
[2,7,ArrayBuffer(8, 7, 6, 5, 4, 3, 2, 1, 0)]
[3,6,ArrayBuffer(8, 7, 6, 5, 4, 3, 2, 1, 0)]
[4,5,ArrayBuffer(8, 7, 6, 5, 4, 3, 2, 1, 0)]
[5,4,ArrayBuffer(8, 7, 6, 5, 4, 3, 2, 1, 0)]
[6,3,ArrayBuffer(8, 7, 6, 5, 4, 3, 2, 1, 0)]
[7,2,ArrayBuffer(8, 7, 6, 5, 4, 3, 2, 1, 0)]
[8,1,ArrayBuffer(8, 7, 6, 5, 4, 3, 2, 1, 0)]
[9,0,ArrayBuffer(8, 7, 6, 5, 4, 3, 2, 1, 0)]
This happens because all rows in the partition use reference to the same mutable object. Spilling to disc could further make it non-deterministic with some objects being serialized and not reflecting the changes.
You can use mutable reference and immutable object:
def timePivotWithLoss(iter: Iterator[Int]) : Iterator[Row] = {
var currentArray = Vector[Int]()
var loss = 0
var yields = 0
for (item <- iter) yield {
currentArray = currentArray :+ item
yields += 1
Row(yields, item.toString(), currentArray)
}
}
but in general mutable state and Spark are not good match.
How can I print type of largest number in this dictionary?
let interestingNumbers = [
"Prime": [2, 3, 5, 7, 11, 13],
"Fibonacci": [1, 1, 2, 3, 5, 8],
"Square": [1, 4, 9, 16, 25],
]
var largest = 0
var typeoflargest:String = " "
for (kind, numbers) in interestingNumbers {
for type in kind.characters {
for number in numbers {
if number > largest {
largest = number
typeoflargest = String(type)
}
}
}
}
print(largest)
print(typeoflargest)
output:
25
S
why I got only first character "S" instead of "Square"?
There is no reason to be iterating the characters of the kind string. Just do the following:
let interestingNumbers = [
"Prime": [2, 3, 5, 7, 11, 13],
"Fibonacci": [1, 1, 2, 3, 5, 8],
"Square": [1, 4, 9, 16, 25],
]
var largest = 0
var typeoflargest:String = ""
for (kind, numbers) in interestingNumbers {
for number in numbers {
if number > largest {
largest = number
typeoflargest = kind
}
}
}
print(largest)
print(typeoflargest)
Output:
25
Square
Alternative approach:
let interestingNumbers = [
"Prime": [2, 3, 5, 7, 11, 13],
"Fibonacci": [1, 1, 2, 3, 5, 8],
"Square": [1, 4, 9, 16, 25],
]
let maximum = interestingNumbers
.map{ type, numbers in return (type: type, number: numbers.max()!) }
.max(by: { $0.number < $1.number })!
print(maximum.type, maximum.number)
Explanation:
First, get the maximal element of each category. Do this by iterating the dictionary, mapping the values from arrays of numbers to maximum numbers (within their respective arrays), yielding:
[
(type: "Square", number: 25), // 25 is the max of [1, 4, 9, 16, 25]
(type: "Prime", number: 13), // 13 is the max of [2, 3, 5, 7, 11, 13]
(type: "Fibonacci", number: 8) // 8 is the max of [1, 1, 2, 3, 5, 8]
]
Then, get the maximal type/number pair, by comparing their numbers, yielding the result:
(type: "Square", number: 25) // 25 is the max of 25, 13, 8