Summing Set values in Scala - scala

I have a set of Tuples and I'd like to get sum of it's values integer part. But when I apply this code it returns 1, when 3 is expected. I suppose that it's because result of the map function also returns a Set and duplicate results are eliminated.
object Main:
def main(args: Array[String]): Unit = {
val pairs = Set(("one", 1), ("two", 1), ("three", 1))
val sum = pairs.map(pair => pair._2).sum
println(sum) //returns 1
}
My expectations are based on how such things work in Java. Set has distinct elements, but Stream doesn't until distinct() or .collect(toSet()) are used. According to this the result is 3 as expected.
import org.apache.commons.lang3.tuple.Pair;
import java.util.Set;
public class Main {
public static void main(String... args) {
var pairs = Set.of(Pair.of("one", 1), Pair.of("two", 1), Pair.of("three", 1));
var sum = pairs.stream()
.map(Pair::getRight)
.reduce(0, Integer::sum);
System.out.println(sum); //returns 3
}
}
My current assumptions on how to achieve such result (3) are:
Convert Set to List, but it seems not to be a good solution:
val sum = List.from(pairs).map(pair => pair._2).sum
Use foldLeft:
val sum = set.foldLeft(0)((a, b) => a + b._2)
But maybe there are more convinient methods?

You can use iterator or view to get required collection representation:
val sum = pairs.iterator.map(_._2).sum

Related

TapeEquilibrium ScalaCheck

I have been trying to code some scalacheck property to verify the Codility TapeEquilibrium problem. For those who do not know the problem, see the following link: https://app.codility.com/programmers/lessons/3-time_complexity/tape_equilibrium/.
I coded the following yet incomplete code.
test("Lesson 3 property"){
val left = Gen.choose(-1000, 1000).sample.get
val right = Gen.choose(-1000, 1000).sample.get
val expectedSum = Math.abs(left - right)
val leftArray = Gen.listOfN(???, left) retryUntil (_.sum == left)
val rightArray = Gen.listOfN(???, right) retryUntil (_.sum == right)
val property = forAll(leftArray, rightArray){ (r: List[Int], l: List[Int]) =>
val array = (r ++ l).toArray
Lesson3.solution3(array) == expectedSum
}
property.check()
}
The idea is as follows. I choose two random numbers (values left and right) and calculate its absolute difference. Then, my idea is to generate two arrays. Each array will be random numbers whose sum will be either "left" or "right". Then by concatenating these array, I should be able to verify this property.
My issue is then generating the leftArray and rightArray. This itself is a complex problem and I would have to code a solution for this. Therefore, writing this property seems over-complicated.
Is there any way to code this? Is coding this property an overkill?
Best.
My issue is then generating the leftArray and rightArray
One way to generate these arrays or (lists), is to provide a generator of nonEmptyList whose elements sum is equal to a given number, in other word, something defined by method like this:
import org.scalacheck.{Gen, Properties}
import org.scalacheck.Prop.forAll
def listOfSumGen(expectedSum: Int): Gen[List[Int]] = ???
That verifies the property:
forAll(Gen.choose(-1000, 1000)){ sum: Int =>
forAll(listOfSumGen(sum)){ listOfSum: List[Int] =>
(listOfSum.sum == sum) && listOfSum.nonEmpty
}
}
To build such a list only poses a constraint on one element of the list, so basically here is a way to generate:
Generate list
The extra constrained element, will be given by the expectedSum - the sum of list
Insert the constrained element into a random index of the list (because obviously any permutation of the list would work)
So we get:
def listOfSumGen(expectedSum: Int): Gen[List[Int]] =
for {
list <- Gen.listOf(Gen.choose(-1000,1000))
constrainedElement = expectedSum - list.sum
index <- Gen.oneOf(0 to list.length)
} yield list.patch(index, List(constrainedElement), 0)
Now we the above generator, leftArray and rightArray could be define as follows:
val leftArray = listOfSumGen(left)
val rightArray = listOfSumGen(right)
However, I think that the overall approach of the property described is incorrect, as it builds an array where a specific partition of the array equals the expectedSum but this doesn't ensure that another partition of the array would produce a smaller sum.
Here is a counter-example run-through:
val left = Gen.choose(-1000, 1000).sample.get // --> 4
val right = Gen.choose(-1000, 1000).sample.get // --> 9
val expectedSum = Math.abs(left - right) // --> |4 - 9| = 5
val leftArray = listOfSumGen(left) // Let's assume one of its sample would provide List(3,1) (whose sum equals 4)
val rightArray = listOfSumGen(right)// Let's assume one of its sample would provide List(2,4,3) (whose sum equals 9)
val property = forAll(leftArray, rightArray){ (l: List[Int], r: List[Int]) =>
// l = List(3,1)
// r = List(2,4,3)
val array = (l ++ r).toArray // --> Array(3,1,2,4,3) which is the array from the given example in the exercise
Lesson3.solution3(array) == expectedSum
// According to the example Lesson3.solution3(array) equals 1 which is different from 5
}
Here is an example of a correct property that essentially applies the definition:
def tapeDifference(index: Int, array: Array[Int]): Int = {
val (left, right) = array.splitAt(index)
Math.abs(left.sum - right.sum)
}
forAll(Gen.nonEmptyListOf(Gen.choose(-1000,1000))) { list: List[Int] =>
val array = list.toArray
forAll(Gen.oneOf(array.indices)) { index =>
Lesson3.solution3(array) <= tapeDifference(index, array)
}
}
This property definition might collides with the way the actual solution has been implemented (which is one of the potential pitfall of scalacheck), however, that would be a slow / inefficient solution hence this would be more a way to check an optimized and fast implementation against slow and correct implementation (see this presentation)
Try this with c# :
using System;
using System.Collections.Generic;
using System.Linq;
private static int TapeEquilibrium(int[] A)
{
var sumA = A.Sum();
var size = A.Length;
var take = 0;
var res = new List<int>();
for (int i = 1; i < size; i++)
{
take = take + A[i-1];
var resp = Math.Abs((sumA - take) - take);
res.Add(resp);
if (resp == 0) return resp;
}
return res.Min();
}

Scala - how to make the SortedSet with custom ordering hold multiple different objects that have the same value by which we sort?

as mentioned in the title I have a SortedSet with custom ordering. The set holds objects of class Edge (representing an edge in a graph). Each Edge has a cost associated with it as well as it's start and end point.
case class Edge(firstId : Int, secondId : Int, cost : Int) {}
My ordering for SortedSet of edges looks like this (it's for the A* algorithm) :
object Ord {
val edgeCostOrdering: Ordering[Edge] = Ordering.by { edge : Edge =>
if (edge.secondId == goalId) graphRepresentation.calculateStraightLineCost(edge.firstId, goalId) else edge.cost + graphRepresentation.calculateStraightLineCost(edge.secondId, goalId)
}
}
However after I apply said ordering to the set and I try to sort edges that have different start/end points but the same cost - only the last encountered edge retains in the set.
For example :
val testSet : SortedSet[Edge] = SortedSet[Edge]()(edgeOrder)
val testSet2 = testSet + Edge(1,4,2)
val testSet3 = testSet2 + Edge(3,2,2)
println(testSet3)
Only prints (3,2,2)
Aren't these distinct objects? They only share the same value for one field so shouldn't the Set be able to handle this?
Consider using a mutable.PriorityQueue instead, it can keep multiple elements that have the same order. Here is a simpler example where we order pairs by the second component:
import collection.mutable.PriorityQueue
implicit val twoOrd = math.Ordering.by{ (t: (Int, Int)) => t._2 }
val p = new PriorityQueue[(Int, Int)]()(twoOrd)
p += ((1, 2))
p += ((42, 2))
Even though both pairs are mapped to 2, and therefore have the same priority, the queue does not lose any elements:
p foreach println
(1,2)
(42,2)
To retain all the distinct Edges with the same ordering cost value in the SortedSet, you can modify your Ordering.by's function to return a Tuple that includes the edge Ids as well:
val edgeCostOrdering: Ordering[Edge] = Ordering.by { edge: Edge =>
val cost = if (edge.secondId == goalId) ... else ...
(cost, edge.firstId, edge.secondId)
}
A quick proof of concept below:
import scala.collection.immutable.SortedSet
case class Foo(a: Int, b: Int)
val fooOrdering: Ordering[Foo] = Ordering.by(_.b)
val ss = SortedSet(Foo(2, 2), Foo(2, 1), Foo(1, 2))(fooOrdering)
// ss: scala.collection.immutable.SortedSet[Foo] = TreeSet(Foo(2,1), Foo(1,2))
val fooOrdering: Ordering[Foo] = Ordering.by(foo => (foo.b, foo.a))
val ss = SortedSet(Foo(2, 2), Foo(2, 1), Foo(1, 2))(fooOrdering)
// ss: scala.collection.immutable.SortedSet[Foo] = TreeSet(Foo(1,2), Foo(2,1), Foo(2,2))

Scala: Safe access to index in a List[DataFrame]

I receive a List[DataFrame] and I want to store each df in a variable. Some values always exist in the list:
val routes = dataframes(0)
val stops = dataframes(1)
But other ones may also come so the size list is variable.
How could I perform a safely access to a index of list that may be out of bounds? I thought that with Some() and handling the result it would works:
val fare_attributes : Option[DataFrame] = Some(dataframes(10))
fare_attributes match {
case Some(fare) => upload())
println("fare_attributes uploaded")
case None => println("No fare_attributes found")
}
But I receive: java.lang.IndexOutOfBoundsException: 2
You can use .lift on your list:
val fare_attributes : Option[DataFrame] = dataframes.lift(10)
I think you will have to rely on checking the length of the list before accessing the indexed value. You may want to implement some wrapper function to do so. So that it isn't done repeatedly.
You can be a bit "elegant" about it with currying with two parameter lists. So that your code is a bit concise. Here's a sample which you may improve.
def safeList(list: List[Int])(index: Int): Int = {
if (index < list.length) list(index)
else 0
}
val x = List(1, 2 ,3 )
val y = safeList(x)(_)
val a = y(0) // returns 1
val b = y(1) // returns 2
val c = y(4) // returns 0

variable parameters in Scala constructor

I would like to write a Matrix class in Scala from that I can instantiate objects like this:
val m1 = new Matrix( (1.,2.,3.),(4.,5.,6.),(7.,8.,9.) )
val m2 = new Matrix( (1.,2.,3.),(4.,5.,6.) )
val m3 = new Matrix( (1.,2.),(3.,4.) )
val m4 = new Matrix( (1.,2.),(3.,4.),(5.,6.) )
I have tried this:
class Matrix(a: (List[Double])* ) { }
but then I get a type mismatch because the matrix rows are not of type List[Double].
Further it would be nice to just have to type Integers (1,2,3) instead of (1.,2.,3.) but still get a double matrix.
How to solve this?
Thanks!
Malte
(1.0, 2.0) is a Tuple2[Double, Double] not a List[Double]. Similarly (1.0, 2.0, 3.0) is a Tuple3[Double, Double, Double].
If you need to handle a fixed number of cardinality, the simplest solution in plain vanilla scala would be to have
class Matrix2(rows: Tuple2[Double, Double]*)
class Matrix3(rows: Tuple3[Double, Double, Double]*)
and so on.
Since there exist an implicit conversion from Int to Double, you can pass a tuple of ints and it will be automatically converted.
new Matrix2((1, 2), (3, 4))
If you instead need to abstract over the row cardinality, enforcing an NxM using types, you would have to resort to some more complex solutions, perhaps using the shapeless library.
Or you can use an actual list, but you cannot restrict the cardinality, i.e. you cannot ensure that all rows have the same length (again, in vanilla scala, shapeless can help)
class Matrix(rows: List[Double]*)
new Matrix(List(1, 2), List(3, 4))
Finally, the 1. literal syntax is deprecated since scala 2.10 and removed in scala 2.11. Use 1.0 instead.
If you need support for very large matrices, consider using an existing implementation like Breeze. Breeze has a DenseMatrix which probably meets your requirements. For performance reasons, Breeze offloads more complex operations into native code.
Getting Matrix algebra right is a difficult exercise and unless you are specifically implementing matrix to learn/assignment, better to go with proven libraries.
Edited based on comment below:
You can consider the following design.
class Row(n : Int*)
class Matrix(rows: Row*) {...}
Usage:
val m = new Matrix(Row(1, 2, 3), Row(2,3,4))
You need to validate that all Rows are of the length and reject the input if they are not.
I have hacked a solution in an - I think - a bit unscala-ish way
class Matrix(values: Object*) { // values is of type WrappedArray[Object]
var arr : Array[Double] = null
val rows : Integer = values.size
var cols : Integer = 0
var _arrIndex = 0
for(row <- values) {
// if Tuple (Tuple extends Product)
if(row.isInstanceOf[Product]) {
var colcount = row.asInstanceOf[Product].productIterator.length
assert(colcount > 0, "empty row")
assert(cols == 0 || colcount == cols, "varying number of columns")
cols = colcount
if(arr == null) {
arr = new Array[Double](rows*cols)
}
for(el <- row.asInstanceOf[Product].productIterator) {
var x : Double = 0.0
if(el.isInstanceOf[Integer]) {
x = el.asInstanceOf[Integer].toDouble
}
else {
assert(el.isInstanceOf[Double], "unknown element type")
x = el.asInstanceOf[Double]
}
arr(_arrIndex) = x
_arrIndex = _arrIndex + 1
}
}
}
}
works like
object ScalaMatrix extends App {
val m1 = new Matrix((1.0,2.0,3.0),(5,4,5))
val m2 = new Matrix((9,8),(7,6))
println(m1.toString())
println(m2.toString())
}
I don't really like it. What do you think about it?

Getting a HashMap from Scala's HashMap.mapValues?

The example below is a self-contained example I've extracted from my larger app.
Is there a better way to get a HashMap after calling mapValues below? I'm new to Scala, so it's very likely that I'm going about this all wrong, in which case feel free to suggest a completely different approach. (An apparently obvious solution would be to move the logic in the mapValues to inside the accum but that would be tricky in the larger app.)
#!/bin/sh
exec scala "$0" "$#"
!#
import scala.collection.immutable.HashMap
case class Quantity(val name: String, val amount: Double)
class PercentsUsage {
type PercentsOfTotal = HashMap[String, Double]
var quantities = List[Quantity]()
def total: Double = (quantities map { t => t.amount }).sum
def addQuantity(qty: Quantity) = {
quantities = qty :: quantities
}
def percentages: PercentsOfTotal = {
def accum(m: PercentsOfTotal, qty: Quantity) = {
m + (qty.name -> (qty.amount + (m getOrElse (qty.name, 0.0))))
}
val emptyMap = new PercentsOfTotal()
// The `emptyMap ++` at the beginning feels clumsy, but it does the
// job of giving me a PercentsOfTotal as the result of the method.
emptyMap ++ (quantities.foldLeft(emptyMap)(accum(_, _)) mapValues (dollars => dollars / total))
}
}
val pu = new PercentsUsage()
pu.addQuantity(new Quantity("A", 100))
pu.addQuantity(new Quantity("B", 400))
val pot = pu.percentages
println(pot("A")) // prints 0.2
println(pot("B")) // prints 0.8
Rather than using a mutable HashMap to build up your Map, you can just use scala collections' built in groupBy function. This creates a map from the grouping property to a list of the values in that group, which can then be aggregated, e.g. by taking a sum:
def percentages: Map[String, Double] = {
val t = total
quantities.groupBy(_.name).mapValues(_.map(_.amount).sum / t)
}
This pipeline transforms your List[Quantity] => Map[String, List[Quantity]] => Map[String, Double] giving you the desired result.