Scala count number of times function returns each value, functionally - scala

I want to count up the number of times that a function f returns each value in it's range (0 to f_max, inclusive) when applied to a given list l, and return the result as an array, in Scala.
Currently, I accomplish as follows:
def count (l: List): Array[Int] = {
val arr = new Array[Int](f_max + 1)
l.foreach {
el => arr(f(el)) += 1
}
return arr
}
So arr(n) is the number of times that f returns n when applied to each element of l. This works however, it is imperative style, and I am wondering if there is a clean way to do this purely functionally.
Thank you

how about a more general approach:
def count[InType, ResultType](l: Seq[InType], f: InType => ResultType): Map[ResultType, Int] = {
l.view // create a view so we don't create new collections after each step
.map(f) // apply your function to every item in the original sequence
.groupBy(x => x) // group the returned values
.map(x => x._1 -> x._2.size) // count returned values
}
val f = (i:Int) => i
count(Seq(1,2,3,4,5,6,6,6,4,2), f)

l.foldLeft(Vector.fill(f_max + 1)(0)) { (acc, el) =>
val result = f(el)
acc.updated(result, acc(result) + 1)
}
Alternatively, a good balance of performance and external purity would be:
def count(l: List[???]): Vector[Int] = {
val arr = l.foldLeft(Array.fill(f_max + 1)(0)) { (acc, el) =>
val result = f(el)
acc(result) += 1
}
arr.toVector
}

Related

Need to merge the Map and List with the same dates

I have a Map where key = LocalDateTime and value = Group
def someGroup(/.../): List[Group] = {
someCode.map {
/.../
}.map(group => (group.completedDt, group)).toMap
/.../
}
And there is also List [Group], where Group (completedDt: LocalDateTime, cost: Int), in which always cost = 0
An example of what I have:
map: [(2021-04-01T00:00:00.000, 500), (2021-04-03T00:00:00.000, 1000), (2021-04-05T00:00:00.000, 750)]
list: ((2021-04-01T00:00:00.000, 0),(2021-04-02T00:00:00.000, 0),(2021-04-03T00:00:00.000, 0),(2021-04-04T00:00:00.000, 0),(2021-04-05T00:00:00.000, 0))
The expected result is:
list ((2021-04-01T00:00:00.000, 500),(2021-04-02T00:00:00.000, 0),(2021-04-03T00:00:00.000, 1000),(2021-04-04T00:00:00.000, 0),(2021-04-05T00:00:00.000, 750))
Thanks in advance!
Assuming that if there's a time appearing in both that you want to combine the costs:
type Group = (LocalDateTime, Int) // completedDt, cost
val groupMap: Map[LocalDateTime, Group] = ???
val groupList: List[Group] = ???
val combined =
groupList.foldLeft(groupMap) { (acc, group) =>
val completedDt = group._1
if (acc.contains(completedDt)) {
val nv = completedDt -> (acc(completedDt)._2 + group._2)
acc.updated(completedDt, nv)
} else acc + (completedDt -> group)
}.values.toList.sortBy(_._1) // You might need to define an Ordering[LocalDateTime]
The notation in your question leads me to think Group is just a pair, not a case class. It's also worth noting that I'm not sure what having the map be Map[LocalDateTime, Group] vs. Map[LocalDateTime, Int] (and thus by definition a collection of Group) buys you.
EDIT: if you have a general collection of collections of Group, you can
val groupLists: List[List[Group]] = ???
groupList.foldLeft(Map.empty[LocalDateTime, Group]) { (acc, lst) =>
lst.foldLeft(acc) { (m, group) =>
val completedDt = group._1
if (m.contains(completedDt)) {
val nv = completedDt -> (acc(completedDt)._2 + group._2)
m.updated(completedDt, nv)
} else m + (completedDt -> group)
}
}.values.toList.sortBy(_._2)

scala using calculations from pattern matching's guard (if) in body

I'm using pattern matching in scala a lot. Many times I need to do some calculations in guard part and sometimes they are pretty expensive. Is there any way to bind calculated values to separate value?
//i wan't to use result of prettyExpensiveFunc in body safely
people.collect {
case ...
case Some(Right((x, y))) if prettyExpensiveFunc(x, y) > 0 => prettyExpensiveFunc(x)
}
//ideally something like that could be helpful, but it doesn't compile:
people.collect {
case ...
case Some(Right((x, y))) if {val z = prettyExpensiveFunc(x, y); y > 0} => z
}
//this sollution works but it isn't safe for some `Seq` types and is risky when more cases are used.
var cache:Int = 0
people.collect {
case ...
case Some(Right((x, y))) if {cache = prettyExpensiveFunc(x, y); cache > 0} => cache
}
Is there any better solution?
ps: Example is simplified and I don't expect anwers that shows that I don't need pattern matching here.
You can use cats.Eval to make expensive calculations lazy and memoizable, create Evals using .map and extract .value (calculated at most once - if needed) in .collect
values.map { value =>
val expensiveCheck1 = Eval.later { prettyExpensiveFunc(value) }
val expensiveCheck2 = Eval.later { anotherExpensiveFunc(value) }
(value, expensiveCheck1, expensiveCheck2)
}.collect {
case (value, lazyResult1, _) if lazyResult1.value > 0 => ...
case (value, _, lazyResult2) if lazyResult2.value > 0 => ...
case (value, lazyResult1, lazyResult2) if lazyResult1.value > lazyResult2.value => ...
...
}
I don't see a way of doing what you want without creating some implementation of lazy evaluation, and if you have to use one, you might as well use existing one instead of rolling one yourself.
EDIT. Just in case you haven't noticed - you aren't losing the ability to pattern match by using tuple here:
values.map {
// originial value -> lazily evaluated memoized expensive calculation
case a # Some(Right((x, y)) => a -> Some(Eval.later(prettyExpensiveFunc(x, y)))
case a => a -> None
}.collect {
// match type and calculation
...
case (Some(Right((x, y))), Some(lazyResult)) if lazyResult.value > 0 => ...
...
}
Why not run the function first for every element and then work with a tuple?
Seq(1,2,3,4,5).map(e => (e, prettyExpensiveFunc(e))).collect {
case ...
case (x, y) if y => y
}
I tried own matchers and effect is somehow OK, but not perfect. My matcher is untyped, and it is bit ugly to make it fully typed.
class Matcher[T,E](f:PartialFunction[T, E]) {
def unapply(z: T): Option[E] = if (f.isDefinedAt(z)) Some(f(z)) else None
}
def newMatcherAny[E](f:PartialFunction[Any, E]) = new Matcher(f)
def newMatcher[T,E](f:PartialFunction[T, E]) = new Matcher(f)
def prettyExpensiveFunc(x:Int) = {println(s"-- prettyExpensiveFunc($x)"); x%2+x*x}
val x = Seq(
Some(Right(22)),
Some(Right(10)),
Some(Left("Oh now")),
None
)
val PersonAgeRank = newMatcherAny { case Some(Right(x:Int)) => (x, prettyExpensiveFunc(x)) }
x.collect {
case PersonAgeRank(age, rank) if rank > 100 => println("age:"+age + " rank:" + rank)
}
https://scalafiddle.io/sf/hFbcAqH/3

How to count number of total items where a class references itself

I am new to scala. I need to count Number of categories in the List, and I am trying to build a tail recursive function, without any success.
case class Category(name:String, children: List[Category])
val lists = List(
Category("1",
List(Category("1.1",
List(Category("1.2", Nil))
))
)
,Category("2", Nil),
Category("3",
List(Category("3.1", Nil))
)
)
Nyavro's solution can be made much faster (by several orders of magnitude) if you use Lists instead of Streams and also append elements at the front.
That's because x.children is usually a lot shorter than xs and Scala's List is an immutable singly linked list making prepend operations a lot faster than append operations.
Here is an example
import scala.annotation.tailrec
case class Category(name:String, children: List[Category])
#tailrec
def childCount(cats:Stream[Category], acc:Int):Int =
cats match {
case Stream.Empty => acc
case x #:: xs => childCount(xs ++ x.children, acc+1)
}
#tailrec
def childCount2(cats: List[Category], acc:Int): Int =
cats match {
case Nil => acc
case x :: xs => childCount2(x.children ++ xs, acc + 1)
}
def generate(depth: Int, children: Int): List[Category] = {
if(depth == 0) Nil
else (0 until children).map(i => Category("abc", generate(depth - 1, children))).toList
}
val list = generate(8, 3)
var start = System.nanoTime
var count = childCount(list.toStream, 0)
var end = System.nanoTime
println("count: " + count)
println("time: " + ((end - start)/1e6) + "ms")
start = System.nanoTime
count = childCount2(list, 0)
end = System.nanoTime
println("count: " + count)
println("time: " + ((end - start)/1e6) + "ms")
output:
count: 9840
time: 2226.761485ms
count: 9840
time: 3.90171ms
Consider the following idea.
Lets define function childCount, taking collection of categories (cats) and number of children count so far (acc). To organize tail-recursive processing we take first child from collection and incrementing the acc. So we have processed first item but got some more items to process - children of first element. The idea is to put these unprocessed children to the end of children collection and call childCount again.
You can implement it this way:
#tailrec
def childCount(cats:Stream[Category], acc:Int):Int =
cats match {
case Stream.Empty => acc
case x #:: xs => childCount(xs ++ x.children, acc+1)
}
call it:
val count = childCount(lists.toStream, 0)

Use an array as a Scala foldLeft accumulator

I am trying to use a foldLeft on an array. Eg:
var x = some array
x.foldLeft(new Array[Int](10))((a, c) => a(c) = a(c)+1)
This refuses to compile with the error found Int(0) required Array[Int].
In order to use foldLeft in what you want to do, and following your style, you can just return the same accumulator array in the computation like this:
val ret = a.foldLeft(new Array[Int](10)) {
(acc, c) => acc(c) += 1; acc
}
Alternatively, since your numbers are from 0 to 9, you can also do this to achieve the same result:
val ret = (0 to 9).map(x => a.count(_ == x))
Assignment in Scala does not return a value (but instead Unit) so your expression that is supposed to return the Array[Int] for the next step returns Unit which does not work.
You would have to use a block and return the array in the end like this:
x.foldLeft(new Array[Int](10)) { (a, c) =>
a(c) = a(c)+1
a
}

Tune Nested Loop in Scala

I was wondering if I can tune the following Scala code :
def removeDuplicates(listOfTuple: List[(Class1,Class2)]): List[(Class1,Class2)] = {
var listNoDuplicates: List[(Class1, Class2)] = Nil
for (outerIndex <- 0 until listOfTuple.size) {
if (outerIndex != listOfTuple.size - 1)
for (innerIndex <- outerIndex + 1 until listOfTuple.size) {
if (listOfTuple(i)._1.flag.equals(listOfTuple(j)._1.flag))
listNoDuplicates = listOfTuple(i) :: listNoDuplicates
}
}
listNoDuplicates
}
Usually if you have someting looking like:
var accumulator: A = new A
for( b <- collection ) {
accumulator = update(accumulator, b)
}
val result = accumulator
can be converted in something like:
val result = collection.foldLeft( new A ){ (acc,b) => update( acc, b ) }
So here we can first use a map to force the unicity of flags. Supposing the flag has a type F:
val result = listOfTuples.foldLeft( Map[F,(ClassA,ClassB)] ){
( map, tuple ) => map + ( tuple._1.flag -> tuple )
}
Then the remaining tuples can be extracted from the map and converted to a list:
val uniqList = map.values.toList
It will keep the last tuple encoutered, if you want to keep the first one, replace foldLeft by foldRight, and invert the argument of the lambda.
Example:
case class ClassA( flag: Int )
case class ClassB( value: Int )
val listOfTuples =
List( (ClassA(1),ClassB(2)), (ClassA(3),ClassB(4)), (ClassA(1),ClassB(-1)) )
val result = listOfTuples.foldRight( Map[Int,(ClassA,ClassB)]() ) {
( tuple, map ) => map + ( tuple._1.flag -> tuple )
}
val uniqList = result.values.toList
//uniqList: List((ClassA(1),ClassB(2)), (ClassA(3),ClassB(4)))
Edit: If you need to retain the order of the initial list, use instead:
val uniqList = listOfTuples.filter( result.values.toSet )
This compiles, but as I can't test it it's hard to say if it does "The Right Thing" (tm):
def removeDuplicates(listOfTuple: List[(Class1,Class2)]): List[(Class1,Class2)] =
(for {outerIndex <- 0 until listOfTuple.size
if outerIndex != listOfTuple.size - 1
innerIndex <- outerIndex + 1 until listOfTuple.size
if listOfTuple(i)._1.flag == listOfTuple(j)._1.flag
} yield listOfTuple(i)).reverse.toList
Note that you can use == instead of equals (use eq if you need reference equality).
BTW: https://codereview.stackexchange.com/ is better suited for this type of question.
Do not use index with lists (like listOfTuple(i)). Index on lists have very lousy performance. So, some ways...
The easiest:
def removeDuplicates(listOfTuple: List[(Class1,Class2)]): List[(Class1,Class2)] =
SortedSet(listOfTuple: _*)(Ordering by (_._1.flag)).toList
This will preserve the last element of the list. If you want it to preserve the first element, pass listOfTuple.reverse instead. Because of the sorting, performance is, at best, O(nlogn). So, here's a faster way, using a mutable HashSet:
def removeDuplicates(listOfTuple: List[(Class1,Class2)]): List[(Class1,Class2)] = {
// Produce a hash map to find the duplicates
import scala.collection.mutable.HashSet
val seen = HashSet[Flag]()
// now fold
listOfTuple.foldLeft(Nil: List[(Class1,Class2)]) {
case (acc, el) =>
val result = if (seen(el._1.flag)) acc else el :: acc
seen += el._1.flag
result
}.reverse
}
One can avoid using a mutable HashSet in two ways:
Make seen a var, so that it can be updated.
Pass the set along with the list being created in the fold. The case then becomes:
case ((seen, acc), el) =>