Scala: Add a sequence number to duplicate elements in a list - scala

I have a list and want to add a sequential number to duplicate elements.
val lst=List("a", "b", "c", "b", "c", "d", "b","a")
The result should be
List("a___0", "b___0", "c____0", "b___1", "c____1", "d___0", "b___2","a___1")
preserving the original order.
What I have so far:
val lb=new ListBuffer[String]()
for(i<-0 to lst.length-2) {
val lbSplit=lb.map(a=>a.split("____")(0)).distinct.toList
if(!lbSplit.contains(lst(i))){
var count=0
lb+=lst(i)+"____"+count
for(j<-i+1 to lst.length-1){
if(lst(i).equalsIgnoreCase(lst(j))) {
count+=1
lb+= lst(i)+"____"+count
}
}
}
}
which results in :
res120: scala.collection.mutable.ListBuffer[String]
= ListBuffer(a____0, a____1, b____0, b____1, b____2, c____0, c____1, d____0)
messing up the order. Also if there is a more concise way that would be great.

This should work without any mutable variables.
val lst=List("a", "b", "c", "b", "c", "d", "b","a")
lst.foldLeft((Map[String,Int]().withDefaultValue(0),List[String]())){
case ((m, l), x) => (m + (x->(m(x)+1)), x + "__" + m(x) :: l)
}._2.reverse
// res0: List[String] = List(a__0, b__0, c__0, b__1, c__1, d__0, b__2, a__1)
explanation
lst.foldLeft - Take the List of items (in this case a List[String]) and fold them (starting on the left) into a single item.
(Map[String,Int]().withDefaultValue(0),List[String]()) - In this case the new item will be a tuple of type (Map[String,Int], List[String]). We'll start the tuple with an empty Map and an empty List.
case ((m, l), x) => - Every time an element from lst is passed in to the tuple calculation we'll call that element x. We'll also receive the tuple from the previous calculation. We'll call the Map part m and we'll call the List part l.
m + (x->(m(x)+1)) - The Map part of the new tuple is created by creating/updating the count for this String (the x) and adding it to the received Map.
x + "__" + m(x) :: l - The List part of the new tuple is created by pre-pending a new String at the head.
}._2.reverse - The fold is finished. Extract the List from the tuple (the 2nd element) and reverse it to restore the original order of elements.

I think a more concise way that preserves the order would just to be to use a Map[String, Int] to keep a running total of each time you've seen a particular string. Then you can just map over lst directly and keep updating the map each time you've seen a string:
var map = Map[String, Int]()
lst.map { str =>
val count = map.getOrElse(str, 0) //get current count if in the map, otherwise zero
map += (str -> (count + 1)) //update the count
str + "__" + count
}
which will give you the following for your example:
List(a__0, b__0, c__0, b__1, c__1, d__0, b__2, a__1)
I consider that easiest to read, but if you want to avoid var then you can use foldLeft with a tuple to hold the intermediate state of the map:
lst.foldLeft((List[String](), Map[String, Int]())) { case ((list, map), str) =>
val count = map.getOrElse(str, 0)
(list :+ (str + "__" + count), map + (str -> (count + 1)))
}._1

Related

Scala - conditional product/join of two arrays with default values using for comprehensions

I have two Sequences, say:
val first = Array("B", "L", "T")
val second = Array("T70", "B25", "B80", "A50", "M100", "B50")
How do I get a product such that elements of the first array are joined with each element of the second array which startsWith the former and also yield a default empty result when no element in the second array meets the condition.
Effectively to get an Output:
expectedProductArray = Array("B-B25", "B-B80", "B-B50", "L-Default", "T-T70")
I tried doing,
val myProductArray: Array[String] = for {
f <- first
s <- second if s.startsWith(f)
} yield s"""$f-$s"""
and i get:
myProductArray = Array("B-B25", "B-B80", "B-B50", "T-T70")
Is there an Idiomatic way of adding a default value for values in first sequence not having a corresponding value in the second sequence with the given criteria? Appreciate your thoughts.
Here's one approach by making array second a Map and looking up the Map for elements in array first with getOrElse:
val first = Array("B", "L", "T")
val second = Array("T70", "B25", "B80", "A50", "M100", "B50")
val m = second.groupBy(_(0).toString)
// m: scala.collection.immutable.Map[String,Array[String]] =
// Map(M -> Array(M100), A -> Array(A50), B -> Array(B25, B80, B50), T -> Array(T70))
first.flatMap(x => m.getOrElse(x, Array("Default")).map(x + "-" + _))
// res1: Array[String] = Array(B-B25, B-B80, B-B50, L-Default, T-T70)
In case you prefer using for-comprehension:
for {
x <- first
y <- m.getOrElse(x, Array("Default"))
} yield s"$x-$y"

Failed to print get count result in forloop

I have been trying to count inside a for loop, but the result just ends with a parentheses. I am just printing out the key here in map.
var count = 0
xs.foreach(x => (myMap += ((count+=1).toString+","+java.util.UUID.randomUUID.toString -> x)))
Output:
(),901e9926-be1e-4dc4-b3e3-6c3b2feea2c4
Expected output:
1,901e9926-be1e-4dc4-b3e3-6c3b2feea2c4
Within your foreach, count += 1 would be of type Unit. If I understand your question correctly, the example below (using an arbitrary xs collection) might be what you're looking for:
val xs = List("a", "b", "c", "d")
var count = 0
var myMap = Map[String, String]()
xs.foreach{ x =>
count += 1
myMap += ((count.toString + "," + java.util.UUID.randomUUID.toString) -> x)
}
myMap.keys
// res1: Iterable[String] = Set(
// 1,bd971c44-b9d0-41a0-b59f-3acbf2e0dee0, 2,5459eed9-309d-4f9c-afd7-10aced9df2a0,
// 3,5816ea42-d8ed-4beb-8b30-0376d0674700, 4,30f6f22f-1e6d-4eec-86af-5bc6734d5196
// )
In case you want a more idiomatic approach, using zip for the count and foldLeft for Map aggregation would produce similar result:
val myMap = Map[String, String]()
val resultMap = xs.zip(Stream from 1).foldLeft( myMap )(
(m, x) => m + ((x._2.toString + "," + java.util.UUID.randomUUID.toString) -> x._1)
)
What you are printing here is actually (count+=1).toString. In Scala, an assignment like this will be evaluated to Unit, which is expressed by parentheses. That's why you print () and not the value of count. If you check the count variable value afterwards you will see that it is 1 as expected.
Additionally, what you are trying to do could be expressed in a better way, e.g, you could do:
val myMap = xs.zipWithIndex.map(x => (x._2 + 1) + "," + java.util.UUID.randomUUID -> x._1).toMap

Update values of Map

I have a Map like:
Map("product1" -> List(Product1ObjectTypes), "product2" -> List(Product2ObjectTypes))
where ProductObjectType has a field usage. Based on the other field (counter) I have to update all ProductXObjectTypes.
The issue is that this update depends on previous ProductObjectType, and I can't find a way to get previous item when iterating over mapValues of this map. So basically, to update current usage I need: CurrentProduct1ObjectType.counter - PreviousProduct1ObjectType.counter.
Is there any way to do this?
I started it like:
val reportsWithCalculatedUsage =
reportsRefined.flatten.flatten.toList.groupBy(_._2.product).mapValues(f)
but I don't know in mapValues how to access previous list item.
I'm not sure if I understand completely, but if you want to update the values inside the lists based on their predecessors, this can generally be done with a fold:
case class Thing(product: String, usage: Int, counter: Int)
val m = Map(
"product1" -> List(Thing("Fnord", 10, 3), Thing("Meep", 0, 5))
//... more mappings
)
//> Map(product1 -> List(Thing(Fnord,10,3), Thing(Meep,0,5)))
m mapValues { list => list.foldLeft(List[Thing]()){
case (Nil, head) =>
List(head)
case (tail, head) =>
val previous = tail.head
val current = head copy (usage = head.usage + head.counter - previous.counter)
current :: tail
} reverse }
//> Map(product1 -> List(Thing(Fnord,10,3), Thing(Meep,2,5)))
Note that regular map is an unordered collection, you need to use something like TreeMap to have predictable order of iteration.
Anyways, from what I understand you want to get pairs of all values in a map. Try something like this:
scala> val map = Map(1 -> 2, 2 -> 3, 3 -> 4)
scala> (map, map.tail).zipped.foreach((t1, t2) => println(t1 + " " + t2))
(1,2) (2,3)
(2,3) (3,4)

Count occurrences of each element in a List[List[T]] in Scala

Suppose you have
val docs = List(List("one", "two"), List("two", "three"))
where e.g. List("one", "two") represents a document containing terms "one" and "two", and you want to build a map with the document frequency for every term, i.e. in this case
Map("one" -> 1, "two" -> 2, "three" -> 1)
How would you do that in Scala? (And in an efficient way, assuming a much larger dataset.)
My first Java-like thought is to use a mutable map:
val freqs = mutable.Map.empty[String,Int]
for (doc <- docs)
for (term <- doc)
freqs(term) = freqs.getOrElse(term, 0) + 1
which works well enough but I'm wondering how you could do that in a more "functional" way, without resorting to a mutable map?
Try this:
scala> docs.flatten.groupBy(identity).mapValues(_.size)
res0: Map[String,Int] = Map(one -> 1, two -> 2, three -> 1)
If you are going to be accessing the counts many times, then you should avoid mapValues since it is "lazy" and, thus, would recompute the size on every access. This version gives you the same result but won't require the recomputations:
docs.flatten.groupBy(identity).map(x => (x._1, x._2.size))
The identity function just means x => x.
docs.flatten.foldLeft(new Map.WithDefault(Map[String,Int](),Function.const(0))){
(m,x) => m + (x -> (1 + m(x)))}
What a train wreck!
[Edit]
Ah, that's better!
docs.flatten.foldLeft(Map[String,Int]() withDefaultValue 0){
(m,x) => m + (x -> (1 + m(x)))}
Starting Scala 2.13, after flattening the list of lists, we can use groupMapReduce which is a one-pass alternative to groupBy/mapValues:
// val docs = List(List("one", "two"), List("two", "three"))
docs.flatten.groupMapReduce(identity)(_ => 1)(_ + _)
// Map[String,Int] = Map("one" -> 1, "three" -> 1, "two" -> 2)
This:
flattens the List of Lists as a List
groups list elements (identity) (group part of groupMapReduce)
maps each grouped value occurrence to 1 (_ => 1) (map part of groupMapReduce)
reduces values within a group of values (_ + _) by summing them (reduce part of groupMapReduce).

How do I populate a list of objects with new values

Apologies: I'm well noob
I have an items class
class item(ind:Int,freq:Int,gap:Int){}
I have an ordered list of ints
val listVar = a.toList
where a is an array
I want a list of items called metrics where
ind is the (unique) integer
freq is the number of times that ind appears in list
gap is the minimum gap between ind and the number in the list before it
so far I have:
def metrics = for {
n <- 0 until 255
listVar filter (x == n) count > 0
}
yield new item(n, (listVar filter == n).count,0)
It's crap and I know it - any clues?
Well, some of it is easy:
val freqMap = listVar groupBy identity mapValues (_.size)
This gives you ind and freq. To get gap I'd use a fold:
val gapMap = listVar.sliding(2).foldLeft(Map[Int, Int]()) {
case (map, List(prev, ind)) =>
map + (ind -> (map.getOrElse(ind, Int.MaxValue) min ind - prev))
}
Now you just need to unify them:
freqMap.keys.map( k => new item(k, freqMap(k), gapMap.getOrElse(k, 0)) )
Ideally you want to traverse the list only once and in the course for each different Int, you want to increment a counter (the frequency) as well as keep track of the minimum gap.
You can use a case class to store the frequency and the minimum gap, the value stored will be immutable. Note that minGap may not be defined.
case class Metric(frequency: Int, minGap: Option[Int])
In the general case you can use a Map[Int, Metric] to lookup the Metric immutable object. Looking for the minimum gap is the harder part. To look for gap, you can use the sliding(2) method. It will traverse the list with a sliding window of size two allowing to compare each Int to its previous value so that you can compute the gap.
Finally you need to accumulate and update the information as you traverse the list. This can be done by folding each element of the list into your temporary result until you traverse the whole list and get the complete result.
Putting things together:
listVar.sliding(2).foldLeft(
Map[Int, Metric]().withDefaultValue(Metric(0, None))
) {
case (map, List(a, b)) =>
val metric = map(b)
val newGap = metric.minGap match {
case None => math.abs(b - a)
case Some(gap) => math.min(gap, math.abs(b - a))
}
val newMetric = Metric(metric.frequency + 1, Some(newGap))
map + (b -> newMetric)
case (map, List(a)) =>
map + (a -> Metric(1, None))
case (map, _) =>
map
}
Result for listVar: List[Int] = List(2, 2, 4, 4, 0, 2, 2, 2, 4, 4)
scala.collection.immutable.Map[Int,Metric] = Map(2 -> Metric(4,Some(0)),
4 -> Metric(4,Some(0)), 0 -> Metric(1,Some(4)))
You can then turn the result into your desired item class using map.toSeq.map((i, m) => new Item(i, m.frequency, m.minGap.getOrElse(-1))).
You can also create directly your Item object in the process, but I thought the code would be harder to read.