Scala - List to List of tuples - scala

Beginner here.
Sorry but I did'nt found an answer so I ask the question here.
I want to know how to do this by using the Scala API :
(blabla))( -> List(('(',2),(')',2))
Currently I have this :
"(blabla))(".toCharArray.toList.filter(p => (p == '(' || p == ')')).sortBy(x => x)
Output :
List((, (, ), ))
Now how can I map each character to the tuples I describe ?
Example for a general case :
"t:e:s:t" -> List(('t',2),('e',1),('s',1),(':',3))
Thanks

val source = "ok:ok:k::"
val chars = source.toList
val shorter = chars.distinct.map( c => (c, chars.count(_ == c)))
//> shorter : List[(Char, Int)] = List((o,2), (k,3), (:,4))

Classic groupBy . mapValues use case:
scala> val str = "ok:ok:k::"
str: String = ok:ok:k::
scala> str.groupBy(identity).mapValues(_.size) // identity <=> (x => x)
res0: scala.collection.immutable.Map[Char,Int] = Map(k -> 3, : -> 4, o -> 2)

I like sschaef's solution very much, but I was wondering if anyone could weigh in on how efficient that solution is compared to this one:
scala> val str = "ok:ok:k::"
str: String = ok:ok:k::
scala> str.foldLeft(Map[Char,Int]().withDefaultValue(0))((current, c) => current.updated(c, current(c) + 1))
res29: scala.collection.immutable.Map[Char,Int] = Map(o -> 2, k -> 3, : -> 4)
I think my solution is slower. If we have n total occurrences and m unique values:
My solution: we have the fold left over all occurrences or n. For each of these occurrences we look up once to find the current count and then again to create the updated the map. I'm assuming that the creating of the updated map is constant time.
Total complexity: n * 2m or O(n*m)
sschaef's solution: we have the groupBy which I'm assuming just adds entries onto a list without checking the map (so for all values this would be a constant time look up plus appending to the list) so n. Then for the mapValues it probably iterates over the unique values and grabs the size for each key's list. I'm assuming that getting the size of each entry's list is constant time.
Total complexity: O(n + m)
Does this seem correct or am I mistaken in my assumptions?

Related

How to pair each element of a Seq with the rest?

I'm looking for an elegant way to combine every element of a Seq with the rest for a large collection.
Example: Seq(1,2,3).someMethod should produce something like
Iterator(
(1,Seq(2,3)),
(2,Seq(1,3)),
(3,Seq(1,2))
)
Order of elements doesn't matter. It doesn't have to be a tuple, a Seq(Seq(1),Seq(2,3)) is also acceptable (although kinda ugly).
Note the emphasis on large collection (which is why my example shows an Iterator).
Also note that this is not combinations.
Ideas?
Edit:
In my use case, the numbers are expected to be unique. If a solution can eliminate the dupes, that's fine, but not at additional cost. Otherwise, dupes are acceptable.
Edit 2: In the end, I went with a nested for-loop, and skipped the case when i == j. No new collections were created. I upvoted the solutions that were correct and simple ("simplicity is the ultimate sophistication" - Leonardo da Vinci), but even the best ones are quadratic just by the nature of the problem, and some create intermediate collections by usage of ++ that I wanted to avoid because the collection I'm dealing with has close to 50000 elements, 2.5 billion when quadratic.
The following code has constant runtime (it does everything lazily), but accessing every element of the resulting collections has constant overhead (when accessing each element, an index shift must be computed every time):
def faceMap(i: Int)(j: Int) = if (j < i) j else j + 1
def facets[A](simplex: Vector[A]): Seq[(A, Seq[A])] = {
val n = simplex.size
(0 until n).view.map { i => (
simplex(i),
(0 until n - 1).view.map(j => simplex(faceMap(i)(j)))
)}
}
Example:
println("Example: facets of a 3-dimensional simplex")
for ((i, v) <- facets((0 to 3).toVector)) {
println(i + " -> " + v.mkString("[", ",", "]"))
}
Output:
Example: facets of a 3-dimensional simplex
0 -> [1,2,3]
1 -> [0,2,3]
2 -> [0,1,3]
3 -> [0,1,2]
This code expresses everything in terms of simplices, because "omitting one index" corresponds exactly to the face maps for a combinatorially described simplex. To further illustrate the idea, here is what the faceMap does:
println("Example: how `faceMap(3)` shifts indices")
for (i <- 0 to 5) {
println(i + " -> " + faceMap(3)(i))
}
gives:
Example: how `faceMap(3)` shifts indices
0 -> 0
1 -> 1
2 -> 2
3 -> 4
4 -> 5
5 -> 6
The facets method uses the faceMaps to create a lazy view of the original collection that omits one element by shifting the indices by one starting from the index of the omitted element.
If I understand what you want correctly, in terms of handling duplicate values (i.e., duplicate values are to be preserved), here's something that should work. Given the following input:
import scala.util.Random
val nums = Vector.fill(20)(Random.nextInt)
This should get you what you need:
for (i <- Iterator.from(0).take(nums.size)) yield {
nums(i) -> (nums.take(i) ++ nums.drop(i + 1))
}
On the other hand, if you want to remove dups, I'd convert to Sets:
val numsSet = nums.toSet
for (num <- nums) yield {
num -> (numsSet - num)
}
seq.iterator.map { case x => x -> seq.filter(_ != x) }
This is quadratic, but I don't think there is very much you can do about that, because in the end of the day, creating a collection is linear, and you are going to need N of them.
import scala.annotation.tailrec
def prems(s : Seq[Int]):Map[Int,Seq[Int]]={
#tailrec
def p(prev: Seq[Int],s :Seq[Int],res:Map[Int,Seq[Int]]):Map[Int,Seq[Int]] = s match {
case x::Nil => res+(x->prev)
case x::xs=> p(x +: prev,xs, res+(x ->(prev++xs)))
}
p(Seq.empty[Int],s,Map.empty[Int,Seq[Int]])
}
prems(Seq(1,2,3,4))
res0: Map[Int,Seq[Int]] = Map(1 -> List(2, 3, 4), 2 -> List(1, 3, 4), 3 -> List(2, 1, 4),4 -> List(3, 2, 1))
I think you are looking for permutations. You can map the resulting lists into the structure you are looking for:
Seq(1,2,3).permutations.map(p => (p.head, p.tail)).toList
res49: List[(Int, Seq[Int])] = List((1,List(2, 3)), (1,List(3, 2)), (2,List(1, 3)), (2,List(3, 1)), (3,List(1, 2)), (3,List(2, 1)))
Note that the final toList call is only there to trigger the evaluation of the expressions; otherwise, the result is an iterator as you asked for.
In order to get rid of the duplicate heads, toMap seems like the most straight-forward approach:
Seq(1,2,3).permutations.map(p => (p.head, p.tail)).toMap
res50: scala.collection.immutable.Map[Int,Seq[Int]] = Map(1 -> List(3, 2), 2 -> List(3, 1), 3 -> List(2, 1))

Finding a map in list/vector of maps in Scala

I have a vector/list of maps (Map[String,Int]). How can I find if a key-value pair exists in one of these maps in the list of maps using .find?
val res = List(Map("1" -> 1), Map("2" -> 2)).find(t => t.exists(j => j == ("2", 2)))
println(res)
use find with exists to check whether it exists in maps.
chengpohi's solution is pretty inefficient, and also different to how I understand the question.
Let m: Map[String,Int].
Why chengpoi's solution is inefficient
First, using m.exists(j => j == ("2",2)), which can also be written m.contains("2" -> 2) looks at every entry of m, while m("2").toSeq.contains(2) performs only a single map lookup.
Note that m.contains("2" -> 2) will not work, as contains is overridden for Map to check for a key, i.e., m.contains("2") works—and is also fast.
To obtain the same result as chengpoi, but efficiently:
def mapExists[K,V](ms: List[Map[K,V]], k: K, v: V): Option[(K,V)] =
ms.get(k).filter(_ == v).map(_ => k -> v)
Note that this method returns its arguments, which is quite redundant.
How I understand the question
Second, I understood the question as checking whether the List contains a Map with a specific pair.
This would translate to
def mapExists[K,V](ms: List[Map[K,V]], k: K, v: V): Boolean =
ms.exists(_.get(k).contains(v))
It can be done even like this using just the key value we are interested to find:
scala> val res = List(Map("A" -> 10), Map("B" -> 20)).find(_.keySet.contains("B"))
res: Option[scala.collection.immutable.Map[String,Int]] = Some(Map(B -> 20))
scala>

Scala List difference with index information

I have two Lists
val l1 = List(1,2,3)
val l2 = List(1,3,3)
with
l1.diff(l2)
I can find the difference in the list; at the same time I am interested in index where the difference found also; can i know what is the solution in scala ?
Note : All the time both the list size is going to be same.
You can just add indexes to both lists and diff then:
val diff = l1.zipWithIndex.diff(l2.zipWithIndex)
-> List((2,1)) // different value is 2 and index is 1
val indexes = (l1 zip l2 zipWithIndex).filter(x => x._1._1 != x._1._2).map(_._2)
val indexesWithDiffValues = (l1 zip l2 zipWithIndex).filter(x => x._1._1 != x._1._2)
this code will give you a list of indexes you want.
Another way which shows you very easy which list and where you can find the value:
l1.diff(l2).map(v => (v, l1.indexOf(v), l2.indexOf(v)))
// res6: List[(Int, Int, Int)] = List((2,1,-1))

Update values of Map

I have a Map like:
Map("product1" -> List(Product1ObjectTypes), "product2" -> List(Product2ObjectTypes))
where ProductObjectType has a field usage. Based on the other field (counter) I have to update all ProductXObjectTypes.
The issue is that this update depends on previous ProductObjectType, and I can't find a way to get previous item when iterating over mapValues of this map. So basically, to update current usage I need: CurrentProduct1ObjectType.counter - PreviousProduct1ObjectType.counter.
Is there any way to do this?
I started it like:
val reportsWithCalculatedUsage =
reportsRefined.flatten.flatten.toList.groupBy(_._2.product).mapValues(f)
but I don't know in mapValues how to access previous list item.
I'm not sure if I understand completely, but if you want to update the values inside the lists based on their predecessors, this can generally be done with a fold:
case class Thing(product: String, usage: Int, counter: Int)
val m = Map(
"product1" -> List(Thing("Fnord", 10, 3), Thing("Meep", 0, 5))
//... more mappings
)
//> Map(product1 -> List(Thing(Fnord,10,3), Thing(Meep,0,5)))
m mapValues { list => list.foldLeft(List[Thing]()){
case (Nil, head) =>
List(head)
case (tail, head) =>
val previous = tail.head
val current = head copy (usage = head.usage + head.counter - previous.counter)
current :: tail
} reverse }
//> Map(product1 -> List(Thing(Fnord,10,3), Thing(Meep,2,5)))
Note that regular map is an unordered collection, you need to use something like TreeMap to have predictable order of iteration.
Anyways, from what I understand you want to get pairs of all values in a map. Try something like this:
scala> val map = Map(1 -> 2, 2 -> 3, 3 -> 4)
scala> (map, map.tail).zipped.foreach((t1, t2) => println(t1 + " " + t2))
(1,2) (2,3)
(2,3) (3,4)

Count occurrences of each element in a List[List[T]] in Scala

Suppose you have
val docs = List(List("one", "two"), List("two", "three"))
where e.g. List("one", "two") represents a document containing terms "one" and "two", and you want to build a map with the document frequency for every term, i.e. in this case
Map("one" -> 1, "two" -> 2, "three" -> 1)
How would you do that in Scala? (And in an efficient way, assuming a much larger dataset.)
My first Java-like thought is to use a mutable map:
val freqs = mutable.Map.empty[String,Int]
for (doc <- docs)
for (term <- doc)
freqs(term) = freqs.getOrElse(term, 0) + 1
which works well enough but I'm wondering how you could do that in a more "functional" way, without resorting to a mutable map?
Try this:
scala> docs.flatten.groupBy(identity).mapValues(_.size)
res0: Map[String,Int] = Map(one -> 1, two -> 2, three -> 1)
If you are going to be accessing the counts many times, then you should avoid mapValues since it is "lazy" and, thus, would recompute the size on every access. This version gives you the same result but won't require the recomputations:
docs.flatten.groupBy(identity).map(x => (x._1, x._2.size))
The identity function just means x => x.
docs.flatten.foldLeft(new Map.WithDefault(Map[String,Int](),Function.const(0))){
(m,x) => m + (x -> (1 + m(x)))}
What a train wreck!
[Edit]
Ah, that's better!
docs.flatten.foldLeft(Map[String,Int]() withDefaultValue 0){
(m,x) => m + (x -> (1 + m(x)))}
Starting Scala 2.13, after flattening the list of lists, we can use groupMapReduce which is a one-pass alternative to groupBy/mapValues:
// val docs = List(List("one", "two"), List("two", "three"))
docs.flatten.groupMapReduce(identity)(_ => 1)(_ + _)
// Map[String,Int] = Map("one" -> 1, "three" -> 1, "two" -> 2)
This:
flattens the List of Lists as a List
groups list elements (identity) (group part of groupMapReduce)
maps each grouped value occurrence to 1 (_ => 1) (map part of groupMapReduce)
reduces values within a group of values (_ + _) by summing them (reduce part of groupMapReduce).