What is a simple approach to testing whether a Map[A,B] is bijective, namely for
val m1 = Map( 1 -> "a", 2 -> "b")
val m2 = Map( 1 -> "a", 2 -> "a")
we have that m1 is bijective unlike m2.
You could do
val m = Map(1 -> "a", 2 -> "a")
val isBijective = m.values.toSet.size == m.size
A bijection is one-to-one and onto. The mapping defined by a Map is definitely onto. Every value has a corresponding key.
The only way it's not one-to-one is if two keys map to the same value. But that means that we'll have fewer values than keys.
As explained here -
For a pairing between X and Y (where Y need not be different from X) to be a bijection, four properties must hold:
1) each element of X must be paired with at least one element of Y
This is inherent nature of Map. In some cases it can be mapped to None which is again one of the element of Y.
2) no element of X may be paired with more than one element of Y,
Again inherent neture of Map.
3) each element of Y must be paired with at least one element of X, and
Every element in Y will have some association with X otherwise it won't exist.
4) no element of Y may be paired with more than one element of X.
Map doesnt have this constraint. So we need to check for this. If Y contains duplicates then it is violated.
So, sufficient check should be "No duplicates in Y")
val a = scala.collection.mutable.Map[Int, Option[Int]]()
a(10) = None
a(20) = None
if(a.values.toArray.distinct.size != a.values.size) println("No") else println("Yes") // No
val a = Map("a"->10, "b"->20)
if(a.values.toArray.distinct.size != a.values.size) println("No") else println("Yes") // Yes
val a = scala.collection.mutable.Map[Int, Option[Int]]()
a(10) = Some(100)
a(20) = None
a(30) = Some(300)
if(a.values.toArray.distinct.size != a.values.size) println("No") else println("Yes") // Yes
Related
Quite complex algorith is being applied to list of Spark Dataset's rows (list was obtained using groupByKey and flatMapGroups). Most rows are transformed 1 : 1 from input to output, but in some scenarios require more than one output per each input. The input row schema can change anytime. The map() fits the requirements quite well for the 1:1 transformation, but is there a way to use it producing 1 : n output?
The only work-around I found relies on foreach method which has unpleasant overhed cause by creating the initial empty list (remember, unlike the simplified example below, real-life list structure is changing randomly).
My original problem is too complex to share here, but this example demonstrates the concept. Let's have a list of integers. Each should be transformed into its square value and if the input is even it should also transform into one half of the original value:
val X = Seq(1, 2, 3, 4, 5)
val y = X.map(x => x * x) //map is intended for 1:1 transformation so it works great here
val z = X.map(x => for(n <- 1 to 5) (n, x * x)) //this attempt FAILS - generates list of five rows with emtpy tuples
// this work-around works, but newX definition is problematic
var newX = List[Int]() //in reality defining as head of the input list and dropping result's tail at the end
val za = X.foreach(x => {
newX = x*x :: newX
if(x % 2 == 0) newX = (x / 2) :: newX
})
newX
Is there a better way than foreach construct?
.flatMap produces any number of outputs from a single input.
val X = Seq(1, 2, 3, 4, 5)
X.flatMap { x =>
if (x % 2 == 0) Seq(x*x, x / 2) else Seq(x / 2)
}
#=> Seq[Int] = List(0, 4, 1, 1, 16, 2, 2)
flatMap in more detail
In X.map(f), f is a function that maps each input to a single output. By contrast, in X.flatMap(g), the function g maps each input to a sequence of outputs. flatMap then takes all the sequences produced (one for each element in f) and concatenates them.
The neat thing is .flatMap works not just for sequences, but for all sequence-like objects. For an option, for instance, Option(x)#flatMap(g) will allow g to return an Option. Similarly, Future(x)#flatMap(g) will allow g to return a Future.
Whenever the number of elements you return depends on the input, you should think of flatMap.
I have two Sequences, say:
val first = Array("B", "L", "T")
val second = Array("T70", "B25", "B80", "A50", "M100", "B50")
How do I get a product such that elements of the first array are joined with each element of the second array which startsWith the former and also yield a default empty result when no element in the second array meets the condition.
Effectively to get an Output:
expectedProductArray = Array("B-B25", "B-B80", "B-B50", "L-Default", "T-T70")
I tried doing,
val myProductArray: Array[String] = for {
f <- first
s <- second if s.startsWith(f)
} yield s"""$f-$s"""
and i get:
myProductArray = Array("B-B25", "B-B80", "B-B50", "T-T70")
Is there an Idiomatic way of adding a default value for values in first sequence not having a corresponding value in the second sequence with the given criteria? Appreciate your thoughts.
Here's one approach by making array second a Map and looking up the Map for elements in array first with getOrElse:
val first = Array("B", "L", "T")
val second = Array("T70", "B25", "B80", "A50", "M100", "B50")
val m = second.groupBy(_(0).toString)
// m: scala.collection.immutable.Map[String,Array[String]] =
// Map(M -> Array(M100), A -> Array(A50), B -> Array(B25, B80, B50), T -> Array(T70))
first.flatMap(x => m.getOrElse(x, Array("Default")).map(x + "-" + _))
// res1: Array[String] = Array(B-B25, B-B80, B-B50, L-Default, T-T70)
In case you prefer using for-comprehension:
for {
x <- first
y <- m.getOrElse(x, Array("Default"))
} yield s"$x-$y"
I have a Map which looks like this
var map = Map[Int, Map[Int, Int]]()
I need the functionality where I need to lookup values in the inner map and if the value is found the result is "added" and not just updated.
I have written this code like
map.get(x) match {
case Some(innerMap) =>
innerMap.get(y) match {
case Some(z) => map(x)(y) = z + a
case None => map(x)(y) = z
case None => map(x) = Map(y -> a)
If you pass the inputs x=2,y=2,a=10 and x=2,y=2,a=10 to the code above the map will contain x=2,y=2,a=20 because values are added.
But I don't like this nested pattern matching. (reminds me of the nested ifs) I tried to simplify this using getOrElseUpdate
map
.getOrElseUpdate(x, Map(y -> a))
.getOrElseUpdate(y, a)
This is good but doesn't handle the scenario where y in found in the inner map and we needed to do a z + a
Not necessarily the most efficient solution, but simple - you can instantiate values with zeroes, and then add the value itself assuming all the keys are already there:
map
.getOrElseUpdate(x, Map(y -> 0))
.getOrElseUpdate(y, 0)
map(x)(y) += a
I have two sets of (k,v) pairs:
val x = Set((1,2), (2,10), (3,5), (7,15))
val y = Set((1,200), (3,500))
How to find difference of these two sets by keys, to get:
Set((2,10),(7,15))
Any quick and simple solution?
val ym = y.toMap
x.toMap.filterKeys(k => !(ym contains k)).toSet
Sets don't have keys, maps do. So you convert to map. Then, you can't create a difference on maps, but you can filter the keys to exclude the ones you don't want. And then you're done save for converting back to a Set. (It's not the most efficient way to do this, but it's not bad and it's easy to write.)
Let val keys = y.map(_._1).toSet be the set of keys (first element in the pair) that must not occur as key in x; thus
for ( p <- x if !keys(p._1) ) yield p
as well as
x.collect { case p#(a,b) if !keys(a) => p }
and
x.filter ( p => !keys(p._1) )
x.filterNot ( p => keys(p._1) )
you can try this one:
x filter{ m => y map{_._1} contains m._1} toSet
Apologies: I'm well noob
I have an items class
class item(ind:Int,freq:Int,gap:Int){}
I have an ordered list of ints
val listVar = a.toList
where a is an array
I want a list of items called metrics where
ind is the (unique) integer
freq is the number of times that ind appears in list
gap is the minimum gap between ind and the number in the list before it
so far I have:
def metrics = for {
n <- 0 until 255
listVar filter (x == n) count > 0
}
yield new item(n, (listVar filter == n).count,0)
It's crap and I know it - any clues?
Well, some of it is easy:
val freqMap = listVar groupBy identity mapValues (_.size)
This gives you ind and freq. To get gap I'd use a fold:
val gapMap = listVar.sliding(2).foldLeft(Map[Int, Int]()) {
case (map, List(prev, ind)) =>
map + (ind -> (map.getOrElse(ind, Int.MaxValue) min ind - prev))
}
Now you just need to unify them:
freqMap.keys.map( k => new item(k, freqMap(k), gapMap.getOrElse(k, 0)) )
Ideally you want to traverse the list only once and in the course for each different Int, you want to increment a counter (the frequency) as well as keep track of the minimum gap.
You can use a case class to store the frequency and the minimum gap, the value stored will be immutable. Note that minGap may not be defined.
case class Metric(frequency: Int, minGap: Option[Int])
In the general case you can use a Map[Int, Metric] to lookup the Metric immutable object. Looking for the minimum gap is the harder part. To look for gap, you can use the sliding(2) method. It will traverse the list with a sliding window of size two allowing to compare each Int to its previous value so that you can compute the gap.
Finally you need to accumulate and update the information as you traverse the list. This can be done by folding each element of the list into your temporary result until you traverse the whole list and get the complete result.
Putting things together:
listVar.sliding(2).foldLeft(
Map[Int, Metric]().withDefaultValue(Metric(0, None))
) {
case (map, List(a, b)) =>
val metric = map(b)
val newGap = metric.minGap match {
case None => math.abs(b - a)
case Some(gap) => math.min(gap, math.abs(b - a))
}
val newMetric = Metric(metric.frequency + 1, Some(newGap))
map + (b -> newMetric)
case (map, List(a)) =>
map + (a -> Metric(1, None))
case (map, _) =>
map
}
Result for listVar: List[Int] = List(2, 2, 4, 4, 0, 2, 2, 2, 4, 4)
scala.collection.immutable.Map[Int,Metric] = Map(2 -> Metric(4,Some(0)),
4 -> Metric(4,Some(0)), 0 -> Metric(1,Some(4)))
You can then turn the result into your desired item class using map.toSeq.map((i, m) => new Item(i, m.frequency, m.minGap.getOrElse(-1))).
You can also create directly your Item object in the process, but I thought the code would be harder to read.