Scala List Regex match in Map and return key - scala

I want to match a string with list of regex within a Map[String, List[Regex]] and return the key[String] as String in case there is a match.
e.g:
//Map[String, List[Regex]]
Map(m3 -> List(([^ ]*)(rule3)([^ ]*)), m1 -> List(([^ ]*)(rule1)([^ ]*)), m4 -> List(([^ ]*)(rule5)([^ ]*)), m2 -> List(([^ ]*)(rule2)([^ ]*)))
if the string is "***rule3****" it should return me the key "m3", similarly if the string is "****rule5****" it should return key "m4".
How do i implement this?
something that i tried which is not working
rulesMap.mapValues (y => y.par.foreach (x => x.findFirstMatchIn("description"))).keys.toString()

For Scala 2.13.x
rulesMap
.filter({ case (_, regexList) => regexList.exists(regex => regex.matches("yourString")) })
.keys
For Scala 2.12.x
rulesMap
.filter({ case (_, regexList) => regexList.exists(regex => regex.findFirstIn("yourString").isDefined) })
.keys

collect is the best way of both filtering and mapping a collection because it only does a single pass over the data.
def findKeys(s: String) =
rulesMap.collect {
case (key, exps) if exps.exists(_.findFirstIn(s).nonEmpty) => key
}

Related

How to transform input data into following format? - groupby

What I have is the following input data for a function in a piece of scala code I'm writing:
List(
(1,SubScriptionState(CNN,ONLINE,Seq(12))),
(1,SubScriptionState(SKY,ONLINE,Seq(12))),
(1,SubScriptionState(FOX,ONLINE,Seq(12))),
(2,SubScriptionState(CNN,ONLINE,Seq(12))),
(2,SubScriptionState(SKY,ONLINE,Seq(12))),
(2,SubScriptionState(FOX,ONLINE,Seq(12))),
(2,SubScriptionState(CNN,OFFLINE,Seq(13))),
(2,SubScriptionState(SKY,ONLINE,Seq(13))),
(2,SubScriptionState(FOX,ONLINE,Seq(13))),
(3,SubScriptionState(CNN,OFFLINE,Seq(13))),
(3,SubScriptionState(SKY,ONLINE,Seq(13))),
(3,SubScriptionState(FOX,ONLINE,Seq(13)))
)
SubscriptionState is just a case class here:
case class SubscriptionState(channel: Channel, state: ChannelState, subIds: Seq[Long])
I want to transform it into this:
Map(
1 -> Map(
SubScriptionState(SKY,ONLINE,Seq(12)) -> 1,
SubScriptionState(CNN,ONLINE,Seq(12)) -> 1,
SubScriptionState(FOX,ONLINE,Seq(12)) -> 1),
2 -> Map(
SubScriptionState(SKY,ONLINE,Seq(12,13)) -> 2,
SubScriptionState(CNN,ONLINE,Seq(12)) -> 1,
SubScriptionState(FOX,ONLINE,Seq(12,13)) -> 2,
SubScriptionState(CNN,OFFLINE,Seq(13)) -> 1),
3 -> Map(
SubScriptionState(SKY,ONLINE,Seq(13)) -> 1,
SubScriptionState(FOX,ONLINE,Seq(13)) -> 1,
SubScriptionState(CNN,OFFLINE,Seq(13)) -> 1)
)
How would I go about doing this in scala?
Here is my approach to the problem. I think it may not be a perfect solution, but it works as you would expect.
val result: Map[Int, Map[SubscriptionState, Int]] = list
.groupBy(_._1)
.view
.mapValues { statesById =>
statesById
.groupBy { case (_, subscriptionState) => (subscriptionState.channel, subscriptionState.state) }
.map { case (_, groupedStatesById) =>
val subscriptionState = groupedStatesById.head._2 // groupedStatesById should contain at least one element
val allSubIds = groupedStatesById.flatMap(_._2.subIds)
val updatedSubscriptionState = subscriptionState.copy(subIds = allSubIds)
updatedSubscriptionState -> allSubIds.size
}
}.toMap
This is a "simple" solution using groupMap and groupMapReduce
list
.groupMap(_._1)(_._2)
.view
.mapValues{
_.groupMapReduce(ss => (ss.channel, ss.state))(_.subIds)(_ ++ _)
.map{case (k,v) => SubScriptionState(k._1, k._2, v) -> v.length}
}
.toMap
The groupMap converts the data to a Map[Int, List[SubScriptionState]] and the mapValues converts each List to the appropriate Map. (The view and toMap wrappers make mapValues more efficient and safe.)
The groupMapReduce converts the List[SubScriptionState] into a Map[(Channel, ChannelState), List[SubId]].
The map on this inner Map juggles these values around to make Map[SubScriptionState, Int] as required.
I'm not clear what the purpose of inner Map is. The value is the length of the subIds field so it could be obtained directly from the key rather than needing to look it up in the Map
An attempt using foldLeft:
list.foldLeft(Map.empty[Int, Map[SubscriptionState, Int]]) { (acc, next) =>
val subMap = acc.getOrElse(next._1, Map.empty[SubscriptionState, Int])
val channelSub = subMap.find { case (sub, _) => sub.channel == next._2.channel && sub.state == next._2.state }
acc + (next._1 -> channelSub.fold(subMap + (next._2 -> next._2.subIds.length)) { case (sub, _) =>
val subIds = sub.subIds ++ next._2.subIds
(subMap - sub) + (sub.copy(subIds = subIds) -> subIds.length)
})
}
I noticed that count is not used while folding and can be calculated using storeIds. Also, as storeIds can vary, the inner Map is rather useless as you will have to use find instead of get to fetch values from Map. So if you have control over your ADTs, you could use an intermediary ADT like:
case class SubscriptionStateWithoutIds(channel: Channel, state: ChannelState)
then you can rewrite your foldLeft as follows:
list.foldLeft(Map.empty[Int, Map[SubscriptionStateWithoutIds, Seq[Long]]]) { (acc, next) =>
val subMap = acc.getOrElse(next._1, Map.empty[SubscriptionStateWithoutIds, Seq[Long]])
val withoutId = SubscriptionStateWithoutIds(next._2.channel, next._2.state)
val channelSub = subMap.get(withoutId)
acc + (next._1 -> (subMap + channelSub.fold(withoutId -> next._2.subIds) { seq => withoutId -> (seq ++ next._2.subIds) }))
}
The biggest advantage of intermediary ADT is you can have a cleaner groupMapReduce version:
list.groupMap(_._1)(sub => SubscriptionStateWithoutIds(sub._2.channel, sub._2.state) -> sub._2.subIds)
.map { case (key, value) => key -> value.groupMapReduce(_._1)(_._2)(_ ++ _) }

Scala: Best way to remove tuples from Seq where one value is None

I wish to filter out None values where they appear in a Seq of tuples.
In the code below, I want to replace getOrElse with get. But then how do I remove the tuples where the first value is None ?
Here is my code. I feel it is inelegant.
myFirstMap.map {
case (key, value) =>
val tuple = (myLookUpMap.getOrElse(key,MyCaseClass("", None)), value.toString)
tuple
}.filter(_._1.name.nonEmpty).toIndexedSeq
}
What is the correct way to do this?
NOTE: this method will be called thousands of times on Seq with length 40 to 100, so performance is important
It looks like .map() and .flatMap() should do the trick, which is what a for comprehension is all about.
(for {
(k, v) <- myFirstMap
mcc <- myLookUpMap.get(k)
} yield (mcc, v.toString)).toIndexedSeq
myFirstMap.collect { case (k, _) if myFilterMap.contains(k) => myFilterMap(k)}
Maybe
myFirstMap.map {
case (key, value) =>
myLookUpMap.get(key).map( found => Tuple2( found, value.toString ) )
}.withFilter(_.nonEmpty).map( _.get ).toIndexedSeq
...or more readably...
val mbTuples = myFirstMap.map {
case (key, value) =>
myLookUpMap.get(key).map( found => Tuple2( found, value.toString ) )
}
val foundTuples = mbTuples.withFilter(_.nonEmpty).map( _.get )
val tupleSeq = foundTuples.toIndexedSeq
Or how about this approach:
val commonKeys = myFirstMap.keySet().intersect( myLookUpMap.keySet() )
val tupleSeq = commonKeys.map { case ( key, value ) =>
( myLookUpMap(key), value.toString )
}.toIndexedSeq
You can use flatMap to filter None in collection
myFirstMap.flatMap { case (key, value) => myLookUpMap.get(key).map(entity => (entity, value.toString)) }

Scala map validation

My program receives a scala map, the requirements is to validate this map (key-value pairs). Ex: validate a key value, convert its type from string to int etc. In a rare case, we update the key as well before passing the map to the down layer.
Its not always required to update this map , but only when we detect that there are any unsupported keys or values.
I'm doing some thing like this:
private def updateMap ( parameters: Map[String, String]): Map[String, String] = {
parameters.map{
case(k,v) => k match { case "checkPool" =>
(k, (if (k.contains("checkPool"))
v match {
case "1" => "true"
case _ => "false"
}
else v))
case "Newheader" => (k.replace("Newheader","header"),v)
case _ =>(k,v)
}
case _ => ("","")
}
}
Like this the code increases for doing the validation and converting the keys/values to supported ones.
Is there a cleaner way of doing this validation in Scala for a map?
Regards
According to what I understood from your question, match case can be your solution
inOptions.map(kv => kv.keySet.contains(STR) match {
case true => mutable.HashMap(STR_UPDT->kv.get(STR).get)
case _ => kv
})
Edited
Since you updated your question with more requirements, simple if else condition matching seems to be the best choice.
def updateMap(parameters: Map[String, String]): Map[String, String] = {
parameters.map(kv => {
var key = kv._1
var value = kv._2
if(key.contains("checkPool")){
value = if(value.equals("1")) "true" else "false"
}
else if(key.contains("Newheader")){
key = key.replace("Newheader", "header")
}
(key, value)
})
}
You can add more else if conditions

Scala Filter a map for based on unique values within Map values

In Scala, I'm trying to filter a map based on a unique property with the Map values.
case class Product(
item: Item,
)
productModels: Map[Int, Product]
How can I create a new Map (or filter productModels) to only contain values where Product.Item.someproperty is unique within the Map?
I've been trying foldLeft on productModels, but can't seem to get it. I'll keep trying but want to check with you all as well.
Thanks
You can do it the following way:
productModels
.groupBy(_._1) // produces Map[Product, Map[Int, Product]]
.filter {case (k,v) => v.size == 1} // filters unique values
.flatMap {case (_,v) => v}
The easiest way to do that is to transform your map into another map, where keys are desired fields of Item:
case class Product(item:String)
val productModels =
Map(
1 -> Product("a"),
2 -> Product("b"),
3 -> Product("c"),
4 -> Product("a")
)
// here I'm calculating distinct by Product.item for simplicity
productModels.map { case e#(_, v) => v.item -> e }.values.toMap
Result:
Map(4 -> Product(a), 2 -> Product(b), 3 -> Product(c))
Note, that the order of the elements is not guaranteed, as generic Map doesn't have particular order of keys. If you use Map that has item order, such as ListMap and want to preserve order of elements, here is the necessary adjustment:
productModels.toList.reverse.map { case e#(_, v) => v.item -> e }.toMap.values.toMap
Result:
res1: scala.collection.immutable.Map[Int,Product] = Map(1 -> Product(a), 3 -> Product(c), 2 -> Product(b))
case class Item(property:String)
case class Product(item:Item)
val xs = Map[Int, Product]() // your example has this data structure
// just filter the map based on the item property value
xs filter { case (k,v) => v.item.property == "some property value" }
Here is implementation with foldLeft:
productModels.foldLeft(Map.empty[Int, Product]){
(acc, el) =>
if (acc.exists(_._2.item.someproperty == el._2.item.someproperty)) acc
else acc + el
}

Create map based on condition from Future of List in Scala

I have method with param type Future[List[MyRes]]. MyRes has two option fields id and name. Now I want to create map of id and name if both present. I am able to create map with default value as follow but I don't want to have default value just skip the entry with null value on either.
def myMethod(myRes: Future[List[MyRes]]): Future[Map[Long, String]] = {
myRes.map (
_.map(
o =>
(o.id match {
case Some(id) => id.toLong
case _ => 0L
}) ->
(o.name match {
case Some(name) => name
case _ => ""
})
).toMap)
Any suggestion?
You are looking for collect :)
myRes.map {
_.iterator
.map { r => r.id -> r.name }
.collect { case(Some(id), Some(name) => id -> name }
.toMap
}
If your MyRes thingy is a case class, then you don't need the first .map:
myRes.map {
_.collect { case MyRes(Some(id), Some(name)) => id -> name }
.toMap
}
collect is like .map, but it takes a PartialFunction, and skips over elements on which it is not defined. It is kinda like your match statement but without the defaults.
Update:
If I am reading your comment correctly, and you want to log a message when either field is a None, collect won't help with that, but you can do flatMap:
myRes.map {
_.flatMap {
case MyRes(Some(id), Some(name)) => Some(id -> name)
case x => loger.warn(s"Missing fields in $x."); None
}
.toMap
}
Try this:
def myMethod(myRes: Future[List[MyRes]]): Future[Map[Long, String]] = {
myRes.map (
_.flatMap(o =>
(for (id <- o.id; name <- o.name) yield (id.toLong -> name)).toList
).toMap
)
}
The trick is flattening List[Option[(Long,String)]] by using flatMap and converting the Option to a List.