Scala: batch process Map of index-based form fields - scala

Part of a web app I'm working on handles forms that need to be bound to a collection of model (case class) instances. See this question
So, if I were to add several users at one time, form fields would be named email[0], email[1], password[0], password[1], etc.
Posting the form results in a Map[String, Seq[String]]
Now, what I would like to do is to process the Map in batches, by index, so that for each iteration I can bind a User instance, creating a List[User] as the final result of the bindings.
The hacked approach I'm thinking of is to regex match against "[\d]" in the Map keys and then find the highest index via filter or count; with that, then (0..n).toList map{ ?? } through the number of form field rows, calling the binding/validation method (which also takes a Map[String, Seq[String]]) accordingly.
What is a concise way to achieve this?

Assuming that:
All map keys are in form "field[index]"
There is only one value in Seq for each key.
If there is entry for "email[x]" than there is entry for "password[x]" and vice versa.
I would done something like this:
val request = Map(
"email[0]" -> Seq("alice#example.com"),
"email[1]" -> Seq("bob#example.com"),
"password[0]" -> Seq("%vT*n7#4"),
"password[1]" -> Seq("Bfts7B&^")
)
case class User(email: String, password: String)
val Field = """(.+)\[(\d+)\]""".r
val userList = request.groupBy { case (Field(_, idx), _) => idx.toInt }
.mapValues { userMap =>
def extractField(name: String) =
userMap.collect{case (Field(`name`, _), values) => values.head}.head
User(extractField("email"), extractField("password"))}
.toList.sortBy(_._1).map(_._2)
// Exiting paste mode, now interpreting.
request: scala.collection.immutable.Map[String,Seq[String]] = Map(email[0] -> List(alice#example.com),
email[1] -> List(bob#example.com), password[0] -> List(%vT*n7#4), password[1] -> List(Bfts7B&^))
defined class User
Field: scala.util.matching.Regex = (.+)\[(\d+)\]
userList: List[User] = List(User(alice#example.com,%vT*n7#4), User(bob#example.com,Bfts7B&^))

Related

Scala create immutable nested map

I have a situation here
I have two strins
val keyMap = "anrodiApp,key1;iosApp,key2;xyz,key3"
val tentMap = "androidApp,tenant1; iosApp,tenant1; xyz,tenant2"
So what I want to add is to create a nested immutable nested map like this
tenant1 -> (andoidiApp -> key1, iosApp -> key2),
tenant2 -> (xyz -> key3)
So basically want to group by tenant and create a map of keyMap
Here is what I tried but is done using mutable map which I do want, is there a way to create this using immmutable map
case class TenantSetting() {
val requesterKeyMapping = new mutable.HashMap[String, String]()
}
val requesterKeyMapping = keyMap.split(";")
.map { keyValueList => keyValueList.split(',')
.filter(_.size==2)
.map(keyValuePair => (keyValuePair[0],keyValuePair[1]))
.toMap
}.flatten.toMap
val config = new mutable.HashMap[String, TenantSetting]
tentMap.split(";")
.map { keyValueList => keyValueList.split(',')
.filter(_.size==2)
.map { keyValuePair =>
val requester = keyValuePair[0]
val tenant = keyValuePair[1]
if (!config.contains(tenant)) config.put(tenant, new TenantSetting)
config.get(tenant).get.requesterKeyMapping.put(requester, requesterKeyMapping.get(requester).get)
}
}
The logic to break the strings into a map can be the same for both as it's the same syntax.
What you had for the first string was not quite right as the filter you were applying to each string from the split result and not on the array result itself. Which also showed in that you were using [] on keyValuePair which was of type String and not Array[String] as I think you were expecting. Also you needed a trim in there to cope with the spaces in the second string. You might want to also trim the key and value to avoid other whitespace issues.
Additionally in this case the combination of map and filter can be more succinctly done with collect as shown here:
How to convert an Array to a Tuple?
The use of the pattern with 2 elements ensures you filter out anything with length other than 2 as you wanted.
The iterator is to make the combination of map and collect more efficient by only requiring one iteration of the collection returned from the first split (see comments below).
With both strings turned into a map it just needs the right use of groupByto group the first map by the value of the second based on the same key to get what you wanted. Obviously this only works if the same key is always in the second map.
def toMap(str: String): Map[String, String] =
str
.split(";")
.iterator
.map(_.trim.split(','))
.collect { case Array(key, value) => (key.trim, value.trim) }
.toMap
val keyMap = toMap("androidApp,key1;iosApp,key2;xyz,key3")
val tentMap = toMap("androidApp,tenant1; iosApp,tenant1; xyz,tenant2")
val finalMap = keyMap.groupBy { case (k, _) => tentMap(k) }
Printing out finalMap gives:
Map(tenant2 -> Map(xyz -> key3), tenant1 -> Map(androidApp -> key1, iosApp -> key2))
Which is what you wanted.

Scala : How to pass a class field into a method

I'm new to Scala and attempting to do some data analysis.
I have a CSV files with a few headers - lets say item no., item type, month, items sold.
I have made an Item class with the fields of the headers.
I split the CSV into a list with each iteration of the list being a row of the CSV file being represented by the Item class.
I am attempting to make a method that will create maps based off of the parameter I send in. For example if I want to group the items sold by month, or by item type. However I am struggling to send the Item.field into a method.
F.e what I am attempting is something like:
makemaps(Item.month);
makemaps(Item.itemtype);
def makemaps(Item.field):
if (item.field==Item.month){}
else (if item.field==Item.itemType){}
However my logic for this appears to be wrong. Any ideas?
def makeMap[T](items: Iterable[Item])(extractKey: Item => T): Map[T, Iterable[Item]] =
items.groupBy(extractKey)
So given this example Item class:
case class Item(month: String, itemType: String, quantity: Int, description: String)
You could have (I believe the type ascriptions are mandatory):
val byMonth = makeMap[String](items)(_.month)
val byType = makeMap[String](items)(_.itemType)
val byQuantity = makeMap[Int](items)(_.quantity)
val byDescription = makeMap[String](items)(_.description)
Note that _.month, for instance, creates a function taking an Item which results in the String contained in the month field (simplifying a little).
You could, if so inclined, save the functions used for extracting keys in the companion object:
object Item {
val month: Item => String = _.month
val itemType: Item => String = _.itemType
val quantity: Item => Int = _.quantity
val description: Item => String = _.description
// Allows us to determine if using a predefined extractor or using an ad hoc one
val extractors: Set[Item => Any] = Set(month, itemType, quantity, description)
}
Then you can pass those around like so:
val byMonth = makeMap[String](items)(Item.month)
The only real change semantically is that you explicitly avoid possible extra construction of lambdas at runtime, at the cost of having the lambdas stick around in memory the whole time. A fringe benefit is that you might be able to cache the maps by extractor if you're sure that the source Items never change: for lambdas, equality is reference equality. This might be particularly useful if you have some class representing the collection of Items as opposed to just using a standard collection, like so:
object Items {
def makeMap[T](items: Iterable[Item])(extractKey: Item => T): Map[T,
Iterable[Item]] =
items.groupBy(extractKey)
}
class Items(val underlying: immutable.Seq[Item]) {
def makeMap[T](extractKey: Item => T): Map[T, Iterable[Item]] =
if (Item.extractors.contains(extractKey)) {
if (extractKey == Item.month) groupedByMonth.asInstanceOf[Map[T, Iterable[Item]]]
else if (extractKey == Item.itemType) groupedByItemType.asInstanceOf[Map[T, Iterable[Item]]]
else if (extractKey == Item.quantity) groupedByQuantity.asInstanceOf[Map[T, Iterable[Item]]]
else if (extractKey == Item.description) groupedByDescription.asInstanceOf[Map[T, Iterable[Item]]]
else throw new AssertionError("Shouldn't happen!")
} else {
Items.makeMap(underlying)(extractKey)
}
lazy val groupedByMonth = Items.makeMap[String](underlying)(Item.month)
lazy val groupedByItemType = Items.makeMap[String](underlying)(Item.itemType)
lazy val groupedByQuantity = Items.makeMap[Int](underlying)(Item.quantity)
lazy val groupedByDescription = Items.makeMap[String](underlying)(Item.description)
}
(that is almost certainly a personal record for asInstanceOfs in a small block of code... I'm not sure if I should be proud or ashamed of this snippet)

How to convert Seq[Object] to Map[User, Set[String] in Scala

It's really hard to explain in the title but here's what I want to do.
I'm pretty new to Scala. I have an object User which is just a user that two users can be equal given the same user id
case class UserCustomFeature(
hobby: String,
users: Set[User]
) {}
My input is Seq[UserCustomFeature] So basically a list of objects of a hobby -> users. For example,
[('tv' -> Set('user1', 'user2')),
('swimming' -> Set('user2', 'user3'))]
And I want the result to be
('user1' -> Set('tv')),
('user2' -> Set('tv', 'swimming')),
('user3' -> Set('swimming'))
I have something like this so far but I'm not sure how to group them later
userHobbyMap
.map({
case (hobby, users) => {
users.map(user => {
(user, hobby)
})
}
})
case class User(id: String)
case class UserCustomFeature(
hobby: String,
users: Set[User]
) {}
val input = Seq(
UserCustomFeature("tv", Set(User("1"), User("2"))),
UserCustomFeature("swimming", Set(User("2"), User("3")))
)
val output = (for (UserCustomFeature(h, us) <- input; u <- us) yield (u, h))
.groupBy(_._1)
.mapValues(_.map(_._2).toSet)
output foreach println
Generates output:
(User(1),Set(tv))
(User(3),Set(swimming))
(User(2),Set(tv, swimming))
Brief explanation:
for-comprehension transposes the sequence of UserCustomFeatures into a list of (user, hobby) pairs.
groupBy groups hobbies by user (first component)
The map(_._2) drops the redundant user id from grouped pairs
toSet converts the resulting list of hobbies to a set of hobbies

Does this specific exercise lend itself well to a 'functional style' design pattern?

Say we have an array of one dimensional javascript objects contained in a file Array.json for which the key schema isn't known, that is the keys aren't known until the file is read.
Then we wish to output a CSV file with a header or first entry which is a comma delimited set of keys from all of the objects.
Each next line of the file should contain the comma separated values which correspond to each key from the file.
Array.json
[
abc:123,
xy:"yz",
s12:13,
],
...
[
abc:1
s:133,
]
A valid output:
abc,xy,s12,s
123,yz,13,
1,,,133
I'm teaching myself 'functional style' programming but I'm thinking that this problem doesn't lend itself well to a functional solution.
I believe that this problem requires some state to be kept for the output header and that subsequently each line depends on that header.
I'm looking to solve the problem in a single pass. My goals are efficiency for a large data set, minimal traversals, and if possible, parallelizability. If this isn't possible then can you give a proof or reasoning to explain why?
EDIT: Is there a way to solve the problem like this functionally?:
Say you pass through the array once, in some particular order. Then
from the start the header set looks like abc,xy,s12 for the first
object. With CSV entry 123,yz,13 . Then on the next object we add an
additional key to the header set so abc,xy,s12,s would be the header
and the CSV entry would be 1,,,133 . In the end we wouldn't need to
pass through the data set a second time. We could just append extra
commas to the result set. This is one way we could approach a single
pass....
Are there functional tools ( functions ) designed to solve problems like this, and what should I be considering? [ By functional tools I mean Monads,FlatMap, Filters, etc. ] . Alternatively, should I be considering things like Futures ?
Currently I've been trying to approach this using Java8, but am open to solutions from Scala, etc. Ideally I would be able to determine if Java8s' functional approach can solve the problem since that's the language I'm currently working in.
Since the csv output will change with every new line of input, you must hold that in memory before writing it out. If you consider creating an output text format from an internal representation of a csv file another "pass" over the data (the internal representation of the csv is practically a Map[String,List[String]] which you must traverse to convert it to text) then it's not possible to do this in a single pass.
If, however, this is acceptable, then you can use a Stream to read a single item from your json file, merge that into the csv file, and do this until the stream is empty.
Assuming, that the internal representation of the csv file is
trait CsvFile {
def merge(line: Map[String, String]): CsvFile
}
And you can represent a single item as
trait Item {
def asMap: Map[String, String]
}
You can implement it using foldLeft:
def toCsv(items: Stream[Item]): CsvFile =
items.foldLeft(CsvFile(Map()))((csv, item) => csv.merge(item.asMap))
or use recursion to get the same result
#tailrec def toCsv(items: Stream[Item], prevCsv: CsvFile): CsvFile =
items match {
case Stream.Empty => prevCsv
case item #:: rest =>
val newCsv = prevCsv.merge(item.asMap)
toCsv(rest, newCsv)
}
Note: Of course you don't have to create types for CsvFile or Item, you can use Map[String,List[String]] and Map[String,String] respectively
UPDATE:
As more detail was requested for the CsvFile trait/class, here's an example implementation:
case class CsvFile(lines: Map[String, List[String]], rowCount: Int = 0) {
def merge(line: Map[String, String]): CsvFile = {
val orig = lines.withDefaultValue(List.fill(rowCount)(""))
val current = line.withDefaultValue("")
val newLines = (lines.keySet ++ line.keySet) map {
k => (k, orig(k) :+ current(k))
}
CsvFile(newLines.toMap, rowCount + 1)
}
}
This could be one approach:
val arr = Array(Map("abc" -> 123, "xy" -> "yz", "s12" -> 13), Map("abc" -> 1, "s" -> 133))
val keys = arr.flatMap(_.keys).distinct // get the distinct keys for header
arr.map(x => keys.map(y => x.getOrElse(y,""))) // get an array of rows
Its completely OK to have state in functional programming. But having mutable state or mutating state is not allowed in functional programming.
Functional programming advocates creating new changed state instead of mutating the state in place.
So, its Ok to read and access state created in the program until and unless you are mutating or side effecting.
Coming to the point.
val list = List(List("abc" -> "123", "xy" -> "yz"), List("abc" -> "1"))
list.map { inner => inner.map { case (k, v) => k}}.flatten
list.map { inner => inner.map { case (k, v) => v}}.flatten
REPL
scala> val list = List(List("abc" -> "123", "xy" -> "yz"), List("abc" -> "1"))
list: List[List[(String, String)]] = List(List((abc,123), (xy,yz)), List((abc,1)))
scala> list.map { inner => inner.map { case (k, v) => k}}.flatten
res1: List[String] = List(abc, xy, abc)
scala> list.map { inner => inner.map { case (k, v) => v}}.flatten
res2: List[String] = List(123, yz, 1)
or use flatMap instead of map and flatten
val list = List(List("abc" -> "123", "xy" -> "yz"), List("abc" -> "1"))
list.flatMap { inner => inner.map { case (k, v) => k}}
list.flatMap { inner => inner.map { case (k, v) => v}}
In functional programming, mutable state is not allowed. But immutable states/values are fine.
Assuming that you have read your json file in to a value input:List[Map[String,String]], the codes below will solve your problem:
val input = List(Map("abc"->"123", "xy"->"yz" , "s12"->"13"), Map("abc"->"1", "s"->"33"))
val keys = input.map(_.keys).flatten.toSet
val keyvalues = input.map(kvs => keys.map(k => (k->kvs.getOrElse(k,""))).toMap)
val values = keyvalues.map(_.values)
val result = keys.mkString(",") + "\n" + values.map(_.mkString(",")).mkString("\n")

Filtering futures using values in another future

I have two futures.
One future (idsFuture) holds the computation to get the list of ids. The type of the idsFuture is Future[List[Int]]
Another Future(dataFuture) holds an array of A where A is defined as case class A(id: Int, data: String). The type of dataFuture is Future[Array[A]]
I want to filter dataFuture's using ids present in idsFuture.
For example-
case class A(id: Int, data: String)
val dataFuture = Future(Array(A(1,"a"), A(2,"b"), A(3,"c")))
val idsFuture = Future(List(1,2))
I should get another future having Array((A(1,"a"), A(2,"b"))
I currently do
idsFuture.flatMap{
ids => dataFuture.map(datas => datas.filter(data => ids.contains(data.id)))}
Is there a better solution?
You could use for-comprehension here instead of flatMap + map like this:
for {
ds <- dataFuture
idsList <- idsFuture
ids = idsList.toSet
} yield ds filter { d => ids(d.id) }
Note that apply on Set is faster then contains on List.