Modifying generic maps in Scala - scala

I'm new to the Scala landscape after spending the last 10 years in Java and the last ~year in Groovy. Hi Scala!
For the life of me I can't seem to get the following code snippet to compile, and its just complicated enough to the point where the Google Gods aren't helping me.
I have a map that will contain Strings for keys and Lists of Tuples for values. The tuples will be a String-Long pair. In Groovy this would look like:
Map<String,List<Tuple2<String,Long>>> data = [:]
I need to be able to add and modify keys and values for this map. Specifically, I need to:
Add to the List of Tuples for existing keys
If a key doesn't exist, instantiate a new List of Tuples, and then add the key and list as a map entry
In Groovy this would look like:
Map<String,List<String,Long>> data = [:]
def addData(String key, String message) {
Long currTime = System.currentTimestampInMillis()
Tuple2<String,Long> tuple = new Tuple2<String,Long>(message, tuple)
if(data.contains(key)) {
data.key << tuple
} else {
data[key] = new List<Tuple2<String,Long>>()
data.key << tuple
}
}
I'm trying to do this in Scala, albeit unsuccessfully.
My best attempt thus far:
object MapUtils {
// var data : Map[String,ListBuffer[(String,Long)]] = Map()
val data = collection.mutable.Map[String, ListBuffer[(String, Long)]]()
def addData(key : String, message : String) : Unit = {
val newTuple = (message, System.currentTimeMillis())
val optionalOldValue = data.get(key)
optionalOldValue match {
case Some(olderBufferList) => olderBufferList += newTuple
case None => data
.put(key, ListBuffer[(String, Long)](newTuple))
}
}
}
Complains with this compiler error on the case Some(olderBufferList) => olderBufferList += newTuple line:
value += is not a member of Any
Any ideas what I can do to get this compiling & working?

You are missing an import for ListBuffer. The following code works perfectly fine in 2.9.1 (tested on TryScala), 2.11.7 (tested on IDEOne) and 2.11.8. Note the only addition is the first line adding the import:
import collection.mutable.ListBuffer
object MapUtils {
// var data : Map[String,ListBuffer[(String,Long)]] = Map()
val data = collection.mutable.Map[String, ListBuffer[(String, Long)]]()
def addData(key : String, message : String) : Unit = {
val newTuple = (message, System.currentTimeMillis())
val optionalOldValue = data.get(key)
optionalOldValue match {
case Some(olderBufferList) => olderBufferList += newTuple
case None => data
.put(key, ListBuffer[(String, Long)](newTuple))
}
}
}
MapUtils.addData("123", "message 1")
MapUtils.addData("456", "message 2")
MapUtils.data
//=> Map(456 -> ListBuffer((message 2,1472925061065)), 123 -> ListBuffer((message 1,1472925060926)))

The short version for your needs will be:
val map = mutable.Map[String, ListBuffer[(String, Long)]]()
map.put(key, map.getOrElse(key, ListBuffer[(String, Long)]()) += ((message, System.currentTimeMillis())))
You have some syntax issues with your code, If I'll try to change addData it would look like this:
def addData(key : String, message : String) : Unit = {
val newTuple = (message, System.currentTimeMillis())
val optionalOldValue = map.get(key)
optionalOldValue match {
case Some(olderBufferList) => olderBufferList += newTuple
case None => map.put(key, ListBuffer[(String, Long)](newTuple))
}
}

Related

Read Hocon config as a Map[String, String] with key in dot notation and value

I have following HOCON config:
a {
b.c.d = "val1"
d.f.g = "val2"
}
HOCON represents paths "b.c.d" and "d.f.g" as objects. So, I would like to have a reader, which reads these configs as Map[String, String], ex:
Map("b.c.d" -> "val1", "d.f.g" -> "val2")
I've created a reader and trying to do it recursively:
import scala.collection.mutable.{Map => MutableMap}
private implicit val mapReader: ConfigReader[Map[String, String]] = ConfigReader.fromCursor(cur => {
def concat(prefix: String, key: String): String = if (prefix.nonEmpty) s"$prefix.$key" else key
def toMap(): Map[String, String] = {
val acc = MutableMap[String, String]()
def go(
cur: ConfigCursor,
prefix: String = EMPTY,
acc: MutableMap[String, String]
): Result[Map[String, Object]] = {
cur.fluent.mapObject { obj =>
obj.value.valueType() match {
case ConfigValueType.OBJECT => go(obj, concat(prefix, obj.pathElems.head), acc)
case ConfigValueType.STRING =>
acc += (concat(prefix, obj.pathElems.head) -> obj.asString.right.getOrElse(EMPTY))
}
obj.asRight
}
}
go(cur, acc = acc)
acc.toMap
}
toMap().asRight
})
It gives me the correct result but is there a way to avoid MutableMap here?
P.S. Also, I would like to keep implementation by "pureconfig" reader.
The solution given by Ivan Stanislavciuc isn't ideal. If the parsed config object contains values other than strings or objects, you don't get an error message (as you would expect) but instead some very strange output. For instance, if you parse a typesafe config document like this
"a":[1]
The resulting value will look like this:
Map(a -> [
# String: 1
1
])
And even if the input only contains objects and strings, it doesn't work correctly, because it erroneously adds double quotes around all the string values.
So I gave this a shot myself and came up with a recursive solution that reports an error for things like lists or null and doesn't add quotes that shouldn't be there.
implicit val reader: ConfigReader[Map[String, String]] = {
implicit val r: ConfigReader[String => Map[String, String]] =
ConfigReader[String]
.map(v => (prefix: String) => Map(prefix -> v))
.orElse { reader.map { v =>
(prefix: String) => v.map { case (k, v2) => s"$prefix.$k" -> v2 }
}}
ConfigReader[Map[String, String => Map[String, String]]].map {
_.flatMap { case (prefix, v) => v(prefix) }
}
}
Note that my solution doesn't mention ConfigValue or ConfigReader.Result at all. It only takes existing ConfigReader objects and combines them with combinators like map and orElse. This is, generally speaking, the best way to write ConfigReaders: don't start from scratch with methods like ConfigReader.fromFunction, use existing readers and combine them.
It seems a bit surprising at first that the above code works at all, because I'm using reader within its own definition. But it works because the orElse method takes its argument by name and not by value.
You can do the same without using recursion. Use method entrySet as following
import scala.jdk.CollectionConverters._
val hocon =
"""
|a {
| b.c.d = "val1"
| d.f.g = val2
|}""".stripMargin
val config = ConfigFactory.load(ConfigFactory.parseString(hocon))
val innerConfig = config.getConfig("a")
val map = innerConfig
.entrySet()
.asScala
.map { entry =>
entry.getKey -> entry.getValue.render()
}
.toMap
println(map)
Produces
Map(b.c.d -> "val1", d.f.g -> "val2")
With given knowledge, it's possible to define a pureconfig.ConfigReader that reads Map[String, String] as following
implicit val reader: ConfigReader[Map[String, String]] = ConfigReader.fromFunction {
case co: ConfigObject =>
Right(
co.toConfig
.entrySet()
.asScala
.map { entry =>
entry.getKey -> entry.getValue.render()
}
.toMap
)
case value =>
//Handle error case
Left(
ConfigReaderFailures(
ThrowableFailure(
new RuntimeException("cannot be mapped to map of string -> string"),
Option(value.origin())
)
)
)
}
I did not want to write custom readers to get a mapping of key value pairs. I instead changed my internal data type from a map to list of pairs (I am using kotlin), and then I can easily change that to a map at some later internal stage if I need to. My HOCON was then able to look like this.
additionalProperties = [
{first = "sasl.mechanism", second = "PLAIN"},
{first = "security.protocol", second = "SASL_SSL"},
]
additionalProducerProperties = [
{first = "acks", second = "all"},
]
Not the best for humans... but I prefer it to having to build custom parsing components.

Scala map : How to add new entries

I have created my scala map as :
val A:Map[String, String] = Map()
Then I am trying to add entries as :
val B = AttributeCodes.map { s =>
val attributeVal:String = <someString>
if (!attributeVal.isEmpty)
{
A + (s -> attributeVal)
}
else
()
}
And after this part of the code, I am seeing A is still empty. And, B is of type :
Pattern: B: IndexedSeq[Any]
I need a map to add entries and the same or different map in return to be used later in the code. However, I can not use "var" for that. Any insight on this problem and how to resolve this?
Scala uses immutability in many cases and encourages you to do the same.
Do not create an empty map, create a Map[String, String] with .map and .filter
val A = AttributeCodes.map { s =>
val attributeVal:String = <someString>
s -> attributeVal
}.toMap.filter(e => !e._1.isEmpty && !e._2.isEmpty)
In Scala, the default Map type is immutable. <Map> + <Tuple> creates a new map instance with the additional entry added.
There are 2 ways round this:
Use scala.collection.mutable.Map instead:
val A:immutable.Map[String, String] = immutable.Map()
AttributeCodes.forEach { s =>
val attributeVal:String = <someString>
if (!attributeVal.isEmpty){
A.put(s, attributeVal)
}
}
Create in immutable map using a fold:
val A: Map[String,String] = AttributeCodes.foldLeft(Map(), { m, s =>
val attributeVal:String = <someString>
if (!attributeVal.isEmpty){
m + (s -> attributeVal)
} else {
m
}
}

Create a list with empty map

I have a JSON string which is parsed and a typecaseted to a map. I'm using this map to get a List[Map[String, Any]]. Here to make my code error free I have used getOrElse while type casting.
JSON string looks similar to
{
"map-key" : [
{
"list-object-1-key" : "list-object-1-value"
},
{
"list-object-2-key" : "list-object-2-value"
},
]
}
My code
val json = JSON.parseFull(string) match {
case Some(e) =>
val list = e.asInstanceOf[Map[String, Any]]
.getOrElse("map-key", List[Map[String,Any]]) // Error here
val info = list.asInstanceOf[List[Map[String, Any]]]
//iterate over each element in the list and perform my operations
case None => string
}
I can understand that whenever there is no result present in list object then info object is repeated code.
How can I improve this programme by giving the default value to list object?
Do it in more functional way, without asInstanceOf:
val parsed = JSON.parseFull(string)
parsed match {
case Some(e: Map[String, Any]) =>
e.get("map-key") match {
case Some(a: List[Any]) =>
a.foreach {
case inner: Map[String, Any] => println(inner.toList)
}
case _ =>
}
case None => string
}
Your default value is wrong. You're passing a type, not an empty list.
e.asInstanceOf[Map[String, Any]].getOrElse("map-key", List.empty[Map[String,Any]])
Unfortunately i don't have the environment at this machine but try something like that
first thing you need to convert json to map
def jsonStrToMap(jsonStr: String): Map[String, Any] = {
implicit val formats = org.json4s.DefaultFormats
parse(jsonStr).extract[Map[String, Any]]
}
and the second thing you will need to iterate over map to get values of list
val list= jsonStrToMap.map{ case(k,v) => (k.getBytes, v) }. toList

How to avoid any mutable things in this builder?

I have a simple Scala class like this:
class FiltersBuilder {
def build(filter: CommandFilter) = {
val result = collection.mutable.Map[String, String]()
if (filter.activity.isDefined) {
result += ("activity" -> """ some specific expression """)
} // I well know that manipulating option like this is not recommanded,
//it's just for the simplicity of the example
if (filter.gender.isDefined) {
result += ("gender" -> """ some specific expression """)
}
result.toMap //in order to return an immutable Map
}
}
using this class so:
case class CommandFilter(activity: Option[String] = None, gender: Option[String] = None)
The result content depends on the nature of the selected filters and their associated and hardcoded expressions (String).
Is there a way to transform this code snippet by removing this "mutability" of the mutable.Map?
Map each filter field to a tuple while you add the result to a Seq, then filter out the Nones with flatten finally convert the Seq of tuples to a Map with toMap.
For adding more fields to filter you just have to add a new line to the Seq
def build(filter: CommandFilter) = {
// map each filter filed to the proper tuple
// as they are options, map will transform just the Some and let the None as None
val result = Seq(
filter.activity.map(value => "activity" -> s""" some specific expression using $value """),
filter.gender.map(value => "gender" -> s""" some specific expression using $value """)
).flatten // flatten will filter out all the Nones
result.toMap // transform list of tuple to a map
}
Hope it helps.
Gaston.
Since there are at most 2 elements in your Map:
val activity = filter.activity.map(_ => Map("activity" -> "xx"))
val gender = filter.gender.map(_ => Map("gender" -> "xx"))
val empty = Map[String, String]()
activity.getOrElse(empty) ++ gender.getOrElse(empty)
I've just managed to achieve it with this solution:
class FiltersBuilder(commandFilter: CommandFilter) {
def build = {
val result = Map[String, String]()
buildGenderFilter(buildActivityFilter(result))
}
private def buildActivityFilter(expressions: Map[String, String]) =
commandFilter.activity.fold(expressions)(activity => result + ("activity" -> """ expression regarding activity """))
private def buildGenderFilter(expressions: Map[String, String]) =
commandFilter.gender.fold(expressions)(gender => result + ("gender" -> """ expression regarding gender """))
}
Any better way?

Scala: how to traverse stream/iterator collecting results into several different collections

I'm going through log file that is too big to fit into memory and collecting 2 type of expressions, what is better functional alternative to my iterative snippet below?
def streamData(file: File, errorPat: Regex, loginPat: Regex): List[(String, String)]={
val lines : Iterator[String] = io.Source.fromFile(file).getLines()
val logins: mutable.Map[String, String] = new mutable.HashMap[String, String]()
val errors: mutable.ListBuffer[(String, String)] = mutable.ListBuffer.empty
for (line <- lines){
line match {
case errorPat(date,ip)=> errors.append((ip,date))
case loginPat(date,user,ip,id) =>logins.put(ip, id)
case _ => ""
}
}
errors.toList.map(line => (logins.getOrElse(line._1,"none") + " " + line._1,line._2))
}
Here is a possible solution:
def streamData(file: File, errorPat: Regex, loginPat: Regex): List[(String,String)] = {
val lines = Source.fromFile(file).getLines
val (err, log) = lines.collect {
case errorPat(inf, ip) => (Some((ip, inf)), None)
case loginPat(_, _, ip, id) => (None, Some((ip, id)))
}.toList.unzip
val ip2id = log.flatten.toMap
err.collect{ case Some((ip,inf)) => (ip2id.getOrElse(ip,"none") + "" + ip, inf) }
}
Corrections:
1) removed unnecessary types declarations
2) tuple deconstruction instead of ulgy ._1
3) left fold instead of mutable accumulators
4) used more convenient operator-like methods :+ and +
def streamData(file: File, errorPat: Regex, loginPat: Regex): List[(String, String)] = {
val lines = io.Source.fromFile(file).getLines()
val (logins, errors) =
((Map.empty[String, String], Seq.empty[(String, String)]) /: lines) {
case ((loginsAcc, errorsAcc), next) =>
next match {
case errorPat(date, ip) => (loginsAcc, errorsAcc :+ (ip -> date))
case loginPat(date, user, ip, id) => (loginsAcc + (ip -> id) , errorsAcc)
case _ => (loginsAcc, errorsAcc)
}
}
// more concise equivalent for
// errors.toList.map { case (ip, date) => (logins.getOrElse(ip, "none") + " " + ip) -> date }
for ((ip, date) <- errors.toList)
yield (logins.getOrElse(ip, "none") + " " + ip) -> date
}
I have a few suggestions:
Instead of a pair/tuple, it's often better to use your own class. It gives meaningful names to both the type and its fields, which makes the code much more readable.
Split the code into small parts. In particular, try to decouple pieces of code that don't need to be tied together. This makes your code easier to understand, more robust, less prone to errors and easier to test. In your case it'd be good to separate producing your input (lines of a log file) and consuming it to produce a result. For example, you'd be able to make automatic tests for your function without having to store sample data in a file.
As an example and exercise, I tried to make a solution based on Scalaz iteratees. It's a bit longer (includes some auxiliary code for IteratorEnumerator) and perhaps it's a bit overkill for the task, but perhaps someone will find it helpful.
import java.io._;
import scala.util.matching.Regex
import scalaz._
import scalaz.IterV._
object MyApp extends App {
// A type for the result. Having names keeps things
// clearer and shorter.
type LogResult = List[(String,String)]
// Represents a state of our computation. Not only it
// gives a name to the data, we can also put here
// functions that modify the state. This nicely
// separates what we're computing and how.
sealed case class State(
logins: Map[String,String],
errors: Seq[(String,String)]
) {
def this() = {
this(Map.empty[String,String], Seq.empty[(String,String)])
}
def addError(date: String, ip: String): State =
State(logins, errors :+ (ip -> date));
def addLogin(ip: String, id: String): State =
State(logins + (ip -> id), errors);
// Produce the final result from accumulated data.
def result: LogResult =
for ((ip, date) <- errors.toList)
yield (logins.getOrElse(ip, "none") + " " + ip) -> date
}
// An iteratee that consumes lines of our input. Based
// on the given regular expressions, it produces an
// iteratee that parses the input and uses State to
// compute the result.
def logIteratee(errorPat: Regex, loginPat: Regex):
IterV[String,List[(String,String)]] = {
// Consumes a signle line.
def consume(line: String, state: State): State =
line match {
case errorPat(date, ip) => state.addError(date, ip);
case loginPat(date, user, ip, id) => state.addLogin(ip, id);
case _ => state
}
// The core of the iteratee. Every time we consume a
// line, we update our state. When done, compute the
// final result.
def step(state: State)(s: Input[String]): IterV[String, LogResult] =
s(el = line => Cont(step(consume(line, state))),
empty = Cont(step(state)),
eof = Done(state.result, EOF[String]))
// Return the iterate waiting for its first input.
Cont(step(new State()));
}
// Converts an iterator into an enumerator. This
// should be more likely moved to Scalaz.
// Adapted from scalaz.ExampleIteratee
implicit val IteratorEnumerator = new Enumerator[Iterator] {
#annotation.tailrec def apply[E, A](e: Iterator[E], i: IterV[E, A]): IterV[E, A] = {
val next: Option[(Iterator[E], IterV[E, A])] =
if (e.hasNext) {
val x = e.next();
i.fold(done = (_, _) => None, cont = k => Some((e, k(El(x)))))
} else
None;
next match {
case None => i
case Some((es, is)) => apply(es, is)
}
}
}
// main ---------------------------------------------------
{
// Read a file as an iterator of lines:
// val lines: Iterator[String] =
// io.Source.fromFile("test.log").getLines();
// Create our testing iterator:
val lines: Iterator[String] = Seq(
"Error: 2012/03 1.2.3.4",
"Login: 2012/03 user 1.2.3.4 Joe",
"Error: 2012/03 1.2.3.5",
"Error: 2012/04 1.2.3.4"
).iterator;
// Create an iteratee.
val iter = logIteratee("Error: (\\S+) (\\S+)".r,
"Login: (\\S+) (\\S+) (\\S+) (\\S+)".r);
// Run the the iteratee against the input
// (the enumerator is implicit)
println(iter(lines).run);
}
}