concateneate lines having same title - scala

I have the following issue
Given this list in input , I want to concateneate integers for each line having the same title,
val listIn= List("TitleB,Int,11,0",
"TitleB,Int,1,0",
"TitleB,Int,1,0",
"TitleB,Int,3,0",
"TitleA,STR,3,0",
"TitleC,STR,4,5")
I wrote the following function
def sumB(list: List[String]): List[String] = {
val itemPattern = raw"(.*)(\d+),(\d+)\s*".r
list.foldLeft(ListMap.empty[String, (Int,Int)].withDefaultValue((0,0))) {
case (line, stri) =>
val itemPattern(k,i,j) = stri
val (a, b) = line(k)
line.updated(k, (i.toInt + a, j.toInt + b))
}.toList.map { case (k, (i, j)) => s"$k$i,$j" }
}
Expected output would be:
List("TitleB,Int,16,0",
"TitleA,STR,3,0",
"TitleC,STR,4,5")

Since you are looking to preserve the order of the titles as they appear in the input data, I would suggest you to use LinkedHashMap with foldLeft as below
val finalResult = listIn.foldLeft(new mutable.LinkedHashMap[String, (String, String, Int, Int)]){ (x, y) => {
val splitted = y.split(",")
if(x.keySet.contains(Try(splitted(0)).getOrElse(""))){
val oldTuple = x(Try(splitted(0)).getOrElse(""))
x.update(Try(splitted(0)).getOrElse(""), (Try(splitted(0)).getOrElse(""), Try(splitted(1)).getOrElse(""), oldTuple._3+Try(splitted(2).toInt).getOrElse(0), oldTuple._4+Try(splitted(3).toInt).getOrElse(0)))
x
}
else {
x.put(Try(splitted(0)).getOrElse(""), (Try(splitted(0)).getOrElse(""), Try(splitted(1)).getOrElse(""), Try(splitted(2).toInt).getOrElse(0), Try(splitted(3).toInt).getOrElse(0)))
x
}
}}.mapValues(iter => iter._1+","+iter._2+","+iter._3+","+iter._4).values.toList
finalResult should be your desired output
List("TitleB,Int,16,0", "TitleA,STR,3,0", "TitleC,STR,4,5")

Related

Improve Two sum problem using Map in scala

I am trying to solve Two sum problem using scala
val list = List(1,2,3,4,5)
val map = collection.mutable.Map.empty[Int, Int]
val sum = 9
for {
i <- 0 until list.size
} yield {
map.get(sum - list(i)) match {
case None => map += (list(i) -> i)
case Some(previousIndex) => println(s" Indexes $previousIndex $i")
}
}
Can anyone suggest an O(n) solution without using mutable map using scala
If you are trying to solve "Two sum problem" - meaning you need from given list find two numbers which gives sum equal to given, can go with:
val list = List(1,2,3,4,5)
val sum = 9
val set = list.toSet
val solution = list.flatMap { item =>
val rest = sum - item
val min = Math.min(item, rest)
val max = Math.max(item, rest)
if (set(rest)) Some(min, max) else None
}.toSet
println(solution)
Print result:
Set((4,5))
ScalaFiddle: https://scalafiddle.io/sf/LA6P3eh/0
UPDATE
The result required to return indices not values:
val list = List(1,2,3,4,5)
val sum = 9
val inputMap = list.zipWithIndex.toMap
val solution = list.zipWithIndex.flatMap { case (item, itemIndex) =>
inputMap.get(sum - item).map { restIndex =>
val minIndex = Math.min(itemIndex, restIndex)
val maxIndex = Math.max(itemIndex, restIndex)
minIndex -> maxIndex
}
}.toSet
println(solution)
Printout: Set((3,4))
ScalaFiddle: https://scalafiddle.io/sf/LA6P3eh/1
You can try something as follows for the first result:
object Solution extends App {
def twoSums(xs: List[Int], target: Int): Option[(Int,Int)] = {
#annotation.tailrec def go(zipped: List[(Int,Int)], map: Map[Int,Int] = Map.empty): Option[(Int,Int)] = {
zipped match {
case Nil => None
case (ele, idx) :: tail =>
map.get(target - ele) match {
case Some(prevIdx) => Some((prevIdx, idx))
case None => go(tail, map + (ele -> idx))
}
}
}
go(xs.zipWithIndex)
}
val res = twoSums(List(1,2,3,4,5), 9)
println(res)
}
Or via foldLeft for all results:
object Solution extends App {
def twoSums(xs: List[Int], target: Int): List[(Int, Int)] = {
xs.zipWithIndex.foldLeft((Map.empty[Int,Int], List.empty[(Int,Int)])) {
case ((map, results), (ele, idx)) =>
map.get(target - ele) match {
case Some(prevIdx) =>(map, (prevIdx, idx) :: results)
case None => (map + (ele -> idx), results)
}
}
}._2
val res = twoSums(List(1,2,3,4,5), 9)
println(res)
}

groupby scala list of string

I am facing a problem to calculate the sum of elements in Scala having the same title (my key in this case).
Currently my input can be described as:
val listInput1 =
List(
"itemA,CATA,2,4 ",
"itemA,CATA,3,1 ",
"itemB,CATB,4,5",
"itemB,CATB,4,6"
)
val listInput2 =
List(
"itemA,CATA,2,4 ",
"itemB,CATB,4,5",
"itemC,CATC,1,2"
)
The required output for lists in input should be
val listoutput1 =
List(
"itemA,CATA,5,5 ",
"itemB,CATB,8,11"
)
val listoutput2 =
List(
"itemA , CATA, 2,4 ",
"itemB,CATB,4,5",
"itemC,CATC,1,2"
)
I wrote the following function:
def sumByTitle(listInput: List[String]): List[String] =
listInput.map(_.split(",")).groupBy(_(0)).map {
case (title, features) =>
"%s,%s,%d,%d".format(
title,
features.head.apply(1),
features.map(_(2).toInt).sum,
features.map(_(3).toInt).sum)}.toList
It doesn't give me the expected result as it changes the order of lines.
How can I fix that?
The ListMap is designed to preserve the order of items inserted to the Map.
import collection.immutable.ListMap
def sumByTitle(listInput: List[String]): List[String] = {
val itemPttrn = raw"(.*)(\d+),(\d+)\s*".r
listInput.foldLeft(ListMap.empty[String, (Int,Int)].withDefaultValue((0,0))) {
case (lm, str) =>
val itemPttrn(k, a, b) = str //unsafe
val (x, y) = lm(k)
lm.updated(k, (a.toInt + x, b.toInt + y))
}.toList.map { case (k, (a, b)) => s"$k$a,$b" }
}
This is a bit unsafe in that it will throw if the input string doesn't match the regex pattern.
sumByTitle(listInput1)
//res0: List[String] = List(itemA,CATA,5,5, itemB,CATB,8,11)
sumByTitle(listInput2)
//res1: List[String] = List(itemA,CATA,2,4, itemB,CATB,4,5, itemC,CATC,1,2)
You'll note that the trailing space, if there is one, is not preserved.
If you are just interested in sorting you can simply return the sorted list:
val listInput1 =
List(
"itemA , CATA, 2,4 ",
"itemA , CATA, 3,1 ",
"itemB,CATB,4,5",
"itemB,CATB,4,6"
)
val listInput2 =
List(
"itemA , CATA, 2,4 ",
"itemB,CATB,4,5",
"itemC,CATC,1,2"
)
def sumByTitle(listInput: List[String]): List[String] =
listInput.map(_.split(",")).groupBy(_(0)).map {
case (title, features) =>
"%s,%s,%d,%d".format(
title,
features.head.apply(1),
features.map(_(2).trim.toInt).sum,
features.map(_(3).trim.toInt).sum)}.toList.sorted
println("LIST 1")
sumByTitle(listInput1).foreach(println)
println("LIST 2")
sumByTitle(listInput2).foreach(println)
You can find the code on Scastie for you to play around with.
As a side note, you may be interested in separating the serialization and deserialization from your business logic.
Here you can find another Scastie notebook with a relatively naive approach for a first step towards separating concerns.
def foldByTitle(listInput: List[String]): List[Item] =
listInput.map(Item.parseItem).foldLeft(List.empty[Item])(sumByTitle)
val sumByTitle: (List[Item], Item) => List[Item] = (acc, curr) =>
acc.find(_.name == curr.name).fold(curr :: acc) { i =>
acc.filterNot(_.name == curr.name) :+ i.copy(num1 = i.num1 + curr.num1, num2 = i.num2 + curr.num2)
}
case class Item(name: String, category: String, num1: Int, num2: Int)
object Item {
def parseItem(serializedItem: String): Item = {
val itemTokens = serializedItem.split(",").map(_.trim)
Item(itemTokens.head, itemTokens(1), itemTokens(2).toInt, itemTokens(3).toInt)
}
}
This way the initial order of the elements to kept.

scala for...yield excludes empty strings

In the code below
test("duplicatedParamGetsFirst2") {
val str = "A=B&C" //"A=B&A=C"
val res = for {
x <- str.split("&")
y <- if(x.indexOf("=") == -1) "" else x.substring(x.indexOf("=") + 1)
} yield (if (x.indexOf("=") == -1) x else x.substring(0, x.indexOf("=")), y)
res.foreach(x => println(x))
}
I expected the result (A,B)(C,) but I got just (A,B). How do I fix it?
Your goal isn't completely clear. Maybe this gets close.
"A=B&C".split("&").map(_.split("="))
// res0: Array[Array[String]] = Array(Array(A, B), Array(C))
You can use .toList, or some other collection cast, if you don't want the result in Arrays.
Leo C's solution works. Here is another snippet, generating an array of pairs, close in style to your original code:
val s = "A=B&C"
val res = for {
t <- s.split("&")
a = t.split("=")
} yield a(0) -> a.lift(1).getOrElse("")
res.foreach(println)
// (A,B)
// (C,)
Not sure the result date type is what you want, as your for-comprehension will yield an Array of Tuple2[String, Char] because y would be of type Char when generated from String x. A simple way to generate your tuples would be to apply split twice as follows:
val str = "A=B&C"
str.split("&").
map( x => if (x contains "=") x.split("=") else Array(x, "") ).
map{ case Array(a, b) => (a, b) }
// res1: Array[(String, String)] = Array((A,B), (C,""))
If you must use for-comprehension, here's one way to do it:
val res = for {
x <- str.split("&")
} yield if (x contains "=")
x.split("=") match { case Array(a, b) => (a, b) } else
(x, "")
// res2: Array[(String, String)] = Array((A,B), (C,""))
The code should be:
val str = "A=B&C" //"A=B&A=C"
val res = for {
x <- str.split("&")
} yield
{
val y = if(x.indexOf("=") == -1) "" else x.substring(x.indexOf("=") + 1)
(if (x.indexOf("=") == -1) x else x.substring(0, x.indexOf("=")), y)
}
res.foreach(x => println(x))
About the for( expressA express B), I don't know how to express it.

How to check if there's None in List[Option[_]] and return the element's name?

I have multiple Option's. I want to check if they hold a value. If an Option is None, I want to reply to user about this. Else proceed.
This is what I have done:
val name:Option[String]
val email:Option[String]
val pass:Option[String]
val i = List(name,email,pass).find(x => x match{
case None => true
case _ => false
})
i match{
case Some(x) => Ok("Bad Request")
case None => {
//move forward
}
}
Above I can replace find with contains, but this is a very dirty way. How can I make it elegant and monadic?
Edit: I would also like to know what element was None.
Another way is as a for-comprehension:
val outcome = for {
nm <- name
em <- email
pwd <- pass
result = doSomething(nm, em, pwd) // where def doSomething(name: String, email: String, password: String): ResultType = ???
} yield (result)
This will generate outcome as a Some(result), which you can interrogate in various ways (all the methods available to the collections classes: map, filter, foreach, etc.). Eg:
outcome.map(Ok(result)).orElse(Ok("Bad Request"))
val ok = Seq(name, email, pass).forall(_.isDefined)
If you want to reuse the code, you can do
def allFieldValueProvided(fields: Option[_]*): Boolean = fields.forall(_.isDefined)
If you want to know all the missing values then you can find all missing values and if there is none, then you are good to go.
def findMissingValues(v: (String, Option[_])*) = v.collect {
case (name, None) => name
}
val missingValues = findMissingValues(("name1", option1), ("name2", option2), ...)
if(missingValues.isEmpty) {
Ok(...)
} else {
BadRequest("Missing values for " + missingValues.mkString(", ")))
}
val response = for {
n <- name
e <- email
p <- pass
} yield {
/* do something with n, e, p */
}
response getOrElse { /* bad request /* }
Or, with Scalaz:
val response = (name |#| email |#| pass) { (n, e, p) =>
/* do something with n, e, p */
}
response getOrElse { /* bad request /* }
if ((name :: email :: pass :: Nil) forall(!_.isEmpty)) {
} else {
// bad request
}
I think the most straightforward way would be this:
(name,email,pass) match {
case ((Some(name), Some(email), Some(pass)) => // proceed
case _ => // Bad request
}
A version with stone knives and bear skins:
import util._
object Test extends App {
val zero: Either[List[Int], Tuple3[String,String,String]] = Right((null,null,null))
def verify(fields: List[Option[String]]) = {
(zero /: fields.zipWithIndex) { (acc, v) => v match {
case (Some(s), i) => acc match {
case Left(_) => acc
case Right(t) =>
val u = i match {
case 0 => t copy (_1 = s)
case 1 => t copy (_2 = s)
case 2 => t copy (_3 = s)
}
Right(u)
}
case (None, i) =>
val fails = acc match {
case Left(f) => f
case Right(_) => Nil
}
Left(i :: fails)
}
}
}
def consume(name: String, email: String, pass: String) = Console println s"$name/$email/$pass"
def fail(is: List[Int]) = is map List("name","email","pass") foreach (Console println "Missing: " + _)
val name:Option[String] = Some("Bob")
val email:Option[String]= None
val pass:Option[String] = Some("boB")
val res = verify(List(name,email,pass))
res.fold(fail, (consume _).tupled)
val res2 = verify(List(name, Some("bob#bob.org"),pass))
res2.fold(fail, (consume _).tupled)
}
The same thing, using reflection to generalize the tuple copy.
The downside is that you must tell it what tuple to expect back. In this form, reflection is like one of those Stone Age advances that were so magical they trended on twitter for ten thousand years.
def verify[A <: Product](fields: List[Option[String]]) = {
import scala.reflect.runtime._
import universe._
val MaxTupleArity = 22
def tuple = {
require (fields.length <= MaxTupleArity)
val n = fields.length
val tupleN = typeOf[Tuple2[_,_]].typeSymbol.owner.typeSignature member TypeName(s"Tuple$n")
val init = tupleN.typeSignature member nme.CONSTRUCTOR
val ctor = currentMirror reflectClass tupleN.asClass reflectConstructor init.asMethod
val vs = Seq.fill(n)(null.asInstanceOf[String])
ctor(vs: _*).asInstanceOf[Product]
}
def zero: Either[List[Int], Product] = Right(tuple)
def nextProduct(p: Product, i: Int, s: String) = {
val im = currentMirror reflect p
val ts = im.symbol.typeSignature
val copy = (ts member TermName("copy")).asMethod
val args = copy.paramss.flatten map { x =>
val name = TermName(s"_$i")
if (x.name == name) s
else (im reflectMethod (ts member x.name).asMethod)()
}
(im reflectMethod copy)(args: _*).asInstanceOf[Product]
}
(zero /: fields.zipWithIndex) { (acc, v) => v match {
case (Some(s), i) => acc match {
case Left(_) => acc
case Right(t) => Right(nextProduct(t, i + 1, s))
}
case (None, i) =>
val fails = acc match {
case Left(f) => f
case Right(_) => Nil
}
Left(i :: fails)
}
}.asInstanceOf[Either[List[Int], A]]
}
def consume(name: String, email: String, pass: String) = Console println s"$name/$email/$pass"
def fail(is: List[Int]) = is map List("name","email","pass") foreach (Console println "Missing: " + _)
val name:Option[String] = Some("Bob")
val email:Option[String]= None
val pass:Option[String] = Some("boB")
type T3 = Tuple3[String,String,String]
val res = verify[T3](List(name,email,pass))
res.fold(fail, (consume _).tupled)
val res2 = verify[T3](List(name, Some("bob#bob.org"),pass))
res2.fold(fail, (consume _).tupled)
I know this doesn't scale well, but would this suffice?
(name, email, pass) match {
case (None, _, _) => "name"
case (_, None, _) => "email"
case (_, _, None) => "pass"
case _ => "Nothing to see here"
}

Value assignment inside for-loop in Scala

Is there any difference between this code:
for(term <- term_array) {
val list = hashmap.get(term)
...
}
and:
for(term <- term_array; val list = hashmap.get(term)) {
...
}
Inside the loop I'm changing the hashmap with something like this
hashmap.put(term, string :: list)
While checking for the head of list it seems to be outdated somehow when using the second code snippet.
The difference between the two is, that the first one is a definition which is created by pattern matching and the second one is a value inside a function literal. See Programming in Scala, Section 23.1 For Expressions:
for {
p <- persons // a generator
n = p.name // a definition
if (n startsWith "To") // a filter
} yield n
You see the real difference when you compile sources with scalac -Xprint:typer <filename>.scala:
object X {
val x1 = for (i <- (1 to 5); x = i*2) yield x
val x2 = for (i <- (1 to 5)) yield { val x = i*2; x }
}
After code transforming by the compiler you will get something like this:
private[this] val x1: scala.collection.immutable.IndexedSeq[Int] =
scala.this.Predef.intWrapper(1).to(5).map[(Int, Int), scala.collection.immutable.IndexedSeq[(Int, Int)]](((i: Int) => {
val x: Int = i.*(2);
scala.Tuple2.apply[Int, Int](i, x)
}))(immutable.this.IndexedSeq.canBuildFrom[(Int, Int)]).map[Int, scala.collection.immutable.IndexedSeq[Int]]((
(x$1: (Int, Int)) => (x$1: (Int, Int) #unchecked) match {
case (_1: Int, _2: Int)(Int, Int)((i # _), (x # _)) => x
}))(immutable.this.IndexedSeq.canBuildFrom[Int]);
private[this] val x2: scala.collection.immutable.IndexedSeq[Int] =
scala.this.Predef.intWrapper(1).to(5).map[Int, scala.collection.immutable.IndexedSeq[Int]](((i: Int) => {
val x: Int = i.*(2);
x
}))(immutable.this.IndexedSeq.canBuildFrom[Int]);
This can be simplified to:
val x1 = (1 to 5).map {i =>
val x: Int = i * 2
(i, x)
}.map {
case (i, x) => x
}
val x2 = (1 to 5).map {i =>
val x = i * 2
x
}
Instantiating variables inside for loops makes sense if you want to use that variable the for statement, like:
for (i <- is; a = something; if (a)) {
...
}
And the reason why your list is outdated, is that this translates to a foreach call, such as:
term_array.foreach {
term => val list= hashmap.get(term)
} foreach {
...
}
So when you reach ..., your hashmap has already been changed. The other example translates to:
term_array.foreach {
term => val list= hashmap.get(term)
...
}