Append/Add JsObject into JsArray in Play Framework - scala

I am newbie to Play Framework, I need to append/add JsObject elements into JsArray
Aim(What I need)
{"s_no":1,"s_name":"one",
,"sub_s": [{"sub_s_no":1,"sub_s_name":"one_sub","sub_s_desc":"one_sub"},{"sub_s_no":2,"sub_s_name":"two_sub","sub_s_desc":"two_sub"}]},
{"s_no":2,"s_name":"two","sub_s":[{"sub_s_no":2,"sub_s_name":"two_sub","sub_s_desc":"two_sub"},
{"sub_s_no":3,"sub_s_name":"three_sub","sub_s_desc":"three_sub"}]}
What I Got
JsObject 1
{"s_no":1,"s_name":"one",
,"sub_s":[{"sub_s_no":1,"sub_s_name":"one_sub","sub_s_desc":"one_sub"},{"sub_s_no":2,"sub_s_name":"two_sub","sub_s_desc":"two_sub"}]}
JsObject 2
{"s_no":2,"s_name":"two","sub_s":[{"sub_s_no":2,"sub_s_name":"two_sub","sub_s_desc":"two_sub"},
{"sub_s_no":3,"sub_s_name":"three_sub","sub_s_desc":"three_sub"}]}
I have got two JsObject and will get more than two, I need to add/append these all JsObjects into JsArray
I tried with .+:,.append methods which gave empty JsArray values

The reason why getting an empty JsArray is because JsArray is immutable so the original JsArray will not modified. You need to assign the result of the append to a new variable in order for it to work how you expect.
val jsonString1 = """{"s_no":1,"sub_s":[1,2]}"""
val jsonString2 = """{"s_no":2,"sub_s":[3,4]}"""
val jsObj1 = Json.parse(jsonString1)
val jsObj2 = Json.parse(jsonString2)
val emptyArray = Json.arr()
val filledArray = emptyArray :+ obj1 :+ obj2
Json.prettyPrint(emptyArray)
Json.prettyPrint(filledArray)
And some of the REPL output
> filledArray: play.api.libs.json.JsArray = [{"s_no":1,"s_name":"one","sub_s":[{"sub_s_no":1,"sub_s_name":"one_sub","sub_s_desc":"one_sub"},{"sub_s_no":2,"sub_s_name":"two_sub","sub_s_desc":"two_sub"}]},{"s_no":2,"s_name":"two","sub_s":[{"sub_s_no":2,"sub_s_name":"two_sub","sub_s_desc":"two_sub"},{"sub_s_no":3,"sub_s_name":"three_sub","sub_s_desc":"three_sub"}]}]
> // pretty print of the empty array
> res1: String = [ ]
> // pretty print of the filled array
> res2: String = [ {"s_no" : 1 ...}, {"s_no" : 2 ...} ]

Related

How to add to an Immutable map : Scala

I have a ResultSet object returned from Hive using JDBC.
I am trying to store the values in a resultset in a Scala Immutable Map.
How can i add there values to an Immutable map as i am iterating the resultset using while loop
val m : Map[String, String] = null
while ( resultSet.next() ) {
val col = resultSet.getString("col_name")
val data = resultSet.getString("data_type")
m += (col -> data) // This Gives Reassignment error
}
I propose :
Iterator.continually{
val col = resultSet.getString("col_name")
val data = resultSet.getString("data_type")
col->data
}.takeWhile( _ => resultSet.next()).toMap
Instead of thinking "let's init an empty collection and fill it" which is imho the mutable way to think, this proposition rather think in terms of "let's declare how to build a collection with those elements in it and be done" :-)
You might want to use scala.collection.Iterator[A] so that you can create immutable map out of your java resultSet.
val myMap : Map[String, String] = new Iterator[(String, String)] {
override def hasNext = resultSet.next()
override def next() = {
val col = resultSet.getString("col_name")
val data = resultSet.getString("data_type")
col -> data
}
}.toMap
Otherwise you have to use mutable scala.collection.mutable.Map.

Scala : adding to Scala List

I am trying to append to a List[String] based on a condition But List shows empty
Here is the Simple code :
object Mytester{
def main(args : Array[String]): Unit = {
val columnNames = List("t01354", "t03345", "t11858", "t1801566", "t180387685", "t015434")
//println(columnNames)
val prim = List[String]()
for(i <- columnNames) {
if(i.startsWith("t01"))
println("Printing i : " + i)
i :: prim :: Nil
}
println(prim)
}
}
Output :
Printing i : t01354
Printing i : t015434
List()
Process finished with exit code 0
This line, i :: prim :: Nil, creates a new List[] but that new List is not saved (i.e. assigned to a variable) so it is thrown away. prim is never changed, and it can't be because it is a val.
If you want a new List of only those elements that meet a certain condition then filter the list.
val prim: List[String] = columnNames.filter(_.startsWith("t01"))
// prim: List[String] = List(t01354, t015434)
1) why can't I add to List?
List is immutable, you have to mutable List (called ListBuffer)
definition
scala> val list = scala.collection.mutable.ListBuffer[String]()
list: scala.collection.mutable.ListBuffer[String] = ListBuffer()
add elements
scala> list+="prayagupd"
res3: list.type = ListBuffer(prayagupd)
scala> list+="urayagppd"
res4: list.type = ListBuffer(prayagupd, urayagppd)
print list
scala> list
res5: scala.collection.mutable.ListBuffer[String] = ListBuffer(prayagupd, urayagppd)
2. Filtering a list in scala?
Also, in your case the best approach to solve the problem would be to use List#filter, no need to use for loop.
scala> val columnNames = List("t01354", "t03345", "t11858", "t1801566", "t180387685", "t015434")
columnNames: List[String] = List(t01354, t03345, t11858, t1801566, t180387685, t015434)
scala> val columnsStartingWithT01 = columnNames.filter(_.startsWith("t01"))
columnsStartingWithT01: List[String] = List(t01354, t015434)
Related resources
Add element to a list In Scala
filter a List according to multiple contains
In addition to what jwvh explained.
Note that in Scala you'd usually do what you want as
val prim = columnNames.filter(_.startsWith("t01"))

Modifying List of String in scala

I have input file i would like to read a scala stream and then modify each record and then output the file.
My input is as follows -
Name,id,phone-number
abc,1,234567
dcf,2,345334
I want to change the above input as follows -
Name,id,phone-number
testabc,test1,test234567
testdcf,test2,test345334
I am trying to read a file as scala stream as follows:
val inputList = Source.fromFile("/test.csv")("ISO-8859-1").getLines
after the above step i get Iterator[String]
val newList = inputList.map{line =>
line.split(',').map{s =>
"test" + s
}.mkString (",")
}.toList
but the new list is empty.
I am not sure if i can define an empty list and empty array and then append the modified record to the list.
Any suggestions?
You might want to transform the iterator into a stream
val l = Source.fromFile("test.csv")
.getLines()
.toStream
.tail
.map { row =>
row.split(',')
.map { col =>
s"test$col"
}.mkString (",")
}
l.foreach(println)
testabc,test1,test234567
testdcf,test2,test345334
Here's a similar approach that returns a List[Array[String]]. You can use mkString, toString, or similar if you want a String returned.
scala> scala.io.Source.fromFile("data.txt")
.getLines.drop(1)
.map(l => l.split(",").map(x => "test" + x)).toList
res3: List[Array[String]] = List(
Array(testabc, test1, test234567),
Array(testdcf, test2, test345334)
)

Appending Data to List or any other collection Dynamically in scala [duplicate]

This question already has answers here:
Add element to a list In Scala
(4 answers)
Closed 6 years ago.
I am new to scala.
Can we Add/Append data into List or any other Collection Dynamically in scala.
I mean can we add data in List or any collection using foreach (or any other loop).
I am trying to do something like below:
var propertyData = sc.textFile("hdfs://ip:8050/property.conf")
var propertyList = new ListBuffer[(String,String)]()
propertyData.foreach { line =>
var c = line.split("=")
propertyList.append((c(0), c(1)))
}
And suppose property.conf file contains:
"spark.shuffle.memoryFraction"="0.5"
"spark.yarn.executor.memoryOverhead"="712"
This is compiled fine But value is not added in ListBuffer.
I tried it using Darshan's code from his (updated) question:
val propertyData = List(""""spark.shuffle.memoryFraction"="0.5"""", """"spark.yarn.executor.memoryOverhead"="712" """)
val propertyList = new ListBuffer[(String,String)]()
propertyData.foreach { line =>
val c = line.split("=")
propertyList.append((c(0), c(1)))
}
println(propertyList)
It works as expected: it prints to the console:
ListBuffer(("spark.shuffle.memoryFraction","0.5"), ("spark.yarn.executor.memoryOverhead","712" ))
I didn't do it in a Spark Context, although I will try that in a few minutes. So, I provided the data in a list of Strings (shouldn't make a difference). I also changed the "var" keywords to "val" since none of them needs to be a mutable variable, but of course that makes no difference either. The code works whether they are val or var.
See my comment below. But here is idiomatic Spark/Scala code which does behave exactly as you would expect:
object ListTest extends App {
val conf = new SparkConf().setAppName("listtest")
val sc = new SparkContext(conf)
val propertyData = sc.textFile("listproperty.conf")
val propertyList = propertyData map { line =>
val xs: Array[String] = line.split("""\=""")
(xs(0),xs(1))
}
propertyList foreach ( println(_))
}
yes thats possible using mutable collections (see this link), example:
import scala.collection.mutable
val buffer = mutable.ListBuffer.empty[String]
// add elements
buffer += "a string"
buffer += "another string"
or in a loop:
val buffer = mutable.ListBuffer.empty[Int]
for(i <- 1 to 10) {
buffer += i
}
You can either use a mutable collection (not functional), or return a new collection (functional and more idiomatic) as below :
scala> val a = List(1,2,3)
a: List[Int] = List(1, 2, 3)
scala> val b = a :+ 4
b: List[Int] = List(1, 2, 3, 4)

Pass array as an UDF parameter in Spark SQL

I'm trying to transform a dataframe via a function that takes an array as a parameter. My code looks something like this:
def getCategory(categories:Array[String], input:String): String = {
categories(input.toInt)
}
val myArray = Array("a", "b", "c")
val myCategories =udf(getCategory _ )
val df = sqlContext.parquetFile("myfile.parquet)
val df1 = df.withColumn("newCategory", myCategories(lit(myArray), col("myInput"))
However, lit doesn't like arrays and this script errors. I tried definining a new partially applied function and then the udf after that :
val newFunc = getCategory(myArray, _:String)
val myCategories = udf(newFunc)
val df1 = df.withColumn("newCategory", myCategories(col("myInput")))
This doesn't work either as I get a nullPointer exception and it appears myArray is not being recognized. Any ideas on how I pass an array as a parameter to a function with a dataframe?
On a separate note, any explanation as to why doing something simple like using a function on a dataframe is so complicated (define function, redefine it as UDF, etc, etc)?
Most likely not the prettiest solution but you can try something like this:
def getCategory(categories: Array[String]) = {
udf((input:String) => categories(input.toInt))
}
df.withColumn("newCategory", getCategory(myArray)(col("myInput")))
You could also try an array of literals:
val getCategory = udf(
(input:String, categories: Array[String]) => categories(input.toInt))
df.withColumn(
"newCategory", getCategory($"myInput", array(myArray.map(lit(_)): _*)))
On a side note using Map instead of Array is probably a better idea:
def mapCategory(categories: Map[String, String], default: String) = {
udf((input:String) => categories.getOrElse(input, default))
}
val myMap = Map[String, String]("1" -> "a", "2" -> "b", "3" -> "c")
df.withColumn("newCategory", mapCategory(myMap, "foo")(col("myInput")))
Since Spark 1.5.0 you can also use an array function:
import org.apache.spark.sql.functions.array
val colArray = array(myArray map(lit _): _*)
myCategories(lit(colArray), col("myInput"))
See also Spark UDF with varargs