scala - how to avoid using mutable list and casting - scala

I have written the following code which works fine. However, I wanted to check if there is a way I can avoid using mutable list and casting datatype from any to string.
import scala.collection.mutable.ListBuffer
val databases = spark.catalog.listDatabases.select($"name").collect().map(_(0)).toList
var tables= new ListBuffer[String]()
databases.foreach{database =>
val t = spark.catalog.listTables(database.asInstanceOf[String]).filter($"isTemporary" === false).filter($"tableType" =!= "VIEW").select($"name").collect.map(database+"."+_(0).asInstanceOf[String]).toList
tables = tables ++ t
}
tables.foreach(println)

You can use row deconstructor to get rid of casting:
val databases: List[String] = listDatabases.select($"name").collect().map {
case Row(name: String) => name
}.to[List]
As for mutability, just use flatMap instead of foreach:
val tables = databases.flatMap { db => listTables(db).filter .... }

Related

Extend / Replicate Scala collections syntax to create your own collection?

I want to build a map however I want to discard all keys with empty values as shown below:
#tailrec
def safeFiltersMap(
map: Map[String, String],
accumulator: Map[String,String] = Map.empty): Map[String, String] = {
if(map.isEmpty) return accumulator
val curr = map.head
val (key, value) = curr
safeFiltersMap(
map.tail,
if(value.nonEmpty) accumulator + (key->value)
else accumulator
)
}
Now this is fine however I need to use it like this:
val safeMap = safeFiltersMap(Map("a"->"b","c"->"d"))
whereas I want to use it like the way we instantiate a map:
val safeMap = safeFiltersMap("a"->"b","c"->"d")
What syntax can I follow to achieve this?
The -> syntax isn't a special syntax in Scala. It's actually just a fancy way of constructing a 2-tuple. So you can write your own functions that take 2-tuples as well. You don't need to define a new Map type. You just need a function that filters the existing one.
def safeFiltersMap(args: (String, String)*): Map[String, String] =
Map(args: _*).filter {
result => {
val (_, value) = result
value.nonEmpty
}
}
Then call using
safeFiltersMap("a"->"b","c"->"d")

Anorm - generic insert

Is there a way to use Anorm like a regular ORM? I'd like to have a method that just inserts an element provided.
def insert[T](element: T)(implicit connection: Connection) = {
element.insert(connection)
}
I can definitely implement it by myself, but feels like I'm re-implementing an ORM... Old anorm version had this Magic[T] but I can't see it now
The documentation clearly states that Anorm is not an ORM (and will never be).
As indicated, to insert or update a T value, an instance of the ToStatement typeclass must be provided.
Some macros are provided to automatically materialize such instance.
import anorm.{ Macro, SQL, ToParameterList }
import anorm.NamedParameter
case class Bar(v: Int)
val bar1 = Bar(1)
// Convert all supported properties as parameters
val toParams1: ToParameterList[Bar] = Macro.toParameters[Bar]
val params1: List[NamedParameter] = toParams1(bar1)
// --> List(NamedParameter(v,ParameterValue(1)))
val names1: List[String] = params1.map(_.name)
// --> List(v)
val placeholders = names1.map { n => s"{$n}" } mkString ", "
// --> "{v}"
val generatedStmt = s"""INSERT INTO bar(${names1 mkString ", "}) VALUES ($placeholders)"""
val generatedSql1 = SQL(generatedStmt).on(params1: _*)

Scala Nested HashMaps, how to access Case Class value properties?

New to Scala, continue to struggle with Option related code. I have a HashMap built of Case Class instances that themselves contain hash maps with Case Class instance values. It is not clear to me how to access properties of the retrieved Class instances:
import collection.mutable.HashMap
case class InnerClass(name: String, age: Int)
case class OuterClass(name: String, nestedMap: HashMap[String, InnerClass])
// Load some data...hash maps are mutable
val innerMap = new HashMap[String, InnerClass]()
innerMap += ("aaa" -> InnerClass("xyz", 0))
val outerMap = new HashMap[String, OuterClass]()
outerMap += ("AAA" -> OuterClass("XYZ", innerMap))
// Try to retrieve data
val outerMapTest = outerMap.getOrElse("AAA", None)
val nestedMap = outerMapTest.nestedMap
This produces error: value nestedMap is not a member of Option[ScalaFiddle.OuterClass]
// Try to retrieve data a different way
val outerMapTest = outerMap.getOrElse("AAA", None)
val nestedMap = outerMapTest.nestedMap
This produces error: value nestedMap is not a member of Product with Serializable
Please advise on how I would go about getting access to outerMapTest.nestedMap. I'll eventually need to get values and properties out of the nestedMap HashMap as well.
Since you are using .getOrElse("someKey", None) which returns you a type Product (not the actual type as you expect to be OuterClass)
scala> val outerMapTest = outerMap.getOrElse("AAA", None)
outerMapTest: Product with Serializable = OuterClass(XYZ,Map(aaa -> InnerClass(xyz,0)))
so Product either needs to be pattern matched or casted to OuterClass
pattern match example
scala> outerMapTest match { case x : OuterClass => println(x.nestedMap); case _ => println("is not outerclass") }
Map(aaa -> InnerClass(xyz,0))
Casting example which is a terrible idea when outerMapTest is None, (pattern matching is favored over casting)
scala> outerMapTest.asInstanceOf[OuterClass].nestedMap
res30: scala.collection.mutable.HashMap[String,InnerClass] = Map(aaa -> InnerClass(xyz,0))
But better way of solving it would simply use .get which very smart and gives you Option[OuterClass],
scala> outerMap.get("AAA").map(outerClass => outerClass.nestedMap)
res27: Option[scala.collection.mutable.HashMap[String,InnerClass]] = Some(Map(aaa -> InnerClass(xyz,0)))
For key that does not exist, gives you None
scala> outerMap.get("I dont exist").map(outerClass => outerClass.nestedMap)
res28: Option[scala.collection.mutable.HashMap[String,InnerClass]] = None
Here are some steps you can take to get deep inside a nested structure like this.
outerMap.lift("AAA") // Option[OuterClass]
.map(_.nestedMap) // Option[HashMap[String,InnerClass]]
.flatMap(_.lift("aaa")) // Option[InnerClass]
.map(_.name) // Option[String]
.getOrElse("no name") // String
Notice that if either of the inner or outer maps doesn't have the specified key ("aaa" or "AAA" respectively) then the whole thing will safely result in the default string ("no name").
A HashMap will return None if a key is not found so it is unnecessary to do getOrElse to return None if the key is not found.
A simple solution to your problem would be to use get only as below
Change your first get as
val outerMapTest = outerMap.get("AAA").get
you can check the output as
println(outerMapTest.name)
println(outerMapTest.nestedMap)
And change the second get as
val nestedMap = outerMapTest.nestedMap.get("aaa").get
You can test the outputs as
println(nestedMap.name)
println(nestedMap.age)
Hope this is helpful
You want
val maybeInner = outerMap.get("AAA").flatMap(_.nestedMap.get("aaa"))
val maybeName = maybeInner.map(_.name)
Which if your feeling adventurous you can get with
val name: String = maybeName.get
But that will throw an error if its not there. If its a None
you can access the nestMap using below expression.
scala> outerMap.get("AAA").map(_.nestedMap).getOrElse(HashMap())
res5: scala.collection.mutable.HashMap[String,InnerClass] = Map(aaa -> InnerClass(xyz,0))
if "AAA" didnt exist in the outerMap Map object then the below expression would have returned an empty HashMap as indicated in the .getOrElse method argument (HashMap()).

Create Map in Scala using loop

I am trying to create a map after getting result for each items in the list. Here is what I tried so far:
val sourceList: List[(Int, Int)] = ....
val resultMap: Map[Int, Int] = for(srcItem <- sourceList) {
val result: Int = someFunction(srcItem._1)
Map(srcItem._1 -> result)
}
But I am getting type mismatch error in IntelliJ and I am definitely not writing proper syntax here. I don't think I can use yield as I don't want List of Map. What is correct way to create Map using for loop. Any suggestion?
The simplest way is to create the map out of a list of tuples:
val resultMap = sourceList.map(item => (item._1, someFunction(item._1))).toMap
Or, in the monadic way:
val listOfTuples = for {
(value, _) <- sourceList
} yield (value, someFunction(value))
val resultMap = listOfTuples.toMap
Alternatively, if you want to avoid the creation of listOfTuples you can make the transformation a lazy one by calling .view on sourceList and then call toMap:
val resultMap = sourceList.view
.map(item => (item._1, someFunction(item._1)))
.toMap
Finally, if you really want to avoid generating extra objects you can use a mutable Map instead and append the keys and values to it using += or .put

Filling a Scala immutable Map from a database table

I have a SQL database table with the following structure:
create table category_value (
category varchar(25),
property varchar(25)
);
I want to read this into a Scala Map[String, Set[String]] where each entry in the map is a set of all of the property values that are in the same category.
I would like to do it in a "functional" style with no mutable data (other than the database result set).
Following on the Clojure loop construct, here is what I have come up with:
def fillMap(statement: java.sql.Statement): Map[String, Set[String]] = {
val resultSet = statement.executeQuery("select category, property from category_value")
#tailrec
def loop(m: Map[String, Set[String]]): Map[String, Set[String]] = {
if (resultSet.next) {
val category = resultSet.getString("category")
val property = resultSet.getString("property")
loop(m + (category -> m.getOrElse(category, Set.empty)))
} else m
}
loop(Map.empty)
}
Is there a better way to do this, without using mutable data structures?
If you like, you could try something around
def fillMap(statement: java.sql.Statement): Map[String, Set[String]] = {
val resultSet = statement.executeQuery("select category, property from category_value")
Iterator.continually((resultSet, resultSet.next)).takeWhile(_._2).map(_._1).map{ res =>
val category = res.getString("category")
val property = res.getString("property")
(category, property)
}.toIterable.groupBy(_._1).mapValues(_.map(_._2).toSet)
}
Untested, because I don’t have a proper sql.Statement. And the groupBy part might need some more love to look nice.
Edit: Added the requested changes.
There are two parts to this problem.
Getting the data out of the database and into a list of rows.
I would use a Spring SimpleJdbcOperations for the database access, so that things at least appear functional, even though the ResultSet is being changed behind the scenes.
First, some a simple conversion to let us use a closure to map each row:
implicit def rowMapper[T<:AnyRef](func: (ResultSet)=>T) =
new ParameterizedRowMapper[T]{
override def mapRow(rs:ResultSet, row:Int):T = func(rs)
}
Then let's define a data structure to store the results. (You could use a tuple, but defining my own case class has advantage of being just a little bit clearer regarding the names of things.)
case class CategoryValue(category:String, property:String)
Now select from the database
val db:SimpleJdbcOperations = //get this somehow
val resultList:java.util.List[CategoryValue] =
db.query("select category, property from category_value",
{ rs:ResultSet => CategoryValue(rs.getString(1),rs.getString(2)) } )
Converting the data from a list of rows into the format that you actually want
import scala.collection.JavaConversions._
val result:Map[String,Set[String]] =
resultList.groupBy(_.category).mapValues(_.map(_.property).toSet)
(You can omit the type annotations. I've included them to make it clear what's going on.)
Builders are built for this purpose. Get one via the desired collection type companion, e.g. HashMap.newBuilder[String, Set[String]].
This solution is basically the same as my other solution, but it doesn't use Spring, and the logic for converting a ResultSet to some sort of list is simpler than Debilski's solution.
def streamFromResultSet[T](rs:ResultSet)(func: ResultSet => T):Stream[T] = {
if (rs.next())
func(rs) #:: streamFromResultSet(rs)(func)
else
rs.close()
Stream.empty
}
def fillMap(statement:java.sql.Statement):Map[String,Set[String]] = {
case class CategoryValue(category:String, property:String)
val resultSet = statement.executeQuery("""
select category, property from category_value
""")
val queryResult = streamFromResultSet(resultSet){rs =>
CategoryValue(rs.getString(1),rs.getString(2))
}
queryResult.groupBy(_.category).mapValues(_.map(_.property).toSet)
}
There is only one approach I can think of that does not include either mutable state or extensive copying*. It is actually a very basic technique I learnt in my first term studying CS. Here goes, abstracting from the database stuff:
def empty[K,V](k : K) : Option[V] = None
def add[K,V](m : K => Option[V])(k : K, v : V) : K => Option[V] = q => {
if ( k == q ) {
Some(v)
}
else {
m(q)
}
}
def build[K,V](input : TraversableOnce[(K,V)]) : K => Option[V] = {
input.foldLeft(empty[K,V]_)((m,i) => add(m)(i._1, i._2))
}
Usage example:
val map = build(List(("a",1),("b",2)))
println("a " + map("a"))
println("b " + map("b"))
println("c " + map("c"))
> a Some(1)
> b Some(2)
> c None
Of course, the resulting function does not have type Map (nor any of its benefits) and has linear lookup costs. I guess you could implement something in a similar way that mimicks simple search trees.
(*) I am talking concepts here. In reality, things like value sharing might enable e.g. mutable list constructions without memory overhead.