How to convert case class RDD to RDD[String]? - scala

I am having one schema rdd. If I print that RDD, I will get the output like
caseclass_name(col a, col b,col c)
caseclass_name(col d,col e, col f)
.....
.....
I need to display simply as (without case class name in front)
col a, col b, col c
col d, col e, col f
How can I get this? Please assist

Simplest solution is to override the toString method in your case class
case class iclass(Id1:Int,Id2:Int,SaleDate:String,Code:String) {
override def toString(): String = {
s"$Id1,$Id2,$SaleDate,$Code"
}
}
If you have an RDD[iclass] and want to convert it to an RDD[String], you can then just map it like insureRDD1.map(_.toString)

Related

List of objects of classes in Scala

I've created three classes A,B,C and in each class contains a list of elements , each class also contains a method that prints the elements , I've made a function outside the classes which has a pattern matching to choose which class to Print which takes a parameter of a list of the objects of the classes , my code is working well and can choose which class to print , but my question is what if the order of the objects of the classes in the list is not a,b,c but let's say c,a,b , how can someone then choose to print class A without knowing the order but just typing a ?
object printClass {
class A{
val a:List[Int] = List(1,2,3,4)
def printA(): Unit ={
println("Class A")
println(a)
}
}
class B{
val b:List[String] = List("Adam","Ali","Sara","Yara")
def printB(): Unit ={
println("Class B")
println(b)
}
}
class C{
val c:List[Char] = List('A','S','C','E')
def printC(): Unit ={
println("Class C")
println(c)
}
}
def prtClass(ch:Any): Unit ={
val a = new A()
val b = new B()
val c = new C()
ch match {
case a: A => a.printA()
case b: B => b.printB()
case c: C => c.printC()
case _ => print("Class not found")
}
}
def main(args: Array[String]): Unit = {
val a = new A()
val b = new B()
val c = new C()
val listOfObjects = List(a,b,c)
println("Choose the class to print (A,B,C) : ")
val choice:Int = scala.io.StdIn.readInt()
val ch = listOfObjects(choice)
prtClass(ch)
}
}
TLDR;
Use Map
Instead of using List to store a, b, c objects you could use Map. Keys as letters 'a' , 'b' , 'c' and values as objects a, b, c
val objects: Map[Char, Any] = Map('a' -> a, 'b' -> b, 'c' -> c)
And parse user input as Char
val choice = scala.io.StdIn.readChar()
Now now rest should fall in place. Objects will be fetched based on their association and same will be passed to prtClass function.
You could also define a parent class or trait to your A,B,C classes, so that Map value type can be confined to those types.

scala groupby different class

I have a list which may contain three different types of class, and all extends from class E, such as A extends E, B extends E and C extends E. And I need to identify each element in the list and do some calculations accordingly. (list may contain a little more subclass of E in the future.)
I prefer to use map or partition or groupBy rather than just if, but i get more confused right now. As I am very new to Scala, if anyone can share some idea? Thank you!
val list = //some codes to get the list//
list.groupby{
_.getClass //so in this line, is it possible to call the calculation method accordingly?
}
trait A extends E {
def calA = {...}
}
trait B {
def calB = {...}
}
trait C {
def calC = {...}
}
You can use pattern matching handle the different classes:
val list = List(1, "s", "t")
list map {
case a: A => a.calA
case b: B => b.calB
case i: Int => i + 5
case s: String => s.toUpperCase
}
// -> List(6, "S", "T")
list groupBy {
case a: E => "E" // grouping A, B and C together
case i: Int => "Int"
case s: String => "String"
}
// -> Map("Int" -> List(1), "String" -> List("s", "t"))

Slick, how to map a query to an inheritance table model?

Slick, how to map a query to an inheritance table model?
i.e,
I have table A, B, C
A is the "parent" table and B & C are "child" tables
What I would like to know is how should I model this using slick so A will be abstract and B & C concrete types, and querying for a row in A will result in a B or C object
Something like JPA's InheritanceType.TABLE_PER_CLASS.
We need to do couple of things. First find a way to map the hierarchy to an table. In this case I am using a column that stores the type. But you can use some other tricks as well.
trait Base {
val a: Int
val b: String
}
case class ChildOne(val a: Int, val b: String, val c: String) extends Base
case class ChildTwo(val a: Int, val b: String, val d: Int) extends Base
class MyTable extends Table[Base]("SOME_TABLE") {
def a = column[Int]("a")
def b = column[String]("b")
def c = column[String]("c", O.Nullable)
def d = column[Int]("d", O.Nullable)
def e = column[String]("model_type")
//mapping based on model type column
def * = a ~ b ~ c.? ~ d.? ~ e <> ({ t => t match {
case (a, b, Some(c), _, "ChildOne") => ChildOne(a, b, c)
case (a, b, _, Some(d), "ChildTwo") => ChildTwo(a, b, d)
}}, {
case ChildOne(a, b, c) => Some((a, b, Some(c), None, "ChildOne"))
case ChildTwo(a, b, d) => Some((a, b, None, Some(d), "ChildTwo"))
})
}
}
Now to determine specific sub type you can do following:
Query(new MyTable).foreach {
case ChildOne(a, b, c) => //childone
case ChildTwo(a, b, d) => childtwo
}
Slick does not support this directly. Some databases can help you with inheritance though, so you should be able to get something close to the desired effect.
Have a look at the inheritance documentation in PostgreSQL

Is for-yield-getOrElse paradigmatic Scala or is there a better way?

Basically I want to extract a bunch of Options a, b, etc. Is this the best way to do this in Scala? It looks kind of confusing to me to have the for-yield in parenthesis.
(for {
a <- a
b <- b
c <- c
...
} yield {
...
}) getOrElse {
...
}
Try using map and flatMap instead. Assume you have the following class hierarchy:
case class C(x: Int)
case class B(c: Option[C])
case class A(b: Option[B])
val a = Some(A(Some(B(Some(C(42))))))
In order to extract 42 you can say:
a.flatMap(_.b).flatMap(_.c).map(_.x).getOrElse(-1)
This is roughly equivalent to:
for(
a <- a
b <- a.b
c <- b.c)
yield c.x
except that it returns Some(42). In fact for comprehension is actually translated into a sequence of map/flatMap calls.

Is there such a thing as bidirectional maps in Scala?

I'd like to link 2 columns of unique identifiers and be able to get a first column value by a second column value as well as a second column value by a first column value. Something like
Map(1 <-> "one", 2 <-> "two", 3 <-> "three")
Is there such a facility in Scala?
Actually I need even more: 3 columns to select any in a triplet by another in a triplet (individual values will never be met more than once in the entire map). But a 2-column bidirectional map can help too.
Guava has a bimap that you can use along with
import scala.collection.JavaConversions._
My BiMap approach:
object BiMap {
private[BiMap] trait MethodDistinctor
implicit object MethodDistinctor extends MethodDistinctor
}
case class BiMap[X, Y](map: Map[X, Y]) {
def this(tuples: (X,Y)*) = this(tuples.toMap)
private val reverseMap = map map (_.swap)
require(map.size == reverseMap.size, "no 1 to 1 relation")
def apply(x: X): Y = map(x)
def apply(y: Y)(implicit d: BiMap.MethodDistinctor): X = reverseMap(y)
val domain = map.keys
val codomain = reverseMap.keys
}
val biMap = new BiMap(1 -> "A", 2 -> "B")
println(biMap(1)) // A
println(biMap("B")) // 2
Of course one can add syntax for <-> instead of ->.
Here's a quick Scala wrapper for Guava's BiMap.
import com.google.common.{collect => guava}
import scala.collection.JavaConversions._
import scala.collection.mutable
import scala.languageFeature.implicitConversions
class MutableBiMap[A, B] private (
private val g: guava.BiMap[A, B] = new guava.HashBiMap[A, B]()) {
def inverse: MutableBiMap[B, A] = new MutableBiMap[B, A](g.inverse)
}
object MutableBiMap {
def empty[A, B]: MutableBiMap[A, B] = new MutableBiMap()
implicit def toMap[A, B] (x: MutableBiMap[A, B]): mutable.Map[A,B] = x.g
}
I have a really simple BiMap in Scala:
case class BiMap[A, B](elems: (A, B)*) {
def groupBy[X, Y](pairs: Seq[(X, Y)]) = pairs groupBy {_._1} mapValues {_ map {_._2} toSet}
val (left, right) = (groupBy(elems), groupBy(elems map {_.swap}))
def apply(key: A) = left(key)
def apply[C: ClassTag](key: B) = right(key)
}
Usage:
val biMap = BiMap(1 -> "x", 2 -> "y", 3 -> "x", 1 -> "y")
assert(biMap(1) == Set("x", "y"))
assert(biMap("x") == Set(1, 3))
I don't think it exists out of the box, because the generic behavior is not easy to extract
How to handle values matching several keys in a clean api?
However for specific cases here is a good exercise that might help. It must be updated because no hash is used and getting a key or value is O(n).
But the idea is to let you write something similar to what you propose, but using Seq instead of Map...
With the help of implicit and trait, plus find, you could emulate what you need with a kind of clean api (fromKey, fromValue).
The specificities is that a value is not supposed to appear in several places... In this implementation at least.
trait BiMapEntry[K, V] {
def key:K
def value:V
}
trait Sem[K] {
def k:K
def <->[V](v:V):BiMapEntry[K, V] = new BiMapEntry[K, V]() { val key = k; val value = v}
}
trait BiMap[K, V] {
def fromKey(k:K):Option[V]
def fromValue(v:V):Option[K]
}
object BiMap {
implicit def fromInt(i:Int):Sem[Int] = new Sem[Int] {
def k = i
}
implicit def fromSeq[K, V](s:Seq[BiMapEntry[K, V]]) = new BiMap[K, V] {
def fromKey(k:K):Option[V] = s.find(_.key == k).map(_.value)
def fromValue(v:V):Option[K] = s.find(_.value == v).map(_.key)
}
}
object test extends App {
import BiMap._
val a = 1 <-> "a"
val s = Seq(1 <-> "a", 2 <-> "b")
println(s.fromKey(2))
println(s.fromValue("a"))
}
Scala is immutable and values are assigned as reference not copy, so memory footprint will for reference/pointer storage only, which it's better to use to two maps, with type A being key for first and type being B being key for second mapped to B and A respectively, than tun time swapping of maps. And the swapping implementation also has it's own memory footprint and the newly swapped hash-map will also be there in memory till the execution of parent call back and the garbage collector call. And if the the swapping of map is required frequently than virtually your are using equally or more memory than the naive two maps implementation at starting.
One more approach you can try with single map is this(will work only for getting key using mapped value):
def getKeyByValue[A,B](map: Map[A,B], value: B):Option[A] = hashMap.find((a:A,b:B) => b == value)
Code for Scala implementation of find by key:
/** Find entry with given key in table, null if not found.
*/
#deprecatedOverriding("No sensible way to override findEntry as private findEntry0 is used in multiple places internally.", "2.11.0")
protected def findEntry(key: A): Entry =
findEntry0(key, index(elemHashCode(key)))
private[this] def findEntry0(key: A, h: Int): Entry = {
var e = table(h).asInstanceOf[Entry]
while (e != null && !elemEquals(e.key, key)) e = e.next
e
}