Saving Nested Case Class With Spark Cassandra Connector - scala

I'm trying to work out how to save nested case classes with Spark Cassandra Connector. As a simple example:
Case classes:
case class Foo(id: String, bar: Bar)
case class Bar(field1: String)
Cassandra table:
CREATE TABLE foo (id text, field1 text, PRIMARY KEY (id));
Spark code:
val foo = Foo("a", Bar("b"))
val fooRdd = sparkContext.parallelize(Seq(foo))
fooRdd.saveToCassandra(cassandraKeyspace, "foo")
Results in:
Exception in thread "main" java.lang.IllegalArgumentException: requirement failed: Columns not found in Foo: [field1]
I realise I could make a new case class that flattens out Foo, but I'd rather not do this if possible. Ive played around with Column Mappers but to no avail. Is there a better way?

Related

Doobie Query for Entities with Value Class Fields

I am using Doobie to manage database persistence. I use value class fields in my entity Foo i.e.
case class CreatedOn(value: LocalDateTime) extends AnyVal
case class Resource(value: String) extends AnyVal
case class Foo(id: Option[Int], resource: Resource, createdOn: CreatedOn)
implicit fooRead: Read[Foo] = Read[(Option[Int], String, LocalDateTime)].map {
case (oid, resource, createdOn) => Foo(oid, Resource(resource), CreatedOn(createdOn))
}
implicit fooWrite: Write[Foo] = Write[(Option[Int], String, LocalDateTime)].contramap {e => (e.oid, e.resource.value, e.createdOn.value}
Yet the compiler complains missing Read[(Int, String, LocalDateTime)].map {...
Any suggestions on how I can fix this? Is using value class a bad idea, to begin with, when comes to the entity? Thanks
Doobie is able to read queries into case classes. If you imagine case classes as tuples and you imagine that you can flatten them then this is what Doobie does when you query things into a case class.
Foo(Some(1), Resource("test"), CreatedOn(time))
(Some(1), Tuple1("test"), Tuple1(time))
(Some(1), "test", time) // (Option[Int], String, LocalDateTime)
If your derivation fails you can check for which of these fields:
sql"".query[Int]
sql"".query[String]
sql"".query[LocalDateTime]
The line which fail compilation tells you the missing instance.
From my experience it is time instance. You have to import them separately, which you probably didn't import
doobie.implicits.javatime._

create cassandra table for scala nested case class

I am storing a scala case class data in Cassandra table, for that, I need to define User-defined type. I can write cql query but do not know how to parse it.com.datastax.driver.mapping.annotations.UDT
I have tried this annotation but it does not work me. I think I'm completely out of the track.
I have also tried Session class belong to com.datastax.driver.core.Session.
and my conclusion is I have no idea how to do it I am just using hit and trail.
case class Properties(name: String,
label: String,
description: String,
groupName: String,
fieldDataType: String,
options: Seq[OptionalData]
)
object Properties{
implicit val format: Format[Properties] = Json.format[Properties]
}
case class OptionalData(label: String, name: String)
object OptionalData{
implicit val format: Format[OptionalData] = Json.format[OptionalData]
}
and my query is:
val optionalData: String=
"""
|CREATE TYPE IF NOT EXISTS optionaldata(
|label text,
|name text
);
""".stripMargin
val createPropertiesTable: String = """
|CREATE TABLE IF NOT EXISTS prop(
|name text Primary Key,
|label text,
|description text,
|groupname text,
|fielddatatype text,
|options LIST<frozen<optionaldata>>
);
""".stripMargin
com.datastax.driver.core.exceptions.InvalidQueryException: Unknown typ
e leadpropdb3.optionaldata
java.util.concurrent.ExecutionException: com.datastax.driver.core.exceptions.InvalidQueryException: Unknown type leadpropdb3.optionaldata
at com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:552)
at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:513)
at akka.persistence.cassandra.package$ListenableFutureConverter$$anon$2.$anonfun$run$2(package.scala:25)
at scala.util.Try$.apply(Try.scala:213)
at akka.persistence.cassandra.package$ListenableFutureConverter$$anon$2.run(package.scala:25)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.datastax.driver.core.exceptions.InvalidQueryException: Unknown type leadpropdb3.optionaldata
From error message it's clear that the type wasn't created - you need to create it before creating the table - be very careful when executing CQL statements from your code - you need to wait until schema is an agreement, before you execute next statement. Here is an example of Java code that does this - it's easy to convert it into Scala.
When you're using Object Mapper with Scala, you need to obey some rules (I hope that my blog post on that topic will be published soon):
You need to use Java types - List instead of Seq, etc., or use extra codecs for Scala;
Case classes should have empty constructor.
but otherwise it's possible to use object mapper with Scala, like this:
#UDT(name = "scala_udt")
case class UdtCaseClass(id: Integer, #(Field #field)(name = "t") text: String) {
def this() {
this(0, "")
}
}
#Table(name = "scala_test_udt")
case class TableObjectCaseClassWithUDT(#(PartitionKey #field) id: Integer,
udt: UdtCaseClass) {
def this() {
this(0, UdtCaseClass(0, ""))
}
}
// ...
val mapperForUdtCaseClass = manager.mapper(classOf[TableObjectCaseClassWithUDT])
val objectCaseClassWithUDT = mapperForUdtCaseClass.get(new Integer(1))
println("ObjWithUdt(1)='" + objectCaseClassWithUDT + "'")
More examples are available in my repo.

scala-cass generic read from cassandra table as case class

I am attempting to use scala-cass in order to read from cassandra and convert the resultset to a case class using resultSet.as[CaseClass]. This works great when running the following.
import com.weather.scalacass.syntax._
case class TestTable(id: String, data1: Int, data2: Long)
val resultSet = session.execute(s"select * from test.testTable limit 10")
resultSet.one.as[TestTable]
Now I am attempting to make this more generic and I am unable to find the proper type constraint for the generic class.
import com.weather.scalacass.syntax._
case class TestTable(id: String, data1: Int, data2: Long)
abstract class GenericReader[T] {
val table: String
val keyspace: String
def getRows(session: Session): T = {
val resultSet = session.execute(s"select * from $keyspace.$table limit 10")
resultSet.one.as[T]
}
}
I implement this class with the desired case class and attempt to call getRows on the created Object.
object TestTable extends GenericReader[TestTable] {
val keyspace = "test"
val table = "TestTable"
}
TestTable.getRows(session)
This throws an exception could not find implicit value for parameter ccd: com.weather.scalacass.CCCassFormatDecoder[T].
I am trying to add a type constraint to GenericReader in order to ensure the implicit conversion will work. However, I am unable to find the proper type. I am attempting to read through scala-cass in order to find the proper constraint but I have had no luck so far.
I would also be happy to use any other library that can achieve this.
Looks like as[T] requires an implicit value that you don't have in scope, so you'll need to require that implicit parameter in the getRows method as well.
def getRows(session: Session)(implicit cfd: CCCassFormatDecoder[T]): T
You could express this as a type constraint (what you were looking for in the original question) using context bounds:
abstract class GenericReader[T:CCCassFormatDecoder]
Rather than try to bound your generic T type, it might be easier to just pass through the missing implicit parameter:
abstract class GenericReader[T](implicit ccd: CCCassFormatDecoder[T]) {
val table: String
val keyspace: String
def getRows(session: Session): T = {
val resultSet = session.execute(s"select * from $keyspace.$table limit 10")
resultSet.one.as[T]
}
}
Finding a concrete value for that implicit can then be deferred to when you narrow that T to a specific class (like object TestTable extends GenericReader[TestTable])

Using Couchbase as cache over Slick

I'm trying to use Couchbase as a cache layer for a relational database that is accessed using Slick. The skeleton of my code that's relevant to the question is as follows:
class RdbTable[T <: Table[_]](implicit val bucket: CouchbaseBucket) {
type ElementType = T#TableElementType
private val table = TableQuery[T].baseTableRow
private def cacheAll(implicit session: Session) =
TableQuery[T].list foreach (elem => cache(elem))
private def cache(elem: ElementType) =
table.primaryKeys foreach (pk => bucket.set[ElementType](key(pk, elem), elem))
private def key(pk: PrimaryKey, elem: ElementType) = ???
.......
}
As you can see, I want to cache each element by all of its primary keys. For this purpose, I need to obtain the value of that key for the given element. But I don't see an obvious way to compute the value of a primary key (the column value, if single-column key; the tuple value, if multi-column).
Any suggestions on what to do? Note that the code MUST NOT know what the actual tables and their columns are. It must be completely general.
We're doing something similar, using Redis as the cache. Each of our records only has one primary key, but in some cases we need to include additional data with the cache key to avoid ambiguity (for example, we have a ManyToMany record that represents an association between two records; when we return a ManyToMany record we'll embed one (but not both) of the associated records, and so in the cache key we need to include the type of the associated record that we're returning).
abstract trait Record {
val cacheKey: CacheKey
}
trait ManyToManyRecord extends Record {
override val cacheKey: ManyToManyCacheKey
}
class CacheKey(recordType: String, key: Int) {
def getKey: String = recordType + ":" + key.toString
}
class ManyToManyCacheKey(recordType: String, key: Int, assocType: String) extends CacheKey {
def getKey: String = recordType + ":" + key.toString + ":" + assocType
}
All of our tables use an integer primary key called "id", so it's easy for us to figure out what the value of "key" is. If you're working with a more complicated schema and don't want to manually write out the "def key: String" (or whatever) definitions for all of your record / table types, then you could try using Slick code generation to automatically generate record / table classes / objects with "def key" created directly from the schema. However, the learning curve for Slick code generation (or any other code generation tool) is steep, so if this is your only use for it then you'd probably be better off generating "def key" by hand. (We generate somewhere between 20-30% of our code using the code generation tool, so the initial investment in learning how to use the tool has paid off)
Slick doesn't come with a built-in primary key extractor for entities. What you can do is use either interfaces, type classes or reflection. E.g. variants of the following:
Either make your entities implement a trait
trait HasPrimaryKey{
def primaryKey: Any
}
class RdbTable[T <: Table[_ <: HasPrimaryKey]](implicit val bucket: CouchbaseBucket) {
...
private def key(elem: ElementType) = elem.primaryKey
// and for each entity:
case class Person( ... ) extends HasPrimaryKey{
def primaryKey = ...
}
or a type class
trait KeyTypeClass[E,T <: Table[E]]{
def key(e: E): Any
}
class RdbTable[T <: Table[_]](implicit val bucket: CouchbaseBucket, keyTC: KeyTypeClass[T]) {
...
private def key(elem: ElementType) = keyTC(elem)
// and for each entity:
implicit val personKey = new KeyTypeClass[Person,PersonTable]{
def key(p: Person) = ...
}
or using reflection to iterate over the primary keys and pull the values out of corresponding fields of the entity.
Code generation as mentioned by Zim-Zam can help with the repetitive elements.

Scala Slick and complex types in my database

I am very new to both Scala and Slick and in trying to learn it I am writing a small application that works with a simple database.
Most of my previous experience comes from .Net and the Entity Framework so I was wondering if like in Entity Framework with the ComplexType attribute if Slick would allow me to do the same.
Basically one of my tables is a 1-1 relationship and for some of them I would prefer to just create an object and use it as a complex type. Obviously in the database this is just extra columns on the table, but I was wondering if Slick could map those columns to an object in my Table class. See example below.
I will use a Blog entry example.
My main class that extends Table is BlogEntry and it contains the text of the entry. Then say I wanted in that class another class called EntryDetails that contained the time the entry was posted and the time it was last updated.
In the database all those fields will be in the same table, but when read in it will be one object containing the other object. Is this possible with Slick?
I think that this solves your problem
trait Mapping {
//Need to change JdbcDriver to the driver that you will use in your code (MySql, Postgres, etc)
import scala.slick.driver.JdbcDriver.profile.simple._
// Models
case class EntryDetails(createDate: Option[DateTime] = None, updateDate: Option[DateTime] = None)
case class Entry(id: Int, text: String, details: EntryDetails)
//Implicit Joda Mappers for datetime columns
implicit def timestamp2dateTime = MappedColumnType.base[DateTime, Timestamp](
dateTime => new Timestamp(dateTime.getMillis),
date => new DateTime(date))
//Table mapping
class Entries(tag: Tag) extends Table[Entry](tag, "entry") {
def entryDetails = (createDate, updateDate) <>(EntryDetails.tupled, EntryDetails.unapply)
def * = (id, text, entryDetails) <>(Entry.tupled, Entry.unapply)
val id: Column[Int] = column[Int]("id")
val text: Column[String] = column[String]("text")
val createDate: Column[Option[DateTime]] = column[Option[DateTime]]("createDate")
val updateDate: Column[Option[DateTime]] = column[Option[DateTime]]("updateDate")
}
//Table query, used in slick 2.0 for querying a table
lazy val EntryTableQuery = TableQuery[Entries]
}
I included everything in a trait Mapping, to pack the code for your answer.
As far as I understood, you want to map a table to two objects, one inside the other, this can be achieved by creating another mapping method, here called entryDetails, that map the table queries to the EntryDetails model object. Then you can add this mapping method to your object mapping, the * method. The method entryDetails will be just another parameter of that mapping method.