Non-relational database in statically typed languages (rethinkdb, Scala) - scala

I'm still pretty new to Scala and arrived at some kind of typing-roadblock.
Non-SQL databases such as mongo and rethinkdb do not enforce any scheme for their tables and manage data in json format. I've been struggling to get the java API for rethinkdb to work on Scala and there seems to be surprisingly low information on how to actually use the results returned from the database.
Assuming a simple document schema such as this:
{
"name": "melvin",
"age": 42,
"tags": ["solution"]
}
I fail to get how to actually this data in Scala. After running a query, for example, by running something like r.table("test").run(connection), I receive an object from which I can iterate AnyRef objects. In the python word, this most likely would be a simple dict. How do I convey the structure of this data to Scala, so I can use it in code (e.g., query fields of the returned documents)?

From a quick scan of the docs and code, the Java Rethink client uses Jackson to handle deserialization of the JSON received from the DB into JVM objects. Since by definition every JSON object received is going to be deserializable into a JSON AST (Abstract Syntax Tree: a representation in plain Scala objects of the structure of a JSON document), you could implement a custom Jackson ObjectMapper which, instead of doing the usual Jackson magic with reflection, always deserializes into the JSON AST.
For example, Play JSON defers the actual serialization/deserialization to/from JSON to Jackson: it installs a module into a vanilla ObjectMapper which specially takes care of instances of JsValue, which is the root type of Play JSON's AST. Then something like this should work:
import com.fasterxml.jackson.databind.ObjectMapper
import play.api.libs.json.jackson.PlayJsonModule
// Use Play JSON's ObjectMapper... best to do this before connecting
RethinkDB.setResultMapper(new ObjectMapper().registerModule(new PlayJsonModule(JsonParserSettings())))
run(connection) returns a Result[AnyRef] in Scala notation. There's an alternative version, run(connection, typeRef), where the second argument specifies a result type; this is passed to the ObjectMapper to ensure that every document will either fail to deserialize or be an instance of that result type:
import play.api.libs.json.JsValue
val result = r.table("table").run(connection, classOf[JsValue]) : Result[JsValue]
You can then get the next element from the result as a JsValue and use the usual Play JSON machinery to convert the JsValue into your domain type:
import play.api.libs.json.Json
case class MyDocument(name: String, age: Int, tags: Seq[String])
object MyDocument {
implicit val jsonFormat = Json.format[MyDocument]
}
// result is a Result[JsValue] ... may need an import MyDocument.jsonFormat or similar
val myDoc = Json.fromJson[MyDocument](result.next()).asOpt[MyDocument] : Option[MyDocument]
There's some ability with enrichments to improve the Scala API to make a lot of this machinery more transparent.
You could do similar things with the other Scala JSON ASTs (e.g. Circe, json4s), but might have to implement functionality similar to what Play does with the ObjectMapper yourself.

Related

Scala function TypeTag: T use type T in function

I need to parse several json fields, which I'm using Play Json to do it. As parsing may fail, I need to throw a custom exception for each field.
To read a field, I use this:
val fieldData = parseField[String](json \ fieldName, "fieldName")
My parseField function:
def parseField[T](result: JsLookupResult, fieldName: String): T = {
result.asOpt[T].getOrElse(throw new IllegalArgumentException(s"""Can't access $fieldName."""))
}
However, I get an error that reads:
Error:(17, 17) No Json deserializer found for type T. Try to implement
an implicit Reads or Format for this type.
result.asOpt[T].getOrElse(throw new IllegalArgumentException(s"""Can't access $fieldName."""))
Is there a way to tell the asOpt[] to use the type in T?
I strongly suggest that you do not throw exceptions. The Play JSON API has both a JsSuccess and JsError types that will help you encode parsing errors.
As per the documentation
To convert a Scala object to and from JSON, we use Json.toJson[T: Writes] and Json.fromJson[T: Reads] respectively. Play JSON provides the Reads and Writes typeclasses to define how to read or write specific types. You can get these either by using Play's automatic JSON macros, or by manually defining them. You can also read JSON from a JsValue using validate, as and asOpt methods. Generally it's preferable to use validate since it returns a JsResult which may contain an error if the JSON is malformed.
See https://github.com/playframework/play-json#reading-and-writing-objects
There is also a good example on the Play Discourse forum on how the API manifests in practice.

Converting a nested List in using Gson.tojson

I'm using Scala, but this applies to Java as well - it appears Gson is having issues converting a List nested within a case class/within another object:
case class Candy(name:String, price:Double)
case class PersonCandy(name:String, age:Int, candyList:List[Candy])
val candies = List(Candy("M&M's", 1.99), Candy("Snickers", 1.59), Candy("Butterfinger", 2.33))
val personC = PersonCandy("Mark", 19, candies)
val pollAsJson = new Gson().toJson(personC)
The REPL shows the resulting pollAsJson as follows:
pollAsJson: String = {"name":"Mark","age":19,"candyList":{}}
My workaround could be to convert the nested candyList, convert the personC and then since both are just Strings, manually hack them toether, but this is less than ideal. Reading blogs and usages and the like seems that Gson can extract and convert nested inner classes just fine, but when a Collection is the nested class, it seems to have issues. Any idea?
The problem is not related with case classes or nesting, but rather with Scala collection types which are not supported properly in Gson.
To prove that, let's change PersonCandy to
case class PersonCandy(name:String, age:Int, candyList: java.util.List[Candy])
And convert candies to a Java List:
import scala.collection.JavaConverters._
val candies = List(/* same items */).asJava
And the JSON is OK:
{"name":"Mark","age":19,"candyList":[{"name":"M\u0026M\u0027s","price":1.99},{"name":"Snickers","price":1.59},{"name":"Butterfinger","price":2.33}]}
And if you try to produce a JSON for the original candies list, it will produce:
{"head":{"name":"M\u0026M\u0027s","price":1.99},"tl":{}}
Which reflects the head :: tail structure of the list.
Gson is a lib primarily used with Java code.
For Scala, there is a variety of choices. I'd suggest to try some from the answers to this question.
You can use lift-json module to render json strings from case classes. It is a nice easy to use library with a really good DSL to create JSON strings without the need of case classes. You can find more about it at Lift-Json Github page

Base64 encoding for akka ByteString

I've been searching a way to change Scala's Array[Byte] to some other immutable collection class. I'd like to refactor those said bytearrays to a more elegant Akka ByteStrings, however I've ran into a problem with encoding. As far as I know (I'm a scala newbie) encoding ByteStrings and Array[Bytes] with Jackson serialization returns different results for both datatypes using same values using Base64. So my question is, is there a nice and clean way to refactor Arrays to immutable collections while keeping the same Base64 serialized values as the old Array had?
I'd use circe or argonaut instead of Jackson for the json serialization. They easily allow you to provide a custom encoder for different types.
implicit val byteArrayEncoder =
Encoder.encodeString.contramap[Array[Byte]](new String(_))
implicit val byteStringEncoder =
Encoder.encodeString.contramap[ByteString](_.UTF_8)
Then you can use it like ..
import io.circe._, io.circe.generic.auto._, io.circe.parser._, io.circe.syntax._
case class Whatever(bytes: Array[Byte], bstring: ByteString)
Whatever(???).asJson.nospaces
for more see
https://circe.github.io/circe/

Convert a Seq[String] to a case class in a typesafe way

I have written a parser which transforms a String to a Seq[String] following some rules. This will be used in a library.
I am trying to transform this Seq[String] to a case class. The case class would be provided by the user (so there is no way to guess what it will be).
I have thought to shapeless library because it seems to implement the good features and it seems mature, but I have no idea to how to proceed.
I have found this question with an interesting answer but I don't find how to transform it for my needs. Indeed, in the answer there is only one type to parse (String), and the library iterates inside the String itself. It probably requires a deep change in the way things are done, and I have no clue how.
Moreover, if possible, I want to make this process as easy as possible for the user of my library. So, if possible, unlike the answer in link above, the HList type would be guess from the case class itself (however according to my search, it seems the compiler needs this information).
I am a bit new to the type system and all these beautiful things, if anyone is able to give me an advice on how to do, I would be very happy!
Kind Regards
--- EDIT ---
As ziggystar requested, here is some possible of the needed signature:
//Let's say we are just parsing a CSV.
#onUserSide
case class UserClass(i:Int, j:Int, s:String)
val list = Seq("1,2,toto", "3,4,titi")
// User transforms his case class to a function with something like:
val f = UserClass.curried
// The function created in 1/ is injected in the parser
val parser = new Parser(f)
// The Strings to convert to case classes are provided as an argument to the parse() method.
val finalResult:Seq[UserClass] = parser.parse(list)
// The transfomation is done in two steps inside the parse() method:
// 1/ first we have: val list = Seq("1,2,toto", "3,4,titi")
// 2/ then we have a call to internalParserImplementedSomewhereElse(list)
// val parseResult is now equal to Seq(Seq("1", "2", "toto"), Seq("3","4", "titi"))
// 3/ finally Shapeless do its magick trick and we have Seq(UserClass(1,2,"toto"), UserClass(3,4,"titi))
#insideTheLibrary
class Parser[A](function:A) {
//The internal parser takes each String provided through argument of the method and transforms each String to a Seq[String]. So the Seq[String] provided is changed to Seq[Seq[String]].
private def internalParserImplementedSomewhereElse(l:Seq[String]): Seq[Seq[String]] = {
...
}
/*
* Class A and B are both related to the case class provided by the user:
* - A is the type of the case class as a function,
* - B is the type of the original case class (can be guessed from type A).
*/
private def convert2CaseClass[B](list:Seq[String]): B {
//do something with Shapeless
//I don't know what to put inside ???
}
def parse(l:Seq[String]){
val parseResult:Seq[Seq[String]] = internalParserImplementedSomewhereElse(l:Seq[String])
val finalResult = result.map(convert2CaseClass)
finalResult // it is a Seq[CaseClassProvidedByUser]
}
}
Inside the library some implicit would be available to convert the String to the correct type as they are guessed by Shapeless (similar to the answered proposed in the link above). Like string.toInt, string.ToDouble, and so on...
May be there are other way to design it. It's just what I have in mind after playing with Shapeless few hours.
This uses a very simple library called product-collecions
import com.github.marklister.collections.io._
case class UserClass(i:Int, j:Int, s:String)
val csv = Seq("1,2,toto", "3,4,titi").mkString("\n")
csv: String =
1,2,toto
3,4,titi
CsvParser(UserClass).parse(new java.io.StringReader(csv))
res28: Seq[UserClass] = List(UserClass(1,2,toto), UserClass(3,4,titi))
And to serialize the other way:
scala> res28.csvIterator.toList
res30: List[String] = List(1,2,"toto", 3,4,"titi")
product-collections is orientated towards csv and a java.io.Reader, hence the shims above.
This answer will not tell you how to do exactly what you want, but it will solve your problem. I think you're overcomplicating things.
What is it you want to do? It appears to me that you're simply looking for a way to serialize and deserialize your case classes - i.e. convert your Scala objects to a generic string format and the generic string format back to Scala objects. Your serialization step presently is something you seem to already have defined, and you're asking about how to do the deserialization.
There are a few serialization/deserialization options available for Scala. You do not have to hack away with Shapeless or Scalaz to do it yourself. Try to take a look at these solutions:
Java serialization/deserialization. The regular serialization/deserialization facilities provided by the Java environment. Requires explicit casting and gives you no control over the serialization format, but it's built in and doesn't require much work to implement.
JSON serialization: there are many libraries that provide JSON generation and parsing for Java. Take a look at play-json, spray-json and Argonaut, for example.
The Scala Pickling library is a more general library for serialization/deserialization. Out of the box it comes with some binary and some JSON format, but you can create your own formats.
Out of these solutions, at least play-json and Scala Pickling use macros to generate serializers and deserializers for you at compile time. That means that they should both be typesafe and performant.

Storing an object to a file

I want to save an object (an instance of a class) to a file. I didn't find any valuable example of it. Do I need to use serialization for it?
How do I do that?
UPDATE:
Here is how I tried to do that
import scala.util.Marshal
import scala.io.Source
import scala.collection.immutable
import java.io._
object Example {
class Foo(val message: String) extends scala.Serializable
val foo = new Foo("qweqwe")
val out = new FileOutputStream("out123.txt")
out.write(Marshal.dump(foo))
out.close
}
First of all, out123.txt contains many extra data and it was in a wrong encoding. My gut tells me there should be another proper way.
On the last ScalaDays Heather introduced a new library which gives a new cool mechanism for serialization - pickling. I think it's would be an idiomatic way in scala to use serialization and just what you want.
Check out a paper on this topic, slides and talk on ScalaDays'13
It is also possible to serialize to and deserialize from JSON using Jackson.
A nice wrapper that makes it Scala friendly is Jacks
JSON has the following advantages
a simple human readable text
a rather efficient format byte wise
it can be used directly by Javascript
and even be natively stored and queried using a DB like Mongo DB
(Edit) Example Usage
Serializing to JSON:
val json = JacksMapper.writeValueAsString[MyClass](instance)
... and deserializing
val obj = JacksMapper.readValue[MyClass](json)
Take a look at Twitter Chill to handle your serialization: https://github.com/twitter/chill. It's a Scala helper for the Kyro serialization library. The documentation/example on the Github page looks to be sufficient for your needs.
Just add my answer here for the convenience of someone like me.
The pickling library, which is mentioned by #4lex1v, only supports Scala 2.10/2.11 but I'm using Scala 2.12. So I'm not able to use it in my project.
And then I find out BooPickle. It supports Scala 2.11 as well as 2.12!
Here's the example:
import boopickle.Default._
val data = Seq("Hello", "World!")
val buf = Pickle.intoBytes(data)
val helloWorld = Unpickle[Seq[String]].fromBytes(buf)
More details please check here.