Best mongodb driver for Play Framework 2.1 - mongodb

what's the best MongoDB driver for Play Framework 2.1?
I am trying ReactiveMongo at this moment but I cannot find good documentation in anywhere and I have my doubts about its future development.
Which driver is the most popular and supported?
Thanks,
GA

I didn't do any comparison so I wouldn't claim it's the best, but when I started with my current project there was only salat with its Play! plugin. It's quite well documented (see its github wiki) and under active development. I'd say it has production quality. If the documentation isn't enough for you there are samples of usage in the test suites in the repository.

We were sort of unsatisfied with the way Casbah works for deep objects or simple maps and no real case class mapping support so we rolled our own MongoDB Synchronous Scala driver on top of the legacy java driver which I would like to shamelessly plug here with an example on how to store and retrieve a map and a simple case class. The driver does not have a lot of magic and is easy to setup and features a simple BSON implementation which was inspired by the Play2 JSON impl.
Here is how to use it with some simple values:
val client = MongoClient("hostname", 27017)
val db = client("dbname")
val coll = db("collectionname")
coll.save(Bson.doc("_id" -> 1, "vals" -> Map("key1" -> "val1")))
val docOpt = coll.findOneById(1) // => Option[BsonDoc]
for(doc <- docOpt)
println(doc.as[Map[String, String]]("vals")("key1")) // => prints "val1"
And with a case class it is some mapping needed but this was a design decision we made because we wanted to have the mapping completely customizable without understanding any real framework:
case class DnsRecord(host: String = "", ttl: Long = 0, otherProps: Map[String, String] = Map())
case object DnsRecord {
implicit object DnsRecordToBsonElement extends ToBsonElement[DnsRecord] {
def toBson(v: DnsRecord): BsonElement = DnsRecordToBsonDoc.toBson(v)
}
implicit object DnsRecordFromBsonElement extends FromBsonElement[DnsRecord] {
def fromBson(v: BsonElement): DnsRecord = DnsRecordFromBsonDoc.fromBson(v.asInstanceOf[BsonDoc])
}
implicit object DnsRecordFromBsonDoc extends FromBsonDoc[DnsRecord] {
def fromBson(d: BsonDoc): DnsRecord = DnsRecord(
d[String]("host"),
d[Long]("ttl"),
d[Map[String, String]]("op")
)
}
implicit object DnsRecordToBsonDoc extends ToBsonDoc[DnsRecord] {
def toBson(m: DnsRecord): BsonDoc = Bson.doc(
"host" -> m.host,
"ttl" -> m.ttl,
"op" -> m.otherProps
)
}
}
coll.save(DnsRecord("test.de", 4456, Map("p2" -> "val1")))
for (r <- coll.findAs[DnsRecord](Bson.doc("host" -> "test.de")))
println(r.host)

You could try to use Moscale I implement this library with my colegues as part of another proprietary project. Library is used in production now. There is lack of documentation, but it extreamly usefull and very simple. You can look at tests, instead of documentation and there is a short example of simple usage.

Related

How to print contents of a BSONDocument

I am tyring to implement CRUD operations using reactiveMongo, and here is my find function from a tutorial online.
def findTicker(ticker: String) = {
val query = BSONDocument("firstName" -> ticker)
val future = collection.find(query).one
future.onComplete {
case Failure(e) => throw e
case Success(result) => {
println(result)
}
}
}
However I am getting this Result:
Some(BSONDocument(<non-empty>))
How can I actually see an actual readable JSON data I am looing for:
{ "_id" : ObjectId("569914557b85c62b49634c1d"), "firstName" : "Stephane", "lastName" : "Godbillon", "age" : 29 }
You can do this without playframework module. There is a pretty function specialy for this:
result match{
case Some(document) => println(BSONDocument.pretty(document))
case None => println("No document")
}
With Play-ReactiveMongo
So you have a few options. It looks like your using the Play framework and then I assume Play-ReactiveMongo Plugin. If thats the case checkout this question Its a bit different but I think you can re-use the ideas from the submitted answer.
import play.modules.reactivemongo.json.BSONFormats._
and then in your success case
case Success(result) => {
result.map { data =>
Json.toJson(data)
}
There are other options to convert BSONDocuments to JSON but Play-ReactiveMongo makes things easier.
Without the Play-ReactiveMongo plugin you will need to tell ReactiveMongo how to Write and Read your data. To do this ReactiveMongo uses BSONDocumentReaders & BSONDocumentWriters. They do provide a Macro to generate these for most classes this link has more info
import reactivemongo.bson._
//lets say your domain/case class is called Person
implicit val personHandler:BSONHandler[BSONDocument,Person] = Macros.handler[Person]
A BSONHandler gathers both BSONReader and BSONWriter traits and you can place this implicit in Persons companion object.
ReactiveMongos one method is generic on the type of entity it is looking for and takes an implicit reader for your entity.
def one[T](readPreference: ReadPreference)(implicit reader: Reader[T], ec: ExecutionContext): Future[Option[T]]
So in this example it would use the Reader generated from the Macro above to return a Future[Option[Person]] instead of Future[Option[BSONDocument]]. Then you can use the Play JSON to write your domain in JSON
For full disclosure you can write your own customer writers rather than use the Macro and these end up being similar to writing Play JSON writers and readers
THIS ANSWER IS BASED ON #Barry's PREVIOUS ANSWER BEFORE THE EDITS:
I got it to work using the play-reactivemongo updated version:
"org.reactivemongo" %% "play2-reactivemongo" % "0.11.9",
Now,
result.map { data =>
println(Json.toJson(data))
}
returns what I want:
{"_id":0,"name":"MongoDB","type":"database","count":1,"info":{"x":203,"y":102}}

Scalding TypedPipe API External Operations pattern

I have a copy of Programming MapReduce with Scalding by Antonios Chalkiopoulos. In the book he discusses the External Operations design pattern for Scalding code. You can see an example on his website here. I have made a choice to use the Type Safe API. Naturally, this introduces new challenges but I prefer it over the Fields API which is what is heavily discussed in the book I have previously mentioned and the site.
I am wondering how people have implemented the external operations pattern with the Type Safe API. My initial implementation is as follows:
I create a class that extends com.twitter.scalding.Job which will
serve as my Scalding job class where I will 'manage arguments, define
taps, and use external operations to construct data processing
pipelines'.
I create an object where I define my functions to be used in the Type
Safe pipes. Because the Type Safe pipes take as arguments a function,
I can then just pass the functions in the object as arguments to the
pipes.
This creates code that looks like this:
class MyJob(args: Args) extends Job(args) {
import MyOperations._
val input_path = args(MyJob.inputArgPath)
val output_path = args(MyJob.outputArgPath)
val eventInput: TypedPipe[(LongWritable, Text)] = this.mode match {
case m: HadoopMode => TypedPipe.from(WritableSequenceFile[LongWritable, Text](input_path))
case _ => TypedPipe.from(WritableSequenceFile[LongWritable, Text](input_path))
}
val eventOutput: FixedPathSource with TypedSink[(LongWritable, Text)] with TypedSource[(LongWritable, Text)] = this.mode match {
case m: HadoopMode => WritableSequenceFile[LongWritable, Text](output_path)
case _ => TypedTsv[(LongWritable, Text)](output_path)
}
val validatedEvents: TypedPipe[(LongWritable, Either[Text, Event])] = eventInput.map(convertTextToEither).fork
validatedEvents.filter(isEvent).map(removeEitherWrapper).write(eventOutput)
}
object MyOperations {
def convertTextToEither(v: (LongWritable, Text)): (LongWritable, Either[Text, Event]) = {
...
}
def isEvent(v: (LongWritable, Either[Text, Event])): Boolean = {
...
}
def removeEitherWrapper(v: (LongWritable, Either[Text, Event])): (LongWritable, Text) = {
...
}
}
As you can see, the functions that are passed to the Scalding Type Safe operations are kept separate from the job itself. While this is not as 'clean' as the external operations pattern presented, this is a quick way to write this kind of code. Additionally, I can use JUnitRunner for doing job level integration tests and ScalaTest for function level unit tests.
The main point of this post though is to ask how people are doing this sort of thing? The documentation around the internet for Scalding Type Safe API is sparse. Are there more functional Scala friendly ways for doing this? Am I missing a key component here for the design pattern? I sort of feel nervous about this because with the Fields API you can write unit tests on pipes with ScaldingTest. As far as I know, you can't do that with TypedPipes. Please let me know if there is a generally agreed upon pattern for Scalding Type Safe API or how you create reusable, modular, and testable Type Safe API code. Thanks for the help!
Update 2 after Antonios' reply
Thank you for the reply. That was basically the answer I was looking for. I wanted to continue the conversation. The main issue I see in your answer as I commented was that this implementation expects a specific type implementation but what if the types change throughout your job? I have explored this code and it seems to work but it seems hacked on.
def self: TypedPipe[Any]
def testingPipe: TypedPipe[(LongWritable, Text)] = self.map(
(firstVar: Any) => {
val tester = firstVar.asInstanceOf[(LongWritable, Text)]
(tester._1, tester._2)
}
)
The upside to this is I declare one implementation of self but the downside is this ugly type casting. Additionally, I have not tested this out in depth with a more complex pipeline. So basically, what are your thoughts on how to handle types as they change with only one self implementation for cleanliness/brevity?
Scala extension methods are implemented using implicit classes.
You add to the compiler the capability of converting a TypedPipe into a (wrapper) class that contains your external operations:
import com.twitter.scalding.TypedPipe
import com.twitter.scalding._
import cascading.flow.FlowDef
class MyJob(args: Args) extends Job(args) {
implicit class MyOperationsWrapper(val self: TypedPipe[Double]) extends MyOperations with Serializable
val pipe = TypedPipe.from(TypedTsv[Double](args("input")))
val result = pipe
.operation1
.operation2(x => x*2)
.write(TypedTsv[Double](args("output")))
}
trait MyOperations {
def self: TypedPipe[Double]
def operation1(implicit fd: FlowDef): TypedPipe[Double] =
self.map { x =>
println(s"Input: $x")
x / 100
}
def operation2(datafn:Double => Double)(implicit fd: FlowDef): TypedPipe[Double] =
self.map { x=>
val result = datafn(x)
println(s"Result: $result")
result
}
}
import org.apache.hadoop.util.ToolRunner
import org.apache.hadoop.conf.Configuration
object MyRunner extends App {
ToolRunner.run(new Configuration(), new Tool, (classOf[MyJob].getName :: "--local" ::
"--input" :: "doubles.tsv" ::
"--output":: "result.tsv" :: args.toList).toArray)
}
Regarding how to manage types across the pipes, my recommendation would be to try to work out some basic types that make sense and use case classes. To use your example i would rename the method convertTextToEither into extractEvents :
case class LogInput(l : Long, text: Text)
case class Event(data: String)
def extractEvents( line : LogInput ): TypedPipe[Event] =
self.filter( isEvent(line) )
.map ( getEvent(line.text) )
Then you would have
LogInputOperations for LogInput types
EventOperations for Event types
I am not sure what is the problem you see with the snippet you showed, and why you think it is "less clean". It looks fine to me.
As for the unit testing jobs using typed API question, take a look at JobTest, it seems to be just what you are looking for.

Transform Enumerator[T] into List[T]

I am integrating an application using ReactiveMongo with a legacy application.
As, I must maintain legacy application interfaces at some point I must block and/or transform my code into the specified interface types. I have that code distilled down to the example below.
Is there a better way than, getChunks, to consume all of an Enumerator with the output type being a List? What is the standard practice?
implicit def legacyAdapter[TInput,TResult]
(block: Future[Enumerator[TInput]])
(implicit translator : (TInput => TResult),
executionContext:ExecutionContext,
timeOut : Duration): List[TResult] = {
val iter = Iteratee.getChunks[TResult]
val exhaustFuture = block.flatMap{
enumy => { enumy.map(i => translator(i) ).run(iter) }
}
val r = Await.result(exhaustFuture , timeOut)
r
}
Iteratee.getChunks is the only utility offered by playframework that build a list by consuming all chuncks of an enumerator, you can of course do the same thing using Iteratee.fold but you will reinvent the wheel as Iteratee.getChunks uses Iteratee.fold.
We found that exposing the Collect method of http://reactivemongo.org/releases/0.10/api/index.html#reactivemongo.api.Cursor to be more performant than collecting the chunks of the Enumerator. So in a sense we solved our problem by changing the problem.
That did mean an API change for our data layer but the performance improvements allowed for it.

Scala pickling: Simple custom pickler for my own class?

I am trying to pickle some relatively-simple-structured but large-and-slow-to-create classes in a Scala NLP (natural language processing) app of mine. Because there's lots of data, it needs to pickle and esp. unpickle quickly and without bloat. Java serialization evidently sucks in this regard. I know about Kryo but I've never used it. I've also run into Apache Avro, which seems similar although I'm not quite sure why it's not normally mentioned as a suitable solution. Neither is Scala-specific and I see there's a Scala-specific package called Scala Pickling. Unfortunately it lacks almost all documentation and I'm not sure how to create a custom pickler.
I see a question here:
Scala Pickling: Writing a custom pickler / unpickler for nested structures
There's still some context lacking in that question, and also it looks like an awful lot of boilerplate to create a custom pickler, compared with the examples given for Kryo or Avro.
Here's some of the classes I need to serialize:
trait ToIntMemoizer[T] {
protected val minimum_raw_index: Int = 1
protected var next_raw_index: Int = minimum_raw_index
// For replacing items with ints. This is a wrapper around
// gnu.trove.map.TObjectIntMap to make it look like mutable.Map[T, Int].
// It behaves the same way.
protected val value_id_map = trovescala.ObjectIntMap[T]()
// Map in the opposite direction. This is a wrapper around
// gnu.trove.map.TIntObjectMap to make it look like mutable.Map[Int, T].
// It behaves the same way.
protected val id_value_map = trovescala.IntObjectMap[T]()
...
}
class FeatureMapper extends ToIntMemoizer[String] {
val features_to_standardize = mutable.BitSet()
...
}
class LabelMapper extends ToIntMemoizer[String] {
}
case class FeatureLabelMapper(
feature_mapper: FeatureMapper = new FeatureMapper,
label_mapper: LabelMapper = new LabelMapper
)
class DoubleCompressedSparseFeatureVector(
var keys: Array[Int], var values: Array[Double],
val mappers: FeatureLabelMapper
) { ... }
How would I create custom pickers/unpicklers in way that uses as little boilerplate as possible (since I have a number of other classes that need similar treatment)?
Thanks!

which driver can i use to access mongodb in scala swing application?

HI..
I am working with the scala n mongodb.
now i want access mongodb database in scala swing application.
so which drivers can i use for it?
and which can easily work?
please reply
I've been using casbah http://api.mongodb.org/scala/casbah/2.0.2/index.html to talk to mongodb from my scala swing application.
It's pretty easy to install and setup, and the API is quite scala-esque.
The hardest part is understanding mongodb itself, (coming from an sql background)
We were sort of unsatisfied with the way Casbah works for deep objects or simple maps and no real case class mapping support so we rolled our own MongoDB Synchronous Scala driver on top of the legacy java driver which I would like to shamelessly plug here with an example on how to store and retrieve a map and a simple case class. The driver does not have a lot of magic and is easy to setup and features a simple BSON implementation which was inspired by the Play2 JSON impl.
Here is how to use it with some simple values:
val client = MongoClient("hostname", 27017)
val db = client("dbname")
val coll = db("collectionname")
coll.save(Bson.doc("_id" -> 1, "vals" -> Map("key1" -> "val1")))
val docOpt = coll.findOneById(1) // => Option[BsonDoc]
for(doc <- docOpt)
println(doc.as[Map[String, String]]("vals")("key1")) // => prints "val1"
For a case class it is a little bit more complex but it is all handrolled and there is no magic involved so you can do whatever you like and how you need it, i.e. provider some shorter key names in the doc:
case class DnsRecord(host: String = "", ttl: Long = 0, otherProps: Map[String, String] = Map())
case object DnsRecord {
implicit object DnsRecordToBsonElement extends ToBsonElement[DnsRecord] {
def toBson(v: DnsRecord): BsonElement = DnsRecordToBsonDoc.toBson(v)
}
implicit object DnsRecordFromBsonElement extends FromBsonElement[DnsRecord] {
def fromBson(v: BsonElement): DnsRecord = DnsRecordFromBsonDoc.fromBson(v.asInstanceOf[BsonDoc])
}
implicit object DnsRecordFromBsonDoc extends FromBsonDoc[DnsRecord] {
def fromBson(d: BsonDoc): DnsRecord = DnsRecord(
d[String]("host"),
d[Long]("ttl"),
d[Map[String, String]]("op")
)
}
implicit object DnsRecordToBsonDoc extends ToBsonDoc[DnsRecord] {
def toBson(m: DnsRecord): BsonDoc = Bson.doc(
"host" -> m.host,
"ttl" -> m.ttl,
"op" -> m.otherProps
)
}
}
coll.save(DnsRecord("test.de", 4456, Map("p2" -> "val1")))
for (r <- coll.findAs[DnsRecord](Bson.doc("host" -> "test.de")))
println(r.host)
As an update for people finding this thread and interested in MongoDB 3.X. We're using Async driver which can be found here https://github.com/evojam/mongodb-driver-scala. It's API is build in Scala way with new Play 2.4 module ready if you're using it, but you can always take only driver.