HI..
I am working with the scala n mongodb.
now i want access mongodb database in scala swing application.
so which drivers can i use for it?
and which can easily work?
please reply
I've been using casbah http://api.mongodb.org/scala/casbah/2.0.2/index.html to talk to mongodb from my scala swing application.
It's pretty easy to install and setup, and the API is quite scala-esque.
The hardest part is understanding mongodb itself, (coming from an sql background)
We were sort of unsatisfied with the way Casbah works for deep objects or simple maps and no real case class mapping support so we rolled our own MongoDB Synchronous Scala driver on top of the legacy java driver which I would like to shamelessly plug here with an example on how to store and retrieve a map and a simple case class. The driver does not have a lot of magic and is easy to setup and features a simple BSON implementation which was inspired by the Play2 JSON impl.
Here is how to use it with some simple values:
val client = MongoClient("hostname", 27017)
val db = client("dbname")
val coll = db("collectionname")
coll.save(Bson.doc("_id" -> 1, "vals" -> Map("key1" -> "val1")))
val docOpt = coll.findOneById(1) // => Option[BsonDoc]
for(doc <- docOpt)
println(doc.as[Map[String, String]]("vals")("key1")) // => prints "val1"
For a case class it is a little bit more complex but it is all handrolled and there is no magic involved so you can do whatever you like and how you need it, i.e. provider some shorter key names in the doc:
case class DnsRecord(host: String = "", ttl: Long = 0, otherProps: Map[String, String] = Map())
case object DnsRecord {
implicit object DnsRecordToBsonElement extends ToBsonElement[DnsRecord] {
def toBson(v: DnsRecord): BsonElement = DnsRecordToBsonDoc.toBson(v)
}
implicit object DnsRecordFromBsonElement extends FromBsonElement[DnsRecord] {
def fromBson(v: BsonElement): DnsRecord = DnsRecordFromBsonDoc.fromBson(v.asInstanceOf[BsonDoc])
}
implicit object DnsRecordFromBsonDoc extends FromBsonDoc[DnsRecord] {
def fromBson(d: BsonDoc): DnsRecord = DnsRecord(
d[String]("host"),
d[Long]("ttl"),
d[Map[String, String]]("op")
)
}
implicit object DnsRecordToBsonDoc extends ToBsonDoc[DnsRecord] {
def toBson(m: DnsRecord): BsonDoc = Bson.doc(
"host" -> m.host,
"ttl" -> m.ttl,
"op" -> m.otherProps
)
}
}
coll.save(DnsRecord("test.de", 4456, Map("p2" -> "val1")))
for (r <- coll.findAs[DnsRecord](Bson.doc("host" -> "test.de")))
println(r.host)
As an update for people finding this thread and interested in MongoDB 3.X. We're using Async driver which can be found here https://github.com/evojam/mongodb-driver-scala. It's API is build in Scala way with new Play 2.4 module ready if you're using it, but you can always take only driver.
Related
I'm trying to write a function that aggregates data from my MongoDB using ReactiveMongo (0.12), with Play JSON serialisation (similar to this question).
So this is what I have:
def getPopAggregate(col: JSONCollection) = {
import col.BatchCommands.AggregationFramework.{AggregationResult, Group, Match, SumField}
col.aggregate(
Group(JsString("$rstId"))("totalPopulation" -> SumField("population")),
List(Match(Json.obj("totalPopulation" -> Json.obj("$gte" -> 1000))))
).map(_.firstBatch)
}
This outputs Future[List[JsObject]], however I want to map the results to a List of my case class (i.e. Future[Seq[PopAggregate]]).
case class PopAggregate(rstId: Option[BSONObjectID], totalPopulation: Double)
object PopAggregate {
implicit val popAggregateFormat = Json.format[PopAggregate]
}
I hope someone can spare a moment to help me past this one. Many thanks!
I am working on an backend server in Play Framework in Scala. However, I am calling an external library (written in java) that returns a Java list (util.List). I created a writes for the object that is contained in the List, however I don't know how to write the writes for the actual List so that it can be generic (no need to write a "writes" for both List and List, just the writes for A and B).
I know I could use JavaConversions to convert the Java list to a Scala Seq (that already has Writes implemented), but since speed is essential, I would like to not do the extra conversion.
Here is a possible implementation
import play.api.libs.json.{JsArray, JsValue, Json, Writes}
import scala.collection.JavaConverters._
implicit def jListWrites[A: Writes] = new Writes[java.util.List[A]] {
override def writes(o: util.List[A]): JsValue = {
JsArray(o.asScala.map(Json.toJson(_)))
}
}
You don't create a single Writes but rather a method that can create them for any type that has Writes defined.
You said you want to avoid JavaConversions, but as you can see it is difficult as JsArray expects a Seq[JsValue] anyway so you need to construct a scala Seq one way or another.
What I shown here is more or less equivalent to converting java List to scala mutable.Buffer using asScala and using default Writes for Traversable.
Note that conversions are probably not as expansive as you think, they just create a wrapper, no copying involved.
Here is the best what I could come with in terms of performance
implicit def jListWrites[A: Writes] = new Writes[java.util.List[A]] {
override def writes(o: util.List[A]): JsValue = {
val buffer = new Array[JsValue](o.size)
var i = 0
while (i < o.size) {
buffer(i) = Json.toJson(o.get(i))
i += 1
}
JsArray(buffer)
}
}
It takes 29 ms for 1000000 Ints compared to 39 ms for the straightforward implementation. Note that Int is easy to convert, if your objects are more complex the speedup will be smaller.
Converting 20000 of those case class C(num: Int, n2: Int, s: String) gives equal results (straightforward is even faster by 0.14 ms).
You can code a Writes that reuses the existing one for Scala List
import java.util.{ List => JList }
implicit def JListWrites[T](implicit sw: Writes[List[T]]): Writes[JList[T]]) = Write[JList[T]] { jlist =>
sw.writes(jlist.asScala)
}
I am tyring to implement CRUD operations using reactiveMongo, and here is my find function from a tutorial online.
def findTicker(ticker: String) = {
val query = BSONDocument("firstName" -> ticker)
val future = collection.find(query).one
future.onComplete {
case Failure(e) => throw e
case Success(result) => {
println(result)
}
}
}
However I am getting this Result:
Some(BSONDocument(<non-empty>))
How can I actually see an actual readable JSON data I am looing for:
{ "_id" : ObjectId("569914557b85c62b49634c1d"), "firstName" : "Stephane", "lastName" : "Godbillon", "age" : 29 }
You can do this without playframework module. There is a pretty function specialy for this:
result match{
case Some(document) => println(BSONDocument.pretty(document))
case None => println("No document")
}
With Play-ReactiveMongo
So you have a few options. It looks like your using the Play framework and then I assume Play-ReactiveMongo Plugin. If thats the case checkout this question Its a bit different but I think you can re-use the ideas from the submitted answer.
import play.modules.reactivemongo.json.BSONFormats._
and then in your success case
case Success(result) => {
result.map { data =>
Json.toJson(data)
}
There are other options to convert BSONDocuments to JSON but Play-ReactiveMongo makes things easier.
Without the Play-ReactiveMongo plugin you will need to tell ReactiveMongo how to Write and Read your data. To do this ReactiveMongo uses BSONDocumentReaders & BSONDocumentWriters. They do provide a Macro to generate these for most classes this link has more info
import reactivemongo.bson._
//lets say your domain/case class is called Person
implicit val personHandler:BSONHandler[BSONDocument,Person] = Macros.handler[Person]
A BSONHandler gathers both BSONReader and BSONWriter traits and you can place this implicit in Persons companion object.
ReactiveMongos one method is generic on the type of entity it is looking for and takes an implicit reader for your entity.
def one[T](readPreference: ReadPreference)(implicit reader: Reader[T], ec: ExecutionContext): Future[Option[T]]
So in this example it would use the Reader generated from the Macro above to return a Future[Option[Person]] instead of Future[Option[BSONDocument]]. Then you can use the Play JSON to write your domain in JSON
For full disclosure you can write your own customer writers rather than use the Macro and these end up being similar to writing Play JSON writers and readers
THIS ANSWER IS BASED ON #Barry's PREVIOUS ANSWER BEFORE THE EDITS:
I got it to work using the play-reactivemongo updated version:
"org.reactivemongo" %% "play2-reactivemongo" % "0.11.9",
Now,
result.map { data =>
println(Json.toJson(data))
}
returns what I want:
{"_id":0,"name":"MongoDB","type":"database","count":1,"info":{"x":203,"y":102}}
I have a copy of Programming MapReduce with Scalding by Antonios Chalkiopoulos. In the book he discusses the External Operations design pattern for Scalding code. You can see an example on his website here. I have made a choice to use the Type Safe API. Naturally, this introduces new challenges but I prefer it over the Fields API which is what is heavily discussed in the book I have previously mentioned and the site.
I am wondering how people have implemented the external operations pattern with the Type Safe API. My initial implementation is as follows:
I create a class that extends com.twitter.scalding.Job which will
serve as my Scalding job class where I will 'manage arguments, define
taps, and use external operations to construct data processing
pipelines'.
I create an object where I define my functions to be used in the Type
Safe pipes. Because the Type Safe pipes take as arguments a function,
I can then just pass the functions in the object as arguments to the
pipes.
This creates code that looks like this:
class MyJob(args: Args) extends Job(args) {
import MyOperations._
val input_path = args(MyJob.inputArgPath)
val output_path = args(MyJob.outputArgPath)
val eventInput: TypedPipe[(LongWritable, Text)] = this.mode match {
case m: HadoopMode => TypedPipe.from(WritableSequenceFile[LongWritable, Text](input_path))
case _ => TypedPipe.from(WritableSequenceFile[LongWritable, Text](input_path))
}
val eventOutput: FixedPathSource with TypedSink[(LongWritable, Text)] with TypedSource[(LongWritable, Text)] = this.mode match {
case m: HadoopMode => WritableSequenceFile[LongWritable, Text](output_path)
case _ => TypedTsv[(LongWritable, Text)](output_path)
}
val validatedEvents: TypedPipe[(LongWritable, Either[Text, Event])] = eventInput.map(convertTextToEither).fork
validatedEvents.filter(isEvent).map(removeEitherWrapper).write(eventOutput)
}
object MyOperations {
def convertTextToEither(v: (LongWritable, Text)): (LongWritable, Either[Text, Event]) = {
...
}
def isEvent(v: (LongWritable, Either[Text, Event])): Boolean = {
...
}
def removeEitherWrapper(v: (LongWritable, Either[Text, Event])): (LongWritable, Text) = {
...
}
}
As you can see, the functions that are passed to the Scalding Type Safe operations are kept separate from the job itself. While this is not as 'clean' as the external operations pattern presented, this is a quick way to write this kind of code. Additionally, I can use JUnitRunner for doing job level integration tests and ScalaTest for function level unit tests.
The main point of this post though is to ask how people are doing this sort of thing? The documentation around the internet for Scalding Type Safe API is sparse. Are there more functional Scala friendly ways for doing this? Am I missing a key component here for the design pattern? I sort of feel nervous about this because with the Fields API you can write unit tests on pipes with ScaldingTest. As far as I know, you can't do that with TypedPipes. Please let me know if there is a generally agreed upon pattern for Scalding Type Safe API or how you create reusable, modular, and testable Type Safe API code. Thanks for the help!
Update 2 after Antonios' reply
Thank you for the reply. That was basically the answer I was looking for. I wanted to continue the conversation. The main issue I see in your answer as I commented was that this implementation expects a specific type implementation but what if the types change throughout your job? I have explored this code and it seems to work but it seems hacked on.
def self: TypedPipe[Any]
def testingPipe: TypedPipe[(LongWritable, Text)] = self.map(
(firstVar: Any) => {
val tester = firstVar.asInstanceOf[(LongWritable, Text)]
(tester._1, tester._2)
}
)
The upside to this is I declare one implementation of self but the downside is this ugly type casting. Additionally, I have not tested this out in depth with a more complex pipeline. So basically, what are your thoughts on how to handle types as they change with only one self implementation for cleanliness/brevity?
Scala extension methods are implemented using implicit classes.
You add to the compiler the capability of converting a TypedPipe into a (wrapper) class that contains your external operations:
import com.twitter.scalding.TypedPipe
import com.twitter.scalding._
import cascading.flow.FlowDef
class MyJob(args: Args) extends Job(args) {
implicit class MyOperationsWrapper(val self: TypedPipe[Double]) extends MyOperations with Serializable
val pipe = TypedPipe.from(TypedTsv[Double](args("input")))
val result = pipe
.operation1
.operation2(x => x*2)
.write(TypedTsv[Double](args("output")))
}
trait MyOperations {
def self: TypedPipe[Double]
def operation1(implicit fd: FlowDef): TypedPipe[Double] =
self.map { x =>
println(s"Input: $x")
x / 100
}
def operation2(datafn:Double => Double)(implicit fd: FlowDef): TypedPipe[Double] =
self.map { x=>
val result = datafn(x)
println(s"Result: $result")
result
}
}
import org.apache.hadoop.util.ToolRunner
import org.apache.hadoop.conf.Configuration
object MyRunner extends App {
ToolRunner.run(new Configuration(), new Tool, (classOf[MyJob].getName :: "--local" ::
"--input" :: "doubles.tsv" ::
"--output":: "result.tsv" :: args.toList).toArray)
}
Regarding how to manage types across the pipes, my recommendation would be to try to work out some basic types that make sense and use case classes. To use your example i would rename the method convertTextToEither into extractEvents :
case class LogInput(l : Long, text: Text)
case class Event(data: String)
def extractEvents( line : LogInput ): TypedPipe[Event] =
self.filter( isEvent(line) )
.map ( getEvent(line.text) )
Then you would have
LogInputOperations for LogInput types
EventOperations for Event types
I am not sure what is the problem you see with the snippet you showed, and why you think it is "less clean". It looks fine to me.
As for the unit testing jobs using typed API question, take a look at JobTest, it seems to be just what you are looking for.
what's the best MongoDB driver for Play Framework 2.1?
I am trying ReactiveMongo at this moment but I cannot find good documentation in anywhere and I have my doubts about its future development.
Which driver is the most popular and supported?
Thanks,
GA
I didn't do any comparison so I wouldn't claim it's the best, but when I started with my current project there was only salat with its Play! plugin. It's quite well documented (see its github wiki) and under active development. I'd say it has production quality. If the documentation isn't enough for you there are samples of usage in the test suites in the repository.
We were sort of unsatisfied with the way Casbah works for deep objects or simple maps and no real case class mapping support so we rolled our own MongoDB Synchronous Scala driver on top of the legacy java driver which I would like to shamelessly plug here with an example on how to store and retrieve a map and a simple case class. The driver does not have a lot of magic and is easy to setup and features a simple BSON implementation which was inspired by the Play2 JSON impl.
Here is how to use it with some simple values:
val client = MongoClient("hostname", 27017)
val db = client("dbname")
val coll = db("collectionname")
coll.save(Bson.doc("_id" -> 1, "vals" -> Map("key1" -> "val1")))
val docOpt = coll.findOneById(1) // => Option[BsonDoc]
for(doc <- docOpt)
println(doc.as[Map[String, String]]("vals")("key1")) // => prints "val1"
And with a case class it is some mapping needed but this was a design decision we made because we wanted to have the mapping completely customizable without understanding any real framework:
case class DnsRecord(host: String = "", ttl: Long = 0, otherProps: Map[String, String] = Map())
case object DnsRecord {
implicit object DnsRecordToBsonElement extends ToBsonElement[DnsRecord] {
def toBson(v: DnsRecord): BsonElement = DnsRecordToBsonDoc.toBson(v)
}
implicit object DnsRecordFromBsonElement extends FromBsonElement[DnsRecord] {
def fromBson(v: BsonElement): DnsRecord = DnsRecordFromBsonDoc.fromBson(v.asInstanceOf[BsonDoc])
}
implicit object DnsRecordFromBsonDoc extends FromBsonDoc[DnsRecord] {
def fromBson(d: BsonDoc): DnsRecord = DnsRecord(
d[String]("host"),
d[Long]("ttl"),
d[Map[String, String]]("op")
)
}
implicit object DnsRecordToBsonDoc extends ToBsonDoc[DnsRecord] {
def toBson(m: DnsRecord): BsonDoc = Bson.doc(
"host" -> m.host,
"ttl" -> m.ttl,
"op" -> m.otherProps
)
}
}
coll.save(DnsRecord("test.de", 4456, Map("p2" -> "val1")))
for (r <- coll.findAs[DnsRecord](Bson.doc("host" -> "test.de")))
println(r.host)
You could try to use Moscale I implement this library with my colegues as part of another proprietary project. Library is used in production now. There is lack of documentation, but it extreamly usefull and very simple. You can look at tests, instead of documentation and there is a short example of simple usage.