Scala drivers for couchdb and partial schemas

Scala drivers for couchdb and partial schemas - scala

One question I have about current Scala couchdb drivers is whether they can work with "partial" schemas". I'll try to explain what I mean: the libraries I've see seem to all want to do a complete conversion from JSON docs in the database to a Scala object, handle the Scala object, and convert it back to JSON. This is is fine if your application knows everything about that type of object --- especially if it is the sole piece of software interacting with that database. However, what if I want to write a little application that only knows about part of the JSON object: for example, what if I'm only interested in a 'mybook' component embedded like this:
{
_id: "0ea56a7ec317138700743cdb740f555a",
_rev: "2-3e15c3acfc3936abf10ea4f84a0aeced",
type: "user",
profiles: {
mybook: {
key: "AGW45HWH",
secret: "g4juh43ui9hg929gk4"
},
.. 6 or 7 other profiles
},
.. lots of other stuff
}
I really don't want to convert the whole JSON AST to a Scala object. On the other hand, in couchdb, you must save back the entire JSON doc, so this needs to be preserved somehow. I think what I really what is something like this:
class MyBook {
private val userJson: JObject = ... // full JSON retrieved from the database
lazy val _id: String = ... // parsed from the JSON
lazy val _rev: String = ... // parsed from the JSON
lazy val key: String = ... // parsed from the JSON
lazy val secret: String = ... // (ditto)
def withSecret(secret: String): MyBook = ... // new object with altered userJson
def save(db: CouchDB) = ... // save userJson back to couchdb
}
Advantages:
computationally cheaper to extract only needed fields
don't have to sync with database evolution except for 'mybook' part
more suitable for development with partial schemas
safer, because there is less change as inadvertently deleting fields if we didn't keep up with the database schema
Disadavantages:
domain objects in Scala are not pristinely independent of couch/JSON
more memory use per object
Is this possible with any of the current Scala drivers? With either of scouchdb or the new Sohva library, it seems not.

As long as you have a good JSON library and a good HTTP client library, implementing a schemaless CouchDB client library is really easy.
Here is an example in Java: code, tests.

My couchDB library uses spray-json for (de)serialization, which is very flexible and would enable you to ignore parts of a document but still save it. Let's look at a simplified example:
Say we have a document like this
{
dontcare: {
...
},
important: "foo"
}
Then you could declare a class to hold information from this document and define how the conversion is done:
case class Dummy(js:JsValue)
case class PartialDoc(dontcare: Dummy, important: String)
implicit object DummyFormat extends JsonFormat[Dummy] {
override def read(js:JsValue):Dummy = Dummy(js)
override def write(d:Dummy):JsValue = d.js
}
implicit val productFormat = jsonFormat2(PartialDoc)
This will ignore anything in dontcare but still safe it as a raw JSON AST. Of course this example is not as complex as the one in your question, but it should give you an idea how to solve your problem.

Related

Scala.js - Convert Uint8Array to Array[Byte]

How do I implement the following method in Scala.js?
import scala.scalajs.js
def toScalaArray(input: js.typedarray.Uint8Array): Array[Byte] =
// code in question

edited per request: tl;dr
input.view.map(_.toByte).toArray
Original answer
I'm not intimately familiar with Scala-js, but I can elaborate on some of the questions that came up in the comments, and improve upon your self-answer.
Also I don't quite get why I need toByte calls
class Uint8Array extends Object with TypedArray[Short, Uint8Array]
Scala treats a Uint8Array as a collection of Short, whereas you are expecting it to be a collection of Byte
Uint8Array's toArray method notes:
This member is added by an implicit conversion from Uint8Array to
IterableOps[Short] performed by method iterableOps in scala.scalajs.js.LowestPrioAnyImplicits.
So the method is returning an Array[Short] which you then .map to convert the Shorts to Bytes.
In your answer you posted
input.toArray.map(_.toByte)
which is technically correct, but it has the downside of allocating an intermediate array of the Shorts. To avoid this allocation, you can perform the .map operation on a .view of the array, then call .toArray on the view.
Views in Scala (and by extension Scala.js) are lightweight objects that reference an original collection plus some kind of transformation/filtering function, which can be iterated like any other collection. You can compose many transformation/filters on a view without having to allocate intermediate collections to represent the results. See the docs page (linked) for more.
input.view.map(_.toByte).toArray
Depending on how you intend to pass the resulting value around, you may not even need to call .toArray. For example if all you need to do is iterate the elements later on, you could just pass the view around as an Iterable[Byte] without ever having to allocate a separate array.

All the current answers require iterating over the array in user space.
Scala.js has optimizer supported conversions for typed arrays (in fact, Array[Byte] are typed arrays in modern configs). You'll likely get better performance by doing this:
import scala.scalajs.js.typedarray._
def toScalaArray(input: Uint8Array): Array[Byte] = {
// Create a view as Int8 on the same underlying data.
new Int8Array(input.buffer, input.byteOffset, input.length).toArray
}
The additional new Int8Array is necessary to re-interpret the underlying buffer as signed (the Byte type is signed). Only then, Scala.js will provide the built in conversion to Array[Byte].
When looking at the generated code, you'll see no user space loop is necessary: The built-in slice method is used to copy the TypedArray. This will almost certainly not be beatable in terms of performance by any user-space loop.
$c_Lhelloworld_HelloWorld$.prototype.toScalaArray__sjs_js_typedarray_Uint8Array__AB = (function(input) {
var array = new Int8Array(input.buffer, $uI(input.byteOffset), $uI(input.length));
return new $ac_B(array.slice())
});
If we compare this with the currently accepted answer (input.view.map(_.toByte).toArray) we see quite a difference (comments mine):
$c_Lhelloworld_HelloWorld$.prototype.toScalaArray__sjs_js_typedarray_Uint8Array__AB = (function(input) {
var this$2 = new $c_sjs_js_IterableOps(input);
var this$5 = new $c_sc_IterableLike$$anon$1(this$2);
// We need a function
var f = new $c_sjsr_AnonFunction1(((x$1$2) => {
var x$1 = $uS(x$1$2);
return ((x$1 << 24) >> 24)
}));
new $c_sc_IterableView$$anon$1();
// Here's the view: So indeed no intermediate allocations.
var this$8 = new $c_sc_IterableViewLike$$anon$6(this$5, f);
var len = $f_sc_TraversableOnce__size__I(this$8);
var result = new $ac_B(len);
// This function actually will traverse.
$f_sc_TraversableOnce__copyToArray__O__I__V(this$8, result, 0);
return result
});

import scala.scalajs.js
def toScalaArray(input: js.typedarray.Uint8Array): Array[Byte] =
input.toArray.map(_.toByte)

Extract Json from Spray POST as string, not by marshaling to entity

There is an existing question that has much of what I'm after:
Extracting Raw JSON as String inside a Spray POST route
But it stops short without explaining how to get the actual Json string representation out of the Directive[String]. I'm trying to send Json data to Kafka as a string (which the Kafka Producer serializes), so I'm trying to extract the Json in string form. I will do the marshalling to an entity at the other end in the Kafka consumer. The answer link above gets me close:
def rawJson = extract { _.request.entity.asString}
case "value2" => rawJson{ json =>// use the json }
But I end up with Directive[String]. How do I get the String out?

The example you referenced should work. You would use the rawJson directive they defined to wrap your inner route, and the json string would be made available within that inner route.
In the example below, personJson is a String, extracted by the body of the request via the rawJson directive, and made available to the inner route where the rest of the work is done.
def rawJson = extract { _.request.entity.asString}
val personRoute = {
(post & path("persons")){
rawJson{ personJson =>
onSuccess(personService.addPerson(person)){ personWithId =>
complete(StatusCodes.Created, personWithId)
}
}
}

I came up with the following syntax which accomplishes my need to extract the Json in String form. At first I thought it inefficient that I was unmarshaling and then remarshaling again, but then I realized that this provides a form of immediate Json validation. But there may be more efficient ways to do that.
The API is all Spray. handleWith uses an implicit conversion to the RawWeatherData case class.
path("weather"/"data"/"json") {
handleWith { rawRecord: RawWeatherData =>
val rawJsonStr = rawRecord.toJson.toString
kafkaJsonRecordIngest(rawJsonStr)
rawRecord
}
}

How to generate an unique ID for an class instance in Scala?

I have a class that needs to write to a file to interface with some legacy C++ application.
Since it will be instantiated several times in a concurrent manner,
it is a good idea to give the file an unique name.
I could use System.currentTimemili or hashcode, but there exists the possibility of collisions.
Another solution is to put a var field inside a companion object.
As an example, the code below shows one such class with the last solution, but I am not sure it is the best way to do it (at least it seems thread-safe):
case class Test(id:Int, data: Seq[Double]) {
//several methods writing files...
}
object Test {
var counter = 0
def new_Test(data: Seq[Double]) = {
counter += 1
new Test(counter, data)
}
}

Did you try this :
def uuid = java.util.UUID.randomUUID.toString
See UUID javadoc, and also How unique is UUID? for a discussion of uniqueness guarantee.

it is a good idea to give the file an unique name
Since all you want is a file, not id, the best solution is to create a file with unique name, not a class with unique id.
You could use File.createTempFile:
val uniqFile = File.createTempFile("myFile", ".txt", "/home/user/my_dir")
Vladimir Matveev mentioned that there is a better solution in Java 7 and later - Paths.createTempFile:
val uniqPath = Paths.createTempFile(Paths.get("/home/user/my_dir"), "myFile", ".txt"),

Setting the property of an object using a variable value in scala

I'm trying to create a restful method to update data in the database, I'm using Scala on Play! framework. I have a model called Application, and I want to be able to update an application in the database. So the put request only requires the id of the application you want to update, then the optional properties you want to update.
So in my routes I have this:
PUT /v1/auth/application controllers.Auth.update_application(id: Long)
The method I currently have is this:
def update_application(id: Long) = Action { implicit request =>
var app = Application.find(id)
for((key, value) <- request.queryString) {
app."$key" = value(0)
//app.name = value(0)
}
val update = Application.update(id, app)
Ok(generate(
Map(
"status" -> "success",
"data" -> update
)
)).as("application/json")
}
In the method above I am looping through the request and the app object as a map instance, then updating the app model to be updated using the model. I know there is an easier way is to create the request string as map and just iterate through the object, but I am doing it this way for learning purposes. I'm new to Play! and Scala, barely a week new.
Is there a way to set a property using a variable dynamically that way? In the above method at the end of the loop that is how I would update a object's property in Groovy. So I'm looking for the equivalent in Scala. If Scala can't do this, what is the best way about going about accomplishing this task? Reflection? I don't want to over-complicate things

Play! provides play.api.data.Forms which allows creating a Form that uses the apply() and unapply() methods of a case class. Then you can call form.fill(app) to create a form with the initial state, bindFromRequest(request) to update the form with the request parameters, and get() to obtain a new instance with updated fields.
Define the form only once:
val appForm = Form(
mapping(
"field1" -> text,
"field2" -> number
)(Application.apply)(Application.unapply)
)
Then in your controller:
val app = appForm.fill(Application.find(id)).bindFromRequest(request).get
val update = Application.update(id, app)
Check out Constructing complex objects from the documentation.

How to test methods based on Salat with ScalaTest

I'm writing a web-app using Play 2, Salat (for mongoDB bindin). I would like to test some methods, in the Lesson Model (for instance test the fact that I retrieve the right lesson by id). The problem is that I don't want to pollute my current DB with dummy lessons. How can I use a fake DB using Salat and Scala Test ? Here is one of my test file. It creates two lessons, and insert it into the DB, and it runs some tests on it.
LessonSpec extends FlatSpec with ShouldMatchers {
object FakeApp extends FakeApplication()
val newLesson1 = Lesson(
title = "lesson1",
level = 5,
explanations = "expl1",
questions = Seq.empty)
LessonDAO.insert(newLesson1)
val newLesson2 = Lesson(
title = "lesson2",
level = 5,
explanations = "expl2",
questions = Seq.empty)
LessonDAO.insert(newLesson2)
"Lesson Model" should "be retrieved by level" in {
running(FakeApp) {
assert(Lesson.findByLevel(5).size === 2)
}
}
it should "be of size 0 if no lesson of the level is found" in {
running(FakeApp) {
Lesson.findByLevel(4) should be(Nil)
}
}
it should "be retrieved by title" in {
running(FakeApp) {
Lesson.findOneByTitle("lesson1") should be(Some(Lesson("lesson1", 5, "expl1", List())))
}
}
}
I searched on the web but i can't find a good link or project that use Salat and ScalaTest.

Salat developer here. My recommendation would be to have a separate test only database. You can populate it with test data to put your test database in a known state - see the casbah tests for how to do this - and then test against it however you like, clearing out collections as necessary.
I use specs2, not scalatest, but the principle is the same - see the source code for the Salat tests.
Here's a good test to get you started:
https://github.com/novus/salat/blob/master/salat-core/src/test/scala/com/novus/salat/test/dao/SalatDAOSpec.scala
Note that in my base spec I clear out my test database - this gets run before each spec:
trait SalatSpec extends Specification with Logging {
override def is =
Step {
// log.info("beforeSpec: registering BSON conversion helpers")
com.mongodb.casbah.commons.conversions.scala.RegisterConversionHelpers()
com.mongodb.casbah.commons.conversions.scala.RegisterJodaTimeConversionHelpers()
} ^
super.is ^
Step {
// log.info("afterSpec: dropping test MongoDB '%s'".format(SalatSpecDb))
MongoConnection().dropDatabase(SalatSpecDb)
}
And then in SalatDAOSpec, I run my tests inside scopes which create, populate and/or clear out individual collections so that the tests can run in an expected state. One hitch: if you run your tests concurrently on the same collection, they may fail due to unexpected state. The solution is either to run your tests in isolated special purpose collections, or to force your tests to run in series so that operations on a single collection don't step on each other as different test cases modify the collection.
If you post to the Scalatest mailing list (http://groups.google.com/group/scalatest-users), I'm sure someone can recommend the correct way to set this up.

In my applications, I use a parameter in application.conf to specify the Mongo database name. When initializing my FakeApplication, I override that parameter so that my unit tests can use a real Mongo instance but do not see any of my production data.
Glossing over a few details specific to my application, my tests look something like this:
// wipe any existing data
db.collectionNames.foreach { colName =>
if (colName != "system.indexes") db.getCollection(colName).drop
}
app = FakeApplication(additionalConfiguration = Map("mongo.db.name" -> "unit-test"))

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse