Elasticsearch index array of made up of another class

Elasticsearch index array of made up of another class - scala

I have a class which I want to begin indexing into ElasticSearch using the Scala client elastic4s. I have extended DocumentMap to allow me to insert the documents. The simple values like String, Int etc are working but I cannot seem to get a List of another class to map correctly.
The documents look similar to this:
case class AThing(UserName: String, Comment: String, Time: String)
extends DocumentMap {
override def map: Map[String, Any] = Map(
"UserName" -> UserName,
"Comment" -> Comment,
"Time" -> Time
)
}
case class ThingsThatHappened(Id: String, Things: Seq[AThing] = Nil)
extends DocumentMap {
override def map: Map[String, Any] = Map(
"Id" -> Id,
"Things" -> Things
)
}
It will map the Id field fine within elasticsearch but then I get a an incorrect value which looks similar to this, when the document is inserted into elasticsearch:
List(AThing(id_for_the_thing,user_name_a,typed_in_comment,2015-03-12))
Obviously this is wrong and I am expecting something a kin to this JSON structure once it has been inserted into elasticsearch, such as:
"events" : [
{
"UserName" :"user_name_a",
"Comment": "typed_in_comment",
"Time": "2015-03-12"
}
]
Does anyone know a way to map an array of complex types when indexing data using elastic4s?

Elastic4s or the java client (currently) isn't smart enough to figure out that you have a nested sequence or array, but it would work if it was a nested java map (still a bit rubbish from the Scala point of view).
I think the best thing to do is use the new Indexable typeclass that was added in 1.4.13
So, given
case class AThing(UserName: String, Comment: String, Time: String)
Then create a type class and bring it into scope
implicit object AThingIndexable extends Indexable[AThing] {
def json = ... create json here using Jackson or similar which will handle nested sequences properly
}
Then you should be able to do:
client.execute { index into "myIndex/AThings" source aThing }
It's not quite as automatic as using the DocumentMap but gives you more control.
See a unit test here with it in action

First of all you need to create index in elastic4s. I assume you did this.
client.execute {
create index "myIndex" mappings (
"AThings" as(
"UserName" typed StringType,
"Comemnt" typed StringType,
"Time" typed StringType,
)
)
}
if you create this index, then you can put case class into this directly.
val aThings = AThings("username","comment","time")
client.execute {index into "myIndex/AThings" doc aThings}

Related

Prevent empty values in an array being inserted into Mongo collection

I am trying to prevent empty values being inserted into my mongoDB collection. The field in question looks like this:
MongoDB Field
"stadiumArr" : [
"Old Trafford",
"El Calderon",
...
]
Sample of (mapped) case class
case class FormData(_id: Option[BSONObjectID], stadiumArr: Option[List[String]], ..)
Sample of Scala form
object MyForm {
val form = Form(
mapping(
"_id" -> ignored(Option.empty[BSONObjectID]),
"stadiumArr" -> optional(list(text)),
...
)(FormData.apply)(FormData.unapply)
)
}
I am also using the Repeated Values functionality in Play Framework like so:
Play Template
#import helper._
#(myForm: Form[models.db.FormData])(implicit request: RequestHeader, messagesProvider: MessagesProvider)
#repeatWithIndex(myForm("stadiumArr"), min = 5) { (stadium, idx) =>
#inputText(stadium, '_label -> ("stadium #" + (idx + 1)))
}
This ensures that whether there are at least 5 values or not in the array; there will still be (at least) 5 input boxes created. However if one (or more) of the input boxes are empty when the form is submitted an empty string is still being added as value in the array, e.g.
"stadiumArr" : [
"Old Trafford",
"El Calderon",
"",
"",
""
]
Based on some other ways of converting types from/to the database; I've tried playing around with a few solutions; such as:
implicit val arrayWrite: Writes[List[String]] = new Writes[List[String]] {
def writes(list: List[String]): JsValue = Json.arr(list.filterNot(_.isEmpty))
}
.. but this isn't working. Any ideas on how to prevent empty values being inserted into the database collection?

Without knowing specific versions or libraries you're using it's hard to give you an answer, but since you linked to play 2.6 documentation I'll assume that's what you're using there. The other assumption I'm going to make is that you're using reactive-mongo library. Whether or not you're using the play plugin for that library or not is the reason why I'm giving you two different answers here:
In that library, with no plugin, you'll have defined a BSONDocumentReader and a BSONDocumentWriter for your case class. This might be auto-generated for you with macros or not, but regardless how you get it, these two classes have useful methods you can use to transform the reads/writes you have to another one. So, let's say I defined a reader and writer for you like this:
import reactivemongo.bson._
case class FormData(_id: Option[BSONObjectID], stadiumArr: Option[List[String]])
implicit val formDataReaderWriter = new BSONDocumentReader[FormData] with BSONDocumentWriter[FormData] {
def read(bson: BSONDocument): FormData = {
FormData(
_id = bson.getAs[BSONObjectID]("_id"),
stadiumArr = bson.getAs[List[String]]("stadiumArr").map(_.filterNot(_.isEmpty))
)
}
def write(formData: FormData) = {
BSONDocument(
"_id" -> formData._id,
"stadiumArr" -> formData.stadiumArr
)
}
}
Great you say, that works! You can see in the reads I went ahead and filtered out any empty strings. So even if it's in the data, it can be cleaned up. That's nice and all, but let's notice I didn't do the same for the writes. I did that so I can show you how to use a useful method called afterWrite. So pretend the reader/writer weren't the same class and were separate, then I can do this:
val initialWriter = new BSONDocumentWriter[FormData] {
def write(formData: FormData) = {
BSONDocument(
"_id" -> formData._id,
"stadiumArr" -> formData.stadiumArr
)
}
}
implicit val cleanWriter = initialWriter.afterWrite { bsonDocument =>
val fixedField = bsonDocument.getAs[List[String]]("stadiumArr").map(_.filterNot(_.isEmpty))
bsonDocument.remove("stadiumArr") ++ BSONDocument("stadiumArr" -> fixedField)
}
Note that cleanWriter is the implicit one, that means when the insert call on the collection happens, it will be the one chosen to be used.
Now, that's all a bunch of work, if you're using the plugin/module for play that lets you use JSONCollections then you can get by with just defining play json Reads and Writes. If you look at the documentation you'll see that the reads trait has a useful map function you can use to transform one Reads into another.
So, you'd have:
val jsonReads = Json.reads[FormData]
implicit val cleanReads = jsonReads.map(formData => formData.copy(stadiumArr = formData.stadiumArr.map(_.filterNot(_.isEmpty))))
And again, because only the clean Reads is implicit, the collection methods for mongo will use that.
NOW, all of that said, doing this at the database level is one thing, but really, I personally think you should be dealing with this at your Form level.
val form = Form(
mapping(
"_id" -> ignored(Option.empty[BSONObjectID]),
"stadiumArr" -> optional(list(text)),
...
)(FormData.apply)(FormData.unapply)
)
Mainly because, surprise surprise, form has a way to deal with this. Specifically, the mapping class itself. If you look there you'll find a transform method you can use to filter out empty values easily. Just call it on the mapping you need to modify, for example:
"stadiumArr" -> optional(
list(text).transform(l => l.filter(_.nonEmpty), l => l.filter(_.nonEmpty))
)
To explain a little more about this method, in case you're not used to reading the signatures in the scaladoc.
def
transform[B](f1: (T) ⇒ B, f2: (B) ⇒ T): Mapping[B]
says that by calling transform on some mapping of type Mapping[T] you can create a new mapping of type Mapping[B]. In order to do this you must provide functions that convert from one to the other. So the code above causes the list mapping (Mapping[List[String]]) to become a Mapping[List[String]] (the type did not change here), but when it does so it removes any empty elements. If I break this code down a little it might be more clear:
def convertFromTtoB(list: List[String]): List[String] = list.filter(_.nonEmpty)
def convertFromBtoT(list: List[String]): List[String] = list.filter(_.nonEmpty)
...
list(text).transform(convertFromTtoB, convertFromBtoT)
You might wondering why you need to provide both, the reason is because when you call Form.fill and the form is populated with values, the second method will be called so that the data goes into the format the play form is expecting. This is more obvious if the type actually changes. For example, if you had a text area where people could enter CSV but you wanted to map it to a form model that had a proper List[String] you might do something like:
def convertFromTtoB(raw: String): List[String] = raw.split(",").filter(_.nonEmpty)
def convertFromBtoT(list: List[String]): String = list.mkString(",")
...
text.transform(convertFromTtoB, convertFromBtoT)
Note that when I've done this in the past sometimes I've had to write a separate method and just pass it in if I didn't want to fully specify all the types, but you should be able to work from here given the documentation and type signature for the transform method on mapping.
The reason I suggest doing this in the form binding is because the form/controller should be the one with the concern of dealing with your user data and cleaning things up I think. But you can always have multiple layers of cleaning and whatnot, it's not bad to be safe!

I've gone for this (which always seems obvious when it's written and tested):
implicit val arrayWrite: Writes[List[String]] = new Writes[List[String]] {
def writes(list: List[String]): JsValue = Json.toJson(list.filterNot(_.isEmpty).toIndexedSeq)
}
But I would be interested to know how to
.map the existing Reads rather than redefining from scratch
as #cchantep suggests

Mapping aggregation results using ReactiveMongo JSON

I'm trying to write a function that aggregates data from my MongoDB using ReactiveMongo (0.12), with Play JSON serialisation (similar to this question).
So this is what I have:
def getPopAggregate(col: JSONCollection) = {
import col.BatchCommands.AggregationFramework.{AggregationResult, Group, Match, SumField}
col.aggregate(
Group(JsString("$rstId"))("totalPopulation" -> SumField("population")),
List(Match(Json.obj("totalPopulation" -> Json.obj("$gte" -> 1000))))
).map(_.firstBatch)
}
This outputs Future[List[JsObject]], however I want to map the results to a List of my case class (i.e. Future[Seq[PopAggregate]]).
case class PopAggregate(rstId: Option[BSONObjectID], totalPopulation: Double)
object PopAggregate {
implicit val popAggregateFormat = Json.format[PopAggregate]
}
I hope someone can spare a moment to help me past this one. Many thanks!

Conversion from bson.Document to JsObject and applying reads on it with joda.DateTime

I am currently integrating part of our system with MongoDB and we decided to use the official scala driver for it.
We have case class with joda.DateTime as parameters:
case class Schema(templateId: Muid,
createdDate: DateTime,
updatedDate: DateTime,
columns: Seq[Column])
We also defined format for it:
implicit lazy val checklistSchemaFormat : Format[Schema] = (
(__ \ "templateId").format[Muid] and
(__ \ "createdDate").format[DateTime] and
(__ \ "updatedDate").format[DateTime] and
(__ \ "columns").format[Seq[Column]]
)((Schema.apply _), unlift(Schema.unapply))
When I serialize this object to json and write to mongo, the createdDate and updatedDate getting converted to Long (which is technically fine). And this is how we do it:
val bsonDoc = Document(Json.toJson(schema).toString())
collection(DbViewSchemasCollectionName).replaceOne(filter, bsonDoc, new UpdateOptions().upsert(true)).subscribe(new Observer[UpdateResult] {
override def onNext(result: UpdateResult): Unit = logger.info(s"Successfully updates checklist template schema with result: $result")
override def onError(e: Throwable): Unit = logger.info(s"Failed to update checklist template schema with error: $e")
override def onComplete(): Unit = {}
})
as a result Mongo has this type of object:
{
"_id": ObjectId("56fc4247eb3c740d31b04f05"),
"templateId": "gaQP3JIB3ppJtro9rO9BAw",
"createdDate": NumberLong(1459372615507),
"updatedDate": NumberLong(1459372615507),
"columns": [
...
]
}
Now, I am trying to read it like so:
collection(DbViewSchemasCollectionName).find(filter).first().head() map { document =>
ChecklistSchema.checklistSchemaFormat reads Json.parse(document.toJson()) match {
case JsSuccess(value, _) => {
Some(value)
}
case JsError(e) => throw JsResultException(e)
}
} recover { case e =>
logger.info("ERROR " + e)
None
}
And at this point the reads always failing, since the createdDate and updatedDate now look like this:
"createdDate" : { "$numberLong" : "1459372615507" }, "updatedDate" :
{"$numberLong" : "1459372615507" }
How do I deal with this situation? Is there any easier conversion between bson.Document and JsObject? Or I am completely digging into the wrong direction...
Thanks,

You can use the following approach to resolve your issue.
Firstly, I used json4s for reading the writing json to case classes
example:
case class User(_id: Option[Int], username: String, firstName: String, createdDate: DateTime , updatedDate: DateTime )
// A small wapper to convert case class to Document
def toBson[A <: AnyRef](x : A):Document = {
val json = write[A](x)
Document(json) }
def today() = DateTime.now
val user = User(Some(212),"binkabir","lacmalndl", today , today)
val bson = toBson(user)
usersCollection.insertOne(bson).subscribe((x: Completed) => println(x))
val f = usersCollection.find(equal("_id",212)).toFuture()
f.map(_.head.toJson).map( x => read[User](x)).foreach(println)
The code above will create a user case class, convert to Document, save to mongo db, query the db and print the returned User case class
I hope this makes sense!

To answer your second question (bson.Document <-> JsObject) - YES; this is a solved problem, check out Play2-ReactiveMongo, which makes it seem like you're storing/retrieving JsObject instances - plus it's fully asynchronous and super-easy to get going in a Play application.
You can even go a step further and use a library like Mondrian (full disclosure: I wrote it!) to get the basic CRUD operations on top of ReactiveMongo for your Play-JSON domain case-classes.
Obviously I'm biased, but I think these solutions are a great fit if you've already defined your models as case classes in Play - you can forget about the whole BSONDocument family and stick to Json._ and JsObject etc that you already know well.
EDIT:
At the risk of further downvotes, I will demonstrate how I would go about storing and retrieving the OP's Schema object, using Mondrian. I'm going to show pretty-much everything for completeness; bear in mind you've actually already done most of this. Your final code will have fewer lines than your current code, as you'd expect when you use a library.
Model Objects
There's a Column class here that's never mentioned, for simplicity let's just say that's:
case class Column (name:String, width:Int)
Now we can get on with the Schema, which is just:
import com.themillhousegroup.mondrian._
case class Schema(_id: Option[MongoId],
createdDate: DateTime,
updatedDate: DateTime,
columns: Seq[Column]) extends MongoEntity
So far, we've just implemented the MongoEntity trait, which just required the templateId field to be renamed and given the required type.
JSON Converters
import com.themillhousegroup.mondrian._
import play.api.libs.json._
import play.api.libs.functional.syntax._
object SchemaJson extends MongoJson {
implicit lazy val columnFormat = Json.format[Column]
// Pick one - easy:
implicit lazy val schemaFormat = Json.format[Schema]
// Pick one - backwards-compatible (uses "templateId"):
implicit lazy val checklistSchemaFormat : Format[Schema] = (
(__ \ "templateId").formatNullable[MongoId] and
(__ \ "createdDate").format[DateTime] and
(__ \ "updatedDate").format[DateTime] and
(__ \ "columns").format[Seq[Column]]
)((Schema.apply _), unlift(Schema.unapply))
}
The JSON converters are standard Play-JSON stuff; we pick up the MongoId Format by extending MongoJson. I've shown two different ways of defining the Format for Schema. If you have clients out in the wild using templateId (or if you prefer it) then use the second, more verbose declaration.
Service layer
For brevity I'll skip the application-configuration, you can read the Mondrian README.md for that. Let's define the SchemaService that is responsible for persistence operations on Schema instances:
import com.themillhousegroup.mondrian._
import SchemaJson._
class SchemaService extends TypedMongoService[Schema]("schemas")
That's it. We've linked the model object, the name of the MongoDB collection ("schemas") and (implicitly) the necessary converters.
Saving a Schema and finding a Schema based on some criteria
Now we start to realize the value of Mondrian. save and findOne are standard operations - we get them for free in our Service, which we inject into our controllers in the standard way:
class SchemaController #Inject (schemaService:SchemaService) extends Controller {
...
// Returns a Future[Boolean]
schemaService.save(mySchema).map { saveOk =>
...
}
...
...
// Define a filter criteria using standard Play-JSON:
val targetDate = new DateTime()
val criteria = Json.obj("createdDate" -> Json.obj("$gt" ->targetDate.getMillis))
// Returns a Future[Option[Schema]]
schemaService.findOne(criteria).map { maybeFoundSchema =>
...
}
}
So there we go. No sign of the BSON family, just the Play JSON that, as you say, we all know and love. You'll only need to reach for the Mongo documentation when you need to construct a JSON query (that $gt stuff) although in some cases you can use Mondrian's overloaded findOne(example:Schema) method if you are just looking for a simple object match, and avoid even that :-)

Reading/Writing None values as null with ReactiveMongo

We are in the process of migrating an existing REST service from Spring/Java to Spray using ReactiveMongo. One of the requirements for the migration (the first phase of it anyway), is that all inputs and outputs must match the current system. The issue with this is the business objects allow null values - both at rest in the datastore, and when returned in GET methods on the service. Fields can be missing as input to the service for PUT/POST, but the corresponding values must still be written as null to the datastore and returned the same.
Normally 'not required' fields aren't an issue for Scala/Spray through the use of Option, but the issue I'm having is actually writing the values of the Option fields as null when persisting, and setting the fields as None when reading the same null from Mongo.
In the research I've been doing, I have not been able to find a way to do this.
Here are snippets of my code:
UserPersistent
case class UserPersistent(id: Option[String], name: Option[String])
PersistentUser
object PersistentUser {
implicit object PersistentUserReader extends BSONDocumentReader[UserPersistent] {
def read(doc: BSONDocument): UserPersistent = UserPersistent(
id = doc.getAs[String]("_id"),
name = doc.getAs[String]("name")
)
}
implicit object PersistentUserWriter extends BSONDocumentWriter[UserPersistent] {
override def write(persisted: UserPersistent): BSONDocument = {
BSONDocument(
"_id" -> persisted.id,
"name" -> persisted.name
)
}
}
}
I have tried the following on the write() side, and although the code compiles and runs, it throws a NullPointerException when executed
"name" -> {
val nnn = persisted.name match {
case Some(n) => n
case _ => null
}
nnn
}
I have used OptionFormat for the 'presentation' of the data, which returns nulls (but for everything), but I need to take care of the Mongo side of this.
Surely there's a way to do this - what am I missing?

Try This:
object PersistentUser {
implicit val reader: BSONDocumentReader[UserPersistent] = Macros.reader[UserPersistent]
implicit val writer: BSONDocumentWriter[UserPersistent] = Macros.writer[UserPersistent]
}

Scala Salat Deserialization: how to get a Map[String, Number]?

My database looks like
[
{
name: "domenic",
records: {
today: 5,
yesterday: 1.5
}
},
{
name: "bob",
records: { ... }
}
]
When I try queries like
val result: Option[DBObject] = myCollection.findOne(
MongoDBObject("name" -> "domenic")
MongoDBObject("records" -> 1),
)
val records = result.get.getAs[BasicDBObject]("records").get
grater[Map[String, Number]].asObject(records)
it fails (at runtime!) with
GRATER GLITCH - unable to find or instantiate a grater using supplied path name
REASON: Class scala.collection.immutable.Map is an interface
Context: 'global'
Path from pickled Scala sig: 'scala.collection.immutable.Map'
I think I could make this work by creating a case class whose only field is a Map[String, Number] and then getting its property. Is that really necessary?

grater doesn't take a collection as a type argument, only a case class or a trait/abstract class whose concrete representations are case classes. Since you're just querying for a map, just extract the values you need out of the DBObject using getAs[T].
Number may not be a supported type in Salat - I've certainly never tried it. If you need Number you can write a custom transformer or send a pull request to add real support to Salat.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Elasticsearch index array of made up of another class - scala

Related

Prevent empty values in an array being inserted into Mongo collection

Mapping aggregation results using ReactiveMongo JSON

Conversion from bson.Document to JsObject and applying reads on it with joda.DateTime

Reading/Writing None values as null with ReactiveMongo

Scala Salat Deserialization: how to get a Map[String, Number]?

Categories

Resources