I'm writing a Scala web application that use MongoDB as database and ReactiveMongo as driver.
I've a collection named recommendation.correlation in which I saved the correlation between a product and a category.
A document has the following form:
{ "_id" : ObjectId("544f76ea4b7f7e3f6e2db224"), "category" : "c1", "attribute" : "c3:p1", "value" : { "average" : 0, "weight" : 3 } }
Now I'm writing a method as following:
def calculateCorrelation: Future[Boolean] = {
def calculate(category: String, tag: String, similarity: List[Similarity]): Future[(Double, Int)] = {
println("Calculate correlation of " + category + " " + tag)
val value = similarity.foldLeft(0.0, 0)( (r, c) => if(c.tag1Name.split(":")(0) == category && c.tag2Name == tag) (r._1 + c.eq, r._2 + 1) else r
) //fold the tags
val sum = value._1
val count = value._2
val result = if(count > 0) (sum/count, count) else (0.0, 0)
Future{result}
}
play.Logger.debug("Start Correlation")
Similarity.all.toList flatMap { tagsMatch =>
val tuples =
for {
i<- tagsMatch
} yield (i.tag1Name.split(":")(0), i.tag2Name) // create e List[(String, String)] containing the category and productName
val res = tuples map { el =>
calculate(el._1, el._2, tagsMatch) flatMap { value =>
val correlation = Correlation(el._1, el._2, value._1, value._2) // create the correlation
val query = Json.obj("category" -> value._1, "attribute" -> value._2)
Correlations.find(query).one flatMap(element => element match {
case Some(x) => Correlations.update(query, correlation) flatMap {status => status match {
case LastError(ok, _, _, _, _, _, _) => Future{true}
case _ => Future{false}
}
}
case None => Correlations.save(correlation) flatMap {status => status match {
case LastError(ok, _, _, _, _, _, _) => Future{true}
case _ => Future{false}
}
}
}
)
}
}
val result = if(res.exists(_ equals false)) false else true
Future{result}
}
The problem is that the method insert duplicated documents.
Why this happen??
I've solved using db.recommendation.correlation.ensureIndex({"category": 1, "attribute": 1}, {"unique": true, "dropDups":true }), but how can I fixed the problem without using indexes??
What's wrong??
What you want to do is an in-place update. To do that with ReactiveMongo you need to use an update operator to tell it which fields to update, and how. Instead, you've passed correlation (which I assume is some sort of BSONDocument) to the collection's update method. That simply requests replacement of the document, which if the unique index value is different will cause a new document to be added to the collection. Instead of passing correlation you should pass a BSONDocument that uses one of the update operators such as $set (set a field) or $incr (increment a numeric field by one). For details on doing that, please see the MongoDB Documentation, Modify Document
Related
I have a Seq[String] in Scala, and if the Seq contains certain Strings, I append a relevant message to another list.
Is there a more 'scalaesque' way to do this, rather than a series of if statements appending to a list like I have below?
val result = new ListBuffer[Err]()
val malformedParamNames = // A Seq[String]
if (malformedParamNames.contains("$top")) result += IntegerMustBePositive("$top")
if (malformedParamNames.contains("$skip")) result += IntegerMustBePositive("$skip")
if (malformedParamNames.contains("modifiedDate")) result += FormatInvalid("modifiedDate", "yyyy-MM-dd")
...
result.toList
If you want to use some scala iterables sugar I would use
sealed trait Err
case class IntegerMustBePositive(msg: String) extends Err
case class FormatInvalid(msg: String, format: String) extends Err
val malformedParamNames = Seq[String]("$top", "aa", "$skip", "ccc", "ddd", "modifiedDate")
val result = malformedParamNames.map { v =>
v match {
case "$top" => Some(IntegerMustBePositive("$top"))
case "$skip" => Some(IntegerMustBePositive("$skip"))
case "modifiedDate" => Some(FormatInvalid("modifiedDate", "yyyy-MM-dd"))
case _ => None
}
}.flatten
result.toList
Be warn if you ask for scala-esque way of doing things there are many possibilities.
The map function combined with flatten can be simplified by using flatmap
sealed trait Err
case class IntegerMustBePositive(msg: String) extends Err
case class FormatInvalid(msg: String, format: String) extends Err
val malformedParamNames = Seq[String]("$top", "aa", "$skip", "ccc", "ddd", "modifiedDate")
val result = malformedParamNames.flatMap {
case "$top" => Some(IntegerMustBePositive("$top"))
case "$skip" => Some(IntegerMustBePositive("$skip"))
case "modifiedDate" => Some(FormatInvalid("modifiedDate", "yyyy-MM-dd"))
case _ => None
}
result
Most 'scalesque' version I can think of while keeping it readable would be:
val map = scala.collection.immutable.ListMap(
"$top" -> IntegerMustBePositive("$top"),
"$skip" -> IntegerMustBePositive("$skip"),
"modifiedDate" -> FormatInvalid("modifiedDate", "yyyy-MM-dd"))
val result = for {
(k,v) <- map
if malformedParamNames contains k
} yield v
//or
val result2 = map.filterKeys(malformedParamNames.contains).values.toList
Benoit's is probably the most scala-esque way of doing it, but depending on who's going to be reading the code later, you might want a different approach.
// Some type definitions omitted
val malformations = Seq[(String, Err)](
("$top", IntegerMustBePositive("$top")),
("$skip", IntegerMustBePositive("$skip")),
("modifiedDate", FormatInvalid("modifiedDate", "yyyy-MM-dd")
)
If you need a list and the order is siginificant:
val result = (malformations.foldLeft(List.empty[Err]) { (acc, pair) =>
if (malformedParamNames.contains(pair._1)) {
pair._2 ++: acc // prepend to list for faster performance
} else acc
}).reverse // and reverse since we were prepending
If the order isn't significant (although if the order's not significant, you might consider wanting a Set instead of a List):
val result = (malformations.foldLeft(Set.empty[Err]) { (acc, pair) =>
if (malformedParamNames.contains(pair._1)) {
acc ++ pair._2
} else acc
}).toList // omit the .toList if you're OK with just a Set
If the predicates in the repeated ifs are more complex/less uniform, then the type for malformations might need to change, as they would if the responses changed, but the basic pattern is very flexible.
In this solution we define a list of mappings that take your IF condition and THEN statement in pairs and we iterate over the inputted list and apply the changes where they match.
// IF THEN
case class Operation(matcher :String, action :String)
def processInput(input :List[String]) :List[String] = {
val operations = List(
Operation("$top", "integer must be positive"),
Operation("$skip", "skip value"),
Operation("$modify", "modify the date")
)
input.flatMap { in =>
operations.find(_.matcher == in).map { _.action }
}
}
println(processInput(List("$skip","$modify", "$skip")));
A breakdown
operations.find(_.matcher == in) // find an operation in our
// list matching the input we are
// checking. Returns Some or None
.map { _.action } // if some, replace input with action
// if none, do nothing
input.flatMap { in => // inputs are processed, converted
// to some(action) or none and the
// flatten removes the some/none
// returning just the strings.
I met a read / write problem this last days and I cannot fix an issue in my test. I have a JSON document based on the following model
package models
import play.api.libs.json._
object Models {
case class Record
(
id : Int,
samples : List[Double]
)
object Record {
implicit val recordFormat = Json.format[Record]
}
}
I have two functions : one to read a record and an another one to update.
case class MongoIO(futureCollection : Future[JSONCollection]) {
def readRecord(id : Int) : Future[Option[Record]] =
futureCollection
.flatMap { collection =>
collection.find(Json.obj("id" -> id)).one[Record]
}
def updateRecord(id : Int, newSample : Double) : Future[UpdateWriteResult]= {
readRecord(id) flatMap { recordOpt =>
recordOpt match {
case None =>
Future { UpdateWriteResult(ok = false, -1, -1, Seq(), Seq(), None, None, None) }
case Some(record) =>
val newRecord =
record.copy(samples = record.samples :+ newSample)
futureCollection
.flatMap { collection =>
collection.update(Json.obj("id" -> id), newRecord)
}
}
}
}
}
Now, I have a List[Future[UpdateWriteResult]] corresponds to a many updates on the document but what I want is that : wait the future is complete to execute the second one then wait the completion of the second to execute the third. I tried to do that with a foldLeft and flatMap like this :
val l : List[Future[UpdateWriteResult]] = ...
println(l.size) // give me 10
l
.foldLeft(Future.successful(UpdateWriteResult(ok = false, -1, -1, Seq(), Seq(), None, None, None))) {
case (cur, next) => cur.flatMap(_ => next)
}
but the document is never updated like excepted : instead to have a document with a samples list of size 10, I got a list of 1 samples. So the read is faster than the write (impression that I have) and also using a combinaison of foldLeft / flatMap seems to do not wait the completion of the current future so how can I fix this issue properly (without Await) ?
Update
val futureCollection = DB.getCollection("foo")
val mongoIO = MongoIO(futureCollection)
val id = 1
val samples = List(1.1, 2.2, 3.3)
val l : List[Future[UpdateWriteResult]] = samples.map(sample => mongoIO.updateRecord(id, sample))
You have to do the foldLeft on the samples:
(mongoIO.updateRecord(id, samples.head) /: samples.tail) {(acc, next) =>
acc.flatMap(_ => mongoIO.updateRecord(id, next))
}
updateRecord is what triggers the future, so you have to make sure not to call it until the previous one finishes.
I took the liberty of making some minor modification to you original sources:
case class MongoIO(futureCollection : Future[JSONCollection]) {
def readRecord(id : Int) : Future[Option[Record]] =
futureCollection.flatMap(_.find(Json.obj("id" -> id)).one[Record])
def updateRecord(id : Int, newSample : Double) : Future[UpdateWriteResult] =
readRecord(id) flatMap {
case None => Future.successful(UpdateWriteResult(ok = false, -1, -1, Nil, Nil, None, None, None))
case Some(record) =>
val newRecord = record.copy(samples = record.samples :+ newSample)
futureCollection.flatMap(_.update(Json.obj("id" -> id), newRecord))
}
}
Then we only need to write a serialized/sequential update:
def update(samples: Seq[Double], id: Int): Future[Unit] = s match {
case sample +: remainingSamples => updateRecord(id, sample).flatMap(_ => update(remainingSamples, id))
case _ => Future.successful(())
}
I have method with param type Future[List[MyRes]]. MyRes has two option fields id and name. Now I want to create map of id and name if both present. I am able to create map with default value as follow but I don't want to have default value just skip the entry with null value on either.
def myMethod(myRes: Future[List[MyRes]]): Future[Map[Long, String]] = {
myRes.map (
_.map(
o =>
(o.id match {
case Some(id) => id.toLong
case _ => 0L
}) ->
(o.name match {
case Some(name) => name
case _ => ""
})
).toMap)
Any suggestion?
You are looking for collect :)
myRes.map {
_.iterator
.map { r => r.id -> r.name }
.collect { case(Some(id), Some(name) => id -> name }
.toMap
}
If your MyRes thingy is a case class, then you don't need the first .map:
myRes.map {
_.collect { case MyRes(Some(id), Some(name)) => id -> name }
.toMap
}
collect is like .map, but it takes a PartialFunction, and skips over elements on which it is not defined. It is kinda like your match statement but without the defaults.
Update:
If I am reading your comment correctly, and you want to log a message when either field is a None, collect won't help with that, but you can do flatMap:
myRes.map {
_.flatMap {
case MyRes(Some(id), Some(name)) => Some(id -> name)
case x => loger.warn(s"Missing fields in $x."); None
}
.toMap
}
Try this:
def myMethod(myRes: Future[List[MyRes]]): Future[Map[Long, String]] = {
myRes.map (
_.flatMap(o =>
(for (id <- o.id; name <- o.name) yield (id.toLong -> name)).toList
).toMap
)
}
The trick is flattening List[Option[(Long,String)]] by using flatMap and converting the Option to a List.
I'm parallelising over a collection to count the number same item values in a List. The list in this case is uniqueSetOfLinks :
for (iListVal <- uniqueSetOfLinks.par) {
try {
val num : Int = listOfLinks.count(_.equalsIgnoreCase(iListVal))
linkTotals + iListVal -> num
}
catch {
case e : Exception => {
e.printStackTrace()
}
}
}
linkTotals is an immutable Map. To gain a reference to the total number of links do I need to update linkTotals so that it is immutable ?
I can then do something like :
linkTotals.put(iListVal, num)
You can't update immutable collection, all you can do is to combine immutable collection with addition element to get new immutable collection, like this:
val newLinkTotals = linkTotals + (iListVal -> num)
In case of collection you could create new collection of pairs and than add all pairs to the map:
val optPairs =
for (iListVal <- uniqueSetOfLinks.par)
yield
try {
val num : Int = listOfLinks.count(_.equalsIgnoreCase(iListVal))
Some(iListVal -> num)
}
catch {
case e : Exception => e.printStackTrace()
None
}
val newLinkTotals = linkTotals ++ optPairs.flatten // for non-empty initial map
val map = optPairs.flatten.toMap // in case there is no initial map
Note that you are using parallel collections (.par), so you should not use mutable state, like linkTotals += iListVal -> num.
Possible variation of #senia's answer (got rid of explicit flatten):
val optPairs =
(for {
iListVal <- uniqueSetOfLinks.par
count <- {
try
Some(listOfLinks.count(_.equalsIgnoreCase(iListVal)))
catch {
case e: Exception =>
e.printStackTrace()
None
}
}
} yield iListVal -> count) toMap
I think that you need some form of MapReduce in order to have parallel number of items estimation.
In your problem you already have all unique links. The partial intermediate result of map is simply a pair. And "reduce" is just toMap. So you can simply par-map the link to pair (link-> count) and then finally construct a map:
def count(iListVal:String) = listOfLinks.count(_.equalsIgnoreCase(iListVal))
val listOfPairs = uniqueSetOfLinks.par.map(iListVal => Try( (iListVal, count(iListVal)) ))
("map" operation is par-map)
Then remove exceptions:
val clearListOfPairs = listOfPairs.flatMap(_.toOption)
And then simply convert it to a map ("reduce"):
val linkTotals = clearListOfPairs.toMap
(if you need to check for exceptions, use Try.failure)
I am having problems writing this function, which takes a string and returns a list of strings associated to it.
(I'm expecting entries like {_id: ...., hash: "abcde", n: ["a","b","ijojoij"]} in mongodb)
def findByHash(hash: Hash) = {
val dbobj = mongoColl.findOne(MongoDBObject("hash" -> hash.hashStr))
val n = dbobj match {
case Some(doc: com.mongodb.casbah.Imports.DBObject) => {
doc("n") match {
case Some(n: com.mongodb.casbah.Imports.DBObject) => {
Some(List[String]() ++ n map { x => x.asInstanceOf[String] })
}
case _ => {
None // hash match but no n in object
}
}
}
case _ => {
None // no hash match
}
}
n
}
Is there anything wrong with the code? Do you know how to correct it?
doc("n") returns AnyRef, so you should explicitly cast it to BasicDBList.
val n = doc("n").asInstanceOf[BasicDBList]
Some(List[String]() ++ n map { x => x.asInstanceOf[String] })