I met a read / write problem this last days and I cannot fix an issue in my test. I have a JSON document based on the following model
package models
import play.api.libs.json._
object Models {
case class Record
(
id : Int,
samples : List[Double]
)
object Record {
implicit val recordFormat = Json.format[Record]
}
}
I have two functions : one to read a record and an another one to update.
case class MongoIO(futureCollection : Future[JSONCollection]) {
def readRecord(id : Int) : Future[Option[Record]] =
futureCollection
.flatMap { collection =>
collection.find(Json.obj("id" -> id)).one[Record]
}
def updateRecord(id : Int, newSample : Double) : Future[UpdateWriteResult]= {
readRecord(id) flatMap { recordOpt =>
recordOpt match {
case None =>
Future { UpdateWriteResult(ok = false, -1, -1, Seq(), Seq(), None, None, None) }
case Some(record) =>
val newRecord =
record.copy(samples = record.samples :+ newSample)
futureCollection
.flatMap { collection =>
collection.update(Json.obj("id" -> id), newRecord)
}
}
}
}
}
Now, I have a List[Future[UpdateWriteResult]] corresponds to a many updates on the document but what I want is that : wait the future is complete to execute the second one then wait the completion of the second to execute the third. I tried to do that with a foldLeft and flatMap like this :
val l : List[Future[UpdateWriteResult]] = ...
println(l.size) // give me 10
l
.foldLeft(Future.successful(UpdateWriteResult(ok = false, -1, -1, Seq(), Seq(), None, None, None))) {
case (cur, next) => cur.flatMap(_ => next)
}
but the document is never updated like excepted : instead to have a document with a samples list of size 10, I got a list of 1 samples. So the read is faster than the write (impression that I have) and also using a combinaison of foldLeft / flatMap seems to do not wait the completion of the current future so how can I fix this issue properly (without Await) ?
Update
val futureCollection = DB.getCollection("foo")
val mongoIO = MongoIO(futureCollection)
val id = 1
val samples = List(1.1, 2.2, 3.3)
val l : List[Future[UpdateWriteResult]] = samples.map(sample => mongoIO.updateRecord(id, sample))
You have to do the foldLeft on the samples:
(mongoIO.updateRecord(id, samples.head) /: samples.tail) {(acc, next) =>
acc.flatMap(_ => mongoIO.updateRecord(id, next))
}
updateRecord is what triggers the future, so you have to make sure not to call it until the previous one finishes.
I took the liberty of making some minor modification to you original sources:
case class MongoIO(futureCollection : Future[JSONCollection]) {
def readRecord(id : Int) : Future[Option[Record]] =
futureCollection.flatMap(_.find(Json.obj("id" -> id)).one[Record])
def updateRecord(id : Int, newSample : Double) : Future[UpdateWriteResult] =
readRecord(id) flatMap {
case None => Future.successful(UpdateWriteResult(ok = false, -1, -1, Nil, Nil, None, None, None))
case Some(record) =>
val newRecord = record.copy(samples = record.samples :+ newSample)
futureCollection.flatMap(_.update(Json.obj("id" -> id), newRecord))
}
}
Then we only need to write a serialized/sequential update:
def update(samples: Seq[Double], id: Int): Future[Unit] = s match {
case sample +: remainingSamples => updateRecord(id, sample).flatMap(_ => update(remainingSamples, id))
case _ => Future.successful(())
}
Related
I have a list of integer as input and i would like to store the intermediate result of every comparison in a scala match expression into a ListBuffer.How can i achieve that?
Below is the code that i have written.Currently i am only able to store the result of last comparison not the intermediate ones.
import scala.collection.mutable.ListBuffer
object HelloWorld {
def main(args: Array[String]) {
var stor = ListBuffer[String]()
val inpLst = List(1, 2, 2, 2, 1)
for (i <- inpLst) {
stor = i match {
case 1 => "ok"
case 2 => "notok"
}
}
println(stor)
}
}
This is the output that i want.
opList = List("ok","notok","notok',"notok","ok")
#Mario Galic's answer is quite good way of approach to your problem, if you still insist on writing in your own way, below is the way to do it.
import scala.collection.mutable.ListBuffer
val inpLst = List(1,2,2,2,1)
val stor = ListBuffer.empty[String]
for (i <- inpLst) {
val str = i match {
case 1 => "ok"
case 2 => "notOk"
}
stor += str
}
println(stor)
This outputs below:
ListBuffer(ok, notOk, notOk, notOk, ok)
Instead of using ListBuffer consider mapping over a List like so
l.map {
case 1 => "ok"
case 2 => "notok"
case _ => "unknown"
}
which outputs
res0: List[String] = List(ok, notok, notok, notok, ok)
Applying Krzysztof's suggestion, we could omit case _ => "unknown" if we use collect like so
List(1,2,3).collect {
case 1 => "ok"
case 2 => "notok"
}
which outputs
res1: List[String] = List(ok, notok)
If you only need to convert list into "ok" or "notok" values you can use map function:
val inpLst = List(1, 2, 2, 2, 1)
inpLst.map{
case 1 => "ok"
case 2 => "notok"
}
else if you need to have intermediate container in each time of processing list, you can use foldLeft:
inpLst.foldLeft(List.empty[String]){
(buffer: List[String], i: Int) =>
buffer ++ (i match {
case 1 => List("ok")
case 2 => List("notok")
})
}
buffer will contain result of matching function on each previous iteration.
Could you please help me in understanding the following method:
def extractGlobalID(custDimIndex :Int)(gaData:DataFrame) : DataFrame = {
val getGlobId = udf[String,Seq[GenericRowWithSchema]](genArr => {
val globId: List[String] =
genArr.toList
.filter(_(0) == custDimIndex)
.map(custDim => custDim(1).toString)
globId match {
case Nil => ""
case x :: _ => x
}
})
gaData.withColumn("globalId", getGlobId('customDimensions))
}
The method applies an UDF to to dataframe. The UDF seems intended to extract a single ID from column of type array<struct>, where the first element of the struct is an index, the second one an ID.
You could rewrite the code to be more readable:
def extractGlobalID(custDimIndex :Int)(gaData:DataFrame) : DataFrame = {
val getGlobId = udf((genArr : Seq[Row]) => {
genArr
.find(_(0) == custDimIndex)
.map(_(1).toString)
.getOrElse("")
})
gaData.withColumn("globalId", getGlobId('customDimensions))
}
or even shorter with collectFirst:
def extractGlobalID(custDimIndex :Int)(gaData:DataFrame) : DataFrame = {
val getGlobId = udf((genArr : Seq[Row]) => {
genArr
.collectFirst{case r if(r.getInt(0)==custDimIndex) => r.getString(1)}
.getOrElse("")
})
gaData.withColumn("globalId", getGlobId('customDimensions))
}
I have case classes of Contact and Person:
case class Contact(id: String, name: String)
case class Person(id: String, name: String, age: Int, contacts: List[Contact])
lets say I have list of Person:
val pesonList = List(
Person(1, "john", 30, List(Contact(5,"mark"),Contact(6,"tamy"),Contact(7,"mary"))),
Person(2, "jeff", 40, List(Contact(8,"lary"),Contact(9,"gary"),Contact(10,"sam")))
)
I need to flatten this pesonList and transform it to list of:
case class FlattenPerson(personId: String, contactId: Option[String], personName: String)
so the results would be:
val flattenPersonList = List(
FlattenPerson(1,"john"),
FlattenPerson(1,5,"mark"),
FlattenPerson(1,6,"tamy"),
FlattenPerson(1, 7"mary"),
FlattenPerson(2,"jeff"),
FlattenPerson(2,8,"lary"),
FlattenPerson(2,9,"gary"),
FlattenPerson(2,10,"sam")
)
I found one way that looks like its working but dosent seem like the right way...it might break and scala probably have a more efficient way.
this is what I could come up with:
val people = pesonList.map(person => {
FlattenPerson(person.id, None, person.name)
})
val contacts = pesonList.flatMap(person => {
person.contacts.map(contact => {
FlattenPerson(person.id, Some(contact.id), contact.name)
})
})
val res = people ++ contacts
this would also have bad performance, I need to do it for each api call my app gets and it can be allot of calls plus i need to filter res.
would love to get some help here
I think flatMap() can do what you're after.
personList.flatMap{pson =>
FlattenPerson(pson.id, None, pson.name) ::
pson.contacts.map(cntc => FlattenPerson(pson.id, Some(cntc.id), cntc.name))
}
//res0: List[FlattenPerson] = List(FlattenPerson(1,None,john)
// , FlattenPerson(1,Some(5),mark)
// , FlattenPerson(1,Some(6),tamy)
// , FlattenPerson(1,Some(7),mary)
// , FlattenPerson(2,None,jeff)
// , FlattenPerson(2,Some(8),lary)
// , FlattenPerson(2,Some(9),gary)
// , FlattenPerson(2,Some(10),sam))
For reference here is a recursive versions of this algorithm that includes filtering in a single pass. This appears to perform somewhat faster than calling .filter(f) on the result. The non-filtered recursive version has no real performance advantage.
def flattenPeople(people: List[Person], f: FlattenPerson => Boolean): List[FlattenPerson] = {
#annotation.tailrec
def loop(person: Person, contacts: List[Contact], people: List[Person], res: List[FlattenPerson]): List[FlattenPerson] =
contacts match {
case Contact(id, name) :: tail =>
val newPerson = FlattenPerson(person.id, Some(id), name)
if (f(newPerson)) {
loop(person, tail, people, newPerson +: res)
} else {
loop(person, tail, people, res)
}
case _ =>
val newPerson = FlattenPerson(person.id, None, person.name)
val newRes = if (f(newPerson)) newPerson +: res else res
people match {
case p :: tail =>
loop(p, p.contacts, tail, newRes)
case Nil =>
newRes.reverse
}
}
people match {
case p :: tail => loop(p, p.contacts, tail, Nil)
case _ => Nil
}
}
I have a Seq[String] in Scala, and if the Seq contains certain Strings, I append a relevant message to another list.
Is there a more 'scalaesque' way to do this, rather than a series of if statements appending to a list like I have below?
val result = new ListBuffer[Err]()
val malformedParamNames = // A Seq[String]
if (malformedParamNames.contains("$top")) result += IntegerMustBePositive("$top")
if (malformedParamNames.contains("$skip")) result += IntegerMustBePositive("$skip")
if (malformedParamNames.contains("modifiedDate")) result += FormatInvalid("modifiedDate", "yyyy-MM-dd")
...
result.toList
If you want to use some scala iterables sugar I would use
sealed trait Err
case class IntegerMustBePositive(msg: String) extends Err
case class FormatInvalid(msg: String, format: String) extends Err
val malformedParamNames = Seq[String]("$top", "aa", "$skip", "ccc", "ddd", "modifiedDate")
val result = malformedParamNames.map { v =>
v match {
case "$top" => Some(IntegerMustBePositive("$top"))
case "$skip" => Some(IntegerMustBePositive("$skip"))
case "modifiedDate" => Some(FormatInvalid("modifiedDate", "yyyy-MM-dd"))
case _ => None
}
}.flatten
result.toList
Be warn if you ask for scala-esque way of doing things there are many possibilities.
The map function combined with flatten can be simplified by using flatmap
sealed trait Err
case class IntegerMustBePositive(msg: String) extends Err
case class FormatInvalid(msg: String, format: String) extends Err
val malformedParamNames = Seq[String]("$top", "aa", "$skip", "ccc", "ddd", "modifiedDate")
val result = malformedParamNames.flatMap {
case "$top" => Some(IntegerMustBePositive("$top"))
case "$skip" => Some(IntegerMustBePositive("$skip"))
case "modifiedDate" => Some(FormatInvalid("modifiedDate", "yyyy-MM-dd"))
case _ => None
}
result
Most 'scalesque' version I can think of while keeping it readable would be:
val map = scala.collection.immutable.ListMap(
"$top" -> IntegerMustBePositive("$top"),
"$skip" -> IntegerMustBePositive("$skip"),
"modifiedDate" -> FormatInvalid("modifiedDate", "yyyy-MM-dd"))
val result = for {
(k,v) <- map
if malformedParamNames contains k
} yield v
//or
val result2 = map.filterKeys(malformedParamNames.contains).values.toList
Benoit's is probably the most scala-esque way of doing it, but depending on who's going to be reading the code later, you might want a different approach.
// Some type definitions omitted
val malformations = Seq[(String, Err)](
("$top", IntegerMustBePositive("$top")),
("$skip", IntegerMustBePositive("$skip")),
("modifiedDate", FormatInvalid("modifiedDate", "yyyy-MM-dd")
)
If you need a list and the order is siginificant:
val result = (malformations.foldLeft(List.empty[Err]) { (acc, pair) =>
if (malformedParamNames.contains(pair._1)) {
pair._2 ++: acc // prepend to list for faster performance
} else acc
}).reverse // and reverse since we were prepending
If the order isn't significant (although if the order's not significant, you might consider wanting a Set instead of a List):
val result = (malformations.foldLeft(Set.empty[Err]) { (acc, pair) =>
if (malformedParamNames.contains(pair._1)) {
acc ++ pair._2
} else acc
}).toList // omit the .toList if you're OK with just a Set
If the predicates in the repeated ifs are more complex/less uniform, then the type for malformations might need to change, as they would if the responses changed, but the basic pattern is very flexible.
In this solution we define a list of mappings that take your IF condition and THEN statement in pairs and we iterate over the inputted list and apply the changes where they match.
// IF THEN
case class Operation(matcher :String, action :String)
def processInput(input :List[String]) :List[String] = {
val operations = List(
Operation("$top", "integer must be positive"),
Operation("$skip", "skip value"),
Operation("$modify", "modify the date")
)
input.flatMap { in =>
operations.find(_.matcher == in).map { _.action }
}
}
println(processInput(List("$skip","$modify", "$skip")));
A breakdown
operations.find(_.matcher == in) // find an operation in our
// list matching the input we are
// checking. Returns Some or None
.map { _.action } // if some, replace input with action
// if none, do nothing
input.flatMap { in => // inputs are processed, converted
// to some(action) or none and the
// flatten removes the some/none
// returning just the strings.
I'm writing a Scala web application that use MongoDB as database and ReactiveMongo as driver.
I've a collection named recommendation.correlation in which I saved the correlation between a product and a category.
A document has the following form:
{ "_id" : ObjectId("544f76ea4b7f7e3f6e2db224"), "category" : "c1", "attribute" : "c3:p1", "value" : { "average" : 0, "weight" : 3 } }
Now I'm writing a method as following:
def calculateCorrelation: Future[Boolean] = {
def calculate(category: String, tag: String, similarity: List[Similarity]): Future[(Double, Int)] = {
println("Calculate correlation of " + category + " " + tag)
val value = similarity.foldLeft(0.0, 0)( (r, c) => if(c.tag1Name.split(":")(0) == category && c.tag2Name == tag) (r._1 + c.eq, r._2 + 1) else r
) //fold the tags
val sum = value._1
val count = value._2
val result = if(count > 0) (sum/count, count) else (0.0, 0)
Future{result}
}
play.Logger.debug("Start Correlation")
Similarity.all.toList flatMap { tagsMatch =>
val tuples =
for {
i<- tagsMatch
} yield (i.tag1Name.split(":")(0), i.tag2Name) // create e List[(String, String)] containing the category and productName
val res = tuples map { el =>
calculate(el._1, el._2, tagsMatch) flatMap { value =>
val correlation = Correlation(el._1, el._2, value._1, value._2) // create the correlation
val query = Json.obj("category" -> value._1, "attribute" -> value._2)
Correlations.find(query).one flatMap(element => element match {
case Some(x) => Correlations.update(query, correlation) flatMap {status => status match {
case LastError(ok, _, _, _, _, _, _) => Future{true}
case _ => Future{false}
}
}
case None => Correlations.save(correlation) flatMap {status => status match {
case LastError(ok, _, _, _, _, _, _) => Future{true}
case _ => Future{false}
}
}
}
)
}
}
val result = if(res.exists(_ equals false)) false else true
Future{result}
}
The problem is that the method insert duplicated documents.
Why this happen??
I've solved using db.recommendation.correlation.ensureIndex({"category": 1, "attribute": 1}, {"unique": true, "dropDups":true }), but how can I fixed the problem without using indexes??
What's wrong??
What you want to do is an in-place update. To do that with ReactiveMongo you need to use an update operator to tell it which fields to update, and how. Instead, you've passed correlation (which I assume is some sort of BSONDocument) to the collection's update method. That simply requests replacement of the document, which if the unique index value is different will cause a new document to be added to the collection. Instead of passing correlation you should pass a BSONDocument that uses one of the update operators such as $set (set a field) or $incr (increment a numeric field by one). For details on doing that, please see the MongoDB Documentation, Modify Document