I'm in the process of trying to combine some nested calls with reactivemongo in my play2 application.
I get a list of objects returned from createObjects. I then loop over them, check if the object exist in the collection and if not insert them:
def dostuff() = Action {
implicit request =>
errors => BadRequest(views.html.invite(errors)),
form => {
val objectsReadyForSave = createObjects(form.companyId, form.companyName, sms_pattern.findAllIn(form.phoneNumbers).toSet)
Async {
for(object <- objectsReadyForSave) {
collection.find(BSONDocument("cId" -> object.get.cId,"userId" ->
object.userId.get)) { maybeFound => { found =>"Found record, do not insert")
} getOrElse {
I feel that this way is not as good as it can be and feels not "play2" and "reactivemongo".
So my question is: How should I structure my nested calls to get the result I want
and get the information of which objects that have been inserted?

I am not an expert in mongoDB neither in ReactiveMongo but it seems that you are trying to use a NoSQL database in the same way as you would use standard SQL databases. Note that mongoDB is asynchronous which means that operations may be executed in some future, this is why insertion/update operations do not return affected documents. Regarding your questions:
1 To insert the objects if they do not exist and get the information of which objects that have been inserted?
You should probably look at the mongoDB db.collection.update() method and call it with the upsert parameter as true. If you can afford it, this will either update documents if they already exist in database or insert them otherwise. Again, this operation does not return affected documents but you can check how many documents have been affected by accessing the last error. See reactivemongo.api.collections.GenericCollection#update which returns a Future[LastError].
2 For all the objects that are inserted, add them to a list and then return it with the Ok() call.
Once again, inserted/updated documents will not be returned. If you really need to return the complete affected document back, you will need to make another query to retrieve matching documents.
I would probably rewrite your code this way (without error/failure handling):
def dostuff() = Action {
implicit request =>
errors => BadRequest(views.html.invite(errors)),
form => {
val objectsReadyForSave = createObjects(form.companyId, form.companyName, sms_pattern.findAllIn(form.phoneNumbers).toSet)
Async {
val operations = for {
data <- objectsReadyForSave
} yield collection.update(BSONDocument("cId" -> data.cId.get, "userId" -> data.userId.get), data, upsert = true)
Future.sequence(operations).map {
lastErrors =>
Ok("Documents probably inserted/updated!")
See also Scala Futures:
This is really useful! ;)

Here's how I'd rewrote it.
def dostuff() = Action { implicit request =>
errors => BadRequest(views.html.invite(errors)),
form => {
// ...
// In the model
// ...
def ƒ(cId: Option[String], userId: Option[String], logger: Logger) = {
// You need to handle the case where obj.cId or obj.userId are None
collection.find(BSONDocument("cId" -> obj.cId.get, "userId" -> obj.userId.get))
.map { maybeFound =>
maybeFound map { _ =>"Record found, do not insert")
} getOrElse {
There may be some syntax errors, but the idea is there.


Firestore flutter query on multi fields with OR principle

From the docs:
You can also chain multiple where() methods to create more specific queries (logical AND).
How can I perform an OR query?
Give me all documents where the field status is open OR upcoming
Give me all documents where the field status == open OR createdAt <= <somedatetime>
OR isn't supported as it's hard for the server to scale it (requires keeping state to dedup). The work around is to issue 2 queries, one for each condition, and dedup on the client.
Edit (Nov 2019):
Cloud Firestore now supports IN queries which are a limited type of OR query.
For the example above you could do:
// Get all documents in 'foo' where status is open or upcmoming
However it's still not possible to do a general OR condition involving multiple fields.
With the recent addition of IN queries, Firestore supports "up to 10 equality clauses on the same field with a logical OR"
A possible solution to (1) would be:
documents.where('status', 'in', ['open', 'upcoming']);
See Firebase Guides: Query Operators | in and array-contains-any
suggest to give value for status as well.
{ name: "a", statusValue = 10, status = 'open' }
{ name: "b", statusValue = 20, status = 'upcoming'}
{ name: "c", statusValue = 30, status = 'close'}
you can query by ref.where('statusValue', '<=', 20) then both 'a' and 'b' will found.
this can save your query cost and performance.
btw, it is not fix all case.
I would have no "status" field, but status related fields, updating them to true or false based on request, like
{ name: "a", status_open: true, status_upcoming: false, status_closed: false}
However, check Firebase Cloud Functions. You could have a function listening status changes, updating status related properties like
{ name: "a", status: "open", status_open: true, status_upcoming: false, status_closed: false}
one or the other, your query could be just
Hope it helps.
This doesn't solve all cases, but for "enum" fields, you can emulate an "OR" query by making a separate boolean field for each enum-value, then adding a where("enum_<value>", "==", false) for every value that isn't part of the "OR" clause you want.
For example, consider your first desired query:
Give me all documents where the field status is open OR upcoming
You can accomplish this by splitting the status: string field into multiple boolean fields, one for each enum-value:
status_open: bool
status_upcoming: bool
status_suspended: bool
status_closed: bool
To perform your "where status is open or upcoming" query, you then do this:
where("status_suspended", "==", false).where("status_closed", "==", false)
How does this work? Well, because it's an enum, you know one of the values must have true assigned. So if you can determine that all of the other values don't match for a given entry, then by deduction it must match one of the values you originally were looking for.
See also
I don't like everyone saying it's not possible.
it is if you create another "hacky" field in the model to build a composite...
for instance, create an array for each document that has all logical or elements
then query for .where("field", arrayContains: [...]
you can bind two Observables using the rxjs merge operator.
Here you have an example.
import { Observable } from 'rxjs/Observable';
import 'rxjs/add/observable/merge';
getCombinatedStatus(): Observable<any> {
return Observable.merge(this.db.collection('foo', ref => ref.where('status','==','open')).valueChanges(),
this.db.collection('foo', ref => ref.where('status','==','upcoming')).valueChanges());
Then you can subscribe to the new Observable updates using the above method:
getCombinatedStatus.subscribe(results => console.log(results);
I hope this can help you, greetings from Chile!!
We have the same problem just now, luckily the only possible values for ours are A,B,C,D (4) so we have to query for things like A||B, A||C, A||B||C, D, etc
As of like a few months ago firebase supports a new query array-contains so what we do is make an array and we pre-process the OR values to the array
if (a) {
array addObject:#"a"
if (b) {
array addObject:#"b"
if (a||b) {
array addObject:#"a||b"
And we do this for all 4! values or however many combos there are.
THEN we can simply check the query [document arrayContains:#"a||c"] or whatever type of condition we need.
So if something only qualified for conditional A of our 4 conditionals (A,B,C,D) then its array would contain the following literal strings: #["A", "A||B", "A||C", "A||D", "A||B||C", "A||B||D", "A||C||D", "A||B||C||D"]
Then for any of those OR combinations we can just search array-contains on whatever we may want (e.g. "A||C")
Note: This is only a reasonable approach if you have a few number of possible values to compare OR with.
More info on Array-contains here, since it's newish to firebase docs
If you have a limited number of fields, definitely create new fields with true and false like in the example above. However, if you don't know what the fields are until runtime, you have to just combine queries.
Here is a tags OR example...
// the ids of students in class
const students = [studentID1, studentID2,...];
// get all docs where student.studentID1 = true
const results = this.afs.collection('classes',
ref => ref.where(`students.${students[0]}`, '==', true)
).valueChanges({ idField: 'id' }).pipe(
switchMap((r: any) => {
// get all docs where student.studentID2...studentIDX = true
const docs = students.slice(1).map(
(student: any) => this.afs.collection('classes',
ref => ref.where(`students.${student}`, '==', true)
).valueChanges({ idField: 'id' })
return combineLatest(docs).pipe(
// combine results by reducing array
map((a: any[]) => {
const g: [] = a.reduce(
(acc: any[], cur: any) => acc.concat(cur)
// filter out duplicates by 'id' field
return g.filter(
(b: any, n: number, a: any[]) => a.findIndex(
(v: any) => === === n
Unfortunately there is no other way to combine more than 10 items (use array-contains-any if < 10 items).
There is also no other way to avoid duplicate reads, as you don't know the ID fields that will be matched by the search. Luckily, Firebase has good caching.
For those of you that like promises...
const p = await results.pipe(take(1)).toPromise();
For more info on this, see this article I wrote.
OR isn't supported
But if you need that you can do It in your code
Ex : if i want query products where (Size Equal Xl OR XXL : AND Gender is Male)
//1* first get query where can firestore handle it
.whereEqualTo("gender", "Male")
.addSnapshotListener((queryDocumentSnapshots, e) -> {
if (queryDocumentSnapshots == null)
List<Product> productList = new ArrayList<>();
for (DocumentSnapshot snapshot : queryDocumentSnapshots.getDocuments()) {
Product product = snapshot.toObject(Product.class);
//2* then check your query OR Condition because firestore just support AND Condition
if (product.getSize().equals("XL") || product.getSize().equals("XXL"))
For Flutter dart language use this:
db.collection("projects").where("status", whereIn: ["public", "unlisted", "secret"]);
actually I found #Dan McGrath answer working here is a rewriting of his answer:
private void query() {
FirebaseFirestore db = FirebaseFirestore.getInstance();
.whereIn("status", Arrays.asList("open", "upcoming")) // you can add up to 10 different values like : Arrays.asList("open", "upcoming", "Pending", "In Progress", ...)
.addSnapshotListener(new EventListener<QuerySnapshot>() {
public void onEvent(#Nullable QuerySnapshot queryDocumentSnapshots, #Nullable FirebaseFirestoreException e) {
for (DocumentSnapshot documentSnapshot : queryDocumentSnapshots) {
// I assume you have a model class called MyStatus
MyStatus status= documentSnapshot.toObject(MyStatus.class);
if (status!= null) {
//do somthing...!

Iterate over row and create batch: DataFrame

I have a DataFrame with millions of row and I am iterating over them using following code:
df.foreachPartition { dataSetPartition => {
dataSetPartition.foreach(row => {
// DO SOMETHING like DB write/ s3 publish
Now I want to create batch operation for rows, so I change code with
df.foreachPartition { dataSetPartition => {
val rowBuffer = scala.collection.mutable.ListBuffer[Row]()
dataSetPartition.foreach(row => {
rowBuffer += row
if (rows.size == 1000) {
// DO ACTION like DB write/s3 publish <- DO_ACTION
if (rowBuffer.size > 0) {
// DO ACTION like DB write/s3 publish <-DO_ACTION
Problem in this approach is that DO_ACTION is repeated twice. I do not want to call dataSetPartition.size to get row count beforehand as it is lazy evaluated and might be costly operation.
Scala: 2.11
Spark: 2.2.1
I would suggest to use Scalas grouped method to create batches :
df.foreachPartition { dataSetPartition => {
dataSetPartition.grouped(1000).foreach(batch => {
// DO ACTION like DB write/s3 publish <- DO_ACTION

mongodb: collection.update inside cursor.each resolves before update is completed

First of all, I'm using mongodb-promise as a wrapper to MongoClient.
I need to fetch all records from a collection "people" that matches specific criteria and then update each of them.
For that I have this code to find all people:
return db.collection('people')
.then( (collection) => {
// Store reference to collection for future use
peopleCollection = collection;
return collection.find({a:1})
An then invoke this to update each record:
.then( (people) => {
// Process each people
return people.each( (person) => {
person.b = 2;
// Where peopleCollection is a reference to my collection
return peopleCollection.update({_id: person._id}, person)
I then add another promise chain to fetch all people where b != 2 and I find many records and I counted them. But when I execute this script repeatedly, the count decreases which means mongo is still updating other records when the promise has already resolved. What am I missing here?
.then( (people) => {
// Process each people
return people.each( (person) => {
// Where peopleCollection is a reference to my collection
return peopleCollection.update({_id: person._id}, {$set:{b:2}})