How do you do an AND query on an array in mongodb? - mongodb

I have an array with tags which is part of a document, eg
["red", "green", "blue", "white", "black"]
Now I want to find all documents, which have red AND blue.

Use the $all condition to find records that match both "red" and "blue" conditions.
db.my_collection.find({tags: { $all : ["red","blue"]}})
If you want records that match either "red" or "blue" then you can use the $in condition.
db.my_collection.find({tags: { $in : ["red","blue"]}})

Also in case someone is wondering how to negate itemized search, i.e. find record that match both "red" and "blue" and DO NOT match "green" and "white" this can be accomplished with the $nin operator, which may not be obvious when someone is reading $nin docs (http://www.mongodb.org/display/DOCS/Advanced+Queries#AdvancedQueries-%24nin - parsing description was tricky for me):
db.my_collection.find({tags: { $all : ["red","blue"], $nin : ["green", "white" ]}})
This is very cool, as it allows relatively nice search syntax with negation:
tokenRequired1 tokenRequired2 !tokenForbidden1 !tokenForbidden2
Very natural, Gmail-style search.
As suggested here and here you could do full text search if you create an array of all tokens from a record, though I have no idea if it's efficient or even the best way to do it.

Related

How to get first elements from a pyspark array?

I have an array and I just keep a specific column from it like this in PySpark:
c = [ "green", "blue", "red", "yellow",... ]
And I want to keep the n first elements of this column
How can I do this please ? I know is a simple question but I don't find the solution ...
Thanks

MongoDb query and operation for multiple negative conditions

I stuck at a query. Maybe i worked so long but could you help me to find which operator i should use for this step.
Example collection
[
{color: "red",size: "m"},
{color: "red",size: "s"},
{color: "blue",size: "m"},
{color: "blue",size: "s"}
]
I want to query document which are not red and not m.
db.foo.find({color:{$ne:'red'}, size:{$ne:'m'}}) returns only [{color:'blue', size:'s'}] but i want to get [{color:'red', size:'s'},{color:'blue', size:'m'},{color:'blue', size:'s'}]
Thanks for help.
You are using the wrong filter. You just need to update filter query as,
{"$or":[{color:{$ne:'red'}}, {size:{$ne:'m'}}]}

Query by item in array of document, and sort by item's position in the array

I have a collection dinosaurs with documents of this structure:
doc#1:
name: "Tyrannosaurus rex",
dominantColors: [
"beige",
"blue",
"green"
]
doc#2:
name: "Velociraptor",
dominantColors: [
"green",
"orange",
"white"
]
I want to query the collection by color name (for example: green) to get documents sorted by color's position in dominantColors array. First get the documents in which green occurs higher in the array, then those in which it is lower. So, in the provided case I would get doc#2 first, then doc#1.
Each dominantColors array contains 3 elements, with elements sorted from most dominant to least.
I am looking through documentation, but am not able to find a solution. Maybe I need a different data structure altogether?
Cloud Firestore doesn't support querying arrays by ranked index. The only way you can query an array is using an array-contains type query.
What you could do instead is organize your colors using maps where the color is the key and their rank is the value:
name: "Tyrannosaurus rex",
dominantColors: {
"beige": 1,
"blue": 2,
"green": 3
}
Then you can order the query by the value of the map's property. So, in JavaScript, it would be something like this:
firebase
.collection('dinosaurs')
.where('dominantColors.green', '>', 0)
.orderBy('dominantColors.green')

which style is most preferred in Mongodb?

In mongodb, which style is better? 1) or 2)? Can I retrieve only line name from 1) despite of getting whole record on db.record.find("line":"east")?
1.
{
"line": "east",
"station":[
{ "name": "ABC", "_id": "1", },
{ "name": "DEF", "_id": "2" }
]
}
2.
{ "line": "east", "l_id":"1"},
{"station":"ABC", "_id":"1", "l_id":"1"},
{"station":"ABC", "_id":"2", "l_id":"1"}
Note: line and station has one to many relationship.
If you are most commonly getting stations by line and often need all stations of a line alt 1 is the best. If you are commonly retrieving single stations or a subset of stations 2 is the best. Alt 2 can lead to more queries since mongo doesn't really have any relations, alt 1 can lead to larger reads and make it more difficult to keep the working set in RAM because of larger objects. Alt 1 also has a minor drawback if you change values in a station on multiple lines - then you have to update its instance in each of the line documents containg it.
To get a partial object, i.e. not all stations in alt 1, you can do a projection to filter out what you don't want. It still means reading the whole object into memory first so you wouldn't gain a lot in performance from doing that.

Search with flexible ranking

Can any one suggest search engine that has flexible ranking calculation?
What is flexible ranking calculation?
for example I have two documents:
obj1 = {
title: "new record"
tags: [
{value:"tag1", weight:1},
{value:"tag2", weight:0.8},
{value:"tag3", weight:2},
]
}
obj2 = {
title: "new record with tag1 in title"
tags: [
{value:"tag1", weight:0.5},
{value:"tag2", weight:1},
{value:"tag3", weight:0.01},
]
}
let's assume weight for "title" property is 0.25
When I do search for "tag1" in all properties
I want search to return ranking = 1 for obj1 and ranking = 0.75 for obj2
I know Solr can do it but do you have any other suggestions?
You mention weight for title but then the values you described for scores mapped directly to tag values. Not sure if you missed a detail on how these two connect.
Assuming you want the score of the title match to play a role in the overall document score in addition to boosting documents that match a particular tag or value range, you can do this with Azure Search using scoring profiles (if you want a search-as-a-service solution), and can do it with Solr or Elasticsearch by including boosts as part of the query if you prefer to deploy and management your own infrastructure; in Elasticsearch for example there are function boosts that will allow you to use the value of a field as input to boost computation.