I have a mongodb collection which i would like to index on solr using data import handler . But i don't want the documents with foodType(a field in document) as variant to get indexed during full import how to do it?
If you have the query to get the documents from the MongoDB.
Then modify that query by adding a filtering condition.
This filter will not retreive the data from mongoDB where the documents has foodType.
e.g
db.collection.find({ "fieldToCheck" : { $exists : true, $ne : null } })
You can add some conditions as above in your query and restrict the data from being indexed in solr.
In the data-config the query is missing and that is the reason its getting all the documents from mongoDB. If you add the query with correct criteria it will retrive selected documents.
For more on the mongo Query please refer the link below
MOngoDB reference link
Related
I have a collection product in my DB. Below is one sample document:
{
"sku_id":"12345678",
"priduct_name":"milk",
"product_rank":3,
"product_price": 2.4
}
There are 100k such unique documents in our collection.
I want to query this collection using $in query, as shown below.
db.product.find({"sku_id" :{$in :["12345678","23213"]}}).sort( { product_rank: 1 } )
Our requirement is to search documents based on $in query and sort any field in document(asc or desc).
I have created both forward and reverse index on all fields for this collection.
Note: This sku_id array inside $in query can have 1000+ sku_ids.
My doubt is if I use the filter like $in with an array of sku_id and get the sorted result on any field, will it use the index for sorting or will it sort at query time?
Mongo allows you to find out if a query will use an index. As the find operation returns a cursor you can extend the method chain to include an explain() command which does exactly what you need. (suggest you use db.product.find(...).sort(...).explain('executionStats'))
Will it use the index for sorting or will it sort at query time?
The index created on the product_rank will be used in the query but, not an index on the sku_id alone. Instead create a compound index with both product_rank and sku_id(asc and desc).
Say you're querying documents based on 2 data points. One is a simple bool parameter, and the other is a complicated $geoWithin calculation.
db.collection.find( {"geoField": { "$geoWithin" : ...}, "boolField" : true} )
Will mongo reorder these parameters, so that it checks the boolField 1st, before running the complicated check?
MongoDB uses indexes like any other DBs. So the important thing for mongoDB is if any query fields has an index or not, not the order of query fields. At least there is no information in their documentation that mongoDB try to checks primitive query fields first. So for your example if boolField has an index mongoDB first check this field and eliminate documents whose boolField is false. But If geoField has an index then mongoDB first execute query on this field.
So what happens if none of them have index or both of them have? It should be the given order of fields in query because there is no suggestion or info beside of indexes in query optimization page of mongoDB. Additionally you can always test your queries performances with just adding .explain("executionStats").
So check the performance of db.collection.find( {"geoField": { "$geoWithin" : ...}, "boolField" : true} ) and db.collection.find( { "boolField" : true, "geoField": { "$geoWithin" : ...} } ). And let us know :)
To add to above response, if you want mongo to use specific index you can use cursor.hint . This https://docs.mongodb.com/manual/core/query-plans/ explains how default index selection is done.
I would like to find out how old/stale a collection is, I was wondering if there was a way to know when the last query was made to a collection, or even get a list of all collections last access date.
If your Mongodb collection document _id is of the following format "_id" : ObjectId("57bee0cbc9735bf0b80c23e0") then Mongodb stores the create document timestamp.
This can be retrieved by executing the following query
db.newcollection.findOne({"_id" : ObjectId("57bee0cbc9735bf0b80c23e0")})._id.getTimestamp();
the result would be an ISODate like this ISODate("2016-08-25T12:12:59Z")
find out how old/stale a collection
There is no predefined libraries available in mongodb to track the oldness of a collection. But it is doable by maintaining a log where we can keep an entry when we are accessing a collection.
References
ObjectID.getTimestamp()
Log messages
Rotate Log files
db.collection.stats()
I have Mongo database with 16 collections. All collection has the common field domain_id.
How can I remove documents with specified domain_id from all collections.
I know only how to remove document from single collection.
db.getCollection('collectionName1').remove({domain_id : '123'})
Use the method db.getCollectionNames() to get a list of all the collections in your database, iterate the list using the JavaScript's forEach()
method to remove the document from each collection:
db.getCollectionNames().forEach(function (col) {
db.getCollection(col).remove({domain_id : '123'})
});
Unfortunately, Mondo doesn't allow for linking collections. So, you do have to do it for each separate collection.
I want to compare two fields of the same collection (Mysql query example "SELECT * FROM table AS t WHERE t.field1 > t.filed2;") in mongodb with cakephp. I cannot use '$where' and aggregate of mongodb as I am also using other operators of mongodb like $or, $and and etc. And also I am using find of mongodb.
Ex: Collection have two fields integer fields per_day_budget and today_spent and I want to get the list of records where today_spent is less than or equal to per_day_budget. I hope this will you to better understand my query.
Kindly suggest solution for the same.
You can try:
db.collection.find({ this.today_spent : {$lte : this.per_day_budget}});