I have tried to enter 10000 elements, but I'm unable to insert more than 1 element:
for( i = 0; i< 10000; ++i)
{
db.posts.insert({"Student_id" : i, "Name" : "Mark"});
}
I'm not sure how Studio3T works, but I can add 10k elements using the Node driver. The only difference is I have to do db.collection('posts').insert(...). Anyway, the correct method would be to create a list containing the documents first and insert all of them with insertMany
Related
I have this array in MongoDB:
I want to iterate and make queries on it:
> for(var i=0; i < AllegriTeams.length; i++) {a[i]=db.team.find({
_id:AllegriTeams[i].team_id}, {_id:0, official_name:1})}
The array a, at the end of for cycle, contains just the first two official names. I lose the last official_name.
Your loop looks correct and it is not clear why you would not receive three items in your array a.
Check that your variable AllegriTeams has three elements.
> AllegriTeams.length
3
I mimicked your setup and I received the results you expected where a has three elements. Here's what I did:
// 1. Log into mongo and use the "test" database, for example
> use test
// 2. Create data
> db.team.insert({"_id": "Juv.26", "official_name":"Juv.26.xxx”})
WriteResult({ "nInserted" : 1 })
> db.team.insert({"_id": "Mil.74", "official_name":"Mil.74.xxx”})
WriteResult({ "nInserted" : 1 })
> db.team.insert({"_id": "Cag.00", "official_name":"Cag.00.xxx”})
WriteResult({ "nInserted" : 1 })
// 3. Create the AllegriTeams variable
> var AllegriTeams = [ { "team_id":"Juv.26"}, {"team_id":"Mil.74"}, {"team_id":"Cag.00"}]
// 4. Create the "a" array
> var a = []
// 5. Run the for loop. Consider using "findOne" instead of "find".
> for (var i=0; i < AllegriTeams.length; i++) { a[i]=db.team.find({ _id:AllegriTeams[i].team_id}, {_id:0, official_name:1})}
{ "official_name" : "Cag.00.xxx" }
// 6. Get length of "a"
> a.length
3
As an aside, take note that the find() function will return a cursor. Therefore, the values stored in your a array will be the cursor values. Consider using the findOne() function since it returns a document.
Again, check that your AllegriTeam variable has three array elements.
I want to search in the first 1000 records of my document whose name is CityDB. I used the following code:
db.CityDB.find({'index.2':"London"}).limit(1000)
but it does not work, it return the first 1000 of finding, but I want to search just in the first 1000 records not all records. Could you please help me.
Thanks,
Amir
Note that there is no guarantee that your documents are returned in any particular order by a query as long as you don't sort explicitely. Documents in a new collection are usually returned in insertion order, but various things can cause that order to change unexpectedly, so don't rely on it. By the way: Auto-generated _id's start with a timestamp, so when you sort by _id, the objects are returned by creation-date.
Now about your actual question. When you first want to limit the documents and then perform a filter-operation on this limited set, you can use the aggregation pipeline. It allows you to use $limit-operator first and then use the $match-operator on the remaining documents.
db.CityDB.aggregate(
// { $sort: { _id: 1 } }, // <- uncomment when you want the first 1000 by creation-time
{ $limit: 1000 },
{ $match: { 'index.2':"London" } }
)
I can think of two ways to achieve this:
1) You have a global counter and every time you input data into your collection you add a field count = currentCounter and increase currentCounter by 1. When you need to select your first k elements, you find it this way
db.CityDB.find({
'index.2':"London",
count : {
'$gte' : currentCounter - k
}
})
This is not atomic and might give you sometimes more then k elements on a heavy loaded system (but it can support indexes).
Here is another approach which works nice in the shell:
2) Create your dummy data:
var k = 100;
for(var i = 1; i<k; i++){
db.a.insert({
_id : i,
z: Math.floor(1 + Math.random() * 10)
})
}
output = [];
And now find in the first k records where z == 3
k = 10;
db.a.find().sort({$natural : -1}).limit(k).forEach(function(el){
if (el.z == 3){
output.push(el)
}
})
as you see your output has correct elements:
output
I think it is pretty straight forward to modify my example for your needs.
P.S. also take a look in aggregation framework, there might be a way to achieve what you need with it.
I have a mongo collection with documents. There is one field in every document which is 0 OR 1. I need to random sample 1000 records from the database and count the number of documents who have that field as 1. I need to do this sampling 1000 times. How do i do it ?
For people coming to the answer, you should now use the new $sample aggregation function, new in 3.2.
https://docs.mongodb.org/manual/reference/operator/aggregation/sample/
db.collection_of_things.aggregate(
[ { $sample: { size: 15 } } ]
)
Then add another step to count up the 0s and 1s using $group to get the count. Here is an example from the MongoDB docs.
For MongoDB 3.0 and before, I use an old trick from SQL days (which I think Wikipedia use for their random page feature). I store a random number between 0 and 1 in every object I need to randomize, let's call that field "r". You then add an index on "r".
db.coll.ensureIndex(r: 1);
Now to get random x objects, you use:
var startVal = Math.random();
db.coll.find({r: {$gt: startVal}}).sort({r: 1}).limit(x);
This gives you random objects in a single find query. Depending on your needs, this may be overkill, but if you are going to be doing lots of sampling over time, this is a very efficient way without putting load on your backend.
Here's an example in the mongo shell .. assuming a collection of collname, and a value of interest in thefield:
var total = db.collname.count();
var count = 0;
var numSamples = 1000;
for (i = 0; i < numSamples; i++) {
var random = Math.floor(Math.random()*total);
var doc = db.collname.find().skip(random).limit(1).next();
if (doc.thefield) {
count += (doc.thefield == 1);
}
}
I was gonna edit my comment on #Stennies answer with this but you could also use a seprate auto incrementing ID index here as an alternative if you were to skip over HUGE amounts of record (talking huge here).
I wrote another answer to another question a lot like this one where some one was trying to find nth record of the collection:
php mongodb find nth entry in collection
The second half of my answer basically describes one potential method by which you could approach this problem. You would still need to loop 1000 times to get the random row of course.
If you are using mongoengine, you can use a SequenceField to generate an incremental counter.
class User(db.DynamicDocument):
counter = db.SequenceField(collection_name="user.counters")
Then to fetch a random list of say 100, do the following
def get_random_users(number_requested):
users_to_fetch = random.sample(range(1, User.objects.count() + 1), min(number_requested, User.objects.count()))
return User.objects(counter__in=users_to_fetch)
where you would call
get_random_users(100)
"tags" : ["MongoDB", "Map/Reduce", "Recipe"]
m = Code("function () {"" this.tags.forEach(function(z) {"" emit(z, 1);"" });""}")
r = Code("function (key, values) {var count = 0;for (var i = 0; i < values.length; i++) {count += values[i];}return count;}")
db.coll.map_reduce(m,r, out = "map_tags",query={"tags": {"$ne": ''},"organization":orgid},safe=True)
I can get correct result by above code but I need alternative solution for this.
Because map_reduce creating more collections in my db.
If you don't want the results of a map-reduce persisted then use inline_map_reduce instead of map_reduce:
results = db.coll._inline_map_reduce(m, r, query={"tags": {"$ne": ''}, "organization":orgid}, safe=True)
Note that you can only use this approach if your result set fits within the 16MB limit of a single document.
I got a lucene.net index with bunch of documents. I pull these with MVC request and return to client as JSON. I want to return only top N documents starting from index I want. I need that to minimize data flow between server and client.
What I need is something like:
1) First query- Get top 20 docs
2) Second query - Get top 20 docs beginning from 20 - would be 21 - 41
3) .... and so on
Lucene allows me to set top items. But it only count those from the beginning from the index. Is there a build-in possibility to set start index for that ? Probably some advanced Indexer I am missing in lucene.net or something..
Thanks!
Take a look at this blog that explains pagination in lucene.
The crux of it is this:
int start = 20; int pageSize = 20;
Query query = qp.parse(searchTerm);
TopDocs hits = searcher.search(query, maxNumberOfResults);
for (int i = start; i < start + pageSize && i < hits.Length(); i++) {
int docId = hits.scoreDocs[i].doc;
}