MongoDB : query result size greater than collection size

MongoDB : query result size greater than collection size - mongodb

I'm analyzing a MongoDB data source to check its quality.
I'm wondering if every document contains the attribute time: so I used this two command
> db.droppay.find().count();
291822
> db.droppay.find({time: {$exists : true}}).count()
293525
How can I have more elements with a given field than the elements contained in whole collection ? What's going wrong ? I'm unable to find the mistake.
If it's necessary I can post you the expected structure of the document.
Mongo Shell version is 1.8.3. Mongo Db version is 1.8.3.
Thanks in advance
This is the expected structure of the document entry:
{
"_id" : ObjectId("4e6729cc96babe974c710611"),
"action" : "send",
"event" : "sent",
"job_id" : "50a1b7ac-7482-4ad6-ba7d-853249d6a123",
"result_code" : "0",
"sender" : "",
"service" : "webcontents",
"service_name" : "webcontents",
"tariff" : "0",
"time" : "2011-09-07 10:22:35",
"timestamp" : "1315383755",
"trace_id" : "372",
"ts" : "2011-09-07 09:28:42"
}

My guess is that is an issue with the index. I bet that droppay has an index on :time, and some unsafe operation updated the underlying collection without updating the index.
Can you try repairing the db, and see if that makes it better.
Good luck.

There are probably time values that are of type array.
You may do db.droppay.find({time: {$type : 4}}) to find such documents.

Related

Mongodo db delete documents between custom fields

I have below documents in mongodb, am trying to delete the documents based on the referenceId field between the values X0000000005 and X00000000010, I couldnt find any articles for deleting mongo documents based on custom field, can someone please help me to do this deletion if its possible?
{
"_id" : ObjectId("5a0f13ad0a83924b84d16b7d"),
"senderId" : "783",
"clientId" : "146196",
"referenceId" : "X00000000001",
"file" : "jAAAAAECAAABaAAAAKQAAJyMKYqPYvFQKJrZ/fqYjDKNdXdOMK58tPQ"
}
{
"_id" : ObjectId("5a0f13ad0a83924b84d16b7e"),
"senderId" : "783",
"clientId" : "146196",
"referenceId" : "X00000000002",
"file" : "jAAAAAECAAABaAAAAKQAAJyMKYqPYvFQKJrZ/fqYjDKNdXdOMK58tPQ"
}
.
.
.
.
.
.
{
"_id" : ObjectId("5a0f13ad0a83924b84d16b7f"),
"senderId" : "783",
"clientId" : "146196",
"referenceId" : "X00000000020",
"file" : "jAAAAAECAAABaAAAAKQAAJyMKYqPYvFQKJrZ/fqYjDKNdXdOMK58tPQ"
}

The following simple query should work:
db.collection.remove({"referenceId":{$gte:"X00000000005"}, "referenceId":{$lte:"X00000000010"}})
You might want to run a find() using the same filter first in order to make sure that the delete() will affect the right records. That'd obviously be this then:
db.collection.find({"referenceId":{$gte:"X00000000005"}, "referenceId":{$lte:"X00000000010"}})
Also, depending on the exact definition of your
between the values X0000000005 and X00000000010
you might need to swap the $lte and $gte operators out for something else ($gt and/or $lt).

db.collection.remove({"referenceId": {"$lte": "X00000000010", "gte": "X0000000005"}})

How do i remove duplicates in mongodb?

I have a database which consists of few collections , i have tried copying from one collection to another .
In this process connection was lost and had to recopy them
now i find around 40000 records duplicates.
Format of my data:
{
"_id" : ObjectId("555abaf625149715842e6788"),
"reviewer_name" : "Sudarshan A",
"emp_name" : "Wilson Erica",
"evaluation_id" : NumberInt(550056),
"teamleader_id" : NumberInt(17199),
"reviewer_id" : NumberInt(1659),
"team_manager" : "Las Vegas",
"teammanager_id" : NumberInt(12245),
"team_leader" : "Thomas Donald",
"emp_id" : NumberInt(7781)
}
here only evaluation id is unique.
Queries that i have tried:
ensureIndex({id:1}, {unique:true, dropDups:true})

dropDups was removed in mongodb ~2.7.
Here is other realization method
but I don't test it

MongoDB extra Collections

I have a db called index having only one collection named student.
When I fire query db.students.find({}).count()
It shows 1000000 docs in it.
But when I used db.stats() It shows result like:-
{
"db" : "index",
"collections" : 3,
"objects" : 1000004,
"avgObjSize" : 59.95997216011136,
"dataSize" : 59960212,
"storageSize" : 87420928,
"numExtents" : 14,
"indexes" : 1,
"indexSize" : 32458720,
"fileSize" : 520093696,
"nsSizeMB" : 16,
"ok" : 1
}
3 collections how ?
No of object 1000004 which is 4 extra from expected ?
And finally i did db.getCollectionNames()
it shows [ "student", "system.indexes" ]
What is system.indexes ?
Please anybody elaborate on it?
I am new to the world of mongo.

The mysterious 2 collections
There are two collections created when a user stores data in a database for the first time or a database is created explicitly.
The first one, system.indexes holds the information about the indices defined in the various collections of the database. You can even access it using
db.system.indexes.find()
The hidden one, system.namespaces holds some metadata about the database, actually the name of all existing entities from the point of view of the database management.
Although it is not shown, you can still access it:
db.system.namespaces.find()
Warning: Don't fiddle with either of them. Your database may well become unusable. You have been warned!
There can be even more than those two. Read System Collections in the MongoDB docs for details.
The mysterious 4 objects
Actually, If you have tried to access the system databases as shown above, this one becomes very easy. In a database called foobardb with a collection foo and the default index on _id, querying system.indexes will give a result like this (prettified):
{
"v" : 1,
"key" : {
"_id" : 1
},
"name" : "_id_",
"ns" : "foobardb.foo"
}
Note that this is a single document. The prettified output of the second query looks like this:
{ "name" : "foobardb.foo" }
{ "name" : "foobardb.system.indexes" }
{ "name" : "foobardb.foo.$_id_" }
Here, we have three documents. So we have 4 additional documents inside the metadata.

I have big database on mongodb and can't find and use my info

This my code:
db.test.find() {
"_id" : ObjectId("4d3ed089fb60ab534684b7e9"),
"title" : "Sir",
"name" : {
"_id" : ObjectId("4d3ed089fb60ab534684b7ff"),
"first_name" : "Farid"
},
"addresses" : [
{
"city" : "Baku",
"country" : "Azerbaijan"
},{
"city" : "Susha",
"country" : "Azerbaijan"
},{
"city" : "Istanbul",
"country" : "Turkey"
}
]
}
I want get output only all city. Or I want get output only all country. How can i do it?

I'm not 100% about your code example, because if your 'find' by ID there's no need to search by anything else... but I wonder whether the following can help:
db.test.insert({name:'farid', addresses:[
{"city":"Baku", "country":"Azerbaijan"},
{"city":"Susha", "country":"Azerbaijan"},
{"city" : "Istanbul","country" : "Turkey"}
]});
db.test.insert({name:'elena', addresses:[
{"city" : "Ankara","country" : "Turkey"},
{"city":"Baku", "country":"Azerbaijan"}
]});
Then the following will show all countries:
db.test.aggregate(
{$unwind: "$addresses"},
{$group: {_id:"$country", countries:{$addToSet:"$addresses.country"}}}
);
result will be
{ "result" : [
{ "_id" : null,
"countries" : [ "Turkey", "Azerbaijan"]
}
],
"ok" : 1
}
Maybe there are other ways, but that's one I know.
With 'cities' you might want to take more care (because I know cities with the same name in different countries...).

Based on your question, there may be two underlying issues here:
First, it looks like you are trying to query a Collection called "test". Often times, "test" is the name of an actual database you are using. My concern, then, is that you are trying to query the database "test" to find any collections that have the key "city" or "country" on any of the internal documents. If this is the case, what you actually need to do is identify all of the collections in your database, and search them individually to see if any of these collections contain documents that include the keys you are looking for.
(For more information on how the db.collection.find() method works, check the MongoDB documentation here: http://docs.mongodb.org/manual/reference/method/db.collection.find/#db.collection.find)
Second, if this is actually what you are trying to do, all you need to for each collection is define a query that only returns the key of the document you are looking for. If you get more than 0 results from the query, you know documents have the "city" key. If they don't return results, you can ignore these collections. One caveat here is if data about "city" is in embedded documents within a collection. If this is the case, you may actually need to have some idea of which embedded documents may contain the key you are looking for.

MongoDB 2.4 new Text Index feature

So I have a weirdly configured DB imported into MongoDB, looks like this:
"_id" : ObjectId("51191d45890311d9b2a0865d"),
"field1" : "randomtextstuff",
"field2" : "randomtextstuff",
"field3" : "randomtextstuff",
"field4" : "randomtextstuff",
"field5" : "randomtextstuff"
Some documents have 100 fields others have non.
So I wanted to test the new text search, so I attempted the following index:
db.profile_specialties.ensureIndex({"field1":"text",
"field2":"text",
"field3":"text",
"field4":"text",
"field5":"text",
"field6":"text",
... All the way to 100
"field96":"text",
"field97":"text",
"field98":"text",
"field99":"text",
"field100":"text"})
The returned error message was:
{
"err" : "ns name too long, max size is 128",
"code" : 10080,
"n" : 0,
"connectionId" : 1,
"ok" : 1
}
Has any one else experiences this problem?

With MongoDB 2.4 text search you can use the new wildcard specifier ($**) to index all fields with string content:
db.profile_specialties.ensureIndex("$**":"text"})
You should consider that a text index across all fields is going to be very large, though.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

MongoDB : query result size greater than collection size - mongodb

My guess is that is an issue with the index. I bet that droppay has an index on :time, and some unsafe operation updated the underlying collection without updating the index. Can you try repairing the db, and see if that makes it better. Good luck.

There are probably time values that are of type array. You may do db.droppay.find({time: {$type : 4}}) to find such documents.

Related

Mongodo db delete documents between custom fields

How do i remove duplicates in mongodb?

MongoDB extra Collections

I have big database on mongodb and can't find and use my info

MongoDB 2.4 new Text Index feature

Categories

Resources