How to determine the index size with pymongo? - mongodb

With monogdb, I can run db.collection.stats() to find the size, in bytes, for each index.
PyMongo seems to be missing this operation.
Is there a way to find this information with PyMongo?

import pymongo
connect = pymongo.Connection('mongodb://localhost', safe=True)
db = connect.test
db.command('collStats', 'collection')
Result:
{
u'count': 2,
u'ns': u'test.test2',
u'ok': 1.0,
u'lastExtentSize': 8192,
u'avgObjSize': 94.0,
u'totalIndexSize': 8176,
u'systemFlags': 1,
u'userFlags': 0,
u'numExtents': 1,
u'nindexes': 1,
u'storageSize': 8192,
u'indexSizes': {u'_id_': 8176},
u'paddingFactor': 1.0,
u'size': 188
}
P.S. test2 in the result is my collection name
http://api.mongodb.org/python/current/api/pymongo/database.html
http://docs.mongodb.org/manual/reference/collection-statistics/

Related

PySNMP how to convert to dot oid

How to convert in python SNMPv2-SMI.enterprises.9.1.283 to dot oid?
SNMPv2-SMI.enterprises.9.1.283 -> 1.3.6.1.4.1.9.1.283
Thanks,
Alexey
You can get usufal data by:
print(o.getLabel())
print(o.getMibNode())
print(o.getMibSymbol())
print(o.getOid())
print(o.prettyPrint())
pysnmp implementation to resolve mib names to OID
from pysnmp.smi import builder, view, rfc1902, error
mibBuilder = builder.MibBuilder()
mibView = view.MibViewController(mibBuilder)
mibVar = rfc1902.ObjectIdentity('SNMPv2-SMI', 'enterprises', 9, 1, 283)
mibVar.resolveWithMib(mibView)
print(mibVar.prettyPrint()) # prints SNMPv2-SMI::enterprises.9.1.283
print(tuple(mibVar)) # prints (1, 3, 6, 1, 4, 1, 9, 1, 283)
print(str(mibVar)) # prints 1.3.6.1.4.1.9.1.283
Reference

mongo: update $push failed with "Resulting document after update is larger than 16777216"

I want to extend an large array using the update(.. $push ..) operation.
Here are the details:
I have a large collection 'A' with many fields. Amongst the fields, I want to extract the values of the 'F' field, and transfer them into one large array stored inside one single field of a document in collection 'B'.
I split the process into steps (to limit the memory used)
Here is the python program:
...
steps = 1000 # number of steps
step = 10000 # each step will handle this number of documents
start = 0
for j in range(steps):
print('step:', j, 'start:', start)
project = {'$project': {'_id':0, 'F':1} }
skip = {'$skip': start}
limit = {'$limit': step}
cursor = A.aggregate( [ skip, limit, project ], allowDiskUse=True )
a = []
for i, o in enumerate(cursor):
value = o['F']
a.append(value)
print('len:', len(a))
B.update( {'_id': 1}, { '$push': {'v' : { '$each': a } } } )
start += step
Here is the oupput of this program:
step: 0 start: 0
step: 1 start: 100000
step: 2 start: 200000
step: 3 start: 300000
step: 4 start: 400000
step: 5 start: 500000
step: 6 start: 600000
step: 7 start: 700000
step: 8 start: 800000
step: 9 start: 900000
step: 10 start: 1000000
Traceback (most recent call last):
File "u_psfFlux.py", line 109, in <module>
lsst[k].update( {'_id': 1}, { '$push': {'v' : { '$each': a } } } )
File "/home/ubuntu/.local/lib/python3.5/site-packages/pymongo/collection.py", line 2503, in update
collation=collation)
File "/home/ubuntu/.local/lib/python3.5/site-packages/pymongo/collection.py", line 754, in _update
_check_write_command_response([(0, result)])
File "/home/ubuntu/.local/lib/python3.5/site-packages/pymongo/helpers.py", line 315, in _check_write_command_response
raise WriteError(error.get("errmsg"), error.get("code"), error)
pymongo.errors.WriteError: Resulting document after update is larger than 16777216
Apparently the $push operation has to fetch the complete array !!! (my expectation was that this operation would always need the same amount of memory since we always append the same amount of values to the array)
In short, I don't understand why the update/$push operation fails with error...
Or... is there a way to avoid this unneeded buffering ?
Thanks for your suggestion
Christian

Kapacitor Join not Performing Full Outer Join

I have a TICK script (shown below) with two queries, both performing groupBy on the same tag. The script then joins the two queries on that tag and specifies a full outer join with a fill of 'null'. However Kapacitor seems to be treating it as a inner join as shown by the stats (also below). The queries emit 33 and 32 points each and the join emits 32. Shouldn't a full outer join emit at least as many points as the query with the greater point count (33)? When I |log() the queries I'm able to identify the record that is dropped by the join - it was emitted by one query and not the other.
Any suggestions on how to further troubleshoot this?
TICK script:
var raw_event = batch
|query('''select rsum from jsx.autogen.raw_event''')
.period(1m)
.every(1m)
.offset(1h)
.align()
.groupBy('location')
var event_latency = batch
|query('''select rsum from jsx.autogen.event_latency''')
.period(1m)
.every(1m)
.offset(1h)
.align()
.groupBy('location', 'glocation')
raw_event
|join(event_latency)
.fill('null')
.as('raw_event','event_latency')
.on('location')
.streamName('join_stream')
|log()
Stats:
"node-stats": {
"batch0": {
"avg_exec_time_ns": 0,
"collected": 65,
"emitted": 0
},
"join4": {
"avg_exec_time_ns": 11523,
"collected": 65,
"emitted": 32
},
"log5": {
"avg_exec_time_ns": 0,
"collected": 32,
"emitted": 0
},
"query1": {
"avg_exec_time_ns": 0,
"batches_queried": 33,
"collected": 33,
"emitted": 33,
"points_queried": 33,
"query_errors": 0
},
"query2": {
"avg_exec_time_ns": 0,
"batches_queried": 32,
"collected": 32,
"emitted": 32,
"points_queried": 32,
"query_errors": 0
}

Convert powershell cmdlet to C#

How could I convert the follow powershell command to C# code, especially parameters for -index.
Get-Mailbox | select-object -index 0, 1, 2, 3, 4, 5
I just want to retrieve the mail box many times to avoid extremely big memory usage.
How to set 0, 1, 2, 3, 4, 5 to CommandParameters?
I'm not a programmer but this is should get you closer:
Command cmdMailbox = new Command("Get-Mailbox");
cmdMailbox.Parameters.Add("Identity", 'someone');
Command cmdSelect = new Command("Select-Object");
int[] indexes = new int[] {0,1,2,3,4,5};
cmdSelect.Parameters.Add("Index",indexes );

mongodb: issues using $lte and $gte

look at this bizarre result:
list(db.users.find({"produit_up.spec.prix":{"$gte":0, "$lte": 1000}}, {"_id":0,"produit_up":1}))
Out[5]:
[{u'produit_up': [{u'avatar': {u'avctype': u'image/jpeg',
u'orientation': u'portrait',
u'photo': ObjectId('506867863a5f3a0ea84dcd6c')},
u'spec': {u'abus': 0,
u'date': u'2012-09-30',
u'description': u"portable tr\xe8s solide, peu servi, avec batterie d'une autonomie de 3 heures.",
u'id': u'alucaard134901952647',
u'namep': u'nokia 3310',
u'nombre': 1,
u'prix': 1000,
u'tags': [u'portable', u'nokia', u'3310'],
u'vendu': False}},
{u'avatar': {u'avctype': u'image/jpeg',
u'orientation': u'portrait',
u'photo': ObjectId('50686d013a5f3a04a8923b3e')},
u'spec': {u'abus': 0,
u'date': u'2012-09-30',
u'description': u'\u0646\u0628\u064a\u0639 \u0623\u064a \u0641\u0648\u0646 \u062c\u062f\u064a\u062f \u0641\u064a \u0627\u0644\u0628\u0648\u0627\u0637 \u0645\u0639\u0627\u0647 \u0634\u0627\u0631\u062c\u0648\u0631 \u062f\u0648\u0631\u064a\u062c \u064a\u0646',
u'id': u'alucaard134902092967',
u'namep': u'iphone 3gs',
u'nombre': 1,
u'prix': 20000,
u'tags': [u'iphone', u'3gs', u'apple'],
u'vendu': False}},
{u'avatar': {u'avctype': u'image/jpeg',
u'orientation': u'paysage',
u'photo': ObjectId('50686d3e3a5f3a04a8923b40')},
u'spec': {u'abus': 0,
u'date': u'2012-09-30',
u'description': u'vends 206 toutes options 2006 hdi.',
u'id': u'alucaard134902099082',
u'namep': u'peugeot 206',
u'nombre': 1,
u'prix': 500000,
u'tags': [u'voiture', u'206', u'hdi'],
u'vendu': False}}]}]
list(db.users.find({"produit_up.spec.prix":{"$gte":0, "$lte": 100}}, {"_id":0,"produit_up":1}))
Out[6]: []
pymongo.version
Out[8]: '2.3+'
and it gives me the same result in Mongo Shell:
db.version()
2.2.0
here is the answer from Bernie Hackett
You have three values for "produit_up.spec.prix", 1000, 20000, 500000.
Why would you think that {"$gte":0, "$lte": 100} would match any of
those values? 100 is less than all of those values.
The reason that {"$gte":0, "$lte": 1000} returns all three documents
is that they are all subdocuments in an array. Since one of the
subdocuments in the array is matched the entire enclosing document
is a match for your query. Since you did a projection on only
"produit_up", just that array (including all array members) is
returned. Use $elemMatch in MongoDB 2.2 to only return the exact
matching array element.
MongoDB and PyMongo are working as designed here.
To get the behavior I think you're asking for see the $elemMatch operator:
http://docs.mongodb.org/manual/reference/projection/elemMatch/