PyMongo sending "authenticate" for every query - mongodb

I am getting an authentication to MongoDb for every query I run using PyMongo MongoClient. This seems expensive / unnecessary:
2015-02-13T09:38:08.091-0800 [conn375243] authenticate db: { authenticate: 1, user: "", nonce: "xxx", key: "xxx" }
2015-02-13T09:38:08.876-0800 [conn375243] end connection xxx (15 connections now open)
2015-02-13T09:38:08.962-0800 [initandlisten] connection accepted from xxx:42554 #375244 (16 connections now open)
2015-02-13T09:38:08.966-0800 [conn375244] authenticate db: { authenticate: 1, user: "", nonce: "xxx", key: "xxx" }
As far as I can tell, I'm using the same MongoClient (although it's hidden behind MongoEngine) and not intentionally disconnecting it at any point:
19:20:45 {'default': MongoClient('xxx-a0.mongolab.com', 39931)}
19:20:45 [139726027002480]
19:28:35 {'default': MongoClient('xxx-a0.mongolab.com', 39931)} # print mongo_client_instance
19:28:35 [139726027002480] # print id(mongo_Client_instance)
When I set a pdb breakpoint in the authenticate function, this is the stacktrace. I cannot figure out why asking the cursor to refresh requires a fresh authentication. Am I misunderstanding, and that is part of the MongoDb protocol? My goal is to have as few "authenticate" commands sent as possible, since right now they're 50% of my logged commands on the server.
/home/ubuntu/workspace//metadata/jobs.py(24)get()
-> b = Item.objects.get_or_create(id=i['id'])[0]
/home/ubuntu/workspace//venv/local/lib/python2.7/site-packages/mongoengine/queryset/base.py(241)get_or_create()
-> doc = self.get(*q_objs, **query)
/home/ubuntu/workspace//venv/local/lib/python2.7/site-packages/mongoengine/queryset/base.py(182)get()
-> result = queryset.next()
/home/ubuntu/workspace//venv/local/lib/python2.7/site-packages/mongoengine/queryset/base.py(1137)next()
-> raw_doc = self._cursor.next()
/home/ubuntu/workspace//venv/local/lib/python2.7/site-packages/pymongo/cursor.py(1058)next()
-> if len(self.__data) or self._refresh():
/home/ubuntu/workspace//venv/local/lib/python2.7/site-packages/pymongo/cursor.py(1002)_refresh()
-> self.__uuid_subtype))
/home/ubuntu/workspace//venv/local/lib/python2.7/site-packages/pymongo/cursor.py(915)__send_message()
-> res = client._send_message_with_response(message, **kwargs)
/home/ubuntu/workspace//venv/local/lib/python2.7/site-packages/pymongo/mongo_client.py(1194)_send_message_with_response()
-> sock_info = self.__socket(member)
/home/ubuntu/workspace//venv/local/lib/python2.7/site-packages/pymongo/mongo_client.py(922)__socket()
-> self.__check_auth(sock_info)
/home/ubuntu/workspace//venv/local/lib/python2.7/site-packages/pymongo/mongo_client.py(503)__check_auth()
-> sock_info, self.__simple_command)
> /home/ubuntu/workspace//venv/local/lib/python2.7/site-packages/pymongo/auth.py(239)authenticate()
-> mechanism = credentials[0]
Additional information that might be useful is that these calls are from a Python RQ worker. I am trying to set up the connection before the fork step, but it's possible something is happening there to cause this.
(Pdb) os.getpid()
10507
... next query...
(Pdb) os.getpid()
10510

Got it!
The default Python-RQ worker uses the fork model, and the forking blocked PyMongo from sharing connection sockets.
I switched to the GeventWorker and now the sockets are shared by default.

Related

How to debug session leaking or close all sessions in python MongoDB?

It is my first time to use MongoDB to manage an image dataset(~10 million images).
My environment is MongoDB 5.0.6 with PymongoDB 4.0.2 and Python 3.9.6 on Ubuntu 18.04.
My dataset is accessing PyMongoDB and then it is used to train a DNN in Pytorch. My code warns at the begining:
UserWarning: MongoClient opened before fork.
Create MongoClient only after forking.
See PyMongo's documentation for details: https://pymongo.readthedocs.io/en/stable/faq.html#is-pymongo-fork-safe
(I check this url and get nothing. I think I indeed recreate the client each time when the class is instantiaed)
After running for a while, my code crashs, and exit
Unable to add session ID ffd152cf-97d3-454a-882a-c6fc693e2985 - 47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=
into the cache because the number of active sessions is too high,
full error:
{'ok': 0.0,
'errmsg': 'Unable to add session ID ffd152cf-97d3-454a-882a-c6fc693e2985 - 47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU= into the cache because the number of active sessions is too high',
'code': 261,
'codeName': 'TooManyLogicalSessions'}Unable to add session ID 85b35e6c-fc83-41d1-915b-83e6841c5467 - 47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU= into the cache because the number of active sessions is too high, full error: {'ok': 0.0, 'errmsg': 'Unable to add session ID 85b35e6c-fc83-41d1-915b-83e6841c5467 - 47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU= into the cache because the number of active sessions is too high',
'code': 261,
'codeName': 'TooManyLogicalSessions'}
As the error message, It seems I just opened too many sessions.
How can I check current active sessions in PyMongoDB? How can I close all sessions? Or how can I further debug this problem?
Thank you very much! My active sessions' number and my dataset code sample is offered as shown below:
1. Check the active sessions
My active sessions (db.serverStatus().connections) are indeed increasing:
{
current: 6,
available: 51194,
totalCreated: 199,
active: 2,
threaded: 6,
exhaustIsMaster: 0,
exhaustHello: 1,
awaitingTopologyChanges: 1
}
to
{
current: 156,
available: 51044,
totalCreated: 361,
active: 54,
threaded: 156,
exhaustIsMaster: 0,
exhaustHello: 51,
awaitingTopologyChanges: 51
}
and 5 minutes later, my program broke. (I don't know how to get this result from PymongoDB instead of mongosh. So I can only check it manually periodly. At this moment It seems there are still 51044 availiable, and should not be consumed up so quickly only in 5 minutes.)
2. my sample dataset code
like:
from torch.utils.data import Dataset
from pymongo import MongoClient
class MongoDataset(Dataset):
def __init__(self, dbName):
client = MongoClient(host = '127.0.0.1', port = 27017, connect=False)
db = client[dbName]
self.dataTable = db["dataTable"]
def getData(self, _id):
return self.dataTable.find_one({"_id" : _id})
def __len__(self):
return self.dataTable.estimated_document_count()
This class will be automatically forked, and recreated.

Connecting to postgresql from lapis

I decided to play with lapis - https://github.com/leafo/lapis, but the application drops when I try to query the database (PostgreSQL) with the output:
2017/07/01 16:04:26 [error] 31284#0: *8 lua entry thread aborted: runtime error: attempt to yield across C-call boundary
stack traceback:
coroutine 0:
[C]: in function 'require'
/usr/local/share/lua/5.1/lapis/init.lua:15: in function 'serve'
content_by_lua(nginx.conf.compiled:22):2: in function , client: 127.0.0.1, server: , request: "GET / HTTP/1.1", host: "localhost:8080"
The code that causes the error:
local db = require("lapis.db")
local res = db.query("SELECT * FROM users");
config.lua:
config({ "development", "production" }, {
postgres = {
host = "0.0.0.0",
port = "5432",
user = "wars_base",
password = "12345",
database = "wars_base"
}
})
The database is running, the table is created, in table 1 there is a record.
What could be the problem?
Decision: https://github.com/leafo/lapis/issues/556
You need to specify the right server IP in the host parameter.
The IP you have specified 0.0.0.0 is not a valid one, and normally it is used when you specify a listen address, with the meaning of "every address".
Usually you can use the '127.0.0.1' address during development.

Fiware Orion - pepProxy

i'm part of a team that is developing an application that uses the Fiware GE's has part of the Smart-AgriFood accelerator.
We are using the Orion Context Broker for gathering the data provided by the sensor network, and we intend to use the Pep-Proxy to authenticate the sensor node for access the Orion instance. We have tried the following pepProxy's:
https://github.com/telefonicaid/fiware-orion-pep
https://github.com/ging/fi-ware-pep-proxy
We only have success implementing the second (fi-ware-pep-proxy) implementation of the proxy. With the fiware-orion-pep we haven't been able to connect to the Keystone Global instance (account.lab.fi-ware.org), we have tried the account.lab... and the cloud.lab..., my question are:
1) is the keystone (IDM) instance for authentication the account.lab or the cloud.lab?? and what port's to use or address's?
2) is the fiware-orion-pep prepared for authenticate at the account.lab.fi-ware.org?? here is way i ask this:
This one works with the curl command at >> cloud.lab.fiware.org:4730/v2.0/tokens
{
"auth": {
"passwordCredentials": {
"username": "<my_user>",
"password": "<my_password>"
}
}
}'
This one does't work with the curl comand at >> account.lab.fi-ware.org:5000/v3/auth/tokens
{
"auth": {
"identity": {
"methods": [
"password"
],
"password": {
"user": {
"domain": {
"name": "<my_domain>"
},
"name": "<my_user>",
"password": "<my_password>"
}
}
}
} }'
3) what is the implementation that i should be using for authenticate the devices or other calls to the Orion instance???
Here are the configuration that i used:
fiware-orion-pep
config.authentication = {
checkHeaders: true,
module: 'keystone',
user: '<my_user>',
password: '<my_password>',
domainName: '<my_domain>',
retries: 3,
cacheTTLs: {
users: 1000,
projectIds: 1000,
roles: 60
},
options: {
protocol: 'http',
host: 'account.lab.fiware.org',
port: 5000,
path: '/v3/role_assignments',
authPath: '/v3/auth/tokens'
}
};
fi-ware-pep-proxy (this one works), i have set the listing port to 1026 at the source code
var config = {};
config.account_host = 'https://account.lab.fiware.org';
config.keystone_host = 'cloud.lab.fiware.org';
config.keystone_port = 4731;
config.app_host = 'localhost';
config.app_port = '10026';
config.username = 'pepProxy';
config.password = 'pepProxy';
// in seconds
config.chache_time = 300;
config.check_permissions = false;
config.magic_key = undefined;
module.exports = config;
Thanks in advance for the time ... :)
The are currently some differences in how both PEP Proxies authenticate and validate against the global instances, so they do not behave in exactly the same way.
The one in telefonicaid/fiware-orion-pep was developed to fulfill the PEP Proxy requirements (authentication and validation against a Keystone and Access Control) in individual projects with their own Keystone and Keypass (a flavour of Access Control) installations, and so it evolved faster than the one in ging/fi-ware-pep-proxy and in a slightly different direction. As an example, the former supports multitenancy using the fiware-service and fiware-servicepath headers, while the latter is transparent to those mechanisms. This development direction meant also that the functionality slightly differs from time to time from the one in the global instance.
That being said, the concrete answer:
- Both PEP Proxies should be able to contact the global instance. If one doesn't, please, fill a bug in the issues of the Github repository and we will fix it as soon as possible.
- The ging/fi-ware-pep-proxy was specifically designed for accessing the global instance, so you should be able to use it as expected.
Please, if you try to proceed with the telefonicaid/fiware-orion-pep take note also that:
- the configuration flag authentication.checkHeaders should be false, as the global instance does not currently support multitenancy.
- current stable release (0.5.0) is about to change to next version (probably today) so maybe some of the problems will solve with the update.
Hope this clarify some of your doubts.
[EDIT]
1) I have already install the telefonicaid/fiware-orion-pep (v 0.6.0) from sources and from the rpm package created following the tutorial available in the github. When creating the rpm package, this is created with the following name pep-proxy-0.4.0_next-0.noarch.rpm.
2) Here is the configuration that i used:
/opt/fiware-orion-pep/config.js
var config = {};
config.resource = {
original: {
host: 'localhost',
port: 10026
},
proxy: {
port: 1026,
adminPort: 11211
} };
config.authentication = {
checkHeaders: false,
module: 'keystone',
user: '<##################>',
password: '<###################>',
domainName: 'admin_domain',
retries: 3,
cacheTTLs: {
users: 1000,
projectIds: 1000,
roles: 60
},
options: { protocol: 'http',
host: 'cloud.lab.fiware.org',
port: 4730,
path: '/v3/role_assignments',
authPath: '/v3/auth/tokens'
} };
config.ssl = {
active: false,
keyFile: '',
certFile: '' }
config.logLevel = 'DEBUG'; // List of component
config.middlewares = {
require: 'lib/plugins/orionPlugin',
functions: [
'extractCBAction'
] };
config.componentName = 'orion';
config.resourceNamePrefix = 'fiware:';
config.bypass = false;
config.bypassRoleId = '';
module.exports = config;
/etc/sysconfig/pepProxy
# General Configuration
############################################################################
# Port where the proxy will listen for requests
PROXY_PORT=1026
# User to execute the PEP Proxy with
PROXY_USER=pepproxy
# Host where the target Context Broker is located
# TARGET_HOST=localhost
# Port where the target Context Broker is listening
# TARGET_PORT=10026
# Maximum level of logs to show (FATAL, ERROR, WARNING, INFO, DEBUG)
LOG_LEVEL=DEBUG
# Indicates what component plugin should be loaded with this PEP: orion, keypass, perseo
COMPONENT_PLUGIN=orion
#
# Access Control Configuration
############################################################################
# Host where the Access Control (the component who knows the policies for the incoming requests) is located
# ACCESS_HOST=
# Port where the Access Control is listening
# ACCESS_PORT=
# Host where the authentication authority for the Access Control is located
# AUTHENTICATION_HOST=
# Port where the authentication authority is listening
# AUTHENTICATION_PORT=
# User name of the PEP Proxy in the authentication authority
PROXY_USERNAME=XXXXXXXXXXXXX
# Password of the PEP Proxy in the Authentication authority
PROXY_PASSWORD=XXXXXXXXXXXXX
In the files above i have tried the following parameters:
Keystone instance: account.lab.fiware.org or cloud.lab.fiware.org
User: pep or pepProxy or "user from fiware account"
Pass: pep or pepProxy or "user password from account"
Port: 4730, 4731, 5000
The result it's the same as before... the telefonicaid/fiware-orion-pep is unable to authenticate:
log file at /var/log/pepProxy/pepProxy
time=2015-04-13T14:49:24.718Z | lvl=ERROR | corr=71a34c8b-10b3-40a3-be85-71bd3ce34c8a | trans=71a34c8b-10b3-40a3-be85-71bd3ce34c8a | op=/v1/updateContext | msg=VALIDATION-GEN-003] Error connecting to Keystone authentication: KEYSTONE_AUTHENTICATION_ERROR: There was a connection error while authenticating to Keystone: 500
time=2015-04-13T14:49:24.721Z | lvl=DEBUG | corr=71a34c8b-10b3-40a3-be85-71bd3ce34c8a | trans=71a34c8b-10b3-40a3-be85-71bd3ce34c8a | op=/v1/updateContext | msg=response-time: 50745 statusCode: 500
result from the client console
{
"message": "There was a connection error while authenticating to Keystone: 500",
"name": "KEYSTONE_AUTHENTICATION_ERROR"
}
I'm doing something wrong here??

Mongorestore not restoring data

I have an existing mongodump of a single collection that I am trying to restore. After running mongo restore, no errors show up and the data is not in the collection. Are there any known reasons how this could happen? I would expect if the data weren't inserted for some reason, an error would be provided in the log.
To create and attempt to restore the dump, I followed the answer provided for this question:
How to use mongodump for 1 collection
I've created a new database on a different server and it has an empty collection. I've checked the mongo log file and there are no errors, it shows the connection open and authenticate, then disconnect on the next line.
mongorestore -vvvvv -u user -p 'password' --db=MyDatabase --collection=MyCollection dump1/MyCollection.bson
2015-03-04T18:20:31.331+0000 creating new connection to:127.0.0.1:27017
2015-03-04T18:20:31.332+0000 [ConnectBG] BackgroundJob starting: ConnectBG
2015-03-04T18:20:31.332+0000 connected to server 127.0.0.1:27017 (127.0.0.1)
2015-03-04T18:20:31.332+0000 connected connection!
connected to: 127.0.0.1
2015-03-04T18:20:31.333+0000 drillDown: dump1/MyCollection.bson
2015-03-04T18:20:31.333+0000 dump1/MyCollection.bson
2015-03-04T18:20:31.333+0000 going into namespace [MyDatabase.MyCollection]
Restoring to MyDatabase.MyCollection without dropping. Restored data will be inserted without raising errors; check your server log
file size: 94876
130 objects found
2015-03-04T18:20:31.336+0000 Creating index: { key: { _id: 1 }, name: "_id_", ns: "MyDatabase.MyCollection" }
2015-03-04T18:20:31.340+0000 Creating index: { key: { geometry: "2dsphere" }, name: "geometry_2dsphere", ns: "MyDatabase.MyCollection", 2dsphereIndexVersion: 2 }
Log file:
2015-03-04T18:20:31.333+0000 [conn874] authenticate db: MyDatabase { authenticate: 1, nonce: "xxx", user: "user", key: "xxx" }
2015-03-04T18:20:31.342+0000 [conn874] end connection 127.0.0.1:59420 (25 connections now open)
The query I am using on the origin and destination is:
db.MyCollection.find()
On the origin server, the collection has 130 elements, which is what is also shown in the mongorestore output "130 objects found".
Edit:
I added the --drop option to the mongorestore command. The log file output clearly shows that it is creating the index on an empty collection.
2015-03-20T15:03:57.565+0000 [conn61965] authenticate db: MyDatabase { authenticate: 1, nonce: "xxx", user: "user", key: "xxx" }
2015-03-20T15:03:57.566+0000 [conn61965] CMD: drop MyDatabase.MyCollection
2015-03-20T15:03:57.631+0000 [conn61965] build index on: MyDatabase.MyCollection properties: { v: 1, key: { _id: 1 }, name: "_id_", ns: "MyDatabase.MyCollection" }
2015-03-20T15:03:57.631+0000 [conn61965] added index to empty collection
2015-03-20T15:03:57.652+0000 [conn61965] build index on: MyDatabase.MyCollection properties: { v: 1, key: { geometry: "2dsphere" }, name: "geometry_2dsphere", ns: "MyDatabase.MyCollection", 2dsphereIndexVersion: 2 }
2015-03-20T15:03:57.652+0000 [conn61965] added index to empty collection
2015-03-20T15:03:57.654+0000 [conn61965] end connection 127.0.0.1:59456 (21 connections now open)
So the issue ended up being that the user I was trying to do the restore with only had the read and dbAdmin roles. I had made a separate user so that the regular user used by the application did not have administrative rights. After changing my user's role from read to readWrite, it worked as expected.
To be honest, if the user didn't have the correct permissions, I really would have expected the log to show an error of some sort when it tries to run the restore without the correct permission.

Mongodb authentication using elixir-mongo

I have just started using Elixir, so I figure I have some basic misunderstanding going on here. Here is the code...
defmodule Mdb do
def connect(collection, this_db \\ "db-test") do
{:ok, mongo} = Mongo.connect("db-test.some-mongo-server.com", 12345)
db = mongo |> Mongo.db(this_db)
db |> Mongo.auth("user", "secretpassword")
db
end
end
I start with iex -S mix
and when I try db = Mdb.connect("users") I get
** (UndefinedFunctionError) undefined function: Mongo.auth/3
Mongo.auth(%Mongo.Db{auth: nil, mongo: %Mongo.Server{host: 'db-test.some-mongo-server.com', id_prefix: 12641, mode: :passive, opts: %{}, port: 12345, socket: #Port<0.5732>, timeout: 6000}, name: "db-stage", opts: %{mode: :passive, timeout: 6000}}, "user", "secretpassword")
(mdb_play) lib/mdb.ex:7: Mdb.connect/2
I looks like Mongo.auth/3 is undefined, but that makes no sense to me. Can any one point me towards my error?
thanks for the help
I just played around it, and faced the same error. As in the error message, Mongo.auth seems not defined, and it might be Mongo.Db.auth instead. However, I faced another error (ArgumentError) on Mongo.Db.auth too. It may be certain issue in the library.
** (ArgumentError) argument error
:erlang.byte_size
...
(mongo) lib/mongo_request.ex:43: Mongo.Request.cmd/3
(mongo) lib/mongo_db.ex:44: Mongo.Db.auth/1
I'm not familiar with the library, but after small change in Mongo.Db.auth, normal call seems started working.
I tried with the following sequence.
mongo = Mongo.connect!(server, port)
db = mongo |> Mongo.db(db_name)
db |> Mongo.Db.auth(user_name, password)
collection = db |> Mongo.Db.collection(collection_name)
collection |> Mongo.Collection.count()
The change I tried is in the following fork-repo.
https://github.com/parroty/elixir-mongo