Kubectl connecting to Azure ACS cluster - kubernetes

I have deployed a cluster to Azure successfully. Now trying to set up kubectl to work with it.
Running:
az acs kubernetes get-credentials --resource-group=group --name=cluster
Results in:
scp: .kube/config: No such file or directory
Traceback (most recent call last):
File "/Users/me/lib/azure-cli/lib/python2.7/site-packages/azure/cli/main.py", line 36, in main
cmd_result = APPLICATION.execute(args)
File "/Users/me/lib/azure-cli/lib/python2.7/site-packages/azure/cli/core/application.py", line 212, in execute
result = expanded_arg.func(params)
File "/Users/me/lib/azure-cli/lib/python2.7/site-packages/azure/cli/core/commands/__init__.py", line 377, in __call__
return self.handler(*args, **kwargs)
File "/Users/me/lib/azure-cli/lib/python2.7/site-packages/azure/cli/core/commands/__init__.py", line 626, in _execute_command
reraise(*sys.exc_info())
File "/Users/me/lib/azure-cli/lib/python2.7/site-packages/azure/cli/core/commands/__init__.py", line 603, in _execute_command
result = op(client, **kwargs) if client else op(**kwargs)
File "/Users/me/lib/azure-cli/lib/python2.7/site-packages/azure/cli/command_modules/acs/custom.py", line 814, in k8s_get_credentials
_k8s_get_credentials_internal(name, acs_info, path, ssh_key_file)
File "/Users/me/lib/azure-cli/lib/python2.7/site-packages/azure/cli/command_modules/acs/custom.py", line 835, in _k8s_get_credentials_internal
'.kube/config', path_candidate, key_filename=ssh_key_file)
File "/Users/me/lib/azure-cli/lib/python2.7/site-packages/azure/cli/command_modules/acs/acs_client.py", line 64, in secure_copy
scp.get(src, dest)
File "/Users/me/lib/azure-cli/lib/python2.7/site-packages/scp.py", line 198, in get
self._recv_all()
File "/Users/me/lib/azure-cli/lib/python2.7/site-packages/scp.py", line 348, in _recv_all
raise SCPException(asunicode(msg[1:]))
SCPException: scp: .kube/config: No such file or directory
File ~/.kube/config exists, and I can ssh to the master node of the cluster.
Any idea what on can be wrong with what I am doing?
UPDATE.
Tried running the command per one of the comments here:
az acs kubernetes get-credentials --resource-group=group --name=cluster -f /Users/me/.kube/config --debug
And received the following info:
Password for private key:
paramiko.transport : starting thread (client mode): 0x9659890L
paramiko.transport : Local version/idstring: SSH-2.0-paramiko_2.3.1
paramiko.transport : Remote version/idstring: SSH-2.0-OpenSSH_7.2p2 Ubuntu-4ubuntu2.2
paramiko.transport : Connected (version 2.0, client OpenSSH_7.2p2)
paramiko.transport : kex algos:[u'curve25519-sha256#libssh.org', u'ecdh-sha2-nistp256', u'ecdh-sha2-nistp384', u'ecdh-sha2-nistp521', u'diffie-hellman-group-exchange-sha256', u'diffie-hellman-group14-sha1'] server key:[u'ssh-rsa', u'rsa-sha2-512', u'rsa-sha2-256', u'ecdsa-sha2-nistp256', u'ssh-ed25519'] client encrypt:[u'chacha20-poly1305#openssh.com', u'aes128-ctr', u'aes192-ctr', u'aes256-ctr', u'aes128-gcm#openssh.com', u'aes256-gcm#openssh.com'] server encrypt:[u'chacha20-poly1305#openssh.com', u'aes128-ctr', u'aes192-ctr', u'aes256-ctr', u'aes128-gcm#openssh.com', u'aes256-gcm#openssh.com'] client mac:[u'umac-64-etm#openssh.com', u'umac-128-etm#openssh.com', u'hmac-sha2-256-etm#openssh.com', u'hmac-sha2-512-etm#openssh.com', u'hmac-sha1-etm#openssh.com', u'umac-64#openssh.com', u'umac-128#openssh.com', u'hmac-sha2-256', u'hmac-sha2-512', u'hmac-sha1'] server mac:[u'umac-64-etm#openssh.com', u'umac-128-etm#openssh.com', u'hmac-sha2-256-etm#openssh.com', u'hmac-sha2-512-etm#openssh.com', u'hmac-sha1-etm#openssh.com', u'umac-64#openssh.com', u'umac-128#openssh.com', u'hmac-sha2-256', u'hmac-sha2-512', u'hmac-sha1'] client compress:[u'none', u'zlib#openssh.com'] server compress:[u'none', u'zlib#openssh.com'] client lang:[u''] server lang:[u''] kex follows?False
paramiko.transport : Kex agreed: ecdh-sha2-nistp256
paramiko.transport : HostKey agreed: ecdsa-sha2-nistp256
paramiko.transport : Cipher agreed: aes128-ctr
paramiko.transport : MAC agreed: hmac-sha2-256
paramiko.transport : Compression agreed: none
paramiko.transport : kex engine KexNistp256 specified hash_algo <built-in function openssl_sha256>
paramiko.transport : Switch to new keys ...
paramiko.transport : Trying SSH key 765f702f5fff7fe30260d53b5bbb57eb
paramiko.transport : userauth is OK
paramiko.transport : Authentication (publickey) successful!
paramiko.transport : [chan 0] Max packet in: 32768 bytes
paramiko.transport : Received global request "hostkeys-00#openssh.com"
paramiko.transport : Rejecting "hostkeys-00#openssh.com" global request from server.
paramiko.transport : [chan 0] Max packet out: 32768 bytes
paramiko.transport : Secsh channel 0 opened.
paramiko.transport : [chan 0] Sesch channel 0 request ok
paramiko.transport : [chan 0] EOF received (0)
paramiko.transport : [chan 0] EOF sent (0)
scp: .kube/config: No such file or directory
Traceback (most recent call last):
File "/Users/me/lib/azure-cli/lib/python2.7/site-packages/azure/cli/main.py", line 36, in main
cmd_result = APPLICATION.execute(args)
File "/Users/me/lib/azure-cli/lib/python2.7/site-packages/azure/cli/core/application.py", line 212, in execute
result = expanded_arg.func(params)
File "/Users/me/lib/azure-cli/lib/python2.7/site-packages/azure/cli/core/commands/__init__.py", line 377, in __call__
return self.handler(*args, **kwargs)
File "/Users/me/lib/azure-cli/lib/python2.7/site-packages/azure/cli/core/commands/__init__.py", line 626, in _execute_command
reraise(*sys.exc_info())
File "/Users/me/lib/azure-cli/lib/python2.7/site-packages/azure/cli/core/commands/__init__.py", line 603, in _execute_command
result = op(client, **kwargs) if client else op(**kwargs)
File "/Users/me/lib/azure-cli/lib/python2.7/site-packages/azure/cli/command_modules/acs/custom.py", line 814, in k8s_get_credentials
_k8s_get_credentials_internal(name, acs_info, path, ssh_key_file)
File "/Users/me/lib/azure-cli/lib/python2.7/site-packages/azure/cli/command_modules/acs/custom.py", line 835, in _k8s_get_credentials_internal
'.kube/config', path_candidate, key_filename=ssh_key_file)
File "/Users/me/lib/azure-cli/lib/python2.7/site-packages/azure/cli/command_modules/acs/acs_client.py", line 64, in secure_copy
scp.get(src, dest)
File "/Users/me/lib/azure-cli/lib/python2.7/site-packages/scp.py", line 198, in get
self._recv_all()
File "/Users/me/lib/azure-cli/lib/python2.7/site-packages/scp.py", line 348, in _recv_all
raise SCPException(asunicode(msg[1:]))
SCPException: scp: .kube/config: No such file or directory
Version of azure cli is 2.0.18. Would appreciate advice.

Try Using --file -f flag to specify the absolute path of the "config" file in .kube directory and you can use --debug to debug further.
az acs kubernetes get-credentials --resource-group=group --name=cluster -f /.../.../.kube/config --debug

In my case, .kube/config file wasn't created on the master. There was an error during the attempt to create .kube/config file, which I have noticed by ssh-ing to my master node and looking into /var/log/azure directory. The error was related to my service principal password. With a new service principal, new clusters work just fine.

Related

Couldn't resolve module/action 'k8s_exec' on Ansible Playbook

I am trying to make a reboot to a pod on a Rancher Cluster through Ansible and I am having this error:
ERROR! couldn't resolve module/action 'k8s_exec'. This often indicates a misspelling, missing collection, or incorrect module path.
The error appears to be in '/home/ansible/ansible/GetKubectlPods': line 27, column 7, but may
be elsewhere in the file depending on the exact syntax problem.
The offending line appears to be:
-
name: Reboot Machine
^ here
I don't know why I am getting an error since I am also using k8s_info and he is working be fine.
Here is the playbook I am running:
---
- hosts: localhost
connection: local
remote_user: root
vars:
ansible_python_interpreter: '{{ ansible_playbook_python }}'
tasks:
-
name: Get the pods in the specific namespace
k8s_info:
kubeconfig: '/etc/ansible/RCCloudConfig'
kind: Pod
namespace: redmine
register: pod_list
-
name: Print pod names
debug:
msg: "pod_list: {{ pod_list | json_query('resources[*].status.podIP') }} "
- set_fact:
pod_names: "{{pod_list|json_query('resources[*].metadata.name')}}"
-
name: Reboot Machine
k8s_exec:
kubeconfig: '/etc/ansible/RCCloudConfig'
namespace: redmine
pod: redmine_quick-testing-6c57cc5d65-lwkww #pod name
command: reboot
Ansible Version:
ansible 2.9.9
config file = /etc/ansible/ansible.cfg
configured module search path = ['/home/ansible/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
ansible python module location = /usr/lib/python3.6/site-packages/ansible
executable location = /usr/bin/ansible
python version = 3.6.8 (default, Apr 16 2020, 01:36:27) [GCC 8.3.1 20191121 (Red Hat 8.3.1-5)]
In case I edit my playbook and add the community.kubernetes collection, I get the following error:
fatal: [localhost]: FAILED! => {"changed": false, "module_stderr": "/usr/local/lib/python3.6/site-packages/kubernetes/client/apis/__init__.py:12: DeprecationWarning: The package kubernetes.client.apis is renamed and deprecated, use kubernetes.client.api instead (please note that the trailing s was removed).\n DeprecationWarning\nTraceback (most recent call last):\n File \"/usr/local/lib/python3.6/site-packages/kubernetes/stream/ws_client.py\", line 296, in websocket_call\n client = WSClient(configuration, get_websocket_url(url), headers, capture_all)\n File \"/usr/local/lib/python3.6/site-packages/kubernetes/stream/ws_client.py\", line 94, in __init__\n self.sock.connect(url, header=header)\n File \"/usr/local/lib/python3.6/site-packages/websocket/_core.py\", line 226, in connect\n self.handshake_response = handshake(self.sock, *addrs, **options)\n File \"/usr/local/lib/python3.6/site-packages/websocket/_handshake.py\", line 80, in handshake\n status, resp = _get_resp_headers(sock)\n File \"/usr/local/lib/python3.6/site-packages/websocket/_handshake.py\", line 165, in _get_resp_headers\n raise WebSocketBadStatusException(\"Handshake status %d %s\", status, status_message, resp_headers)\nwebsocket._exceptions.WebSocketBadStatusException: Handshake status 404 Not Found\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File \"/home/ansible/.ansible/tmp/ansible-tmp-1592530746.4685414-46123-84159696554463/AnsiballZ_k8s_exec.py\", line 102, in <module>\n _ansiballz_main()\n File \"/home/ansible/.ansible/tmp/ansible-tmp-1592530746.4685414-46123-84159696554463/AnsiballZ_k8s_exec.py\", line 94, in _ansiballz_main\n invoke_module(zipped_mod, temp_path, ANSIBALLZ_PARAMS)\n File \"/home/ansible/.ansible/tmp/ansible-tmp-1592530746.4685414-46123-84159696554463/AnsiballZ_k8s_exec.py\", line 40, in invoke_module\n runpy.run_module(mod_name='ansible_collections.community.kubernetes.plugins.modules.k8s_exec', init_globals=None, run_name='__main__', alter_sys=True)\n File \"/usr/lib64/python3.6/runpy.py\", line 205, in run_module\n return _run_module_code(code, init_globals, run_name, mod_spec)\n File \"/usr/lib64/python3.6/runpy.py\", line 96, in _run_module_code\n mod_name, mod_spec, pkg_name, script_name)\n File \"/usr/lib64/python3.6/runpy.py\", line 85, in _run_code\n exec(code, run_globals)\n File \"/tmp/ansible_k8s_exec_payload_q6oom26c/ansible_k8s_exec_payload.zip/ansible_collections/community/kubernetes/plugins/modules/k8s_exec.py\", line 148, in <module>\n File \"/tmp/ansible_k8s_exec_payload_q6oom26c/ansible_k8s_exec_payload.zip/ansible_collections/community/kubernetes/plugins/modules/k8s_exec.py\", line 135, in main\n File \"/usr/local/lib/python3.6/site-packages/kubernetes/stream/stream.py\", line 35, in stream\n return func(*args, **kwargs)\n File \"/usr/local/lib/python3.6/site-packages/kubernetes/client/api/core_v1_api.py\", line 841, in connect_get_namespaced_pod_exec\n (data) = self.connect_get_namespaced_pod_exec_with_http_info(name, namespace, **kwargs) # noqa: E501\n File \"/usr/local/lib/python3.6/site-packages/kubernetes/client/api/core_v1_api.py\", line 941, in connect_get_namespaced_pod_exec_with_http_info\n collection_formats=collection_formats)\n File \"/usr/local/lib/python3.6/site-packages/kubernetes/client/api_client.py\", line 345, in call_api\n _preload_content, _request_timeout)\n File \"/usr/local/lib/python3.6/site-packages/kubernetes/client/api_client.py\", line 176, in __call_api\n _request_timeout=_request_timeout)\n File \"/usr/local/lib/python3.6/site-packages/kubernetes/stream/stream.py\", line 30, in _intercept_request_call\n return ws_client.websocket_call(config, *args, **kwargs)\n File \"/usr/local/lib/python3.6/site-packages/kubernetes/stream/ws_client.py\", line 302, in websocket_call\n raise ApiException(status=0, reason=str(e))\nkubernetes.client.rest.ApiException: (0)\nReason: Handshake status 404 Not Found\n\n", "module_stdout": "", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error", "rc": 1}
Solution:
Find your k8s_exec.py file and edit your import settings from Kubernetes.apis to Kubernetes.api
Thank you in advance!

Unable to run airflow scheduler

I have recently installed airflow on an AWS server by using this guide for ubuntu 16.04. After a painful and successful install started the webserver. I tried a sample dag as follows
from airflow.operators.python_operator import PythonOperator
from airflow.operators.dummy_operator import DummyOperator
from datetime import timedelta
from airflow import DAG
import airflow
# DEFAULT ARGS
default_args = {
'owner': 'airflow',
'start_date': airflow.utils.dates.days_ago(2),
'depends_on_past': False}
dag = DAG('init_run', default_args=default_args, description='DAG SAMPLE',
schedule_interval='#daily')
def print_something():
print("HELLO AIRFLOW!")
with dag:
task_1 = PythonOperator(task_id='do_it', python_callable=print_something)
task_2 = DummyOperator(task_id='dummy')
task_1 << task_2
But when i open the UI the tasks in the dag are still in "No Status" no matter how many times i trigger manually or refresh the page.
Later i found out that airflow scheduler is not running and shows the following error:
{celery_executor.py:228} ERROR - Error sending Celery task:No module named 'MySQLdb'
Celery Task ID: ('init_run', 'dummy', datetime.datetime(2019, 5, 30, 18, 0, 24, 902499, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>), 1)
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/airflow/executors/celery_executor.py", line 118, in send_task_to_executor
result = task.apply_async(args=[command], queue=queue)
File "/usr/local/lib/python3.7/site-packages/celery/app/task.py", line 535, in apply_async
**options
File "/usr/local/lib/python3.7/site-packages/celery/app/base.py", line 728, in send_task
amqp.send_task_message(P, name, message, **options)
File "/usr/local/lib/python3.7/site-packages/celery/app/amqp.py", line 552, in send_task_message
**properties
File "/usr/local/lib/python3.7/site-packages/kombu/messaging.py", line 181, in publish
exchange_name, declare,
File "/usr/local/lib/python3.7/site-packages/kombu/connection.py", line 510, in _ensured
return fun(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/kombu/messaging.py", line 194, in _publish
[maybe_declare(entity) for entity in declare]
File "/usr/local/lib/python3.7/site-packages/kombu/messaging.py", line 194, in <listcomp>
[maybe_declare(entity) for entity in declare]
File "/usr/local/lib/python3.7/site-packages/kombu/messaging.py", line 102, in maybe_declare
return maybe_declare(entity, self.channel, retry, **retry_policy)
File "/usr/local/lib/python3.7/site-packages/kombu/common.py", line 121, in maybe_declare
return _maybe_declare(entity, channel)
File "/usr/local/lib/python3.7/site-packages/kombu/common.py", line 145, in _maybe_declare
entity.declare(channel=channel)
File "/usr/local/lib/python3.7/site-packages/kombu/entity.py", line 608, in declare
self._create_queue(nowait=nowait, channel=channel)
File "/usr/local/lib/python3.7/site-packages/kombu/entity.py", line 617, in _create_queue
self.queue_declare(nowait=nowait, passive=False, channel=channel)
File "/usr/local/lib/python3.7/site-packages/kombu/entity.py", line 652, in queue_declare
nowait=nowait,
File "/usr/local/lib/python3.7/site-packages/kombu/transport/virtual/base.py", line 531, in queue_declare
self._new_queue(queue, **kwargs)
File "/usr/local/lib/python3.7/site-packages/kombu/transport/sqlalchemy/__init__.py", line 82, in _new_queue
self._get_or_create(queue)
File "/usr/local/lib/python3.7/site-packages/kombu/transport/sqlalchemy/__init__.py", line 70, in _get_or_create
obj = self.session.query(self.queue_cls) \
File "/usr/local/lib/python3.7/site-packages/kombu/transport/sqlalchemy/__init__.py", line 65, in session
_, Session = self._open()
File "/usr/local/lib/python3.7/site-packages/kombu/transport/sqlalchemy/__init__.py", line 56, in _open
engine = self._engine_from_config()
File "/usr/local/lib/python3.7/site-packages/kombu/transport/sqlalchemy/__init__.py", line 51, in _engine_from_config
return create_engine(conninfo.hostname, **transport_options)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/__init__.py", line 443, in create_engine
return strategy.create(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/strategies.py", line 87, in create
dbapi = dialect_cls.dbapi(**dbapi_args)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/dialects/mysql/mysqldb.py", line 104, in dbapi
return __import__("MySQLdb")
ModuleNotFoundError: No module named 'MySQLdb'
Here is the setting in the config file (airflow.cfg):
sql_alchemy_conn = postgresql+psycopg2://airflow#localhost:5432/airflow
broker_url = sqla+mysql://airflow:airflow#localhost:3306/airflow
result_backend = db+postgresql://airflow:airflow#localhost/airflow
I been struggling with this issue for two days now, Please help
In your airflow.cfg, there should also be a config option for celery_result_backend. Are you able to let us know what this value is set to? If it is not present in your config, set it to the same value as the result_backend
i.e:
celery_result_backend = db+postgresql://airflow:airflow#localhost/airflow
And then restart the airflow stack to ensure the configuration changes apply.
(I wanted to leave this as a comment but don't have enough rep to do so)
I think the example you are following didnt told you to install mysql and it seems you are using it in broker URL.
you can install mysql and than configure it. (for python 3.5+)
pip install mysqlclient
Alternatively, for a quick fix. You can also use rabbit MQ(Rabbitmq is a message broker, that you will require to rerun airflow dags with celery) guest user login
and then your broker_url will be
broker_url = amqp://guest:guest#localhost:5672//
if not already installed, Rabbitmq can be installed with following command.
sudo apt install rabbitmq-server
Change configuration NODE_IP_ADDRESS=0.0.0.0 in configuration file located at
/etc/rabbitmq/rabbitmq-env.conf
start RabbitMQ service
sudo service rabbitmq-server start

MySQL Python connector TypeError: wrap_socket() got an unexpected keyword argument 'ciphers'

I am running into an issue with MySQL python connection.
Traceback (most recent call last):
File "./expconfig.py", line 176, in <module>
cnx = mysql.connector.connect(**config)
File "/usr/lib/python2.6/site-packages/mysql/connector/__init__.py", line 179, in connect
return MySQLConnection(*args, **kwargs)
File "/usr/lib/python2.6/site-packages/mysql/connector/connection.py", line 95, in __init__
self.connect(**kwargs)
File "/usr/lib/python2.6/site-packages/mysql/connector/abstracts.py", line 728, in connect
self._open_connection()
File "/usr/lib/python2.6/site-packages/mysql/connector/connection.py", line 228, in _open_connection
self._ssl)
File "/usr/lib/python2.6/site-packages/mysql/connector/connection.py", line 150, in _do_auth
ssl_options.get('cipher'))
File "/usr/lib/python2.6/site-packages/mysql/connector/network.py", line 420, in switch_to_ssl
ssl_version=ssl.PROTOCOL_TLSv1, ciphers=cipher)
TypeError: wrap_socket() got an unexpected keyword argument 'ciphers'
MySQL server has "ssl_diabled" , so the client doesn't need SSL connection. However it invokes
I have following,
Python : 2.6.6 MySql Python connector mysql-connector-python-2.1.7
OS : RHEL 6.6
MySQL server Server version : 5.7.17-enterprise-commercial-advanced
Code
try:
flags = [ClientFlag.FOUND_ROWS,-ClientFlag.SSL]
config = {
'user' : 'ed30_user',
'password' : 'mypassword',
'host' : options.remHost,
'database' : 'config',
'client_flags': [-ClientFlag.SSL],
'ssl_disabled' : False
}
cnx = mysql.connector.connect(**config)
cur=cnx.cursor(dictionary=True)
The reason for the error was that Pythong 2.6.6 has "ssl" module with different method signature than expected by the MysqlConnector version 2.1.7.
The MySQL web https://dev.mysql.com/doc/connector-python/en/connector-python-versions.html had in correct version table
I dropped back to connector version mysql-connector-python-2.0.5-1.el6.noarch.rpm and the error went away.
just set 'ssl_disabled' : True if using config file
or ssl_disabled='True' if your config is in arguments to mysql.connector.connect function

Mongo connector unable to connect to mongos

I am connecting to mongo with a user with clusterAdmin and backup roles, but I get the error:
2017-02-09 17:51:23,254 [ERROR] mongo_connector.util:96 - Fatal Exception
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/mongo_connector/util.py", line 94, in wrapped
func(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/mongo_connector/connector.py", line 370, in run
'listShards')['shards']:
File "/usr/lib/python2.7/site-packages/mongo_connector/util.py", line 78, in retry_until_ok
return func(*args, **kwargs)
File "/usr/lib64/python2.7/site-packages/pymongo/database.py", line 494, in command
codec_options, **kwargs)
File "/usr/lib64/python2.7/site-packages/pymongo/database.py", line 406, in _command
parse_write_concern_error=parse_write_concern_error)
File "/usr/lib64/python2.7/site-packages/pymongo/pool.py", line 419, in command
collation=collation)
File "/usr/lib64/python2.7/site-packages/pymongo/network.py", line 116, in command
parse_write_concern_error=parse_write_concern_error)
File "/usr/lib64/python2.7/site-packages/pymongo/helpers.py", line 210, in _check_command_response
raise OperationFailure(msg % errmsg, code, response)
OperationFailure: not authorized on admin to execute command { listShards: 1 }
This page under Required Permissions says The simplest way to get mongo-connector running is to create a user with the backup role:
https://github.com/mongodb-labs/mongo-connector/wiki/Usage-with-Authentication
db.getSiblingDB("admin").createUser({ user:"backup",pwd:"password_here", roles: ["backup"] })
But I cant even connect with such a user (Authentication error):
2017-02-10 16:52:01,448 [ERROR] mongo_connector.util:96 - Fatal Exception
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/mongo_connector/util.py", line 94, in wrapped
func(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/mongo_connector/connector.py", line 398, in run
hosts, replicaSet=repl_set)
File "/usr/lib/python2.7/site-packages/mongo_connector/connector.py", line 299, in create_authed_client
client['admin'].authenticate(self.auth_username, self.auth_key)
File "/usr/lib64/python2.7/site-packages/pymongo/database.py", line 1048, in authenticate
connect=True)
File "/usr/lib64/python2.7/site-packages/pymongo/mongo_client.py", line 505, in _cache_credentials
sock_info.authenticate(credentials)
File "/usr/lib64/python2.7/site-packages/pymongo/pool.py", line 523, in authenticate
auth.authenticate(credentials, self)
File "/usr/lib64/python2.7/site-packages/pymongo/auth.py", line 470, in authenticate
auth_func(credentials, sock_info)
File "/usr/lib64/python2.7/site-packages/pymongo/auth.py", line 450, in _authenticate_default
return _authenticate_scram_sha1(credentials, sock_info)
File "/usr/lib64/python2.7/site-packages/pymongo/auth.py", line 201, in _authenticate_scram_sha1
res = sock_info.command(source, cmd)
File "/usr/lib64/python2.7/site-packages/pymongo/pool.py", line 419, in command
collation=collation)
File "/usr/lib64/python2.7/site-packages/pymongo/network.py", line 116, in command
parse_write_concern_error=parse_write_concern_error)
File "/usr/lib64/python2.7/site-packages/pymongo/helpers.py", line 210, in _check_command_response
raise OperationFailure(msg % errmsg, code, response)
OperationFailure: Authentication failed.
When I log into mongos with both these users and run the command
db.getSiblingDB("admin").runCommand( { listShards: 1 } )
I get a shard listing no probs
{
"shards" : [
{
"_id" : "shard001",
"host" : "shard001/timgrhlmdb01:27020,timgrhlmdb02:27020",
"state" : 1
},
{
"_id" : "shard002",
"host" : "shard002/timgrhlmdb03:27020,timgrhlmdb04:27020",
"state" : 1
}
],
"ok" : 1
}
So what does this mean:
OperationFailure: not authorized on admin to execute command { listShards: 1 }
Update
I rebuilt the cluster from scratch and still have the same problem: OperationFailure: not authorized on admin to execute command { listShards: 1 }
I have also tried the user 'backup' with only the roles 'clusterManager' and 'readAnyDatabase'. This allows the user to list shards, but now mongo-connector fails with 'Authentication failed':
{ "_id" : "admin.backup", "user" : "backup", "db" : "admin", "credentials" : { "SCRAM-SHA-1" : { "iterationCount" : 10000, "salt" : "pWcEU7uFqfHPgGe8z+E9Wg==", "storedKey" : "k2tapXQPtM2dHlxYnJiWVxO/rtg=", "serverKey" : "EGG8M4i27OYBy+fLYaL13+Nn4mc=" } }, "roles" : [ { "role" : "readAnyDatabase", "db" : "admin" }, { "role" : "clusterManager", "db" : "admin" } ] }
Check out users by running this command:
db.system.users.find({})
Make sure that the user you created is with a backup role,if you can log in as the backup user and you can also run those commands,that means backup user was created and granted a role and its privileges.
Make sure that you have the role of clusterManager to perform this.
Provides management and monitoring actions on the cluster. A user with
this role can access the config and local databases, which are used in
sharding and replication, respectively.
Provides the following actions on the cluster as a whole:
addShard
appendOplogNote
applicationMessage
cleanupOrphaned
flushRouterConfig
listShards
removeShard
etc
Have a look at built-in-roles.
By the way,have a look at this issue.Hope this helps.
Response from bug submitted to mongodb-labs/mongo-connector:
This is indeed a subtle bug introduced in #563. We changed a find on
config.shards into a call to listShards assuming that it would have no
change in behavior. Unfortunately (and annoyingly), the backup role
has privileges to read the list of shards in the config.shards
collection but, as you can see, does not have the privilege to run the
listShards command. I'll revert this change to fix the problem in
the upcoming 2.5.1 bug-fix release.
In the meantime, you will need to grant the mongo-connector user the
backup AND clusterMonitor roles.
An important point that is not yet mentioned in the documentation is
that the user must be created on a mongos and all the shards. This
enables mongo-connector to authenticate to the cluster as a whole and
to each shard individually.
This now works! yay
That will teach me for following the manual lol!

Error with remote sharded DB cluster

I connect to remote sharded DB cluster. It connects OK, but if I insert document I get error:
File "/home/df/SlickJump/WishNuServer/server.py", line 522, in POST
unit = wndb.pages.find_one({'pagehash': thash, 'version': Version})
File "/usr/local/lib/python2.7/dist-packages/pymongo/collection.py", line 598, in find_one
for result in self.find(spec_or_id, *args, **kwargs).limit(-1):
File "/usr/local/lib/python2.7/dist-packages/pymongo/cursor.py", line 814, in next
if len(self.__data) or self._refresh():
File "/usr/local/lib/python2.7/dist-packages/pymongo/cursor.py", line 763, in _refresh
self.__uuid_subtype))
File "/usr/local/lib/python2.7/dist-packages/pymongo/cursor.py", line 720, in __send_message
self.__uuid_subtype)
File "/usr/local/lib/python2.7/dist-packages/pymongo/helpers.py", line 105, in _unpack_response
error_object["$err"])
OperationFailure: database error: setShardVersion failed host: db1s0:27017 { errmsg: "exception: remote client 5.101.101.150:46117 tried to initialize this host (db1
7) as shard shard0000, but shard name was previously initialized...", code: 13298, ok: 0.0 }
In MongoDB logs in looks:
2014-07-09T13:26:40.666+0000 [conn11] Assertion failed while processing query op for wishnudb.pages :: caused by :: 10429 setShardVersion failed host: db1s0:27017 {
"exception: remote client 5.101.101.150:46219 tried to initialize this host (db1s0:27017) as shard shard0000, but shard name was previously initialized...", code: 1
k: 0.0 }
Why do i get this error? There is a reason?
It happened to me when I had two shards that are actually one mongod:
{ "_id" : "shard0000", "host" : "localhost:27027" }
{ "_id" : "shard0001", "host" : "127.0.0.1:27027" }
When I removed the second shard, the error gone.