Kubespray Kubernetes Installation Fails - dockerd[8296]: unable to configure the Docker daemon with file /etc/docker/daemon.json - kubernetes

Above error occured when installing kubernetes using kubespray.
The installtion fails and through journal -xe i see the follow:
` node1 systemd[1]: Starting Docker Application Container Engine...
-- Subject: Unit docker.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit docker.service has begun starting up.
Dec 09 23:37:01 node1 dockerd[8296]: unable to configure the Docker daemon with file /etc/docker/daemon.json: the following directives are specified both as a flag and in the configuration file: lo
Dec 09 23:37:01 node1 systemd[1]: docker.service: main process exited, code=exited, status=1/FAILURE
Dec 09 23:37:01 node1 systemd[1]: Failed to start Docker Application Container Engine.
how do I troubleshoot to fix the issue? Is there something that I am missing looking into?
The json file is as follows
[root#k8s-master01 kubespray]# cat /etc/docker/daemon.json
{
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"storage-driver": "overlay2",
"storage-opts": [
"overlay2.override_kernel_check=true"
]
}
the docker.yml file is as follows:
cat inventory/sample/group_vars/all/docker.yml
---
## Uncomment this if you want to force overlay/overlay2 as docker storage driver
## Please note that overlay2 is only supported on newer kernels
# docker_storage_options: -s overlay2
## Enable docker_container_storage_setup, it will configure devicemapper driver on Centos7 or RedHat7.
docker_container_storage_setup: false
## It must be define a disk path for docker_container_storage_setup_devs.
## Otherwise docker-storage-setup will be executed incorrectly.
# docker_container_storage_setup_devs: /dev/vdb
## Uncomment this if you have more than 3 nameservers, then we'll only use the first 3.
docker_dns_servers_strict: false
# Path used to store Docker data
docker_daemon_graph: "/var/lib/docker"
## Used to set docker daemon iptables options to true
docker_iptables_enabled: "false"
# Docker log options
# Rotate container stderr/stdout logs at 50m and keep last 5
docker_log_opts: "--log-opt max-size=50m --log-opt max-file=5"
# define docker bin_dir
docker_bin_dir: "/usr/bin"
# keep docker packages after installation; speeds up repeated ansible provisioning runs when '1'
# kubespray deletes the docker package on each run, so caching the package makes sense
docker_rpm_keepcache: 0
## An obvious use case is allowing insecure-registry access to self hosted registries.
## Can be ipaddress and domain_name.
## example define 172.19.16.11 or mirror.registry.io
# docker_insecure_registries:
# - mirror.registry.io
# - 172.19.16.11
## Add other registry,example China registry mirror.
# docker_registry_mirrors:
# - https://registry.docker-cn.com
# - https://mirror.aliyuncs.com
## If non-empty will override default system MountFlags value.
## This option takes a mount propagation flag: shared, slave
## or private, which control whether mounts in the file system
## namespace set up for docker will receive or propagate mounts
## and unmounts. Leave empty for system default
# docker_mount_flags:
## A string of extra options to pass to the docker daemon.
## This string should be exactly as you wish it to appear.
docker_options: >-
the setup.cfg file is as below
[root#k8s-master01 kubespray]# cat setup.cfg
[metadata]
name = kubespray
summary = Ansible modules for installing Kubernetes
description-file =
README.md
author = Kubespray
author-email = smainklh#gmail.com
license = Apache License (2.0)
home-page = https://github.com/kubernetes-sigs/kubespray
classifier =
License :: OSI Approved :: Apache Software License
Development Status :: 4 - Beta
Intended Audience :: Developers
Intended Audience :: System Administrators
Intended Audience :: Information Technology
Topic :: Utilities
[global]
setup-hooks =
pbr.hooks.setup_hook
[files]
data_files =
usr/share/kubespray/playbooks/ =
cluster.yml
upgrade-cluster.yml
scale.yml
reset.yml
remove-node.yml
extra_playbooks/upgrade-only-k8s.yml
usr/share/kubespray/roles = roles/*
usr/share/kubespray/library = library/*
usr/share/doc/kubespray/ =
LICENSE
README.md
usr/share/doc/kubespray/inventory/ =
inventory/sample/inventory.ini
etc/kubespray/ =
ansible.cfg
etc/kubespray/inventory/sample/group_vars/ =
inventory/sample/group_vars/etcd.yml
etc/kubespray/inventory/sample/group_vars/all/ =
inventory/sample/group_vars/all/all.yml
inventory/sample/group_vars/all/azure.yml
inventory/sample/group_vars/all/coreos.yml
inventory/sample/group_vars/all/docker.yml
inventory/sample/group_vars/all/oci.yml
inventory/sample/group_vars/all/openstack.yml
[wheel]
universal = 1
[pbr]
skip_authors = True
skip_changelog = True
[bdist_rpm]
group = "System Environment/Libraries"
requires =
ansible
python-jinja2
python-netaddr

Take look on that you have defined in deamon.json file storage driver:
"storage-driver": "overlay2",
"storage-opts": [
"overlay2.override_kernel_check=true"
At the same time in docker.yaml file you didn't enable storage driver options :
## Uncomment this if you want to force overlay/overlay2 as docker storage driver
## Please note that overlay2 is only supported on newer kernels
# docker_storage_options: -s overlay2
Please uncomment docker_storage_options: -s overlay2 line.
make sure you have followed every steps from this tutorial.

Related

How to connect python s3fs client to a running Minio docker container?

For test purposes, I'm trying to connect a module that intoduces an absration layer over s3fs with custom business logic.
It seems like I have trouble connecting the s3fs client to the Minio container.
Here's how I created the the container and attach the s3fs client (below describes how I validated the container is running properly)
import s3fs
import docker
client = docker.from_env()
container = client.containers.run('minio/minio',
"server /data --console-address ':9090'",
environment={
"MINIO_ACCESS_KEY": "minio",
"MINIO_SECRET_KEY": "minio123",
},
ports={
"9000/tcp": 9000,
"9090/tcp": 9090,
},
volumes={'/tmp/minio': {'bind': '/data', 'mode': 'rw'}},
detach=True)
container.reload() # why reload: https://github.com/docker/docker-py/issues/2681
fs = s3fs.S3FileSystem(
anon=False,
key='minio',
secret='minio123',
use_ssl=False,
client_kwargs={
'endpoint_url': "http://localhost:9000" # tried 127.0.0.1:9000 with no success
}
)
===========
>>> fs.ls('/')
[]
>>> fs.ls('/data')
Bucket doesnt exists exception
check that the container is running:
➜ ~ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
127e22c19a65 minio/minio "/usr/bin/docker-ent…" 56 seconds ago Up 55 seconds 0.0.0.0:9000->9000/tcp, :::9000->9000/tcp, 0.0.0.0:9090->9090/tcp, :::9090->9090/tcp hardcore_ride
check that the relevant volume is attached:
➜ ~ docker exec -it 127e22c19a65 bash
[root#127e22c19a65 /]# ls -l /data/
total 4
-rw-rw-r-- 1 1000 1000 4 Jan 11 16:02 foo.txt
[root#127e22c19a65 /]# exit
Since I proved the volume binding is working properly by shelling into the container, I expected to see the same results when attached the container's filesystem via the s3fs client.
What is the bucket name that was created as part of this setup?
From the docs I'm seeing you have to give <bucket_name>/<object_path> syntax to access the resources.
fs.ls('my-bucket')
['my-file.txt']
Also if you look at the docs below there are a couple of other ways to access it using fs.open can you give that a try?
https://buildmedia.readthedocs.org/media/pdf/s3fs/latest/s3fs.pdf

Unable to bring up Eclipse che on Kubernetes

Getting ERR_TIMEOUT: Timeout set to pod wait timeout 300000 while dowloading images
I am new to Eclipse che and kubernetes. I got Kubernetes installed on Ubuntu and am trying to run chectl server:start but it is failing. What am doing wrong? Below is the trace i get. Is there a log file where i could get more details? Please help.
Details:
✔ Verify Kubernetes API...OK
✔  Looking for an already existing Che instance
✔ Verify if Che is deployed into namespace "che"
✔ Found running che deployment
✔ Found running plugin registry deployment
✔ Found running devfile registry deployment
✔  Starting already deployed Che
✔ Scaling up Che Deployments...done.
❯ ✅ Post installation checklist
❯ Che pod bootstrap
✔ scheduling...done.
✖ downloading images
→ ERR_TIMEOUT: Timeout set to pod wait timeout 300000
starting
Retrieving Che Server URL
Che status check
Error: ERR_TIMEOUT: Timeout set to pod wait timeout 300000
at KubeHelper.<anonymous> (/usr/local/lib/chectl/lib/api/kube.js:578:19)
at Generator.next (<anonymous>)
at fulfilled (/usr/local/lib/chectl/node_modules/tslib/tslib.js:107:62)
Values.yaml
#
# Copyright (c) 2012-2017 Red Hat, Inc.
# This program and the accompanying materials are made
# available under the terms of the Eclipse Public License 2.0
# which is available at https://www.eclipse.org/legal/epl-2.0/
#
# SPDX-License-Identifier: EPL-2.0
#
# the following section is for secure registries. when uncommented, a pull secret will be created
#registry:
# host: my-secure-private-registry.com
# username: myUser
# password: myPass
cheWorkspaceHttpProxy: ""
cheWorkspaceHttpsProxy: ""
cheWorkspaceNoProxy: ""
cheImage: eclipse/che-server:nightly
cheImagePullPolicy: Always
cheKeycloakRealm: "che"
cheKeycloakClientId: "che-public"
#customOidcUsernameClaim: ""
#customOidcProvider: ""
#workspaceDefaultRamRequest: ""
#workspaceDefaultRamLimit: ""
#workspaceSidecarDefaultRamLimit: ""
global:
cheNamespace: ""
multiuser: false
# This value can be passed if custom Oidc provider is used, and there is no need to deploy keycloak in multiuser mode
# default (if empty) is true
#cheDedicatedKeycloak: false
ingressDomain: <xx.xx.xx.xx.nip.io>
# See --annotations-prefix flag (https://github.com/kubernetes/ingress-nginx/blob/master/docs/user-guide/cli-arguments.md)
ingressAnnotationsPrefix: "nginx."
# options: default-host, single-host, multi-host
serverStrategy: multi-host
tls:
enabled: false
useCertManager: true
useStaging: true
secretName: che-tls
gitHubClientID: ""
gitHubClientSecret: ""
pvcClaim: "1Gi"
cheWorkspacesNamespace: ""
workspaceIdleTimeout: "-1"
log:
loggerConfig: ""
appenderName: "plaintext"
Try to increase timeout by setting --k8spodreadytimeout=500000
[1] https://github.com/che-incubator/chectl
Following https://github.com/eclipse/che/issues/13871 (which is for minishift)
kubectl delete namespaces che
chectl server:start --platform minikube
give it a try
I hope by now you would have installed Eclipse Che on Kubernetes successfully.

Removing pool 'mon_allow_pool_delete config option to true before you can destroy a pool1_U (500)

I'm running proxmox and I try to remove a pool which I created wrong.
However it keeps giving this error:
mon_command failed - pool deletion is disabled; you must first set the mon_allow_pool_delete config option to true before you can destroy a pool1_U (500)
OK
But:
root#kvm-01:~# ceph -n mon.0 --show-config | grep mon_allow_pool_delete
mon_allow_pool_delete = true
root#kvm-01:~# ceph -n mon.1 --show-config | grep mon_allow_pool_delete
mon_allow_pool_delete = true
root#kvm-01:~# ceph -n mon.2 --show-config | grep mon_allow_pool_delete
mon_allow_pool_delete = true
root#kvm-01:~# cat /etc/ceph/ceph.conf
[global]
auth client required = cephx
auth cluster required = cephx
auth service required = cephx
cluster network = 10.0.0.0/24
filestore xattr use omap = true
fsid = 41fa3ff6-e751-4ebf-8a76-3f4a445823d2
keyring = /etc/pve/priv/$cluster.$name.keyring
osd journal size = 5120
osd pool default min size = 1
public network = 10.0.0.0/24
[osd]
keyring = /var/lib/ceph/osd/ceph-$id/keyring
[mon.0]
host = kvm-01
mon addr = 10.0.0.1:6789
mon allow pool delete = true
[mon.2]
host = kvm-03
mon addr = 10.0.0.3:6789
mon allow pool delete = true
[mon.1]
host = kvm-02
mon addr = 10.0.0.2:6789
mon allow pool delete = true
So that's my full config. Any idea why I am unable to delete my pools?
Another approach:
ceph tell mon.\* injectargs '--mon-allow-pool-delete=true'
ceph osd pool rm test-pool test-pool --yes-i-really-really-mean-it
You can set the config via the CLI or via the dashboard of Ceph under Cluster -> Configuration (advanced settings).
The CLI command is the following:
ceph config set mon mon_allow_pool_delete true
you need to do:
systemctl restart ceph-mon.target
Otherwise you can restart the server an infinite number of times and nothing happens
After editing the config you need to reboot the node. After the reboot everything went smoothly!
After added the following lines to the /etc/ceph/ceph.conf or /etc/ceph/ceph.d/ceph.conf and restart the ceph.target servivce, the issue still exists.
[mon.1]
host = kvm-02
mon addr = 10.11.110.112:6789
mon allow pool delete = true

dashDB Local on fedora 25 - error code 130

I tried 30 day trial of dashDB Local. I followed the steps described in the link:
https://www.ibm.com/support/knowledgecenter/en/SS6NHC/com.ibm.swg.im.dashdb.doc/admin/linux_deploy.html
I did not create a node configuration file because mine is a SMP setup.
Logged into my docker hub account and pulled the image.
docker login -u xxx -p yyyyy
docker pull ibmdashdb/local:latest-linux
The pull took 5 minutes or so. I waited for the image download to complete.
Ran the following command. It completed successfully.
docker run -d -it --privileged=true --net=host --name=dashDB -v /mnt/clusterfs:/mnt/bludata0 -v /mnt/clusterfs:/mnt/blumeta0 ibmdashdb/local:latest-linux
ran logs command
docker logs --follow dashDB
This showed dashDB did not start but exited with error code 130
# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
0f008f8e413d ibmdashdb/local:latest-linux "/usr/sbin/init" 16 seconds ago Exited (130) 1 seconds ago dashDB
#
logs command shows this:
2017-05-17T17:48:11.285582000Z Detected virtualization docker.
2017-05-17T17:48:11.286078000Z Detected architecture x86-64.
2017-05-17T17:48:11.286481000Z
2017-05-17T17:48:11.294224000Z Welcome to dashDB Local!
2017-05-17T17:48:11.294621000Z
2017-05-17T17:48:11.295022000Z Set hostname to <orion>.
2017-05-17T17:48:11.547189000Z Cannot add dependency job for unit systemd-tmpfiles-clean.timer, ignoring: Unit is masked.
2017-05-17T17:48:11.547619000Z [ OK ] Reached target Timers.
<snip>
2017-05-17T17:48:13.361610000Z [ OK ] Started The entrypoint script for initializing dashDB local.
2017-05-17T17:48:19.729980000Z [100209.207731] start_dashDB_local.sh[161]: /usr/lib/dashDB_local_common_functions.sh: line 1816: /tmp/etc_profile-LOCAL.cfg: No such file or directory
2017-05-17T17:48:20.236127000Z [100209.713223] start_dashDB_local.sh[161]: The dashDB Local container's environment is not set up yet.
2017-05-17T17:48:20.275248000Z [ OK ] Stopped Create Volatile Files and Directories.
<snip>
2017-05-17T17:48:20.737471000Z Sending SIGTERM to remaining processes...
2017-05-17T17:48:20.840909000Z Sending SIGKILL to remaining processes...
2017-05-17T17:48:20.880537000Z Powering off.
So it looks like start_dashDB_local.sh is failing at /usr/lib/dashDB_local_common_functions.sh 1816th line? I exported the image and this is the 1816th line of dashDB_local_common_functions.sh
update_etc_profile()
{
local runtime_env=$1
local cfg_file
# Check if /etc/profile/dashdb_env.sh is already updated
grep -q BLUMETAHOME /etc/profile.d/dashdb_env.sh
if [ $? -eq 0 ]; then
return
fi
case "$runtime_env" in
"AWS" | "V1.5" ) cfg_file="/tmp/etc_profile-V15_AWS.cfg"
;;
"V2.0" ) cfg_file="/tmp/etc_profile-V20.cfg"
;;
"LOCAL" ) # dashDB Local Case and also the default
cfg_file="/tmp/etc_profile-LOCAL.cfg"
;;
*) logger_error "Invalid ${runtime_env} value"
return
;;
esac
I also see /tmp/etc_profile-LOCAL.cfg in the image. Did I miss any step here?
I also created /mnt/clusterfs/nodes file ... but it did not help. The same docker run command failed in the same way.
Please help.
I am using x86_64 Fedora25.
# docker version
Client:
Version: 1.12.6
API version: 1.24
Package version: docker-common-1.12.6-6.gitae7d637.fc25.x86_64
Go version: go1.7.4
Git commit: ae7d637/1.12.6
Built: Mon Jan 30 16:15:28 2017
OS/Arch: linux/amd64
Server:
Version: 1.12.6
API version: 1.24
Package version: docker-common-1.12.6-6.gitae7d637.fc25.x86_64
Go version: go1.7.4
Git commit: ae7d637/1.12.6
Built: Mon Jan 30 16:15:28 2017
OS/Arch: linux/amd64
#
# cat /etc/fedora-release
Fedora release 25 (Twenty Five)
# uname -r
4.10.15-200.fc25.x86_64
#
Thanks for bringing this to our attention. I reached out to our developer team. It seems this is happening because inside the container, tmpfs gets mounted on to /tmp and wipes out all the scripts
We have seen this issue and moving to the latest version of docker seems to fix it. Your docker version commands shows it is an older version.
So please install the latest docker version and retry the deployment of dashdb Local and update here.
Regards
Murali

How to set kube-scheduler print log to file

kubernetes's version is 1.2
I want to watch the scheduler's log. So how to set kube-scheduler's log print to a file?
The kube-scheduler's configuration is at this path: /etc/kubernetes/scheduler.
And the global configuration is at this path: /etc/kubernetes/config.
So we can see these notes:
# logging to stderr means we get it in the systemd journal
KUBE_LOGTOSTDERR="--logtostderr=true"
# journal message level, 0 is debug
KUBE_LOG_LEVEL="--v=0"
Can you tail the contents of the service (if running in systemd): journalctl -u apiserver -f
Or if a container, find the container id of the scheduler, and tail with docker: docker logs -f