I am trying to set up a Ceph cluster. I have 4 nodes - 1 admin-node, 1 monitor and 2 object storage devices. The installation guide I am using is given at the following location:
http://ceph.com/docs/master/start/quick-ceph-deploy/.
When I am trying to add the initial monitor (step 5 in the guide), I am getting the following error:
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/cloud-user/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (1.5.21): /usr/bin/ceph-deploy mon create-initial
[ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts worker-1-full
[ceph_deploy.mon][DEBUG ] detecting platform for host worker-1-full ...
[worker-1-full][DEBUG ] connection detected need for sudo
[worker-1-full][DEBUG ] connected to host: worker-1-full
[worker-1-full][DEBUG ] detect platform information from remote host
[worker-1-full][DEBUG ] detect machine type
[ceph_deploy.mon][INFO ] distro info: Ubuntu 14.04 trusty
[worker-1-full][DEBUG ] determining if provided host has same hostname in remote
[worker-1-full][DEBUG ] get remote short hostname
[worker-1-full][DEBUG ] deploying mon to worker-1-full
[worker-1-full][DEBUG ] get remote short hostname
[worker-1-full][DEBUG ] remote hostname: worker-1-full
[worker-1-full][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[worker-1-full][DEBUG ] create the mon path if it does not exist
[worker-1-full][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-worker-1-full/done
[worker-1-full][DEBUG ] create a done file to avoid re-doing the mon deployment
[worker-1-full][DEBUG ] create the init path if it does not exist
[worker-1-full][DEBUG ] locating the `service` executable...
[worker-1-full][INFO ] Running command: sudo initctl emit ceph-mon cluster=ceph id=worker-1-full
[worker-1-full][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.worker-1-full.asok mon_status
[worker-1-full][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[worker-1-full][WARNIN] monitor: mon.worker-1-full, might not be running yet
[worker-1-full][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.worker-1-full.asok mon_status
[worker-1-full][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[worker-1-full][WARNIN] monitor worker-1-full does not exist in monmap
[worker-1-full][WARNIN] neither `public_addr` nor `public_network` keys are defined for monitors
[worker-1-full][WARNIN] monitors may not be able to form quorum
[ceph_deploy.mon][INFO ] processing monitor mon.worker-1-full
[worker-1-full][DEBUG ] connection detected need for sudo
[worker-1-full][DEBUG ] connected to host: worker-1-full
[worker-1-full][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.worker-1-full.asok mon_status
[worker-1-full][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[ceph_deploy.mon][WARNIN] mon.worker-1-full monitor is not yet in quorum, tries left: 5
[ceph_deploy.mon][WARNIN] waiting 5 seconds before retrying
[worker-1-full][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.worker-1-full.asok mon_status
[worker-1-full][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[ceph_deploy.mon][WARNIN] mon.worker-1-full monitor is not yet in quorum, tries left: 4
[ceph_deploy.mon][WARNIN] waiting 10 seconds before retrying
[worker-1-full][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.worker-1-full.asok mon_status
[worker-1-full][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[ceph_deploy.mon][WARNIN] mon.worker-1-full monitor is not yet in quorum, tries left: 3
[ceph_deploy.mon][WARNIN] waiting 10 seconds before retrying
[worker-1-full][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.worker-1-full.asok mon_status
[worker-1-full][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[ceph_deploy.mon][WARNIN] mon.worker-1-full monitor is not yet in quorum, tries left: 2
[ceph_deploy.mon][WARNIN] waiting 15 seconds before retrying
[worker-1-full][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.worker-1-full.asok mon_status
[worker-1-full][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[ceph_deploy.mon][WARNIN] mon.worker-1-full monitor is not yet in quorum, tries left: 1
[ceph_deploy.mon][WARNIN] waiting 20 seconds before retrying
[ceph_deploy.mon][ERROR ] Some monitors have still not reached quorum:
[ceph_deploy.mon][ERROR ] worker-1-full
"worker-1-full" is the node I am trying to set up as my monitor. The command I used is:
"ceph-deploy mon create-initial". Please help. Thanks in advance!
Please check your ceph-deploy version:
ceph-deploy --version
I met same problem on 1.5.30,
see http://docs.ceph.com/ceph-deploy/docs/changelog.html#id34,
Default to the "infernalis" release.
use 1.5.29 ceph-deploy and "hammer" release , it's work fine.
(perhaps other combination also ok.)
Good luck ~~
Related
I have the following shell script which I am running in UserData section of a CloudFormation template:
#!/bin/sh
wget -qO - https://www.mongodb.org/static/pgp/server-4.4.asc | apt-key add -
echo "deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu focal/mongodb-org/4.4 multiverse" | tee /etc/apt/sources.list.d/mongodb-org-4.4.list
apt-get update
apt-get install -y mongodb-org
apt install mongodb-clients
systemctl start mongod
systemctl status mongod
systemctl enable mongod
echo "ABOUT TO ENTER WHILE LOOP"
while :
do
echo "waiting to RUN MONGO COMMANDS"
echo "$(systemctl show -p ActiveState --value mongod)"
if [ "$(systemctl show -p ActiveState --value mongod)" = "active" ]
then
echo "RUNNING MONGO COMMANDS"
mongo crawler --eval "db.websites.insertOne({ customerId: '1', url: 'https://dootli.com' })"
mongo crawler --eval "db.createUser({ user: 'username', pwd: 'password', roles: 'clusterAdmin' })"
break
fi
done
As far I can tell the script is valid, how I'm getting the this output (and error) when it is run during the initialization of the EC2 instance:
Reading package lists...
Building dependency tree...
Reading state information...
The following package was automatically installed and is no longer required:
mongodb-database-tools
Use 'apt autoremove' to remove it.
The following additional packages will be installed:
libboost-filesystem1.71.0 libboost-iostreams1.71.0
libboost-program-options1.71.0 libgoogle-perftools4 libpcrecpp0v5
libsnappy1v5 libtcmalloc-minimal4 libyaml-cpp0.6 mongo-tools
The following packages will be REMOVED:
mongodb-org mongodb-org-database-tools-extra mongodb-org-mongos
mongodb-org-server mongodb-org-shell mongodb-org-tools
The following NEW packages will be installed:
libboost-filesystem1.71.0 libboost-iostreams1.71.0
libboost-program-options1.71.0 libgoogle-perftools4 libpcrecpp0v5
libsnappy1v5 libtcmalloc-minimal4 libyaml-cpp0.6 mongo-tools mongodb-clients
0 upgraded, 10 newly installed, 6 to remove and 87 not upgraded.
Need to get 35.2 MB of archives.
After this operation, 44.4 MB disk space will be freed.
Do you want to continue? [Y/n] Abort.
● mongod.service - MongoDB Database Server
Loaded: loaded (/lib/systemd/system/mongod.service; disabled; vendor preset: enabled)
Active: active (running) since Sat 2021-02-06 20:38:12 UTC; 15ms ago
Docs: https://docs.mongodb.org/manual
Main PID: 2738 (mongod)
Memory: 288.0K
CGroup: /system.slice/mongod.service
└─2738 /usr/bin/mongod --config /etc/mongod.conf
Feb 06 20:38:12 ip-172-31-64-168 systemd[1]: Started MongoDB Database Server.
Created symlink /etc/systemd/system/multi-user.target.wants/mongod.service → /lib/systemd/system/mongod.service.
ABOUT TO ENTER WHILE LOOP
waiting to RUN MONGO COMMANDS
RUNNING MONGO COMMANDS
MongoDB shell version v4.4.3
connecting to: mongodb://127.0.0.1:27017/crawler?compressors=disabled&gssapiServiceName=mongodb
Error: couldn't connect to server 127.0.0.1:27017, connection attempt failed: SocketException: Error connecting to 127.0.0.1:27017 :: caused by :: Connection refused :
connect#src/mongo/shell/mongo.js:374:17
#(connect):2:6
exception: connect failed
exiting with code 1
MongoDB shell version v4.4.3
connecting to: mongodb://127.0.0.1:27017/crawler?compressors=disabled&gssapiServiceName=mongodb
Error: couldn't connect to server 127.0.0.1:27017, connection attempt failed: SocketException: Error connecting to 127.0.0.1:27017 :: caused by :: Connection refused :
connect#src/mongo/shell/mongo.js:374:17
#(connect):2:6
exception: connect failed
exiting with code 1
Cloud-init v. 20.3-2-g371b392c-0ubuntu1~20.04.1 running 'modules:final' at Sat, 06 Feb 2021 20:37:42 +0000. Up 27.35 seconds.
ci-info: no authorized SSH keys fingerprints found for user ubuntu.
Cloud-init v. 20.3-2-g371b392c-0ubuntu1~20.04.1 finished at Sat, 06 Feb 2021 20:38:12 +0000. Datasource DataSourceEc2Local. Up 56.97 seconds
I modified the code to work. I identified that the main issue is mongodb-clients which causes failures of monogdb. Also your command db.createUser is invalid and will lead to failure as well. I did not fix that, as its not related to your issue about connection refused. You can make new question why your db.createUser is incorrect (I don't know how to fix that, its mongodb specific).
#!/bin/sh
wget -qO - https://www.mongodb.org/static/pgp/server-4.4.asc | apt-key add -
echo "deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu focal/mongodb-org/4.4 multiverse" | tee /etc/apt/sources.list.d/mongodb-org-4.4.list
apt update
apt install -y mongodb-org
systemctl enable mongod
systemctl start mongod
echo "ABOUT TO ENTER WHILE LOOP"
while :
do
echo "waiting to RUN MONGO COMMANDS"
sleep 5
echo "$(systemctl show -p ActiveState --value mongod)"
if [ "$(systemctl show -p ActiveState --value mongod)" = "active" ]
then
echo "RUNNING MONGO COMMANDS"
mongo crawler --eval "db.websites.insertOne({ customerId: '1', url: 'https://dootli.com' })"
[ $? != 0 ] && continue
echo "db.websites.insertOne command successful"
break
fi
done
I am trying to install kubernetes with kubeadm in my laptop which has Ubuntu 16.04. I have disabled swap, since kubelet does not work with swap on. The command I used is :
swapoff -a
I also commented out the reference to swap in /etc/fstab.
# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point> <type> <options> <dump> <pass>
# / was on /dev/sda1 during installation
UUID=1d343a19-bd75-47a6-899d-7c8bc93e28ff / ext4 errors=remount-ro 0 1
# swap was on /dev/sda5 during installation
#UUID=d0200036-b211-4e6e-a194-ac2e51dfb27d none swap sw 0 0
I confirmed swap is turned off by running the following:
free -m
total used free shared buff/cache available
Mem: 15936 2108 9433 954 4394 12465
Swap: 0 0 0
When I start kubeadm, I get the following error:
kubeadm init --pod-network-cidr=10.244.0.0/16
[init] Using Kubernetes version: v1.14.2
[preflight] Running pre-flight checks
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR Swap]: running with swap on is not supported. Please disable swap
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
I also tried restarting my laptop, but I get the same error. What could the reason be?
below was the root cause.
detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd".
you need to update the docker cgroup driver.
follow the below fix
cat > /etc/docker/daemon.json <<EOF
{
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"storage-driver": "overlay2",
"storage-opts": [
"overlay2.override_kernel_check=true"
]
}
EOF
mkdir -p /etc/systemd/system/docker.service.d
# Restart Docker
systemctl daemon-reload
systemctl restart docker
you could try kubeadm reset , then kubeadm init --ignore-preflight-errors Swap .
first try with sudo
sudo swapoff -a
then check if there's anything swapped
cat /proc/swaps
and
free -h
I am trying to deploy a k8s cluster in openstack rocky but after long time it fails. I've checked orchestration stack and see that kube_minions resources never completes. Checking the logs output for all the instances created:
[ 196.817505] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 215.082433] random: crng init done
Fedora 27 (Atomic Host)
Kernel 4.14.18-300.fc27.x86_64 on an x86_64 (ttyS0)
host-10-0-0-3 login: [ 691.438618] bridge: filtering via
arp/ip/ip6tables is no longer available by default. Update your scripts
to load br_netfilter if you need this.
[ 691.516277] Bridge firewalling registered
[ 692.149217] nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
[ 701.932912] IPv6: ADDRCONF(NETDEV_UP): docker0: link is not ready
Checking deeply in the instances I've found that in master node can not start heat-agent-service...
_prefix=docker.io/openstackmagnum/
atomic install --storage ostree --system --system-package no --set
REQUESTS_CA_BUNDLE=/etc/pki/tls/certs/ca-bundle.crt --name heat-container-
agent docker.io/openstackmagnum/heat-container-agent:rocky-stable
systemctl start heat-container-agent
Failed to start heat-container-agent.service: Unit heat-container-
agent.service not found.
2019-04-04 14:57:40,238 - util.py[WARNING]: Failed running
/var/lib/cloud/instance/scripts/part-013 [5]´
I am net in ceph installation and followed some tutorials. Unluckily when I am now trying to execute the command for OSD.
ceph-deploy osd create --data /dev/vdb node1
I have encountered this error
[ceph-vm2][INFO ] Running command: sudo /usr/sbin/ceph-volume --cluster ceph lvm create --bluestore --data /dev/sdb
[ceph-vm2][WARNIN] --> RuntimeError: Unable to create a new OSD id
[ceph-vm2][DEBUG ] Running command: /usr/bin/ceph-authtool --gen-print-key
[ceph-vm2][DEBUG ] Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new d64885d8-866c-4e26-bdda-94a6b8a79366
[ceph-vm2][DEBUG ] stderr: [errno 1] error connecting to the cluster
[ceph-vm2][ERROR ] RuntimeError: command returned non-zero exit status: 1
[ceph_deploy.osd][ERROR ] Failed to execute command: /usr/sbin/ceph-volume --cluster ceph lvm create --bluestore --data /dev/sdb
[ceph_deploy][ERROR ] GenericError: Failed to create 1 OSDs
Be sure your /dev/sdb got no important data first!
1.
#umount /dev/sdb1 or /dev/sdb2
2.
vim /etc/fstab
common /dev/sdb uuid mount
3.
#parted -s /dev/sdb mklabel gpt mkpart primary xfs 0% 100%
4.
#reboot
5.
#mkfs.xfs /dev/sdb -f
6.
ceph-deploy osd create --data /dev/sdb node1
What are installed for minikube:
$ ls -al /usr/local/bin/
-rwxr-xr-x 1 root root 26406912 Jun 14 12:05 docker-machine
-rwxrwxr-x 1 me libvirtd 11889064 Jun 14 12:07 docker-machine-driver-kvm
-rwxrwxr-x 1 me me 70232912 Jun 14 11:58 kubectl
-rwxrwxr-x 1 me me 82512696 Jun 14 11:57 minikube
Trying to start cluster by minikube
$ minikube start --vm-driver=kvm
Starting local Kubernetes v1.6.4 cluster...
Starting VM...
E0614 12:07:39.515994 14655 start.go:127] Error starting host: Error creating host: Error creating machine: Error in driver during machine creation: virError(Code=8, Domain=44, Message='invalid argument: could not find capabilities for domaintype=kvm ').
Retrying.
E0614 12:07:39.517076 14655 start.go:133] Error starting host: Error creating host: Error creating machine: Error in driver during machine creation: virError(Code=8, Domain=44, Message='invalid argument: could not find capabilities for domaintype=kvm ')
I am new to kubernetes. Any idea how to fix it? Thanks
UPDATE
sudo /usr/sbin/kvm-ok
INFO: /dev/kvm does not exist
HINT: sudo modprobe kvm_intel
INFO: Your CPU supports KVM extensions
INFO: KVM (vmx) is disabled by your BIOS
HINT: Enter your BIOS setup and enable Virtualization Technology (VT),
and then hard poweroff/poweron your system
KVM acceleration can NOT be used
$ dmesg | grep kvm
[ 2.114855] kvm: disabled by bios
[ 2.327746] kvm: disabled by bios
[ 120.423249] kvm: disabled by bios
[ 222.250977] kvm: disabled by bios
My update is close to the solution. The solution is to enable virtualization in the BIOS.
1, Power on your PC and open the BIOS.
2, Go to the security section and enable virtualization.
you need to install the kvm package refer package.
https://github.com/kubernetes/minikube/blob/master/docs/drivers.md#kvm-driver
# Install libvirt and qemu-kvm on your system, e.g.
# Debian/Ubuntu
$ sudo apt install libvirt-bin qemu-kvm
# Fedora/CentOS/RHEL
$ sudo yum install libvirt-daemon-kvm kvm
# Add yourself to the libvirtd group (use libvirt group for rpm based distros) so you don't need to sudo
# Debian/Ubuntu (NOTE: For Ubuntu 17.04 change the group to `libvirt`)
$ sudo usermod -a -G libvirtd $(whoami)
# Fedora/CentOS/RHEL
$ sudo usermod -a -G libvirt $(whoami)
# Update your current session for the group change to take effect
# Debian/Ubuntu (NOTE: For Ubuntu 17.04 change the group to `libvirt`)
$ newgrp libvirtd
# Fedora/CentOS/RHEL
$ newgrp libvirt