Using ceph-deploy when APT repos are down - ceph

Using ceph-deploy install ... fails when the download.ceph.com website/repo is down. Is there a way to install from a mirror? In the docs you can see --repo-url option, but it seems to still download from download.ceph.com. See hereafter:
ceph-deploy install --repo-url http://eu.ceph.com/debian-jewel/ ogw01
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/bstor/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (1.5.35): /usr/bin/ceph-deploy install --repo-url http://eu.ceph.com/debian-jewel/ ogw01
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] testing : None
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7faec06d8638>
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] dev_commit : None
[ceph_deploy.cli][INFO ] install_mds : False
[ceph_deploy.cli][INFO ] stable : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] adjust_repos : True
[ceph_deploy.cli][INFO ] func : <function install at 0x7faec0b1b230>
[ceph_deploy.cli][INFO ] install_all : False
[ceph_deploy.cli][INFO ] repo : False
[ceph_deploy.cli][INFO ] host : ['ogw01']
[ceph_deploy.cli][INFO ] install_rgw : False
[ceph_deploy.cli][INFO ] install_tests : False
[ceph_deploy.cli][INFO ] repo_url : http://eu.ceph.com/debian-jewel/
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] install_osd : False
[ceph_deploy.cli][INFO ] version_kind : stable
[ceph_deploy.cli][INFO ] install_common : False
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] dev : master
[ceph_deploy.cli][INFO ] nogpgcheck : False
[ceph_deploy.cli][INFO ] local_mirror : None
[ceph_deploy.cli][INFO ] release : None
[ceph_deploy.cli][INFO ] install_mon : False
[ceph_deploy.cli][INFO ] gpg_url : None
[ceph_deploy.install][DEBUG ] Installing stable version jewel on cluster ceph hosts ogw01
[ceph_deploy.install][DEBUG ] Detecting platform for host ogw01 ...
[ogw01][DEBUG ] connection detected need for sudo
[ogw01][DEBUG ] connected to host: ogw01
[ogw01][DEBUG ] detect platform information from remote host
[ogw01][DEBUG ] detect machine type
[ceph_deploy.install][INFO ] Distro info: Ubuntu 16.04 xenial
[ogw01][INFO ] installing Ceph on ogw01
[ceph_deploy.install][WARNIN] --gpg-url was not used, will fallback
[ceph_deploy.install][WARNIN] using GPG fallback: https://download.ceph.com/keys/release.asc
[ogw01][INFO ] using custom repository location: http://eu.ceph.com/debian-jewel/
[ogw01][INFO ] Running command: sudo wget -O release.asc https://download.ceph.com/keys/release.asc
[ogw01][WARNIN] --2016-10-11 14:06:38-- https://download.ceph.com/keys/release.asc
[ogw01][WARNIN] Resolving download.ceph.com (download.ceph.com)... 173.236.253.173, 2607:f298:6050:51f3:f816:3eff:fe71:9135

The following lines:
[ceph_deploy.install][WARNIN] --gpg-url was not used, will fallback
[ceph_deploy.install][WARNIN] using GPG fallback: https://download.ceph.com/keys/release.asc
indicate that download.ceph.com is only used for the gpg part. You can also give --gpg-url option. The following command should work, as it also uses the mirror's key for deploying:
ceph-deploy install --repo-url http://eu.ceph.com --gpg-url http://eu.ceph.com/keys/release.asc ogw01

Related

Failed to execute command: env DEBIAN_FRONTEND=noninteractive DEBIAN_PRIORITY=critical apt-get --assume-yes -q update

I'm new to Ceph, and I'm trying to install and config a ceph-cluster.
After successfully installing the ceph-cluster I've run into some issues regarding storage and decided to re-install after purging everything and following this guide
https://www.howtoforge.com/tutorial/how-to-install-a-ceph-cluster-on-ubuntu-16-04/
which went well the first time I've installed this.
But in my second attempt I get this error after running the install command:
ceph-deploy install ceph-admin ceph-osd1 ceph-osd2 ceph-osd3 mon1
And this is my output:
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/cep/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/local/bin/ceph-deploy install ceph-admin ceph-osd1 ceph-osd2 ceph-osd3 mon1
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] testing : None
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f11585e0b90>
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] dev_commit : None
[ceph_deploy.cli][INFO ] install_mds : False
[ceph_deploy.cli][INFO ] stable : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] adjust_repos : True
[ceph_deploy.cli][INFO ] func : <function install at 0x7f115851d9d0>
[ceph_deploy.cli][INFO ] install_mgr : False
[ceph_deploy.cli][INFO ] install_all : False
[ceph_deploy.cli][INFO ] repo : False
[ceph_deploy.cli][INFO ] host : ['ceph-admin', 'ceph-osd1', 'ceph-osd2', 'ceph-osd3', 'mon1']
[ceph_deploy.cli][INFO ] install_rgw : False
[ceph_deploy.cli][INFO ] install_tests : False
[ceph_deploy.cli][INFO ] repo_url : None
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] install_osd : False
[ceph_deploy.cli][INFO ] version_kind : stable
[ceph_deploy.cli][INFO ] install_common : False
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] dev : master
[ceph_deploy.cli][INFO ] nogpgcheck : False
[ceph_deploy.cli][INFO ] local_mirror : None
[ceph_deploy.cli][INFO ] release : None
[ceph_deploy.cli][INFO ] install_mon : False
[ceph_deploy.cli][INFO ] gpg_url : None
[ceph_deploy.install][DEBUG ] Installing stable version mimic on cluster ceph hosts ceph-admin ceph-osd1 ceph-osd2 ceph-osd3 mon1
[ceph_deploy.install][DEBUG ] Detecting platform for host ceph-admin ...
cep#ceph-admin's password:
[ceph-admin][DEBUG ] connection detected need for sudo
cep#ceph-admin's password:
[ceph-admin][DEBUG ] connected to host: ceph-admin
[ceph-admin][DEBUG ] detect platform information from remote host
[ceph-admin][DEBUG ] detect machine type
[ceph_deploy.install][INFO ] Distro info: Ubuntu 18.04 bionic
[ceph-admin][INFO ] installing Ceph on ceph-admin
[ceph-admin][INFO ] Running command: sudo env DEBIAN_FRONTEND=noninteractive DEBIAN_PRIORITY=critical apt-get --assume-yes -q update
[ceph-admin][DEBUG ] Hit:1 http://mirrors.service.networklayer.com/ubuntu bionic InRelease
[ceph-admin][DEBUG ] Hit:2 http://mirrors.service.networklayer.com/ubuntu bionic-updates InRelease
[ceph-admin][DEBUG ] Hit:3 http://mirrors.service.networklayer.com/ubuntu bionic-backports InRelease
[ceph-admin][DEBUG ] Hit:4 http://mirrors.service.networklayer.com/ubuntu bionic-security InRelease
[ceph-admin][DEBUG ] Hit:5 https://download.docker.com/linux/ubuntu bionic InRelease
[ceph-admin][DEBUG ] Ign:6 https://artifactory.haifa.ibm.com/artifactory/hrl-site-layer-ubuntu bionic InRelease
[ceph-admin][DEBUG ] Hit:7 https://artifactory.haifa.ibm.com/artifactory/hrl-site-layer-ubuntu bionic Release
[ceph-admin][DEBUG ] Hit:8 http://apt.puppetlabs.com bionic InRelease
[ceph-admin][DEBUG ] Ign:9 https://download.ceph.com/rpm-nautilus/el7 bionic InRelease
[ceph-admin][DEBUG ] Err:10 https://download.ceph.com/rpm-nautilus/el7 bionic Release
[ceph-admin][DEBUG ] 404 Not Found [IP: 158.69.68.124 443]
[ceph-admin][DEBUG ] Ign:12 https://pkg.duosecurity.com/Ubuntu bionic InRelease
[ceph-admin][DEBUG ] Hit:13 https://pkg.duosecurity.com/Ubuntu bionic Release
[ceph-admin][DEBUG ] Hit:15 http://ppa.launchpad.net/wireguard/wireguard/ubuntu bionic InRelease
[ceph-admin][DEBUG ] Reading package lists...
[ceph-admin][WARNIN] E: The repository 'https://download.ceph.com/rpm-nautilus/el7 bionic Release' does not have a Release file.
[ceph-admin][ERROR ] RuntimeError: command returned non-zero exit status: 100
[ceph_deploy][ERROR ] RuntimeError: Failed to execute command: env DEBIAN_FRONTEND=noninteractive DEBIAN_PRIORITY=critical apt-get --assume-yes -q update
Already tried to use a different machine as the admin-node and re-install but nothing seems to work, I hope someone here could help :)
Thanks!
Managed to sort it out by changing the repository it is downloading from to debian-15.2.4 by installing Ceph like this:
ceph-deploy install ceph-admin ceph-osd1 ceph-osd2 ceph-osd3 mon1 --repo-url=https://download.ceph.com/debian-15.2.4/

Error when setting up glusterfs on Kubernetes: volume create: heketidbstorage: failed: Host not connected

I'm following this instruction to setup glusterfs on my kubernetes cluster. At heketi-client/bin/heketi-cli setup-openshift-heketi-storage part, heketi-cli tells me :
Error: volume create: heketidbstorage: failed: Host 192.168.99.25 not connected
or sometimes:
Error: volume create: heketidbstorage: failed: Staging failed on 192.168.99.26. Error: Host 192.168.99.25 not connected
heketi.json is
{
"_port_comment": "Heketi Server Port Number",
"port": "8080",
"_use_auth": "Enable JWT authorization. Please enable for deployment",
"use_auth": false,
"_jwt": "Private keys for access",
"jwt": {
"_admin": "Admin has access to all APIs",
"admin": {
"key": "7319"
},
"_user": "User only has access to /volumes endpoint",
"user": {
"key": "7319"
}
},
"_glusterfs_comment": "GlusterFS Configuration",
"glusterfs": {
"_executor_comment": "Execute plugin. Possible choices: mock, kubernetes, ssh",
"executor": "kubernetes",
"_db_comment": "Database file name",
"db": "/var/lib/heketi/heketi.db",
"kubeexec": {
"rebalance_on_expansion": true
},
"sshexec": {
"rebalance_on_expansion": true,
"keyfile": "/etc/heketi/private_key",
"fstab": "/etc/fstab",
"port": "22",
"user": "root",
"sudo": false
}
},
"_backup_db_to_kube_secret": "Backup the heketi database to a Kubernetes secret when running in Kubernetes. Default is off.",
"backup_db_to_kube_secret": false
}
topology-sample.json is
{
"clusters": [
{
"nodes": [
{
"node": {
"hostnames": {
"manage": [
"redis-test25"
],
"storage": [
"192.168.99.25"
]
},
"zone": 1
},
"devices": [
{
"name": "/dev/sda7",
"destroydata": true
}
]
},
{
"node": {
"hostnames": {
"manage": [
"redis-test26"
],
"storage": [
"192.168.99.26"
]
},
"zone": 1
},
"devices": [
{
"name": "/dev/sda7",
"destroydata": true
}
]
},
{
"node": {
"hostnames": {
"manage": [
"redis-test01"
],
"storage": [
"192.168.99.113"
]
},
"zone": 1
},
"devices": [
{
"name": "/dev/sda7",
"destroydata": true
}
]
}
]
}
]
}
The heketi-cli is v8.0.0 and kubernetes is v1.12.3
How do I fix this problem?
Update: Just found that I missed the iptables part, but now the message becomes
Error: volume create: heketidbstorage: failed: Host 192.168.99.25 is not in 'Peer in Cluster' state
seems that one of the glusterfs pod cannot connect to others, I tried kubectl exec -i glusterfs-59ftx -- gluster peer status:
Number of Peers: 2
Hostname: 192.168.99.26
Uuid: 6950db9a-3d60-4625-b642-da5882396bee
State: Peer Rejected (Disconnected)
Hostname: 192.168.99.113
Uuid: 78983466-4499-48d2-8411-2c3e8c70f89f
State: Peer Rejected (Disconnected)
while the other one said:
Number of Peers: 1
Hostname: 192.168.99.26
Uuid: 23a0114d-65b8-42d6-8067-7efa014af68d
State: Peer in Cluster (Connected)
I solved these problems by myself.
For first part, the reason is that I didn't setup iptables in every nodes according to Infrastructure Requirements.
For second part according to this article, delete all file in /var/lib/glusterd except glusterd.info and then start over from Kubernete Deploy.

Heteki no space error

I'm trying to setup a glusterfs cluster with kubernetes.
I managed to start the glusterd pods on all the nodes (3 nodes)
I also managed to load the topology successfully, however when I run
heketi-cli setup-openshift-heketi-storage
I get the following error:
Error: No space
This is the output of
heketi-cli topology load --json=gluster-kubernetes/deploy/topology.json
Found node vps01 on cluster 1a36667e4275773fc353f2caaaaaaa
Adding device /dev/loop0 ... OK
Found node vps02 on cluster 1a36667e4275773fc353faaaaaaaa
Found device /dev/loop0
Found node vps04 on cluster 1a36667e4275773fc353faaaaaaa
Adding device /dev/loop0 ... OK
Output of
heketi-cli topology info
Cluster Id: 1a36667e4275773fc353f2caaaaaa
File: true
Block: true
Volumes:
Nodes:
Node Id: 1752dcf447c8eb6eaad45aaaa
State: online
Cluster Id: 1a36667e4275773fc353f2caaa
Zone: 1
Management Hostnames: vps01
Storage Hostnames: XX.XX.XX.219
Devices:
Id:50396d72293c4723504810108bd75d41 Name:/dev/loop0 State:online Size (GiB):12 Used (GiB):0 Free (GiB):12
Bricks:
Node Id: 56b8c1942b347a863ee73a005758cc27
State: online
Cluster Id: 1a36667e4275773fc353f2c8eb2dd2a3
Zone: 1
Management Hostnames: vps04
Storage Hostnames: XX.XX.XX.227
Devices:
Id:dc75ad8154234ebcf9174b018d0bc30a Name:/dev/loop0 State:online Size (GiB):9 Used (GiB):4 Free (GiB):5
Bricks:
Node Id: f82cb81a026884764d3d953c7c9b6a9f
State: online
Cluster Id: 1a36667e4275773fc353f2c8eb2dd2a3
Zone: 1
Management Hostnames: vps02
Storage Hostnames: XX.XX.XX.157
Devices:
Id:1914102b7ae395f12797981a0e3cf5a4 Name:/dev/loop0 State:online Size (GiB):4 Used (GiB):4 Free (GiB):0
Bricks:
There is no more space on device 1914102b7ae395f12797981a0e3cf5a4, however I didn't not store anything yet on the device.
For info here is the topology.json file:
{
"clusters": [
{
"nodes": [
{
"node": {
"hostnames": {
"manage": [
"vps01"
],
"storage": [
"XX.XX.XX.219"
]
},
"zone": 1
},
"devices": [
"/dev/loop0"
]
},
{
"node": {
"hostnames": {
"manage": [
"vps02"
],
"storage": [
"XX.XX.XX.157"
]
},
"zone": 1
},
"devices": [
"/dev/loop0"
]
},
{
"node": {
"hostnames": {
"manage": [
"vps04"
],
"storage": [
"XX.XX.XX.227"
]
},
"zone": 1
},
"devices": [
"/dev/loop0"
]
}
]
}
]
}
You can try this:
# ./gk-deploy -g --abort
# dmsetup remove_all # In each server.
# dmsetup ls
# rm -fr /var/lib/glusterd/vols/* # In each server.
# rm -fr /var/lib/heketi/* # In each server.
# wipefs -a /dev/<device> # In each server.
source: https://github.com/gluster/gluster-kubernetes/issues/369#issuecomment-383247722

Sharding in orientdb distributed database

I am creating a distributed database in orientDB 2.2.6 with 3 nodes, namely master1, master2 and master3. I modified the hazelcast.xml and orientdb.server.config.xml files on each of the nodes. I used a common default-distributed-db-config.json on all 3 nodes which looks like as shown below.
{
"autoDeploy": true,
"readQuorum": 1,
"writeQuorum": "majority",
"executionMode": "undefined",
"readYourWrites": true,
"failureAvailableNodesLessQuorum": false,
"servers": {
"*": "master"
},
"clusters": {
"internal": {
},
"address": {
"owner" : "master1",
"servers": [ "master1" ]
},
"address_1": {
"owner" : "master1",
"servers" : [ "master1" ]
},
"ip": {
"owner" : "master2",
"servers" : [ "master2" ]
},
"ip_1": {
"owner" : "master2",
"servers" : [ "master2" ]
},
"id": {
"owner" : "master3",
"servers" : [ "master3" ]
},
"id_1": {
"owner" : "master3",
"servers" : [ "master3" ]
},
"*": {
"servers": [ "<NEW_NODE>" ]
}
}
}
Then I started the distributed server in the master1 machine, master2 and master3 in this order and let them synchronize the default DB. Then I created a database and three classes(Address, IP, ID) and their properties and indexes in the master1 machine. As I mentioned in the default-distributed-db-config.json file, Address class has two clusters and they are residing in the master1 machine. Class IP has two clusters and they reside in master2 machine.
When I insert values into Address class, as expected they are getting into master1 machine's clusters, following the round-robin strategy. But when I insert values for IP from the master2 machine they are creating a cluster in master1 and inserting into the new cluster. Basically, all the values are getting into master1 machine. When I do List Clusters, the clusters in master2 and master3 machines are empty.
So, I could not distribute the data across the three nodes. It basically stores the data into single machine. How to shard the data ? Is there any issue with the way I am trying to insert the data ?
Thanks
In current OrientDB releases, write operations (create/update/delete) are not forwarded. only the reads are. For this reason, the client should be connected to the server that handles the cluster you want your data written to.
Usually, this isn't a problem, because a local cluster is selected, but if you want to write on a specific cluster on a remote server this is not supported yet.

AWSCloudFormation - cfn-init failed to run command

I am using cloudformation for installing elasticsearch.
I am downloading and extracting tar.gz.
The following is my EC2 instance section :
"masterinstance": {
"Type": "AWS: : EC2: : Instance",
"Metadata": {
"AWS: : CloudFormation: : Init": {
"configSets" : {
"ascending" : [ "config1" , "config2" ]
},
"config1": {
"sources": {
"/home/ubuntu/": "https: //s3.amazonaws.com/xxxxxxxx/elasticsearch.tar.gz"
},
"files": {
"/home/ubuntu/elasticsearch/config/elasticsearch.yml": {
"content": {
"Fn: : Join": [
"",
[
xxxxxxxx
]
]
}
}
}
},
"config2" : {
"commands": {
"runservice": {
"command": "~/elasticsearch/bin/elasticsearch",
"cwd" : "~",
"test" : "~/elasticsearch/bin/elasticsearch > test.txt",
"ignoreErrors" : "false"
}
}
}
}
},
"Properties": {
"ImageId": "ami-xxxxxxxxxx",
"InstanceType": {
"Ref": "InstanceTypeParameter"
},
"Tags": [
xxxxxxxx
],
"KeyName": "everybody",
"NetworkInterfaces": [
{
"GroupSet": [
{
"Ref": "newSecurity"
}
],
"AssociatePublicIpAddress": "true",
"DeviceIndex": "0",
"SubnetId": {
"Ref": "oneSubnet"
}
}
],
"UserData": {
"Fn: : Base64": {
"Fn: : Join": [
"",
[
"#!/bin/bash\n",
"sudo add-apt-repository-yppa: webupd8team/java\n",
"sudo apt-get update\n",
"echo'oracle-java8-installershared/accepted-oracle-license-v1-1selecttrue'|sudo debconf-set-selections\n",
"sudo apt-getinstall-yoracle-java8-installer\n",
"apt-get update\n",
"apt-get-y installpython-setuptools\n",
"easy_installhttps: //s3.amazonaws.com/cloudformation-examples/aws-cfn-bootstrap-latest.tar.gz\n",
"/usr/local/bin/cfn-init",
"--stack Elasticsearch",
"--resource masterinstance",
"--configsets ascending",
"-v\n"
]
]
}
}
}
}
I am using AWS::CloudFormation::Init for configuration and other settings.
After extracting the tar , I want to start elasticsearch , which I am doing through the command section in AWS::CloudFormation::Init but ,
after the complete creation of stack when I ssh into my instances, I am not able to see my elasticsearch service running.
All other things like extracting tar and creating file is working correctly.
I have gone through the cfn-init.log , it gives me the following information :
2016-07-19 05:53:15,776 P2745 [INFO] Test for Command runservice
2016-07-19 05:53:15,778 P2745 [INFO] -----------------------Command Output-----------------------
2016-07-19 05:53:15,778 P2745 [INFO] /bin/sh: 1: ~/elasticsearch/bin/elasticsearch: not found
2016-07-19 05:53:15,778 P2745 [INFO] ------------------------------------------------------------
2016-07-19 05:53:15,779 P2745 [ERROR] Exited with error code 127
~
If I fire the above command ~/elasticsearch/bin/elasticsearch directly on my instance then it is working perfectly.
What I am doing wrong here.
Thank you.
I'm guessing that the home directory (~) is evaluating to a different user (not Ubuntu) when trying to run ES. I think CFN-Init runs as the root user instead of as ubuntu/ec2-user. Try to change the paths in the config2 command block to fully qualified paths (/home/ubuntu/elasticsearch).