Ceph storage OSD disk upgrade (replace with larger drive) - ceph

I have three servers each with 1 x SSD drive (Ceph base OS) and 6 x 300Gb SAS drives, at the moment I'm only using 4 drives on each server as the OSD's in my Ceph storage array and everything is fine.
My question is that now I have built this and got everything up and running if say in 6 months or so I need to replace these OSD's due to the space of the storage array running out is it possible to remove one disk at a time from each server and replace it with a large drive?
For example if server 1 had OSD 0-5, server 2 has OSD 6-11 and server 3 has OSD 12-17 could I one day remove OSD0 and replace it with a 600Gb SAS drive, wait for it to heal the do the same with OSD6 then OSD12 etc. etc. until all the disks are replaced, and would this then give me a large storage pool?

OK just for anyone that is looking for this answer in the future you can upgrade your drives in the way that I mention above and here are the steps that I have taken (please note that this is in a lab and not production)
Mark the OSD as down
Mark the OSD as Out
Remove the drive in question
Install new drive (must be either the same size or larger)
I needed to reboot the server in question for the new disk to be seen by the OS
Add the new disk into Ceph as normal
Wait for the cluster to heal then repeat on a different server
I have now done this with 6 out of my 15 drives over 3 servers and each time the size of the Ceph storage has increase a little (I'm only doing 320G drives to 400Gb drives as this is only a test and I have some of these not in use).
I plan on starting this on the live production servers next week now that I know it works and going from 300G to 600G drives I should see a larger increase in storage (I hope).

Related

How to recover a Fedora Server 36 storage pool after upgrading to v37?

I recently upgraded Fedora Server (F) v36 to v37. The F36 server had a volume group consisting of three physical drives combined to form a storage pool, which I named “BigDrive”. During the upgrade the logical volume information seems to have been lost and BigDrive didn’t appear or mount in the F37 server. I’ve been unable to find any backup logical volume information. At present the 3 drives are installed on the F37 server. I would welcome advise on how to recombine the three drives, recover the logical volume information, and access the data stored in the shared pool. Can anyone suggest a process to do that, or a utility that could rebuild the storage pool from the physical drives?
I haven't found any helpful information in the various Fedora documentation or usual websites that don't reference using the backed up logical volume information which somehow didn't survive the upgrade process. This is because the OS hard drive was wiped and repartitioned as part of the upgrade. The drives that formed the storage pool were not formatted, nor do they store any OS or application files. They were purely data storage.

How can I fix ceph commands hanging after a reboot?

I'm pretty new to Ceph, so I've included all my steps I used to set up my cluster since I'm not sure what is or is not useful information to fix my problem.
I have 4 CentOS 8 VMs in VirtualBox set up to teach myself how to bring up Ceph. 1 is a client and 3 are Ceph monitors. Each ceph node has 6 8Gb drives. Once I learned how the networking worked, it was pretty easy.
I set each VM to have a NAT (for downloading packages) and an internal network that I called "ceph-public". This network would be accessed by each VM on the 10.19.10.0/24 subnet. I then copied the ssh keys from each VM to every other VM.
I followed this documentation to install cephadm, bootstrap my first monitor, and added the other two nodes as hosts. Then I added all available devices as OSDs, created my pools, then created my images, then copied my /etc/ceph folder from the bootstrapped node to my client node. On the client, I ran rbd map mypool/myimage to mount the image as a block device, then used mkfs to create a filesystem on it, and I was able to write data and see the IO from the bootstrapped node. All was well.
Then, as a test, I shutdown and restarted the bootstrapped node. When it came back up, I ran ceph status but it just hung with no output. Every single ceph and rbd command now hangs and I have no idea how to recover or properly reset or fix my cluster.
Has anyone ever had the ceph command hang on their cluster, and what did you do to solve it?
Let me share a similar experience. I also tried some time ago to perform some tests on Ceph (mimic i think) an my VMs on my VirtualBox acted very strange, nothing comparing with actual bare metal servers so please bare this in mind... the tests are not quite relevant.
As regarding your problem, try to see the following:
have at least 3 monitors (or an even number). It's possible that hang is because of monitor election.
make sure the networking part is OK (separated VLANs for ceph servers and clients)
DNS is resolving OK. (you have added the servername in hosts)
...just my 2 cents...

Mongodb on the cloud

I'm preparing my production environment on the Hetzner cloud, but I have some doubts (I'm more a developer than a devops).
I will get 3 servers for the replicaset with 8 core, 32 Gb ram and 240 gb ssd. I'm a bit worried about the size of the ssd the server comes with and Hetzner has the possibility to create volumes to be attached to the servers. Since mongodb uses a single folder for the db data, I was wondering how can I use the 240 gb that comes with the server in combination with external volumes. At the beginning I can use the 240 gb, but then I will have to move the data folder to a volume when it reaches capacity. Im fine with this, but it looks to me that when I will move to volumes, this 240gb will not be used anymore (yes I can use them to save the mongo journaling as they suggest to store it in a separate partition).
So, my noob question is, how can I use both the disk that comes with the server and the external volumes?
Thank you

Deleting files in Ceph does not free up space

I am using Ceph, uploading many files through radosgw. After, I want to delete the files. I am trying to do that in Python, like this:
bucket = conn.get_bucket(BUCKET)
for key in bucket.list():
bucket.delete_key(key)
Afterwards, I use bucket.list() to list files in the bucket, and this says that the bucket is now empty, as I intended.
However, when I run ceph df on the mon, it shows that the OSDs still have high utilization (e.g. %RAW USED 90.91). If I continue writing (thinking that the status data just hasn't caught up with the state yet), Ceph essentially locks up (100% utilization).
What's going on?
Note: I do have these standing out in ceph status:
health HEALTH_WARN
3 near full osd(s)
too many PGs per OSD (2168 > max 300)
pool default.rgw.buckets.data has many more objects per pg than average (too few pgs?)
From what I gather online, this wouldn't cause my particular issue. But I'm new to Ceph and could be wrong.
I have one mon and 3 OSDs. This is just for testing.
You can check if the object is really deleted by rados -p $pool list,
I knew for cephfs, when you delete a file, it will return ok when mds mark
it as deleted in local memory and then do real delete by sending delete messages to related osd.
Maybe radosgw use the same design to speed up delete

Getting access to a SAN disk / LUN from a virtual machine. Is it possible?

Resources:
node1: Physical cluster node 1.
node2: Physical cluster node 2.
cluster1: Cluster containing node1 and node2 used to host virtual machines.
san1: Dell md3200 highly available storage device (SAN).
lun1: A lun dedicated to file server storage located on san1.
driveZ: A hard drive currently a resource on node1 that is 100GB and has the
drive letter Z:\. This drive letter is lun1 that resides on san1.
virtual1: A virtual server used as a file server only.
Synopsis / Goals:
I have two nodes/servers on my network. Theses two nodes (node1 and node2) are part of a cluster (cluster1) that is used for hosting all my virtual machines. There is a SAN involved (san1) that has many LUNs created on it one of which (lun1) will be used to store all data dedicated to a virtual machine (virtual1). Eventually lun1 is created, given the name "storage" and strictly used for the virtual machine "virtual1" to store and access data.
What I have currently in place:
- I currently have created the SAN (san1), created a disk group with the
virtual disk (storage), and assigned a LUN (lun1) to it.
- I have set up two physical servers that are connected to the SAN via SAS
cables (multi paths).
- I have set up the clustering feature on those two servers and have hyper-v
role installed on each as well.
- I have created a cluster (cluster1) with server members node1 and node2.
- I have created a virtual server (virtual1) and made it highly available
on the cluster (cluster1).
Question:
Is it possible to have lun1 (drive z) brought up and accessed by virtual1?
What I have tried:
I had the lun1 aka driveZ showing up in node1's disk management. I then added it as a resource to the cluster storage area. I tried to do two different things. (1) I tried to add it as a Cluster Shared Volume, shortly after I realized that only the cluster members could see/access it and not the virtual machines even though they were created as a service under in the cluster. (2) I tried to move the resource (driveZ) to the virtual machine (virtual1) within cluster1. After doing that I went into the virtual machine settings and added the drive as a SCSI drive (using lun1 # 100GB) and refreshed the Disk Management on the virtual machine (virtual1). The drive showed up and allowed me to assign a drive letter, then asked me if I wanted to format it... What about all my data thats on it?? Was that a bust? Anyway, thats where I'm at right now... Ideas?
Thoughts:
Just so I'm clear, all of this is for testing atm... Actual sizes of resources in production greatly differ. I was thinking about adding the driveZ (lun1) as a Cluster Shared Volume, and then add a new Hyper-V virtual SCSI drive (say 50G so later I can try to expand to 100G, the full size of the physical/SAN drive) to my VM. Storing the fixed VHD (Virtual Hard Disk) inside the Cluster Shared Volume "driveZ". I'm testing it out now... But I have concerns... 1) What happens when I try to create a really large VHD (around 7TB)? 2) Can the fixed disk VHD be expanded in any way? I plan on making my new SAN virtual disk larger than 7TB in the future... Currently its going to stay at 7TB but that will expand at some point...
Figured it out!
The correct way to do it is...
Setup a SAN, create a disk group with two virtual disks, and assigned LUNs to them.
Setup your 2 physical servers with Win Server 2008 R2, connect them both to the SAN.
Add the Failover Cluster feature, and the Hyper-V role to both servers.
For the two drives (from the SAN), bring them online and initialize them both. Create a simple volume on each drive if you wish, even format them if you want.
Create a cluster, add 1 of the virtual disks from the SAN as a Cluster Shared Volume. This will be used to store the virtual machines on.
Create a virtual machine and store it on the CSV ex: C:\ClusterStorage\Volume1\, then power it up.
The second drive you need to take offline. This should just be a drive on the host server. It has to be offline! When you right click and choose offline, go ahead and right click then go to properties. On that page look for the LUN number and write it down.
Open up the VM settings go down to Scsi controller and add a drive. Choose physical drive and choose the correct LUN number. Hit OK and it should show up in the VM Storage Manager.
As a helpful tool check these pages out...
Configuring Disks and Storage
Hyper-V Clustering Video 1
Hyper-V Clustering Video 2
Hyper-V Clustering Video 3
Hyper-V Clustering Video 4
Hyper-V Clustering Video 5