Can i increase the disc size while kafka is running? - apache-kafka

I've an Apache Kafka 2.7 and it is in the production environment, there are discs that we use as data storage, and they are %90 filled, how can i increase the disc capacity, they are in virtual environment would you also share the answer for if physical discs were connected?

Assuming you've configured log.dirs to use specific volume mounts, then no.
The contents of the directories for that value must be topic-partition folders, not a parent folder of all volumes where it would be more possible to dynamically add directories.
Drives may be hot swappable on the physical server (or VM), yes, but you'll still need to edit the Kafka property file to include the new mount path, and restart the broker.
Also, if you add new disks, Kafka will not rebalance existing data, meaning you'll still have discs that are 90% full. In any environment, you'd need to create/acquire a new, larger disc, shutdown the machine, replicate the disc, then attach the new one, and restart.

Related

Install Postgres on removable volume on linux?

Cloud platforms like Linode.com often provide hot-pluggable storage volumes that you can easily attach and detach from a Linux virtual machine without restarting it.
I am looking for a way to install Postgres so that its data and configuration ends up on a volume that I have mounted to the virtual machine. The end result should allow me to shut down the machine, detach the volume, spin up another machine with an identical version of Postgres already installed, attach the volume and have Postgres work just like it did on the old machine with all the data, file system permissions and server-wide configuration intact.
Is such a thing possible? Is there a reliable way to move installations (i.e databases and configuration, not the actual binaries) of Postgres across machines?
CLARIFICATION: the virtual machine has two disks:
the "built-in" one which is created when the VM is created and mounted to /. That's where Postgres gets installed to and you can't move this disk.
the hot-pluggable disk which you can easily attach and detach from a running VM. This is where I want Postgres data and configuration to be so I can just detach the disk (after shutting down the VM to prevent data loss/corruption) and attach it to another VM when I want my data to move so it behaves like it did on the old VM (i.e. no failures to start Postgres, no errors about permissions or missing files, etc).
This works just fine. It is not really any different to starting and stopping PostgreSQL and not removing the disk. There are a couple of things to consider though.
You have to make sure it is stopped + writing synced before unmounting the volume. Obvious enough, and I can't believe you'd be able to unmount before sync completed, but worth repeating.
You will want the same version of PostgreSQL, probably on the same version of operating system with the same locales too. Different distributions might compile it with different options.
Although you can put configuration and data in the same directory hierarchy, most distros tend to put config in /etc. If you compile from source yourself this won't be a problem. Alternatively, you can usually override the default locations or, and this is probably simpler, bind-mount the data and config directories into the places your distro expects.
Note that if your storage allows you to connect the same volume to multiple hosts in some sort of "read only" mode that won't work.
Edit: steps from comment moved into body for easier reading.
start up PG, create a table put one row in it.
Stop PG.
Mount your volume at /mnt/db
rsync /var/lib/postgresql/NN/main to /mnt/db/pg_data and /etc/postgresql/NN/main to /mnt/db/pg_etc
rename /var/lib/postgresql/NN/main and add .OLD to the name and do the same with the /etc
bind-mount the dirs from /mnt to replace them
restart PG
Test
Repeat
Return to step 8 until you are happy

what is in zookeeper datadir and how to cleanup?

I found my zookeeper dataDir is huge. I would like to understand
What is in the dataDir?
How to cleanup? Does it automatically cleanup after certain period?
Thanks
According to Zookeeper's administrator guide:
The ZooKeeper Data Directory contains files which are a persistent copy of the znodes stored by a particular serving ensemble. These are the snapshot and transactional log files. As changes are made to the znodes these changes are appended to a transaction log, occasionally, when a log grows large, a snapshot of the current state of all znodes will be written to the filesystem. This snapshot supercedes all previous logs.
So in short, for your first question, you can assume that dataDir is used to store Zookeeper's state.
As for your second question, there is no automatic cleanup. From the doc:
A ZooKeeper server will not remove old snapshots and log files, this is the responsibility of the operator. Every serving environment is different and therefore the requirements of managing these files may differ from install to install (backup for example).
The PurgeTxnLog utility implements a simple retention policy that administrators can use. The API docs contains details on calling conventions (arguments, etc...).
In the following example the last count snapshots and their corresponding logs are retained and the others are deleted. The value of should typically be greater than 3 (although not required, this provides 3 backups in the unlikely event a recent log has become corrupted). This can be run as a cron job on the ZooKeeper server machines to clean up the logs daily.
java -cp zookeeper.jar:log4j.jar:conf org.apache.zookeeper.server.PurgeTxnLog <dataDir> <snapDir> -n <count>
If this is a dev instance, I guess you could just almost completely purge the folder (except some files like myid if its there). But for a production instance you should follow the cleanup procedure shown above.

How to increase boot partition of hdd.img yocto

I created a hdd.img using Yocto and copied the same to pendrive using dd command.
On PC it is mounted as single partition "boot".
When I checked on command line using sudo fdisk -l, it is showing four partitions.
I booted my hardware from pendrive, now I want to remove rootfs.img from boot partition mounted at /media/realroot and copy same rootfs.img again to the same partition.
It is giving No disk space error. I am trying to copy the same rootfs.img which I deleted.
Why it is using only 700MB for boot partiton of 16GB pendrive, or how to increase boot partition size in Yocto to make use of full 16GB?
I'm not sure I understood the real problem but if you are looking for a way to increase the free space on the rootfs, take a look at IMAGE_ROOTFS_EXTRA_SPACE as well as IMAGE_OVERHEAD_FACTOR.

How to control various log sizes

I have cluster running in Azure.
I have multiple gigabytes of log data under D:\SvcFab\Log\Traces. Is there way to control amount of trace data that is collected/stored? Will the logs grow indefinitely?
Also the D:\SvcFab\ReplicatorLog has 8GB of preallocated data as specified by SharedLogSizeInMB parameter (https://learn.microsoft.com/en-us/azure/service-fabric/service-fabric-reliable-services-configuration). How can I change this setting in Azure cluster or should it always be kept default?
For Azure clusters the SvcFab\Log folder will grow up to 5GB. It will also shrink if detects your disk is running out of space (<1GB). There are no controls for this in Azure.
This may be old but if you still have this issue, the solution for this is to add parameter in arm template for service fabric cluster .. there are some other ways to do this but this one is the most guaranteed one
https://techcommunity.microsoft.com/t5/azure-paas-developer-blog/reduce-log-size-on-service-fabric-node/ba-p/1017493

Copy file using zookeeper

I have a distributed application and I use zookeeper to manage configuration data in all distributed servers.My service in each server needs some dlls to run . I am trying to build a centralized system from where I can copy my dlls to all the server.
Can I achieve that using zookeeper ?
I am aware that "ZooKeeper is generally not designed for large size storage" . My dll files are of size less the 3mb.
There is a 1mb soft limit on how large node data can get. According to the docs you can increase the max data size:
jute.maxbuffer:
(Java system property: jute.maxbuffer)
This option can only be set as a Java system property. There is no zookeeper prefix on it. It specifies the maximum size of the data that can be stored in a znode. The default is 0xfffff, or just under 1M. If this option is changed, the system property must be set on all servers and clients otherwise problems will arise. This is really a sanity check. ZooKeeper is designed to store data on the order of kilobytes in size.
I would not recommend using Zookeeper for this purpose, (you could much more easily host the binaries on a web server instead,) but it does seem possible in theory.
Zookeeper is designed to transfer messages inside the cluster.
Best thing you can do is create a Znode_A that will contain Znodes,
watch znode a for changes. Each Znode in Znode_A will represent a dll and will contain a dll path. Each node on the cluster watch for Znode_A data changes, so when a new dll (znode) will be created the nodes will know to copy the dll from a main repository.
In order to transfer files you can use SCP.
As data you can pass file path of your dlls. Using SCP you can pull files from base repository.