How can I clear all children nodes of a data node, but NOT delete the data node itself in zookeeper? - apache-zookeeper

I have a znode: /test
And /test has two children nodes: /test/data1, /test/data2
How can I delete /test/data1 and /test/data2, but at the same time, NOT delete the node /test?

You can execute something like the following:
zkCli.sh -server xxx ls /test | \
grep "^\[" | \
grep -o -P "\w*" | \
while read znode ; do zkCli.sh -server xxx delete /test/$znode ; done
This is only using zkCli.sh and bash commands but is not optimal because it connects to the ZooKeeper server multiple times (one for each direct child deletion + one to fetch the children list). A more straightforward approach would be to use a ZooKeeper client library like kazoo or ZooKeeper Java API for such a task.

Related

Why some buckets should not appear after a gsutil ls?

When I do gsutil ls -p myproject-id I get a list of buckets (in my case 2 buckets), which I expect to be the list of all my buckets in the project:
gs://bucket-one/
gs://bucket-two/
But, if I do gsutil ls -p myproject-id gs://asixtythreecharacterlongnamebucket I actually get the elements of that long-named bucket:
gs://asixtythreecharacterlongnamebucket/somefolder/
So my question is: why when I do a ls to the project I don't get in the results the long-named bucket?
The only explanation it made sense to me was this: https://stackoverflow.com/a/34738829/3457432
But I'm not sure. Is this the reason? Or could it be other ones?
Are you sure that asixtythreecharacterlongnamebucket belongs to myproject-id? It really sounds like asixtythreecharacterlongnamebucket was created in a different project.
You can verify this by checking the bucket ACLs for asixtythreecharacterlongnamebucket and bucket-one and seeing if the project numbers in the listed entities match:
$ gsutil ls -Lb gs://asixtythreecharacterlongnamebucket | grep projectNumber
$ gsutil ls -Lb gs://bucket-one | grep projectNumber
Also note that the -p argument to ls has no effect in your second command when you're listing objects in some bucket. The -p argument only affects which project should be used when you're listing buckets in some project, as in your first command. Think of ls as listing the children resources belonging to some parent -- the parent of a bucket is a project, while the parent of an object is a bucket.
You don't perform the same request!
gsutil ls -p myproject-id
Here you ask all the bucket resources that belong to a project
gsutil ls -p myproject-id gs://asixtythreecharacterlongnamebucket
Here you ask all the objects that belong to the bucket asixtythreecharacterlongnamebucket and you use the quota project myproject-id
In both case, you need to have permissions to access the resources

Copy a file into kubernetes pod without using kubectl cp

I have a use case where my pod is run as non-rootuser and its running a python app.
Now I want to copy file from master node to running pod. But when I try to run
kubectl cp app.py 103000-pras-dev/simplehttp-777fd86759-w79pn:/tmp
This command hungs up but when i run pod as root user and then run the same command
it executes successfully. I was going through the code of kubectl cp where it internally uses tar command.
Tar command has got multiple flags like --overwrite --no-same-owner, --no-preserve and few others. Now from kubectl cp we can't pass all those flag to tar. Is there any way by which I can copy file using kubectl exec command or any other way.
kubectl exec simplehttp-777fd86759-w79pn -- cp app.py /tmp/ **flags**
If the source file is a simple text file, here's my trick:
#!/usr/bin/env bash
function copy_text_to_pod() {
namespace=$1
pod_name=$2
src_filename=$3
dest_filename=$4
base64_text=`cat $src_filename | base64`
kubectl exec -n $namespace $pod_name -- bash -c "echo \"$base64_text\" | base64 -d > $dest_filename"
}
copy_text_to_pod my-namespace my-pod-name /path/of/source/file /path/of/target/file
Maybe base64 is not necessary. I put it here in case there is some special character in the source file.
Meanwhile I found a hack, disclaimer this is not the exact kubectl cp just a workaround.
I have written a go program where I have created a goroutine to read file and attached that to stdin and ran kubectl exec tar command with proper flags. Here is what I did
reader, writer := io.Pipe()
copy := exec.CommandContext(ctx, "kubectl", "exec", pod.Name, "--namespace", pod.Namespace, "-c", container.Name, "-i",
"--", "tar", "xmf", "-", "-C", "/", "--no-same-owner") // pass all the flags you want to
copy.Stdin = reader
go func() {
defer writer.Close()
if err := util.CreateMappedTar(writer, "/", files); err != nil {
logrus.Errorln("Error creating tar archive:", err)
}
}()
Helper function definition
func CreateMappedTar(w io.Writer, root string, pathMap map[string]string) error {
tw := tar.NewWriter(w)
defer tw.Close()
for src, dst := range pathMap {
if err := addFileToTar(root, src, dst, tw); err != nil {
return err
}
}
return nil
}
Obviously, this thing doesn't work because of permission issue but *I was able to pass tar flags
If it is only a text file it can be also "copied" via netcat.
1) You have to be logged on both nodes
$ kubectl exec -ti <pod_name> bash
2) Make sure to have netcat, if not install them
$ apt-get update
$ apt-get install netcat-openbsd
3) Go to the folder with permissions i.e.
/tmp
4) Inside the container where you have python file write
$ cat app.py | nc -l <random_port>
Example
$ cat app.py | nc -l 1234
It will start listening on provided port.
5) Inside the container where you want have the file
$ nc <PodIP_where_you_have_py_file> > app.py
Example
$ nc 10.36.18.9 1234 > app.py
It must be POD IP, it will not recognize pod name. To get ip use kubectl get pods -o wide
It will copy content of app.py file to the other container file. Unfortunately, you will need to add permissions manual or you can use script like (sleep is required due to speed of "copying"):
#!/bin/sh
nc 10.36.18.9 1234 > app.py | sleep 2 |chmod 770 app.py;
Copy a file into kubernetes pod without using kubectl cp
kubectl cp is bit of a pain to work with. For example:
installing kubectl and configuring it (might need it on multiple machines). In our company, most people only have a restrictive kubectl access from rancher web GUI. No CLI access is provided for most people.
network restrictions in enterprises
Large file downloads/uploads may stop or freeze sometimes probably because traffic goes through k8s API server.
weird tar related errors keep popping up etc..
One of the reasons for lack of support to copy the files from a pod(or other way around) is because k8s pods were never meant to be used like a VM.. They are meant to be ephemeral. So, the expectation is to not store/create any files on the pod/container disk.
But sometimes we are forced to do this, especially while debugging issues or using external volumes..
Below is the solution we found effective. This might not be right for you/your team.
We now instead use azure blob storage as a mediator to exchange files between a kubernetes pod and any other location. The container image is modified to include azcopy utility (Dockerfile RUN instruction below will install azcopy in your container).
RUN /bin/bash -c 'wget https://azcopyvnext.azureedge.net/release20220511/azcopy_linux_amd64_10.15.0.tar.gz && \
tar -xvzf azcopy_linux_amd64_10.15.0.tar.gz && \
cp ./azcopy_linux_amd64_*/azcopy /usr/bin/ && \
chmod 775 /usr/bin/azcopy && \
rm azcopy_linux_amd64_10.15.0.tar.gz && \
rm -rf azcopy_linux_amd64_*'
Checkout this SO question for more on azcopy installation.
When we need to download a file,
we simply use azcopy to copy the file from within the pod to azure blob storage. This can be done either programmatically or manually.
Then we download the file to local machine from azure blob storage explorer. Or some job/script can pick up this file from blob container.
Similar thing is done for upload as well. The file is first placed in blob storage container. This can be done manually using blob storage explorer or can be done programmatically. Next, from within the pod azcopy can pull the file from blob storage and place it inside the pod.
The same can be done with AWS (S3) or GCP or using any other cloud provider.
Probably even SCP, SFTP, RSYNC can be used.

How to copy partial topic data from one cluster to another

I have a use case where I need to copy the data from one topic to another topic in a different cluster but I need to copy only from a given offset. What can I use for the above use case?
I have looked into mirror maker as it copies data from one cluster to another but how to mention the offset part, I am not getting that.
Is there any utility I can use?
If as you say "This will be a one time operation" you can use kafkacat this -o option.
For example (the easiest case):
kafkacat -C -b mybrocker_cluster_1:9092 -t mytopic1 -o <offset> | \
kafkacat -P -b mybrocker_cluster_2:9092 -t mytopic1
You probably still need to add a few parameters to the consumer:
-X message.max.bytes=<value> -X fetch.message.max.bytes=<value> -X receive.message.max.bytes=<value>

Can we see transfer progress with kubectl cp?

Is it possible to know the progress of file transfer with kubectl cp for Google Cloud?
No, this doesn't appear to be possible.
kubectl cp appears to be implemented by doing the equivalent of
kubectl exec podname -c containername \
tar cf - /whatever/path \
| tar xf -
This means two things:
tar(1) doesn't print any useful progress information. (You could in principle add a v flag to print out each file name as it goes by to stderr, but that won't tell you how many files in total there are or how large they are.) So kubectl cp as implemented doesn't have any way to get this out.
There's not a richer native Kubernetes API to copy files.
If moving files in and out of containers is a key use case for you, it will probably be easier to build, test, and run by adding a simple HTTP service. You can then rely on things like the HTTP Content-Length: header for progress metering.
One option is to use pv which will show time elapsed, data transferred and throughput (eg MB/s):
$ kubectl exec podname -c containername -- tar cf - /whatever/path | pv | tar xf -
14.1MB 0:00:10 [1.55MB/s] [ <=> ]
If you know the expected transfer size ahead of time you can also pass this to pv and it will then calculate a % progress and also an ETA, eg for a 100m transfer:
$ kubectl exec podname -c containername -- tar cf - /whatever/path | pv -s 100m | tar xf -
13.4MB 0:00:09 [1.91MB/s] [==> ] 13% ETA 0:00:58
You obviously need to have pv installed (locally) for any of the above to work.
It's not possible, but you can find here how to implement rsync with kubernetes, rsync shows you the progress of the transfer file.
rsync files to a kubernetes pod
I figured out a hacky way to do this. If you have bash access to the container you're copying to, you can do something like wc -c <file> on the remote, then compare that to the size locally. du -h <file> is another option, which gives human-readable output so it may be better
On MacOS, there is still the hacky way of opening the "Activity Monitor" on the "Network" tab. If you are copying with kubectl cp from your local machine to a distant pod, then the total transfer is shown in the "Sent Bytes" column.
Not of super high precision, but it sort of does the job without installing anything new.
I know it doesn't show an active progress of each file, but does output a status including byte count for each completed file, which for multiple files run via scripts, is almost as good as active progress:
kubectl cp local.file container:/path/on/container --v=4
Notice the --v=4 is verbose mode and will give you output. I found kubectl cp output shows from v=3 thru v=5.

howto: elastic beanstalk + deploy docker + graceful shutdown

Hi great people of stackoverflow,
Were hosting a docker container on EB with an nodejs based code running on it.
When redeploying our docker container we'd like the old one to do a graceful shutdown.
I've found help & guides on how our code could receive a sigterm signal produced by 'docker stop' command.
However further investigation into the EB machine running docker at:
/opt/elasticbeanstalk/hooks/appdeploy/enact/01flip.sh
shows that when "flipping" from current to the new staged container, the old one is killed with 'docker kill'
Is there any way to change this behaviour to docker stop?
Or in general a recommended approach to handling graceful shutdown of the old container?
Thanks!
Self answering as I've found a solution that works for us:
tl;dr: use .ebextensions scripts to run your script before 01flip, your script will make sure a graceful shutdown of whatevers inside the docker takes place
first,
your app (or whatever your'e running in docker) has to be able to catch a signal, SIGINT for example, and shutdown gracefully upon it.
this is totally unrelated to Docker, you can test it running wherever (locally for example)
There is a lot of info about getting this kind of behaviour done for different kind of apps on the net (be it ruby, node.js etc...)
Second,
your EB/Docker based project can have a .ebextensions folder that holds all kinda of scripts to execute while deploying.
we put 2 custom scripts into it, gracefulshutdown_01.config and gracefulshutdown_02.config file that looks something like this:
# gracefulshutdown_01.config
commands:
backup-original-flip-hook:
command: cp -f /opt/elasticbeanstalk/hooks/appdeploy/enact/01flip.sh /opt/elasticbeanstalk/hooks/appdeploy/01flip.sh.bak
test: '[ ! -f /opt/elasticbeanstalk/hooks/appdeploy/01flip.sh.bak ]'
cleanup-custom-hooks:
command: rm -f 05gracefulshutdown.sh
cwd: /opt/elasticbeanstalk/hooks/appdeploy/enact
ignoreErrors: true
and:
# gracefulshutdown_02.config
commands:
reorder-original-flip-hook:
command: mv /opt/elasticbeanstalk/hooks/appdeploy/enact/01flip.sh /opt/elasticbeanstalk/hooks/appdeploy/enact/10flip.sh
test: '[ -f /opt/elasticbeanstalk/hooks/appdeploy/enact/01flip.sh ]'
files:
"/opt/elasticbeanstalk/hooks/appdeploy/enact/05gracefulshutdown.sh":
mode: "000755"
owner: root
group: root
content: |
#!/bin/sh
# find currently running docker
EB_CONFIG_DOCKER_CURRENT_APP_FILE=$(/opt/elasticbeanstalk/bin/get-config container -k app_deploy_file)
EB_CONFIG_DOCKER_CURRENT_APP=""
if [ -f $EB_CONFIG_DOCKER_CURRENT_APP_FILE ]; then
EB_CONFIG_DOCKER_CURRENT_APP=`cat $EB_CONFIG_DOCKER_CURRENT_APP_FILE | cut -c 1-12`
echo "Graceful shutdown on app container: $EB_CONFIG_DOCKER_CURRENT_APP"
else
echo "NO CURRENT APP TO GRACEFUL SHUTDOWN FOUND"
exit 0
fi
# give graceful kill command to all running .js files (not stats!!)
docker exec $EB_CONFIG_DOCKER_CURRENT_APP sh -c "ps x -o pid,command | grep -E 'workers' | grep -v -E 'forever|grep' " | awk '{print $1}' | xargs docker exec $EB_CONFIG_DOCKER_CURRENT_APP kill -s SIGINT
echo "sent kill signals"
# wait (max 5 mins) until processes are done and terminate themselves
TRIES=100
until [ $TRIES -eq 0 ]; do
PIDS=`docker exec $EB_CONFIG_DOCKER_CURRENT_APP sh -c "ps x -o pid,command | grep -E 'workers' | grep -v -E 'forever|grep' " | awk '{print $1}' | cat`
echo TRIES $TRIES PIDS $PIDS
if [ -z "$PIDS" ]; then
echo "finished graceful shutdown of docker $EB_CONFIG_DOCKER_CURRENT_APP"
exit 0
else
let TRIES-=1
sleep 3
fi
done
echo "failed to graceful shutdown, please investigate manually"
exit 1
gracefulshutdown_01.config is a small util that backups the original flip01 and deletes (if exists) our custom script.
gracefulshutdown_02.config is where the magic happens.
it creates a 05gracefulshutdown enact script and makes sure flip will happen afterwards by renaming it to 10flip.
05gracefulshutdown, the custom script, does this basically:
find current running docker
find all processes that need to be sent a SIGINT (for us its processes with 'workers' in its name
send a sigint to the above processes
loop:
check if processes from before were killed
continue looping for an amount of tries
if tries are over, exit with status "1" and dont continue to 10flip, manual interference is needed.
this assumes you only have 1 docker running on the machine, and that you are able to manually hop on to check whats wrong in the case it fails (for us never happened yet).
I imagine it can also be improved in many ways, so have fun.