Kubectl appears to be discarding standard output - kubernetes

I'm trying to copy the contents of a large (~350 files, ~40MB total) directory from a Kubernetes pod to my local machine. I'm using the technique described here.
Sometimes it succeeds, but very frequently the standard output piped to the tar xf command on my host appears to get truncated. When that happens, I see errors like:
<some file in the archive being transmitted over the pipe>: Truncated tar archive
The files in the source directory don't change. The file in the error message is usually different (ie: it appears to be truncated in a different place).
For reference (copied from the document lined to above), this is the analog to what I'm trying to do (I'm using a different pod name and directory names):
kubectl exec -n my-namespace my-pod -- tar cf - /tmp/foo | tar xf - -C /tmp/bar
After running it, I expect the contents of my local /tmp/bar to be the same as those in the pod.
However, more often than not, it fails. My current theory (I have a very limited understanding of how kubectl works, so this is all speculation) is that when kubectl determines that the tar command has completed, it terminates -- regardless of whether or not there are remaining bytes in transit (over the network) containing the contents of standard output.
I've tried various combinations of:
stdbuf
Changing tar's blocking factor
Making the command take longer to run (by adding && sleep <x>)
I'm not going to list all combinations I've tried, but this is an example that uses everything:
kubectl exec -n my-namespace my-pod -- stdbuf -o 0 tar -b 1 -c -f - -C /tmp/foo . && sleep 2 | tar xf - -C /tmp/bar
There are combinations of that command that I can make work pretty reliably. For example, forgetting about stdbuf and -b 1 and just sleeping for 100 seconds, ie:
kubectl exec -n my-namespace my-pod -- tar -c -f - -C /tmp/foo . && sleep 100 | tar xf - -C /tmp/bar
But even more experimentation led me to believe that the block size of tar (512 bytes, I believe?) was still too large (the arguments of -b are a count of blocks, not the size of those blocks). This is the command I'm using for now:
kubectl exec -n my-namespace my-pod -- bash -c 'dd if=<(tar cf - -C /tmp/foo .) bs=16 && sleep 10' | tar xf - -C /tmp/bar
And yes, I HAD to make bs that small and sleep "that big" to make it work. But this at least gives me two variables I can mess with. I did find that if I set bs=1, I didn't have to sleep... but it took a LONG time to move all the data (one byte at a time).
So, I guess my questions are:
Is my theory that kubectl truncates standard output after it determines the command given to exec has finished correct?
Is there a better solution to this problem?

Maybe you haven't been specific enough for kubectl regarding what the full command that it must contend with really is. There might be ambiguity as to who should be responsible for the pipe process. The "--" probably doesn't direct kubectl to include that as part of the command. That is probably being intercepted by the shell.
Have you tried wrapping all of it in double-quotes ?
CMD="tar cf - /tmp/foo | tar xf - -C /tmp/bar"
kubectl exec -n my-namespace my-pod -- "${CMD}"
That way it would include the scope of saving at the target as part of the process to monitor for completion.

Related

Error while trying to copy a file from a container to my local mac

I'm trying to copy a file from a container to my mac and I also look into this question How to copy files from kubernetes Pods to local system but didn’t help
at first, I tried
kubectl cp -n company -c company-web-falcon company/company-web-falcon-bb86d79cf-6jcqq:/etc/identity/ca/security-ca.pem /etc/identity/ca/security-ca.pem
which resulted in this error
tar: Removing leading `/' from member names
So I tried this
kubectl cp -n company -c company-web-falcon company/company-web-falcon-bb86d79cf-6jcqq:/etc/identity/ca/security-ca.pem /etc/identity/ca/security-ca.pem .
which resulted in this error
error: source and destination are required
how should I fix it?
So based on the comment seems like WORKINGDIR is set either in the docker file or in the deployment, in that case, it will fail. so if you want to achieve this you should move it to the working directory.
kubectl exec company-web-falcon-bb86d79cf-6jcqq -c company-web-falcon -- bash -c "cp /etc/identity/ca/security-ca.pem ." && kubectl cp -n company -c company-web-falcon company/company-web-falcon-bb86d79cf-6jcqq:security-ca.pem /etc/identity/ca/security-ca.pem
You can copy the file with the following command kubectl cp
Copy /tmp/foo_dir local directory to /tmp/bar_dir in a remote pod in the default namespace
kubectl cp /tmp/foo_dir <some-pod>:/tmp/bar_dir
Copy /tmp/foo local file to /tmp/bar in a remote pod in a specific container
kubectl cp /tmp/foo <some-pod>:/tmp/bar -c <specific-container>
Copy /tmp/foo local file to /tmp/bar in a remote pod in namespace
kubectl cp /tmp/foo <some-namespace>/<some-pod>:/tmp/bar
Copy /tmp/foo from a remote pod to /tmp/bar locally
kubectl cp <some-namespace>/<some-pod>:/tmp/foo /tmp/bar
As for the warning that you receive, you can get more details in the github issues
There are multiple accepted explanations. The best one is the following
You move the wanted file to the working dir in the pod (the directory which is automatically opened, when you open bash on it) -
user#podname:/usr/src# ls data.txt
data.txt
In this case it is - /usr/src folder.
Then in the local bash terminal -
user#local:~$ kubectl cp podname:data.txt data.txt
user#local:~$ ls data.txt
data.txt

Why does kubectl cp command terminates with exit code 126?

I am trying to copy files from the pod to local using following command:
kubectl cp /namespace/pod_name:/path/in/pod /path/in/local
But the command terminates with exit code 126 and copy doesn't take place.
Similarly while trying from local to pod using following command:
kubectl cp /path/in/local /namespace/pod_name:/path/in/pod
It throws the following error:
OCI runtime exec failed: exec failed: container_linux.go:367: starting container process caused: exec: "tar": executable file not found in $PATH: unknown
Please help through this.
kubectl cp is actually a very small wrapper around kubectl exec whatever tar c | tar x. A side effect of this is that you need a working tar executable in the target container, which you do not appear to have.
In general kubectl cp is best avoided, it's usually only good for weird debugging stuff.
kubectl cp requires the tar to be present in your container, as the help says:
!!!Important Note!!!
Requires that the 'tar' binary is present in your container
image. If 'tar' is not present, 'kubectl cp' will fail.
Make sure your container contains the tar binary in its $PATH
An alternative way to copy a file from local filesystem into a container:
cat [local file path] | kubectl exec -i -n [namespace] [pod] -c [container] "--" sh -c "cat > [remote file path]"
Useful command to copy the file from pod to local
kubectl exec -n <namespace> <pod> -- cat <filename with path> > <filename>
For me the cat worked like this:
cat <file name> | kubectl exec -i <pod-id> -- sh -c "cat > <filename>"
Example:
cat file.json | kubectl exec -i server-77b7976cc7-x25s8 -- sh -c "cat > /tmp/file.json"
Didn't need to specify namespace since I run the command from a specific project, and since we have one container, didn't need to specify it

kubectl cp "error: one of src or dest must be a remote file specification"

When I try to copy some files in an existing directory with a wildcard, I receive the error:
kubectl cp localdir/* my-namespace/my-pod:/remote-dir/
error: one of src or dest must be a remote file specification
It looks like wildcards support have been removed but I have many files to copy and my remote dir is not empty so I can't use the recursive.
How can I run a similar operation?
As a workaround you can use:
find localdir/* | xargs -I{} kubectl cp {} my-namespace/my-pod:/remote-dir/
In find you can use a wildcard to specify files you are looking for and it will copy it to the pod.
Here is what I have come up with:
kubectl exec -n <namespace> <pod_name> -- mkdir -p <dest_dir> \
&& tar cf - -C <src_dir> . | kubectl exec -i -n <namespace> <pod_name> -- tar xf - -C <dest_dir>
Notice there are two parts, first is making sure the destination directory exists.
Second is using tar to archive the files, send it and unpack it in a container.
Remember, that in order for this to work, it is required that the tar and mkdir binaries are present in your container.
The advantage od this solution over the one proposed earlier (the one with xargs) is that it is faster because it sends all files at once, and not one by one.

Copy a file into kubernetes pod without using kubectl cp

I have a use case where my pod is run as non-rootuser and its running a python app.
Now I want to copy file from master node to running pod. But when I try to run
kubectl cp app.py 103000-pras-dev/simplehttp-777fd86759-w79pn:/tmp
This command hungs up but when i run pod as root user and then run the same command
it executes successfully. I was going through the code of kubectl cp where it internally uses tar command.
Tar command has got multiple flags like --overwrite --no-same-owner, --no-preserve and few others. Now from kubectl cp we can't pass all those flag to tar. Is there any way by which I can copy file using kubectl exec command or any other way.
kubectl exec simplehttp-777fd86759-w79pn -- cp app.py /tmp/ **flags**
If the source file is a simple text file, here's my trick:
#!/usr/bin/env bash
function copy_text_to_pod() {
namespace=$1
pod_name=$2
src_filename=$3
dest_filename=$4
base64_text=`cat $src_filename | base64`
kubectl exec -n $namespace $pod_name -- bash -c "echo \"$base64_text\" | base64 -d > $dest_filename"
}
copy_text_to_pod my-namespace my-pod-name /path/of/source/file /path/of/target/file
Maybe base64 is not necessary. I put it here in case there is some special character in the source file.
Meanwhile I found a hack, disclaimer this is not the exact kubectl cp just a workaround.
I have written a go program where I have created a goroutine to read file and attached that to stdin and ran kubectl exec tar command with proper flags. Here is what I did
reader, writer := io.Pipe()
copy := exec.CommandContext(ctx, "kubectl", "exec", pod.Name, "--namespace", pod.Namespace, "-c", container.Name, "-i",
"--", "tar", "xmf", "-", "-C", "/", "--no-same-owner") // pass all the flags you want to
copy.Stdin = reader
go func() {
defer writer.Close()
if err := util.CreateMappedTar(writer, "/", files); err != nil {
logrus.Errorln("Error creating tar archive:", err)
}
}()
Helper function definition
func CreateMappedTar(w io.Writer, root string, pathMap map[string]string) error {
tw := tar.NewWriter(w)
defer tw.Close()
for src, dst := range pathMap {
if err := addFileToTar(root, src, dst, tw); err != nil {
return err
}
}
return nil
}
Obviously, this thing doesn't work because of permission issue but *I was able to pass tar flags
If it is only a text file it can be also "copied" via netcat.
1) You have to be logged on both nodes
$ kubectl exec -ti <pod_name> bash
2) Make sure to have netcat, if not install them
$ apt-get update
$ apt-get install netcat-openbsd
3) Go to the folder with permissions i.e.
/tmp
4) Inside the container where you have python file write
$ cat app.py | nc -l <random_port>
Example
$ cat app.py | nc -l 1234
It will start listening on provided port.
5) Inside the container where you want have the file
$ nc <PodIP_where_you_have_py_file> > app.py
Example
$ nc 10.36.18.9 1234 > app.py
It must be POD IP, it will not recognize pod name. To get ip use kubectl get pods -o wide
It will copy content of app.py file to the other container file. Unfortunately, you will need to add permissions manual or you can use script like (sleep is required due to speed of "copying"):
#!/bin/sh
nc 10.36.18.9 1234 > app.py | sleep 2 |chmod 770 app.py;
Copy a file into kubernetes pod without using kubectl cp
kubectl cp is bit of a pain to work with. For example:
installing kubectl and configuring it (might need it on multiple machines). In our company, most people only have a restrictive kubectl access from rancher web GUI. No CLI access is provided for most people.
network restrictions in enterprises
Large file downloads/uploads may stop or freeze sometimes probably because traffic goes through k8s API server.
weird tar related errors keep popping up etc..
One of the reasons for lack of support to copy the files from a pod(or other way around) is because k8s pods were never meant to be used like a VM.. They are meant to be ephemeral. So, the expectation is to not store/create any files on the pod/container disk.
But sometimes we are forced to do this, especially while debugging issues or using external volumes..
Below is the solution we found effective. This might not be right for you/your team.
We now instead use azure blob storage as a mediator to exchange files between a kubernetes pod and any other location. The container image is modified to include azcopy utility (Dockerfile RUN instruction below will install azcopy in your container).
RUN /bin/bash -c 'wget https://azcopyvnext.azureedge.net/release20220511/azcopy_linux_amd64_10.15.0.tar.gz && \
tar -xvzf azcopy_linux_amd64_10.15.0.tar.gz && \
cp ./azcopy_linux_amd64_*/azcopy /usr/bin/ && \
chmod 775 /usr/bin/azcopy && \
rm azcopy_linux_amd64_10.15.0.tar.gz && \
rm -rf azcopy_linux_amd64_*'
Checkout this SO question for more on azcopy installation.
When we need to download a file,
we simply use azcopy to copy the file from within the pod to azure blob storage. This can be done either programmatically or manually.
Then we download the file to local machine from azure blob storage explorer. Or some job/script can pick up this file from blob container.
Similar thing is done for upload as well. The file is first placed in blob storage container. This can be done manually using blob storage explorer or can be done programmatically. Next, from within the pod azcopy can pull the file from blob storage and place it inside the pod.
The same can be done with AWS (S3) or GCP or using any other cloud provider.
Probably even SCP, SFTP, RSYNC can be used.

Can we see transfer progress with kubectl cp?

Is it possible to know the progress of file transfer with kubectl cp for Google Cloud?
No, this doesn't appear to be possible.
kubectl cp appears to be implemented by doing the equivalent of
kubectl exec podname -c containername \
tar cf - /whatever/path \
| tar xf -
This means two things:
tar(1) doesn't print any useful progress information. (You could in principle add a v flag to print out each file name as it goes by to stderr, but that won't tell you how many files in total there are or how large they are.) So kubectl cp as implemented doesn't have any way to get this out.
There's not a richer native Kubernetes API to copy files.
If moving files in and out of containers is a key use case for you, it will probably be easier to build, test, and run by adding a simple HTTP service. You can then rely on things like the HTTP Content-Length: header for progress metering.
One option is to use pv which will show time elapsed, data transferred and throughput (eg MB/s):
$ kubectl exec podname -c containername -- tar cf - /whatever/path | pv | tar xf -
14.1MB 0:00:10 [1.55MB/s] [ <=> ]
If you know the expected transfer size ahead of time you can also pass this to pv and it will then calculate a % progress and also an ETA, eg for a 100m transfer:
$ kubectl exec podname -c containername -- tar cf - /whatever/path | pv -s 100m | tar xf -
13.4MB 0:00:09 [1.91MB/s] [==> ] 13% ETA 0:00:58
You obviously need to have pv installed (locally) for any of the above to work.
It's not possible, but you can find here how to implement rsync with kubernetes, rsync shows you the progress of the transfer file.
rsync files to a kubernetes pod
I figured out a hacky way to do this. If you have bash access to the container you're copying to, you can do something like wc -c <file> on the remote, then compare that to the size locally. du -h <file> is another option, which gives human-readable output so it may be better
On MacOS, there is still the hacky way of opening the "Activity Monitor" on the "Network" tab. If you are copying with kubectl cp from your local machine to a distant pod, then the total transfer is shown in the "Sent Bytes" column.
Not of super high precision, but it sort of does the job without installing anything new.
I know it doesn't show an active progress of each file, but does output a status including byte count for each completed file, which for multiple files run via scripts, is almost as good as active progress:
kubectl cp local.file container:/path/on/container --v=4
Notice the --v=4 is verbose mode and will give you output. I found kubectl cp output shows from v=3 thru v=5.