How to analyze disk usage of a Docker container - mongodb

I can see that Docker takes 12GB of my filesystem:
2.7G /var/lib/docker/vfs/dir
2.7G /var/lib/docker/vfs
2.8G /var/lib/docker/devicemapper/mnt
6.3G /var/lib/docker/devicemapper/devicemapper
9.1G /var/lib/docker/devicemapper
12G /var/lib/docker
But, how do I know how this is distributed over the containers?
I tried to attach to the containers by running (the new v1.3 command)
docker exec -it <container_name> bash
and then running 'df -h' to analyze the disk usage. It seems to be working, but not with containers that use 'volumes-from'.
For example, I use a data-only container for MongoDB, called 'mongo-data'.
When I run docker run -it --volumes-from mongo-data busybox, and then df -h inside the container, It says that the filesystem mounted on /data/db (my 'mongo-data' data-only container) uses 11.3G, but when I do du -h /data/db, it says that it uses only 2.1G.
So, how do I analyze a container/volume disk usage? Or, in my case, how do I find out the 'mongo-data' container size?

To see the file size of your containers, you can use the --size argument of docker ps:
docker ps --size

After 1.13.0, Docker includes a new command docker system df to show docker disk usage.
$ docker system df
TYPE TOTAL ACTIVE SIZE RECLAIMABLE
Images 5 1 2.777 GB 2.647 GB (95%)
Containers 1 1 0 B 0B
Local Volumes 4 1 3.207 GB 2.261 (70%)
To show more detailed information on space usage:
$ docker system df --verbose

Posting this as an answer because my comments above got hidden:
List the size of a container:
du -d 2 -h /var/lib/docker/devicemapper | grep `docker inspect -f "{{.Id}}" <container_name>`
List the sizes of a container's volumes:
docker inspect -f "{{.Volumes}}" <container_name> | sed 's/map\[//' | sed 's/]//' | tr ' ' '\n' | sed 's/.*://' | xargs sudo du -d 1 -h
Edit:
List all running containers' sizes and volumes:
for d in `docker ps -q`; do
d_name=`docker inspect -f {{.Name}} $d`
echo "========================================================="
echo "$d_name ($d) container size:"
sudo du -d 2 -h /var/lib/docker/devicemapper | grep `docker inspect -f "{{.Id}}" $d`
echo "$d_name ($d) volumes:"
docker inspect -f "{{.Volumes}}" $d | sed 's/map\[//' | sed 's/]//' | tr ' ' '\n' | sed 's/.*://' | xargs sudo du -d 1 -h
done
NOTE: Change 'devicemapper' according to your Docker filesystem (e.g 'aufs')

The volume part did not work anymore so if anyone is insterested I just change the above script a little bit:
for d in `docker ps | awk '{print $1}' | tail -n +2`; do
d_name=`docker inspect -f {{.Name}} $d`
echo "========================================================="
echo "$d_name ($d) container size:"
sudo du -d 2 -h /var/lib/docker/aufs | grep `docker inspect -f "{{.Id}}" $d`
echo "$d_name ($d) volumes:"
for mount in `docker inspect -f "{{range .Mounts}} {{.Source}}:{{.Destination}}
{{end}}" $d`; do
size=`echo $mount | cut -d':' -f1 | sudo xargs du -d 0 -h`
mnt=`echo $mount | cut -d':' -f2`
echo "$size mounted on $mnt"
done
done

I use docker stats $(docker ps --format={{.Names}}) --no-stream to get :
CPU usage,
Mem usage/Total mem allocated to container (can be allocate with docker run command)
Mem %
Block I/O
Net I/O

Improving Maxime's anwser:
docker ps --size
You'll see something like this:
+---------------+---------------+--------------------+
| CONTAINER ID | IMAGE | SIZE |
+===============+===============+====================+
| 6ca0cef8db8d | nginx | 2B (virtual 183MB) |
| 3ab1a4d8dc5a | nginx | 5B (virtual 183MB) |
+---------------+---------------+--------------------+
When starting a container, the image that the container is started from is mounted read-only (virtual).
On top of that, a writable layer is mounted, in which any changes made to the container are written.
So the Virtual size (183MB in the example) is used only once, regardless of how many containers are started from the same image - I can start 1 container or a thousand; no extra disk space is used.
The "Size" (2B in the example) is unique per container though, so the total space used on disk is:
183MB + 5B + 2B
Be aware that the size shown does not include all disk space used for a container.
Things that are not included currently are;
- volumes
- swapping
- checkpoints
- disk space used for log-files generated by container
https://github.com/docker/docker.github.io/issues/1520#issuecomment-305179362

(this answer is not useful, but leaving it here since some of the comments may be)
docker images will show the 'virtual size', i.e. how much in total including all the lower layers. So some double-counting if you have containers that share the same base image.
documentation

You can use
docker history IMAGE_ID
to see how the image size is ditributed between its various sub-components.

Keep in mind that docker ps --size may be an expensive command, taking more than a few minutes to complete. The same applies to container list API requests with size=1. It's better not to run it too often.
Take a look at alternatives we compiled, including the du -hs option for the docker persistent volume directory.

Alternative to docker ps --size
As "docker ps --size" produces heavy IO load on host, it is not feasable running such command every minute in a production environment. Therefore we have to do a workaround in order to get desired container size or to be more precise, the size of the RW-Layer with a low impact to systems perfomance.
This approach gathers the "device name" of every container and then checks size of it using "df" command. Those "device names" are thin provisioned volumes that a mounted to / on each container. One problem still persists as this observed size also implies all the readonly-layers of underlying image. In order to address this we can simple check size of used container image and substract it from size of a device/thin_volume.
One should note that every image layer is realized as a kind of a lvm snapshot when using device mapper. Unfortunately I wasn't able to get my rhel system to print out those snapshots/layers. Otherwise we could simply collect sizes of "latest" snapshots. Would be great if someone could make things clear. However...
After some tests, it seems that creation of a container always adds an overhead of approx. 40MiB (tested with containers based on Image "httpd:2.4.46-alpine"):
docker run -d --name apache httpd:2.4.46-alpine // now get device name from docker inspect and look it up using df
df -T -> 90MB whereas "Virtual Size" from "docker ps --size" states 50MB and a very small payload of 2Bytes -> mysterious overhead 40MB
curl/download of a 100MB file within container
df -T -> 190MB whereas "Virtual Size" from "docker ps --size" states 150MB and payload of 100MB -> overhead 40MB
Following shell prints results (in bytes) that match results from "docker ps --size" (but keep in mind mentioned overhead of 40MB)
for c in $(docker ps -q); do \
container_name=$(docker inspect -f "{{.Name}}" ${c} | sed 's/^\///g' ); \
device_n=$(docker inspect -f "{{.GraphDriver.Data.DeviceName}}" ${c} | sed 's/.*-//g'); \
device_size_kib=$(df -T | grep ${device_n} | awk '{print $4}'); \
device_size_byte=$((1024 * ${device_size_kib})); \
image_sha=$(docker inspect -f "{{.Image}}" ${c} | sed 's/.*://g' ); \
image_size_byte=$(docker image inspect -f "{{.Size}}" ${image_sha}); \
container_size_byte=$((${device_size_byte} - ${image_size_byte})); \
\
echo my_node_dm_device_size_bytes\{cname=\"${container_name}\"\} ${device_size_byte}; \
echo my_node_dm_container_size_bytes\{cname=\"${container_name}\"\} ${container_size_byte}; \
echo my_node_dm_image_size_bytes\{cname=\"${container_name}\"\} ${image_size_byte}; \
done
Further reading about device mapper: https://test-dockerrr.readthedocs.io/en/latest/userguide/storagedriver/device-mapper-driver/

The docker system df command displays information regarding the amount of disk space used by the docker daemon.
docker system df -v

Related

Can we see transfer progress with kubectl cp?

Is it possible to know the progress of file transfer with kubectl cp for Google Cloud?
No, this doesn't appear to be possible.
kubectl cp appears to be implemented by doing the equivalent of
kubectl exec podname -c containername \
tar cf - /whatever/path \
| tar xf -
This means two things:
tar(1) doesn't print any useful progress information. (You could in principle add a v flag to print out each file name as it goes by to stderr, but that won't tell you how many files in total there are or how large they are.) So kubectl cp as implemented doesn't have any way to get this out.
There's not a richer native Kubernetes API to copy files.
If moving files in and out of containers is a key use case for you, it will probably be easier to build, test, and run by adding a simple HTTP service. You can then rely on things like the HTTP Content-Length: header for progress metering.
One option is to use pv which will show time elapsed, data transferred and throughput (eg MB/s):
$ kubectl exec podname -c containername -- tar cf - /whatever/path | pv | tar xf -
14.1MB 0:00:10 [1.55MB/s] [ <=> ]
If you know the expected transfer size ahead of time you can also pass this to pv and it will then calculate a % progress and also an ETA, eg for a 100m transfer:
$ kubectl exec podname -c containername -- tar cf - /whatever/path | pv -s 100m | tar xf -
13.4MB 0:00:09 [1.91MB/s] [==> ] 13% ETA 0:00:58
You obviously need to have pv installed (locally) for any of the above to work.
It's not possible, but you can find here how to implement rsync with kubernetes, rsync shows you the progress of the transfer file.
rsync files to a kubernetes pod
I figured out a hacky way to do this. If you have bash access to the container you're copying to, you can do something like wc -c <file> on the remote, then compare that to the size locally. du -h <file> is another option, which gives human-readable output so it may be better
On MacOS, there is still the hacky way of opening the "Activity Monitor" on the "Network" tab. If you are copying with kubectl cp from your local machine to a distant pod, then the total transfer is shown in the "Sent Bytes" column.
Not of super high precision, but it sort of does the job without installing anything new.
I know it doesn't show an active progress of each file, but does output a status including byte count for each completed file, which for multiple files run via scripts, is almost as good as active progress:
kubectl cp local.file container:/path/on/container --v=4
Notice the --v=4 is verbose mode and will give you output. I found kubectl cp output shows from v=3 thru v=5.

Docker + Crontab: find container ID from service name for use in crontab

The context is that I'm trying to set up a cron job to back up a database within a postgres docker container. The crontab line I'm using is:
45 1 * * * docker exec e2fa9f0adbe0 pg_dump -Z 9 -U pguser -d pgdb | curl -u ftpuser:ftppwd ftp.mydomain.com/my-db-backups/db-backup_`date '+\%Y-\%m-\%d_\%H-\%M-\%S'`.sql.gz --ftp-create-dirs -T -
It works fine. But I'm trying to refine it because at present the container ID e2fa9f0adbe0 is hard-coded into the crontab, so it will break if ever the service under which the container placed is restarted, so that the container will re-appear under a new ID. On the other hand, the service name will always be the same.
So is there a way of altering the above cron command to extract the container ID from the service name (let's say my-postgres-service)?
Well I tried to edit Olmpc's answer to make it more complete (and so that I can mark it as Accepted), but my edit was rejected (thanks). So I'll post my own answer:
To answer the actual question, the cron command can be altered as follows so that it is based on the service name (which is fixed) rather than the container ID (which is subject to change):
45 1 * * * docker exec `docker ps -qf name=my-postgres-service` pg_dump -Z 9 -U pguser -d pgdb | curl -u ftpuser:ftppwd ftp.mydomain.com/my-db-backups/db-backup_`date '+\%Y-\%m-\%d_\%H-\%M-\%S'`.sql.gz --ftp-create-dirs -T -
This is working nicely. Note: it relies on there being only one container associated with the service, so that only a single container ID is returned by docker ps -qf name=my-postgres-service.
You can use the following command to get the id from the name:
docker ps -aqf "name=my-postgres-service"
And the following to have more details about the options used above:
docker ps -h
So, the full crontab line would be:
45 1 * * * docker exec `docker ps -aqf name=my-postgres-service` pg_dump -Z 9 -U pguser -d pgdb | curl -u ftpuser:ftppwd ftp.mydomain.com/my-db-backups/db-backup_`date '+\%Y-\%m-\%d_\%H-\%M-\%S'`.sql.gz --ftp-create-dirs -T -

MongoDB Docker container: ERROR: Cannot write pid file to /tmp/tmp.aLmNg7ilAm: No space left on device

I started a MongoDB container like so:
docker run -d -p 27017:27017 --net=cdt-net --name cdt-mongo mongo
I saw that my MongoDB container exited:
0e35cf68a29c mongo "docker-entrypoint.s…" Less than a second ago Exited (1) 3 seconds ago cdt-mongo
I checked my Docker logs, I see:
$ docker logs 0e35cf68a29c
about to fork child process, waiting until server is ready for connections.
forked process: 21
2018-01-12T23:42:03.413+0000 I CONTROL [main] ***** SERVER RESTARTED *****
2018-01-12T23:42:03.417+0000 I CONTROL [main] ERROR: Cannot write pid file to /tmp/tmp.aLmNg7ilAm: No space left on device
ERROR: child process failed, exited with error number 1
Does anyone know what this error is about? Not enough space in the container?
I had to delete old Docker images to free up space, here are the commands I used:
# remove all unused / orphaned images
echo -e "Removing unused images..."
docker rmi -f $(docker images --no-trunc | grep "<none>" | awk "{print \$3}") 2>&1 | cat;
echo -e "Done removing unused images"
# clean up stuff -> using these instructions https://lebkowski.name/docker-volumes/
echo -e "Cleaning up old containers..."
docker ps --filter status=dead --filter status=exited -aq | xargs docker rm -v 2>&1 | cat;
echo -e "Cleaning up old volumes..."
docker volume ls -qf dangling=true | xargs docker volume rm 2>&1 | cat;
We've experienced this problem recently while using docker-compose with mongo and a bunch of other services. There are two fixes which have worked for us.
Clear down unused stuff
# close down all services
docker-compose down
# clear unused docker images
docker system prune
# press y
Increase the image memory available to docker - this will depend on your installation of docker. On Mac, for example, it defaults to 64Gb and we doubled it to 128Gb via the UI.
We've had this problem in both Windows and Mac and the above fixed it.

ZFS mount dataset for zone

I shutdown my non-global zone and umount her point zfs zonepath.
command for umount:
zfs unmount -f zones-pool/one-zone
details:
zfs list | grep one
zones-pool/one-zone 15,2G 9,82G 32K /zones-fs/one-zone
zones-pool/one/rpool/ROOT/solaris 15,2G 9,82G 7,83G /zones-fs/one/root
in the above, it is seen that there is an occupied space, 9.82G of 15.2G
more details:
# zfs get mountpoint zones-pool/one-zone
NAME PROPERTY VALUE SOURCE
zones-pool/one-zone mountpoint /zones-fs/one-zone local
# zfs get mounted zones-pool/one-zone
NAME PROPERTY VALUE SOURCE
zones-pool/one-zone mounted no -
but, if I try mount point zfs
I can not see the content
step 1 mount:
zfs mount zones-pool/one-zone
step 2 see mount with df -h:
df -h | grep one
zones-pool/one-zone/rpool/ROOT/solaris 25G 32K 9,8G 1% /zones-fs/one-zone/root
zones-pool/one-zone 25G 32K 9,8G 1% /zones-fs/one-zone
step 3 list content:
ls -l /zones-fs/one-zone/root
total 0
why?
also in step 2, you see that df -h prints 1% used
I do not understand
To view contents of zoned dataset you need to start zone or mount it directly.
Zone files (root-fs) are located into dataset
zones-pool/one-zone/rpool/ROOT/solaris
To mount it you need to change its "zoned" option to off and set "mountpoint" option to path you want to mount.
This may be done via
zfs set zoned=off zones-pool/one-zone/rpool/ROOT/solaris
zfs set mountpoint=/zones-pool/one-zone-root-fs
Space into dataset may be occupied by snapshots and clones, you may check them by commands:
zfs list -t snap zones-pool
zfs get -H -r -o value,name origin dom168vol1 | grep -v '^-'
The first command displays all snapshots, the second command displays datasets which are depends from some snapshots (have not "-" origin property).

How to copy docker volume from one machine to another?

I have created a docker volume for postgres on my local machine.
docker create volume postgres-data
Then I used this volume and run a docker.
docker run -it -v postgres-data:/var/lib/postgresql/9.6/main postgres
After that I did some database operations which got stored automatically in postgres-data. Now I want to copy that volume from my local machine to another remote machine. How to do the same.
Note - Database size is very large
If the second machine has SSH enabled you can use an Alpine container on the first machine to map the volume, bundle it up and send it to the second machine.
That would look like this:
docker run --rm -v <SOURCE_DATA_VOLUME_NAME>:/from alpine ash -c \
"cd /from ; tar -cf - . " | \
ssh <TARGET_HOST> \
'docker run --rm -i -v <TARGET_DATA_VOLUME_NAME>:/to alpine ash -c "cd /to ; tar -xpvf - "'
You will need to change:
SOURCE_DATA_VOLUME_NAME
TARGET_HOST
TARGET_DATA_VOLUME_NAME
Or, you could try using this helper script https://github.com/gdiepen/docker-convenience-scripts
Hope this helps.
I had an exact same problem but in my case, both volumes were in separate VPCs and couldn't expose SSH to outside world. I ended up creating dvsync which uses ngrok to create a tunnel between them and then use rsync over SSH to copy the data. In your case you could start the dvsync-server on your machine:
$ docker run --rm -e NGROK_AUTHTOKEN="$NGROK_AUTHTOKEN" \
--mount source=postgres-data,target=/data,readonly \
quay.io/suda/dvsync-server
and then start the dvsync-client on the target machine:
docker run -e DVSYNC_TOKEN="$DVSYNC_TOKEN" \
--mount source=MY_TARGET_VOLUME,target=/data \
quay.io/suda/dvsync-client
The NGROK_AUTHTOKEN can be found in ngrok dashboard and the DVSYNC_TOKEN is being shown by the dvsync-server in its stdout.
Once the synchronization is done, the dvsync-client container will stop.