How to know each solaris zone is occupying how much disk space? - solaris

IF I use the df command, I can only see in the Solaris server how much disk space is being used up. But I want to know how much diskspace a particular solaris zone is occupying

The problem I found with these solutions is that they do not take into account directory inheritance. Yes, you will find out how much "space" is under a certain directory. But if you want to actually find out how much extra space a zone is taking, you have to go a different route.
Do a
zonecfg -z zonename info
where zonename is the name of the zone. And look at each inherit-pkg-dir line.
inherit-pkgdir-dir:
dir: /lib
inherit-pkgdir-dir:
dir: /sbin
Any line that has inheritance is hard-linked to the zone. So you will be double
counting against the global zone if you simply do a
du -sh /zonepath/zonename
Basically you have to count only the directories (excluding /proc, and maybe /tmp) that aren't listed in any inherit-pkg-dir lines.
cd /zonepath/zonename/root
du -sh /bin /dev /etc /home /kernel opt /system /var etc....

try the du command

Yes - definitely su to root to do this.
I was able to run: /usr/xpg4/bin/du -sk -h -x /zonepath/zonename
to get the space that was used in a UFS root partition of a zone.
For example, /usr/xpg4/bin/du -sk -h -x /zonepath/zonename
returned the following: 3.5G /zonepath/zonename
The -x option when evaluating file sizes, evaluates only those files that have the same device as the file specified by the file operand.
The -x operand only seems to work when calling du with this path: /usr/xpg4/bin/du
This also worked to display the used space of the zfs attached drives in the zone! We mounted one zfs lun to the path /zoneepath/zonename/data and running this matched the output of "zfs list" for the data file:
# /usr/xpg4/bin/du -sk -h -x /zonepath/zoneneame/data
11G /zonepath/zoneneame/data
If you run #/usr/xpg4/bin/du -sk -h -x /zonepath/zoneneame
then you should get an overall total of the used space in the zone such as:
53G /zonepath/zonename
It will not include NFS attached drives, nor will it include directories that root is not the owner of.
I hope this helps!

Since I tried both John's solution and Pierre-Luc solution, what works for me is:
list all the zone (from the global zone)
:
tcsh>zoneadm list -civ
ID NAME STATUS PATH BRAND IP
0 global running / native shared
1 myZone1 running /export/zones/myZone1 native shared
2 myZone2 running /export/zones/myZone2 native shared
du -sk as root
(since local zones are not readable from global zone, I had to du -sk them as root)
:
tcsh>s du -sk /export/zones/myZone1
9930978 /export/zones/myZone1

According to Solaris Operating System Managing ZFS in Solaris 10 Containers the following command would give you the information you require.
zfs list

If you install the zone on a zfs volume then you can use the zfs tools ("zfs list") to quickly see how much space has been used.
Otherwise you'll have to use "du" as you already discovered (which will be much slower).

Related

GSUTIL CP using file size

I am trying to copy files from a directory on my Google Compute Instance to Google Cloud Storage Bucket. I have it working, however there are ~35k files but only ~5k have an data in them.
Is there anyway to only copy files above a certain size?
I've not tried this but...
You should be able to do this using a resumable transfer and setting the threshold to 5k (defaults to 8Mib). See: https://cloud.google.com/storage/docs/gsutil/commands/cp#resumable-transfers
May be advisable to set BOTO_CONFIG specifically for this copy (a) to be intentional; (b) to remind yourself how it works. See: https://cloud.google.com/storage/docs/boto-gsutil
Resumable uploads has the added benefit, of course, of resuming if there are any failures.
Recommend: try this on a small subset and confirm it works to your satisfaction.
While it's not possible to do it only with gsutil, it's possible to do it by parsing the names and use the -I flag on the cp command to process them. If you're using a Linux Compute Engine instance you can perform it by using the du and awk commands:
du * | awk '{if ($1 > 1000) print $2 }' | gsutil -m cp -I gs://bucket2
The command will get the filesize of the files inside the current directory on your compute engine with du * and will only copy the files which size are larger than 1000 bytes to bucket2, you can change that value to adjust it to your needs.

How to reduce par folder size in /tmp?

I have a perl project packaged by Par::Packer and distribute by RPM on RedHat.
I found that a par-hex{6} directory would be created after each user run my command.
$ /tmp
lirian 19M par-6c3973
andy 19M par-6d6a7a
raymond 19M par-726679
john 19M par-736a69
Can I reduce the par directory size or use one shared par folder among all user?
More: I've read the instruction on pp and PAR::FAQ#RPM.
I use pp -d to package the par file.
I think I found an not-so-perfect but enough answer.
Now I use "pp -d -T lirian-tool" to package my perl script.
According to pp-option-page:
-T, --tempcache
Set the program unique part of the cache directory name that is used if the program is run without -C. If not set, a hash of the executable is used.
When the program is run, its contents are extracted to a temporary directory. On Unix systems, this is commonly /tmp/par-USER/cache-XXXXXXX. USER is replaced by the name of the user running the program, but "spelled" in hex. XXXXXXX is either a hash of the executable or the value passed to the -T or --tempcache switch.
So now we have /tmp/par-User/cache-lirian-tool. Each user's cache directory may take 20M, which is acceptable.
And you can cleanup the cache by running "pp -c":
-C, --clean
Clean up temporary files extracted from the application at runtime. By default, these files are cached in the temporary directory; this allows the program to start up faster next time.

Compare file sizes and download if they're different via wget

I'm downloading some .mp3 files (all legal) via wget :
wget -r -nc files.myserver.com
I have to stop the download sometimes and at that times the file is partially downloaded. For example a 10 minutes record.mp3 file become 4 minutes record.mp3 file. It's playing correctly but incomplete.
If I use the same command above, because the record.mp3 file is already exist in my local computer wget skips that file although it isn't complete.
I wonder if there is a way to check the file sizes and if the file size in the remote server and local computer isn't same re-download it. (I've learned the --spider command gives the file size but is there any other command that automatically check the file sizes and download or not).
I would go with wget's -N option for timestamping, but note that wget will only compare the file sizes if you also specify the --no-if-modified-since option. Without it, incomplete files are indeed skipped on the next run because they receive a timestamp of the current time, which is newer than that on the server.
The reason is probably that with only -N, a GET request is sent for the file with the If-Modified-Since field set. The server responds with either 200 or 304, but the 304 doesn't contain the file size so wget can't check it.
With --no-if-modified-since wget sends a HEAD request instead to get the timestamp and file size, and checks both.
What I use for recursive download of a folder:
wget -T 300 -nv -t 1 -r -nd -np -l 1 -N --no-if-modified-since -P $my_folder $my_url
With:
-T 300: Set the network timeout to 300 seconds
-nv: Turn off verbose without being completely quiet
-t 1: Set number of tries to 1
-r: Turn on recursive retrieving
-nd: Do not create a hierarchy of directories when retrieving recursively
-np: Do not ever ascend to the parent directory when retrieving recursively
-l 1: Specify recursion maximum depth 1
-N: Turn on time-stamping
--no-if-modified-since: Do not send If-Modified-Since header in ā€˜-Nā€™ mode, send preliminary HEAD request instead
You may try the -c option to continue the download of partially downloaded files, however the manual gives an explicit warning:
You need to be especially careful of this when using -c in conjunction
with -r, since every file will be considered as an "incomplete
download" candidate.
While there is no perfect solution to this problem you could try to use -N option to turn on timestamping. This might prevent errors when the file has changed on the server but only if the server supports timestamping and partial downloads. Try it and see how it goes.
wget -r -N -c files.myserver.com
If you need check if file was partially downloaded (has different size) or updated on remote server by timestamp and must be in this case updated locally you need use -N option.
Here some additional info about -N (--timestamping) option from Wget docs:
If the local file does not exist, or the sizes of the files do not match, Wget will download the remote file no matter what the
time-stamps say.
Added From: https://www.gnu.org/software/wget/manual/wget.html (Chapter: 5 Time-Stamping)

Why does grep hang when run against the / directory?

My question is in two parts :
1) Why does grep hang when I grep all files under "/" ?
for example :
grep -r 'h' ./
(note : right before the hang/crash, I note that I see some "no such device or address" messages , regarding sockets....
Of course, I know that grep shouldn't run against a socket, but I would think that since sockets are just files in Unix, it should return a negative result, rather than crashing.
2) Now, my follow up question : In any case -- how can I grep the whole filesystem? Are there certain *NIX directories which we should leave out when doing this ? In particular, I'm looking for all recently written log files.
As #ninjalj said, if you don't use -D skip, grep will try to read all your device files, socket files, and FIFO files. In particular, on a Linux system (and many Unix systems), it will try to read /dev/zero, which appears to be infinitely long.
You'll be waiting for a while.
If you're looking for a system log, starting from /var/log is probably the best approach.
If you're looking for something that really could be anywhere in your file system, you can do something like this:
find / -xdev -type f -print0 | xargs -0 grep -H pattern
The -xdev argument to find tells it to stay within a single filesystem; this will avoid /proc and /dev (as well as any mounted filesystems). -type f limits the search to ordinary files. -print0 prints the file names separated by null characters rather than newlines; this avoid problems with files having spaces or other funny characters in their names.
xargs reads a list of file names (or anything else) on its standard input and invokes the specified command on everything in the list. The -0 option works with find's -print0.
The -H option to grep tells it to prefix each match with the file name. By default, grep does this only if there are two or more file names on its command line. Since xargs splits its arguments into batches, it's possible that the last batch will have just one file, which would give you inconsistent results.
Consider using find ... -name '*.log' to limit the search to files with names ending in .log (assuming your log files have such names), and/or using grep -I ... to skip binary files.
Note that all this depends on GNU-specific features. Some of these options might not be available on MacOS (which is based on BSD) or on other Unix systems. Consult your local documentation, and consider installing GNU findutils (for find and xargs) and/or GNU grep.
Before trying any of this, use df to see just how big your root filesystem is. Mine is currently 268 gigabytes; searching all of it would probably take several hours. A few minutes spent (a) restricting the files you search and (b) making sure the command is correct will be well worth the time you spend.
By default, grep tries to read every file. Use -D skip to skip device files, socket files and FIFO files.
If you keep seeing error messages, then grep is not hanging. Keep iotop open in a second window to see how hard your system is working to pull all the contents off its storage media into main memory, piece by piece. This operation should be slow, or you have a very barebones system.
Now, my follow up question : In any case -- how can I grep the whole filesystem? Are there certain *NIX directories which we should leave out when doing this ? In particular, Im looking for all recently written log files.
Grepping the whole FS is very rarely a good idea. Try grepping the directory where the log files should have been written; likely /var/log. Even better, if you know anything about the names of the files you're looking for (say, they have the extension .log), then do a find or locate and grep the files reported by those programs.

How to read a block in a storage pool (zpool) using dd?

I want to read a block in zpool storage pool using dd command. Since zpool doesn't create a device file like other volume manager like vxvm. I dunno which block device to use for reading. Is there any way to read block by block data in zpool ?
You can probably use the zdb command. Here is a pdf about it, and the help output.
http://www.bruningsystems.com/osdevcon_draft3.pdf
# zdb --help
zdb: illegal option -- -
Usage: zdb [-CumdibcsDvhL] poolname [object...]
zdb [-div] dataset [object...]
zdb -m [-L] poolname [vdev [metaslab...]]
zdb -R poolname vdev:offset:size[:flags]
zdb -S poolname
zdb -l [-u] device
zdb -C
Dataset name must include at least one separator character '/' or '#'
If dataset name is specified, only that dataset is dumped
If object numbers are specified, only those objects are dumped
Options to control amount of output:
-u uberblock
-d dataset(s)
-i intent logs
-C config (or cachefile if alone)
-h pool history
-b block statistics
-m metaslabs
-c checksum all metadata (twice for all data) blocks
-s report stats on zdb's I/O
-D dedup statistics
-S simulate dedup to measure effect
-v verbose (applies to all others)
-l dump label contents
-L disable leak tracking (do not load spacemaps)
-R read and display block from a device
Below options are intended for use with other options (except -l):
-A ignore assertions (-A), enable panic recovery (-AA) or both (-AAA)
-F attempt automatic rewind within safe range of transaction groups
-U <cachefile_path> -- use alternate cachefile
-X attempt extreme rewind (does not work with dataset)
-e pool is exported/destroyed/has altroot/not in a cachefile
-p <path> -- use one or more with -e to specify path to vdev dir
-P print numbers parsable
-t <txg> -- highest txg to use when searching for uberblocks
Specify an option more than once (e.g. -bb) to make only that option verbose
Default is to dump everything non-verbosely
Unfortunately, I don't know how to use it.
# zdb
tank:
version: 28
name: 'tank'
...
vdev_tree:
...
children[0]:
...
children[0]:
...
path: '/dev/label/bank1d1'
phys_path: '/dev/label/bank1d1'
...
So I took the array indexes 0 0 to get my first disk (bank1d1) and did this command. It did something. I don't know how to read the output.
zdb -R tank 0:0:4e00:200 | strings
Have fun... try not to destroy anything. Here is your warning from the man page:
The zdb command is used by support engineers to diagnose failures and
gather statistics. Since the ZFS file system is always consistent on
disk and is self-repairing, zdb should only be run under the direction
by a support engineer.
And please tell us what you actually were looking for. Was Alan right that you wanted to do backups?
You can read from underlying raw devices in the pool, but as far as I can tell there's no concept of single contiguous block device representing the whole pool.
The pool in ZFS is not a single contiguous block of sectors that 'classic' volume managers are. ZFS internal structure is closer to a tree which would be somewhat challenging to represent as a flat array of blocks.
Ben Rockwood's blog post "zdb: Examining ZFS At Point-Blank Range" may help getting better idea of what's under the hood.
No idea about what might be useful doing so but you certainly can read blocks in the underlying devices used by the pool. They are shown by the zpool status command. If you are really asking about zvols instead of zpools, they are accessible under /dev/zvol/rdsk/pool-name/zvol-name. If you want to look at internal zpool data, you probably want to use zdb.
If you want to backup ZFS filesystems you should be using the following tools:
'zfs snapshot' to create a stable snapshot of the filesystem
'zfs send' to send a copy of the snapshot to somewhere else
'zfs receive' to go back from a snapshot to a filesystem.
'dd' is almost certainly not the tool you should be using. In your case you could 'zfs send' and redirect the output into a file on your other filesystem.
See chapter 7 of the ZFS administration guide for more details.