File movement issue on NFS file system on Unix box - perl

Currently there are 4.5 million files in a single directory on an NFS file system. As a result any read or write operation on that directory is causing a huge delay.
In order to over come this problem, all the files in that directory will be moved onto different directories based on the year of its creation.
Apparently, the find command that we are using with the -ctime option is not working because of the huge file volume.
We tried listing the files based on the year of creation and then feed the list to a script that will move them in a for loop. But even this failed as ls -lrt went for a hang.
Is there any other way to tackle this problem?
Please help.
Script contents:
1) filelist.sh
ls -tlr|awk '{print $8,$9,$6,$7}'|grep ^2011|awk '{print $2,$1,$3,$4}' 1>>inboundstore_$1.txt 2>>Error_$1.log
ls -tlr|awk '{print $8,$9,$6,$7}'|grep ^2011|wc -l 1>>count_$1.log
2) filemove.sh
INPUT_FILE=$1 ##text file which has the list of files from the previous script
FINAL_LOCATION=$2 ##destination directory
if [ -r $INPUT_FILE ]
then
for file in `cat $INPUT_FILE`
do
echo "TIME OF FILE COPY OF [$file] IS : `date`" >> xyz/IBSCopyTime.log
mv $file $FINAL_LOCATION
done
else
echo "$INPUT_FILE does not exist"
fi

Use the readdir iterator.

Related

Logrotate files in multiple sub directories to backup location in same folder structure

Im trying to use logrotate with very little experience, Currently working i have the files rotating, compressing and renaming into the same folder. Now i need instead of dropping the files in the same place, i need to have them dropped in another location. They also need to have the same folder structure and if it isn't there than it needs to create the new folder. All the compressed files need to be added and not override the existing files
I'm thinking that the olddir will drop them into a destination folder but not sure on how to have it drop it in the corresponding folder or create it if its not already there.
Example source
var/log/device1/*.log
var/log/device2/*.log
var/log/device3/*.log
Example Destination to drop .gz files into
opt/archive/device1/
opt/archvie/device2/
(needs to create opt/archive/device3 and put rotated file in here)
Didn't end up finding a way to move with logrotate but came up with script to do the same sort of thing. pretty simplistic and wont work for more than 1 level deep of subfolders.
#!/bin/bash
source="/opt/log/host"
destination="/opt/archive/"
for i in $(find $source -maxdepth 2 -type f -name "*.gz")
do
#removing /opt/log/host from string
dd="$( echo "$i" | sed -e 's#^/opt/log/host/##' )"
#removing everything after the first /
ff=$( echo "$dd" | cut -f1 -d"/" )
#setting the correct destination string
ee=$destination$dd
#create new folders if they do not exist
mkdir -p -- "$destination$ff"
#move files
mv $i $ee
done

Folders not showing up in Bucket storage

So my problem is that a have a few files not showing up in gcsfuse when mounted. I see them in the online console and if I 'ls' with gsutils.
Also, if If I manually create the folder in the bucket, i then can see the files inside it, but I need to create it first. Any suggestions?
gs://mybucket/
dir1/
ok.txt
dir2
lafu.txt
If I mount mybucket with gcsfuse and do 'ls' it only returns dir1/ok.txt.
Then I'll create the folder dir2 inside dir1 at the root of the mounting point, and suddenly 'lafu.txt' shows up.
By default, gcsfuse won't show a directory "implicitly" defined by a file with a slash in its name. For example if your bucket contains an object named dir/foo.txt, you won't be able to find it unless there is also an object nameddir/.
You can work around this by setting the --implicit-dirs flag, but there are good reasons why this is not the default. See the documentation for more information.
Google Cloud Storage doesn't have folders. The various interfaces use different tricks to pretend that folders exist, but ultimately there's just an object whose name contains a bunch of slashes. For example, "pictures/january/0001.jpg" is the full name of a single object.
If you need to be sure that a "folder" exists, put an object inside it.
#Brandon Yarbrough suggests creating needed directory entries in the GCS bucket. This avoids the performance penalty described by #jacobsa.
Here is a bash script for doing so:
# 1. Mount $BUCKET_NAME at $MOUNT_PT
# 2. Run this script
MOUNT_PT=${1:-HOME/mnt}
BUCKET_NAME=$2
DEL_OUTFILE=${3:-y} # Set to y or n
echo "Reading objects in $BUCKET_NAME"
OUTFILE=dir_names.txt
gsutil ls -r gs://$BUCKET_NAME/** | while read BUCKET_OBJ
do
dirname "$BUCKET_OBJ"
done | sort -u > $OUTFILE
echo "Processing directories found"
cat $OUTFILE | while read DIR_NAME
do
LOCAL_DIR=`echo "$DIR_NAME" | sed "s=gs://$BUCKET_NAME/==" | sed "s=gs://$BUCKET_NAME=="`
#echo $LOCAL_DIR
TARG_DIR="$MOUNT_PT/$LOCAL_DIR"
if ! [ -d "$TARG_DIR" ]
then
echo "Creating $TARG_DIR"
mkdir -p "$TARG_DIR"
fi
done
if [ $DEL_OUTFILE = "y" ]
then
rm $OUTFILE
fi
echo "Process complete"
I wrote this script, and have shared it at https://github.com/mherzog01/util/blob/main/sh/mk_bucket_dirs.sh.
This script assumes that you have mounted a GCS bucket locally on a Linux (or similar) system. The script first specifies the GCS bucket and location where the bucket is mounted. It then identifies all "directories" in the GCS bucket which are not visible locally, and creates them.
This (for me) fixed the issue with folders (and associated objects) not showing up in the mounted folder structure.

Recursively replace colons with underscores in Linux

First of all, this is my first post here and I must specify that I'm a total Linux newb.
We have recently bought a QNAP NAS box for the office, on this box we have a large amount of data which was copied off an old Mac XServe machine. A lot of files and folders originally had forward slashes in the name (HFS+ should never have allowed this in the first place), which when copied to the NAS were all replaced with a colon.
I now want to rename all colons to underscores, and have found the following commands in another thread here: pitfalls in renaming files in bash
However, the flavour of Linux that is on this box does not understand the rename command, so I'm having to use mv instead. I have tried using the code below, but this will only work for the files in the current folder, is there a way I can change this to include all subfolders?
for f in *.*; do mv -- "$f" "${f//:/_}"; done
I have found that I can find al the files and folders in question using the find command as follows
Files:
find . -type f -name "*:*"
Folders:
find . -type d -name "*:*"
I have been able to export a list of the results above by using
find . -type f -name "*:*" > files.txt
I tried using the command below but I'm getting an error message from find saying it doesn't understand the exec switch, so is there a way to pipe this all into one command, or could I somehow use the files I exported previously?
find . -depth -name "*:*" -exec bash -c 'dir=${1%/*} base=${1##*/}; mv "$1" "$dir/${base//:/_}"' _ {} \;
Thank you!
Vincent
So your for loop code works, but only in the current dir. Also, you are able to use find to build a file with all the files with : in the filename.
So, as you've already done all this, I would just loop over each line of your file, and perform the same mv command.
Something like this:
for f in `cat files.txt`; do mv $f "${f//:/_}"; done
EDIT:
As pointed out by tripleee, using a while loop is a better solution
EG
while read -r f; do mv "$f" "${f//:/_}"; done <files.txt
Hope this helps.
Will

Change file extensions of multiple files in a directory with terminal/bash?

I'm developing a simple launchdaemon that copies files from one directory to another. I've gotten the files to transfer over fine.
I just want the files in the directory to be .mp3's instead of .dat's
Some of the files look like this:
6546785.8786.dat
3678685.9834.dat
4658679.4375.dat
I want them to look like this:
6546785.8786.mp3
3678685.9834.mp3
4658679.4375.mp3
This is what I have at the end of the bash script to rename the file extensions.
cd $mp3_dir
mv *.dat *.mp3
exit 0
Problem is the file comes out as *.mp3 instead of 6546785.8786.mp3
and when another 6546785.8786.dat file is imported to $mp3_dir, the *.mp3 is overwritten with the new .mp3
I need to rename just the .dat file extensions to .mp3 and keep the filename.
Ideas? Suggestions?
Try:
for file in *.dat; do mv "$file" "${file%dat}mp3"; done
Or, if your shell has it:
rename .dat .mp3 *.dat
Now, why your command didn't work: first of all, it is more than certain that you only had one file in your directory when it was renamed to *.mp3, otherwise mv would have failed with *.mp3: not a directory.
And mv does NOT do any magic with file globs, it is the shell which expands globs. Which means, if you had this file in the directory:
t.dat
and you typed:
mv *.dat *.mp3
the shell would have expanded *.dat to t.dat. However, as nothing would match *.mp3, the shell would have left it as is, meaning the fully expanded command is:
mv t.dat *.mp3
Which will create a file named, literally, *.mp3.
If, on the other hand, you had several files named *.dat, as in:
t1.dat t2.dat
the command would have expanded to:
mv t1.dat t2.dat *.mp3
But this will fail: if there are more than two arguments to mv, it expects the last argument (ie, *.mp3) to be a directory.
For anyone on a mac, this is quite easy if you have BREW, if you don't have brew then my advice is get it. then when installed just simply do this
$ brew install rename
then once rename is installed just type (in the directory where the files are)
$ rename -s dat mp3 *

How can I remove a file based on its creation date time in Perl?

My webapp is hosted on a unix server using MySQL as database.
I wrote a Perl script to run backup of my database. The Perl script is inside the cgi-bin folde and it is working. I only need to set the cronjob and run the Perl script once a day.
The backups are stored in a folder named db_backups,. However, I also want to add a command inside my Perl script to remove any files inside the folder db_backups that are older than say 10 days ago.
I have searched high and low for unix commands and cannot find anything that matches what I needed.
if (-M $file > 10) { unlink $file }
or, coupled with File::Find::Rule
my $ten_days_ago = time() - 10 * 86400;
my #to_delete = File::Find::Rule->file()
->mtime("<=$ten_days_ago")
->in("/path/to/db_backup");
unlink #to_delete;
On Unix you can't, because the file's creation date is not stored in the filesystem.
You may want to check out stat, and -M (modification time)/-C (inode change time)/-A (access time) if you want a simple expression with relative timestamps (how long ago).
i have searched high and low for unix commands
and cannot find anything that matches what i needed.
Check out find(1) and xargs(1). Warning: these commands may change your life at the shell prompt.
$ find /path/to/backup -type f -mtime +10 -print0 | xargs -0 echo rm -f
When you're confident that will Do What You Want (tm), remove the echo. It says, roughly, starting in /path/to/backup, descend looking for plain files whose mtime is greater than 10 days, and print their names to xargs, which will pass those names to rm in batches.
(print0 and its complement -0 are GNU extensions -- you mentioned you were on Linux -- which let you deal with whitespace in filenames safely.)
You should be able to do it without resorting to Unix commands. Loop through the files in your directory, use stat on each file to get its last modify time for a file, then use unlink on the file to delete it if it's older than what you want.