how to check for activity or lack thereof on a unix file directory using perl or unix commands - perl

Scenario:
I have a process where many files are being copied (scp'd) to a DestinationServer by Host1, Host2, Host3, Host4 for example. Going to the same common directory: DestinationServer:/home/target. All the files are unique so no files will be overwritten. Host1-Host4 will have a cronjob that will launch their scp script to DestinationServer. The caveat is the Hosts are in different time zones, locations. So, they will finish at different times.
Need:
Since the files are being scp'd to Destination:/home/target, what is the best way to programmatically check when those scp's from the other Hosts are done??
Options:
My options are to programmatically do this either in perl or shell if possible.
What do I look for, what unix commands or perl modules could I use to help determine when the processes would finish? Any ideas, examples would be great! Thanks.

Use a Maildir kind of approach: copy all files to a temporary directory, then after the transfer is complete have the originating host perform a rename into the target directory via ssh. That way when a file appears in the target directory, you know that it is complete.
I suggest this because if you just scp files into the target directory and monitor the directory in whatever way, you cannot distinguish a complete transfer from an interrupted scp command or a network failure.

SGI::FAM, Sys::Gamin

Similar but alternative way to Jouni is to use semaphore files. Before scp-ing files originating host puts up semaphore-file and when finished, remove it. So you know, it's time.

Related

Zip files with encryption in a remote share, keeping orignal names and location

My team faces the need to encrypt all files in a repository with AES256. For this purpose, we decided we are going to zip all files with such encryption, using the same key for all of them.
The problem we have is that these files sit in a NAS, so from windows boxes they are accessible by \ to them.
The directory structure is something like this:
Original Structure:
Root
-1
|--folder1
|---file1.ext
|---file2.ext
|--folder2
|---filea.ext
|---fileb.ext
|--folder2.a
|---filec.ext
and so on...
Essentially, what we need is to have all the original files contained in a zip file, keeping their original names, which would be something like this:
Desired Outcome:
|-Root
|-1
|--folder1
|---file1.zip
|---file2.zip
|--folder2
|---filea.zip
|---fileb.zip
|--folder2a
|---filec.zip
and so on...
To accomplish this, we tried a batch script that calls 7zip, but it only works if it's run from the root directory, which is something we cannot use as the files are not in a server.
Here is the syntax of the batch script we came up with:
FOR /R %%i IN ("*.wmv") DO "C:\Program Files\7-Zip\7z.exe" a -mx0 -tzip -pPasswordHere "%%~dpni.zip" "%%i"
But, as wrote previously, it only works when run from the root folder, which is something we cannot do as files sit on a network location.
Mapping the drive or making a symbolic link to it doesn't do the trick either.
I've also checked on 7zip to do this, namely, making use of its "-r" operator, but I couldn't find a way to get the desired outcome (namely, recurse through all folders in the remote tree structure -there are a lot of them...- and keep the original file name).
I'm open to any suggestions as any kind of script, trick or guizmo that gets the job done will be more than welcome. =)
Thanks a million in advance!,
Sebas.
----SOLUTION----
I actually found a sollution here, mapping the drive in a different way (it's so simple it just made me feel stupid(er), but it's altogheter beautiful).
Using the batch script below, the remote share can be mapped like so:
You can map a drive using
net use X: \\server\directory
and then you can change to that directory using
pushd X:
(Post from which the answer was taken from: Batch File Iterating through files on a local network server)

Powershell: Copy-Item -Recurse -Force is not copying all sub files

I have a one liner that is baked into a larger script for some high level forensics. It is just a simple copy-item command and writes the dest folder and its contents back to my server. The code works great, BUT even with the switches:
-Recurse -Force
It is not returning the file with an extension of .dat. As you can guess what I am trying to achieve, I need the .dat file for analysis. I am running this from a privileged account. My only thought was that it is a read/write conflict and the host file was currently utilizing it (or other sys file). What switch am I missing? The "mode" for the file that will not copy over is -a---. Not hidden, just not copying. Suggestions elsewhere have said to use xCopy/robocopy- if possible I do not want to call another dependancy- im already using powershell for the majority of the script, id prefer to stick with it....Any thoughts? Thanks in advance, this one has been tickling my brain for a little...
The only way to copy a file in use is to find the locking handle close it then retry the copy operation(handle.exe).
From your question it looks like you are trying to remotely copy user profiles which includes ntuser.dat and other files that would be needed to keep the profile working properly. Even if you did manage to find a way to unload the dat file(s), you would have to consider the impact that would have on the remote system.
Shadow copy is typically used by backup programs to copy files in use so your best bet would be to find the latest backup of each remote computer and then try to extract the needed files from the backed-up copies or maybe wait for the users to logoff and then try.

Execute robocopy powershell continuously between two times established

I have a program that creates temporary files in a specific folder. Then, automatically, after a few seconds, these files are deleted.
I wanted to copy those temporal files to an specific folder, I would like to use a powershell script to do this:
robocopy startFolder destinationFolder *.TIFF *.JPEG *.jpg *.PNG *.GIF *.BMP *.ICO *.PBM *.PGM *.PPM /s /XO
My problem is that I couldn't use a scheduled task (because of the problem with limitation of seconds) or install this powershell as a Windows Service with a powershell script (as far as I know is a bad practice) . I need this powershell running all the time trying to get files at the moment that they are created, before this folders were deleted.
Could you give me a hand please? Thanks!
Not sure it's quite what you want, but robocopy does have directory monitoring funcitonality built-in. You could add /mon:1 which should monitor the source directory and re-run the copy when it detects one change (a new or changed file, for example).
However, a down-side of this perhaps is that using this method, robocopy won't exit - it will run until you kill it.
Edit: I've just noticed you specify in your question title that this should run between two established times, in which case you could add the /rh:hhmm-hhmm option to specify times between which new copies can be started. For example, /rh:1000-1200 should only perform the copies (and hence monitoring) between 10am and midday.
Caveat: I've not tried using the "monitor" option of robocopy, so I'm not sure what sort of delay there would be between a change taking place, and the copy being re-run, but it's worth a shot.

How can I resume downloads in Perl?

I have a project that depends upon some other binaries to be downloaded from web at install time.For this what i do is:
if ( file-present-in-src/)
# skip that file
else
# use wget to download the file
The problem with this approach is that when I interrupt a download in middle, and do invoke the script next time, the partially downloaded file is also skipped (which is not desired), also I want wget to resume the download of the partially downloaded file.
How should I go about it:
Possible Solutions I could think of:
Let the file to be downloaded to some file say download_tmp. Move to original file
if successful.
Handle SIG{'INT'} to write proper cleanup code.
But none of these could help resume the partial file download,
Any insights?
Fist, I don't understand what this has to do with Perl, since you're using wget to do the dowloading ... You could use libwww-perl (perldoc LWP) and have more control about the download process.
Then I second your idea of downloading to a "tmp" filename and move the file on success.
However I think you need to go further and verify the integrity of the files. Doing an MD5 or SHA hash is very easy, and match the downloaded one with what you're expecting. You can have a short file on server containing the checksum (filename.md5). Determine success only when you have a match.
Note that catching all the signals and generally trying to make the process unkillable, and then expecting it to have worked is bound to fail at one point or another. There could be a network timeout, a crash, power failure, configuration problem on the server ... you should instead assume downloads can fail, because they will, and code so that your process can recover.
Finally you're not telling us what kind of binaries you're downloading and what you're doing with them. Since you use wget I'm going to assume you're on Unix; you should consider using RPM+Yum or the likes, they handle all this for you. RPM are easy to write, really.
use your first approach ..
download to "FileName".tmp
move "FileName".tmp to "FileName" move! not copy
once per diem clean out all .tmp files (paranoia rulez)
You could just use wget's -N and -c options and remove the entire "if file exists" logic.

How can I copy a directory but ignore some files in Perl?

In my Perl code, I need to copy a directory from one location to another on the same host excluding some files/patterns (e.g. *.log, ./myDir/abc.cl).
What would be the optimum way of doing this in Perl across all the platforms?
On Windows, xcopy is one such solution. On unix platforms, is there a way to do this in Perl?
I think you're looking for rsync. It's not Perl, but it's going to work a lot better than anything you make in Perl:
% rsync --exclude='*.log' --exclude='./myDir/abc.cl' SOURCE DEST
If you have a bunch of patterns, you can put those all in a file:
*.log
./myDir/abc.cl
Now ignore all the patterns in a file:
% rsync --exclude-from=do_not_sync.txt SOURCE DEST
I'd use File::Find, and step over each file, but instead of calling File::Copy's copy() on each file, first test to see if it matches the pattern, and then next if it does.
On *nix, you can use native tar command, with -exclude options. Then after creating the tar file, you can bring it over to your destination to untar it.