How to detect if cronned script is stuck - perl

I have a few Perl scripts on a Solaris SunOS system which basically connect to other nodes on the network and fetch/process command logs. They run correctly 99% of the time when run manually, but sometimes they get stuck. In this case, I simply interrupt it and run again.
Now, I intend to cron them, and I would like to know if there is a way to detect if the script got stuck in the middle of execution (for whatever reason), and preferably exit as soon as that happens, in order to release any system resources it may be occupying.
Any help much appreciated.

TMTOWTDI, but one possibility:
At the start of your script, write the process id to a temporary file.
At the end of the script, remove the temporary file.
In another script, see if there are any of these temporary files more than a few minutes/hours old.

Related

Powershell: Detecting that a specifically opened program is running (and closing it)

I'm trying to automate a workflow. The automation script is mainly written in Powershell It consists of these steps: 1) Opening a program 2) Communicating with the API, reading values, etc. 3) Closing the program. This script will be run many times a day, it would suffice to not close the program every time the script is finishing, but rather check at the beginning of the script whether the program is already opened, and if not, open it. I'd like to implement both, then decide which solution to use later on.
The code for opening the program is completed, but it's not enough to just run an .exe file to open the program, as I have to load the correct settings and GUI, for this while opening the .exe file from the command line, additionally, I have to use -s, also -c. I concluded all this in runProgram.cmd, so in the Powershell script, I only run this file to open the program. However, I am unsure how the already opened program can be detected (that it's opened), and how can I close it. I believe a solution might use processes, with the help of Get-Process, but I'm unsure of its capabilities and limitations (how do I check if my program's process is not amongst the list of running processes?), and whether there is a better way of dealing with this problem.
I have found the solution:
Open the program and open Powershell, and type Get-Process (this will list all the currently running processes)
Search yours (by name). If you don't know which process is the one you're looking for, you can close your program, then type Get-Process again, and look for the process that disappeared from the list, since you closed it. Let's assume the name of it is "yourprocess".
In the code, type $val = Get-Process -Name yourprocess. If it is running, $val should equal some data about the process, if it is not running, then $val is 0. Therefore, if you want to check whether it's opened, you should use:
if($null -ne $val){...}
Finally, stopping the process: Stop-Process -Name yourprocess.

Running MATLAB system command in background with stdout

I'm using MATLAB and calling an .exe via the system command.
[status,cmdout] = system(command_s);
where command_s is a command string that is formatted earlier in my script to pass all the desired options to the .exe. The .exe would normally write to a .csv file via the > redirection operator in Windows/DOS. Instead, this output is going to cmdout where I use it later in the MATLAB script. It is working correctly and as expected. I'm doing it this way so that the process just uses memory and does not write a very large file to the disk, which would then have to be read from the disk and then deleted after I'm done with it. In the end, it saves a .mat file that's usually in hundreds of KB instead of 10s/100s of MBs as the .csv file would be (some unneeded data is thrown out in the end).
The issue I'm having is since I'm dealing with large files, the executable can take a significant amount of time. I typically have to wait about 2 minutes after executing this command. In the meantime, I have no feedback to know it is progressing and that my system hasn't froze. I know I could add the & symbol to the end of my string, command_s, and run MATLAB code while this is running in the background (or asynchronously as some would say), but that brings up an external window AND makes cmdout empty - so I cannot use the output - forcing me to sit there for 2 minutes wondering each time it executes.
Is there any way to run in the background AND get the stdout from the command?
Maybe you could try system(command_s,'-echo')?

Reduce relocatable win32 Perl to as few files and bytes as possible

I'm trying to use a perl program on a Windows HTCondor computing cluster. The way HTCondor on windows works is it copies all dependencies into a temporary directory (used as a chroot of sorts) and then it deletes the directory after the specified outputs are moved to a designated place.
If I take only perl.exe and perl514.dll and make a job like this: perl -e "print qq/hello\n/" and tell the cluster to run it 200 times, then each replication winds up taking about 15 seconds, which is acceptable overhead. That's almost all time spent repeatedly copying the files over the network and then deleting them. echo_hello.bat run 200 times takes more like two seconds per replication.
The problem I have is that when I try to use my full blown perl distribution of 55MB and 2,289 files, a single "hello" rep takes something like four minutes of copying and deleting, which is unacceptable. When I try to do many runs the disks on the machines grind to a halt trying to concurrently handle all the file operations across all the reps, so it doesn't work at all. I don't know how long it might take to eventually finish because I gave up after half an hour and no jobs had finished.
I figured PAR::Packer might fix the issue, but nope. I tried print_hello.exe created like this: pp -o print_hello.exe -e "print qq/hello\n/". It still makes things grind to a halt, apparently by swamping the filesystem. I think a PAR::Packer executable makes a ton of temporary files as it pulls out files it needs from the archive. I think the windows file system totally chokes when there are a bunch of concurrent small file operations.
So how can I go about cutting down the perl I built to something like 6MB and a dozen files? I'm really only using a tiny number of core modules and don't need most of the crap in bin and lib, but I have no idea how to proceed ripping out stuff in a sane way.
Is there an automated way to strip away un-needed files and modules?
I know TCL has a bunch of facilities for packing files into a single uncompressed archive that can then be accessed through a "virtual filesystem" without expanding the file. Is there some way to do this with perl itself sort of like with PAR? The problem is PAR compresses everything and then has to extract to temporary files, rather than directly work through a virtual filesystem layer. (If I understand correctly.)
My usage of perl is actually as a scripting layer. It's embedded in a simulation. So I'm really running my_simulation.exe which depends on per514.dll, but you get the idea. I also cannot realistically do anything to the HTCondor cluster other than use it. So there's no need to think outside the box on what I should be using instead of perl and what I could administratively tweak in Windows and HTCondor, thanks.
You can use Module::ScanDeps to get list of actual dependencies of your perl. It was terrible, that it took significant amount of time, when PAR::Packer unpacked the whole application, so I decided to build the executable by myself.
Here is my ready to use script which gathers perl dependencies into some directory; it might be useful for you to reduce the number of perl-modules, e.g. by manually removing some dependencies after copying.
In theory (I have never tried that), the next your step could be merge all pure-perl dependencies into single file (like deps.pm); although it might be non-trivial due to perl's autoload magic and some other tricks.
You can list the modules that are needed by your program using the very nice ListDependencies module
To my knowledge it isn't downloadable anywhere, but it is simple to copy and paste into your own ListDependencies.pm file
You should read the POD documentation within the module for usage instructions

MATLAB doesn't find files I downloaded while the script is running

My problem is as described. My script downloads files through an external call to cmd (using the system function and then .NET to make keypresses). The issue is that when it tries to fopen these files I downloaded (filenames from a text file I write as I download), it doesn't find them, causing an error. When I run the script again after seeing it fail, it works but only up to the point where it's trying to download/call new files again, where it runs into the same problem.
Are new files downloaded during when a script is running somehow not visible to the search path? Because the folder is most definitely in my search path (seeing as it works outside of during-script downloads). It's not that it isn't getting the files fast enough either, cause they appear in my folder almost instantly, and I've tried a delay to allow for it to recognize it, but that didn't work either.
I'm not sure if it's important to note that the script calls an external function which tries to read the files from the .txt list I create in the main script.
Any ideas?
The script to download the files looks like so:
NET.addAssembly('System.Windows.Forms');
sendkey = #(strkey) System.Windows.Forms.SendKeys.SendWait(strkey);
system('start cygwinbatch.bat')
pause(.1)
sendkey(callStr1)
sendkey('{ENTER}')
pause(.1)
sendkey(callStr2)
sendkey('{ENTER}')
pause(.1)
sendkey('exit')
pause(2)
sendkey('{ENTER}')
But that is not the main reason I am asking: I am confident that the downloads are occurring when the script calls them, because I see them appearing in my folder as it called. I am more confused as to why MATLAB doesn't seem to know they are there while the script is running, and I have to stop it and run it again for it to recognize the ones I've downloaded already.
Thank you,
Aaron
The answer here is probably to run the 'rehash' function. Matlab does not look for new files while executing an operation, and in some environments misses new files even during interactive activity.
Running the rehash function forces Matlab to search through its full path and determine if there are any new files.
I've never tried to run rehash in the middle of an operation though. ...
My guess is that the MATLAB interpreter is trying to look ahead and is throwing errors based on a snapshot of what the filesystem looked like before the files were downloaded. Do you get different behavior if you run it one line at a time using F9? If that's the case then you may be able to prevent the interpreter from looking ahead by using eval().

Automatically running a script to read particular information from a .txt file ? (Perl Script, or suggest)

My scenario: A text file(s) will keep coming into say a folder, I need to detect the new text file, and read particular information from it, say format being (word : info, OR word and under it a column of info, etc.). And, this process needs to keep going on always.
Problem: How should I go about doing this, I guess use perl scipt, but where to go from there ?, I am getting ideas, and also help on the internet, but I thought asking it here might make my thoughts clearer.
Kindly help, please suggest a path to do this.
Regards,
Chirayu
First thing: you want a daemon process, so you may want to have a look at Proc::Daemon.
Second thing, you need to read & parse your file. Parsing it, depends on its format, and your question is not really clear about it.
Finally, you may want to consider moving a newly detected file (or renaming it) while processing it, end then (possibly) deleting it after having processed. This depends on the requirements that you have. Alternatively, you may want to move the newly detected file into an archive directory after having processed them.
One approach might be to have a perl process that regularly (say every 5 seconds, every 5 minutes or every 5 hours, your call really) scans said directory and as soon as any new text file appears, spawn a child process that process it.
The child process might be another perl script which gets the name of the text file as it's argument and which reads the file, detects the word you mention and then extracts the information you are interested in (and then does whatever you consider necessary with that information).
Things to look out for is what to do with the text files once they are processed. Are they supposed to stay around? Then you need to keep track of which of them you have processed, so they do not get processed again in the case your master process (the one that scans the directory and spawn perl children) has to be restarted (due to either a crash or a deliberate restart).
If the text files are supposed to disappear once they are processed, then I assume it could be a good idea to either let the children remove them after completion or to let the master process remove them provided the master process always waits for the children to complete before it continues running. The drawback with a master process waiting for children to complete is that children then cannot be run in parallell but has to be run in strict sequence (not necessary a drawback depending on your situation).
(If you have a master process always waiting for the child process to run, you can actually skip having child processes altogether and create a subroutine in the master program which reads and processes the text file).
High level description but hope it helps.
What is the operating system you are using?
On Windows, you can use Win32::ChangeNotify and on Linux, you can use Linux::Inotify2 to be notified of changes to the contents of a directory.
Your script can simply wait to be notified and take action when notified instead of polling the contents of the directory which will either waste resources or potentially miss some changes.