Memory efficient calling of external command in Python - solaris

I have a python script that needs to load lot's of data by calling lot's of external commands.
After a couple hours this always crashes with this:
....
process = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, close_fds=True)
File "/usr/lib/python2.6/subprocess.py", line 621, in __init__
errread, errwrite)
File "/usr/lib/python2.6/subprocess.py", line 1037, in _execute_child
self.pid = os.fork()
OSError: [Errno 12] Not enough space
abort: Not enough space
... even though the machine does have much more memory available than the script is
actually using.
The root cause seems to be that every fork() actually requires twice as much memory than the parent process that is immediately released by calling of exec()
(see: http://www.oracle.com/technetwork/server-storage/solaris10/subprocess-136439.html)
... in my case the above is even worse because I'm loading the data in multiple threads.
So do you see any creative way how to workaround this issue?

Do you need to just launch an external command or fork the one you're running?
If you just need to run another script try:
subprocess.call()
i.e.
subprocess.call(["ls", "-l"])
The os.fork copies your current environment which might be unnecessary depending on your usage.
Don't forget to import the subprocess module before using it.
More info:
https://docs.python.org/2/library/subprocess.html
https://docs.python.org/2/library/os.html#os.fork
How do I execute a program from python? os.system fails due to spaces in path

Related

Set environment variables by batch file called from perl script

Let's consider the following perl script:
#!/usr/bin/perl
system("C:/Program Files (x86)/Microsoft Visual Studio/2017/Enterprise/Common7/Tools/VsDevCmd.bat");
system("msbuild");
The batch file invoked with the first system call is supposed to set up some environment variables so that the msbuild executable in the second system call can be found.
When I run this perl script I get the following error:
'msbuild' is not recognized as an internal or external command,
operable program or batch file.
So it looks like the environment variables set in the batch file are not made available to the context of the perl script. What can I do to make this work?
Note 1
Running first the batch file from a console window and then running msbuild works fine. So the batch file works as expected and msbuild is actually available.
Note 2
My real-world perl script is much longer. This example here is a massive simplification which allows to reproduce the problem. So I cannot easily replace the perl script with a batch file, for example.
Note 3
The funny thing is: I've been using this perl script for one or two years without any problems. Then suddenly it stopped working.
Your process has an associated environment which contains things like the search path.
When a sub-process starts, the new process has a new, separate, environment which starts as a copy of the parent process's environment.
Any process (including sub-processes) can change their own environment. They cannot, however, change their parent's process's environment.
Running system() creates a new environment.
So when you call system() to set up your environment, it starts a new sub-process with a new environment. Your batch program then changes this new environment. But then the sub-process exits and its environment ceases to exist - taking all of the changes with it.
You need to run the batch file in a parent process, before running your Perl program.

Multiple addpath commands result in slow second call

I've had a startup script which sets my default settings as well as defines my working directory and adds all relevant paths. In this script, there is the command
addpath(genpath(pwd))
which simply adds all of the subfolders within my current directory.
Recently I got a new SSD and tried to move most of my non-program files over to it. Both drives work fine and are able to communicate smoothly. However, now I notice that if I try to call the command twice, the first command executes in less than a second whereas the second command continues to execute (20+ minutes and still running).
I am fairly certain I did not have this problem before and it occurs if a single file in the addpath is already on MATLABPATH. Furthermore, adding the files to the permanent MATLABPATH and restarting MATLAB also results in a soft error (runs forever without any actual error). This error persists for paths on both drives but the only change is that I got a new drive.
Edit: It appears to be getting stuck on line 94 of "addpath"
path(p, mp);
I am using Windows 10 on MATLAB 2017b.
Thank you for your help

/usr/local/bin/perl5: bad interpreter: Permission denied

I have a unix command
(script) which has a nested perl script in it.
when i run this unix command from command line it works fine.
If I am running same command from a tcl file using exec, i am getting following error:
'sh: /cmdpath/cmd.pl: /usr/local/bin/perl5: bad interpreter: Permission denied'
Any Idea what could be causing this. My tcl code is trying to execute this command several times ( more than 100 times).
Thanks
Ruchi
Almost certainly your Perl script is encoded in DOS/Windows line-ending format, which uses \r\n to terminate lines. Since Unix terminates lines with \n only, the \r is interpreted as belonging to the executable name, so that the kernel tries to run a program named perl5\r and fails.
Deleting the trailing \r on this line should fix the problem.
Alternatively, it may be that the perl5 executable either does not exist at the given path, or exists but lacks the execute permission bit. If you have this executable living somewhere else in the filesystem, update the path on the first line of the script to point to it. To fix the latter problem, run
chmod +x /usr/local/bin/perl5
You will need to be root to do this.
Given the output you are showing, you are likely executing "sh cmd.pl". In turn, sh is trying to execute the perl interpreter.
Why not spawn "/usr/local/bin/perl5 cmd.pl" directly, this will be more efficient, especially if you are doing that hundreds of time.

Should I turn a perl script that parses a /var/log/.* file into a daemon?

I am writing a perl script to parse, for example, /var/log/syslog.
The perl script triggers further subsequent tasks when particular events in the log appear. The log is parsed following the advice of this post:
Command line: monitor log file and add data to database
Which what I believe is the use of a pipe.
Now I'd like this script to forever run in the background.
This sounds like a daemon to me, and the daemon program referenced in the following question seems ideal:
How can I run a Perl script as a system daemon in linux?
But from this post, it seems clear that daemon's have no open file handles. So how can I have a daemon, or a perl script that becomes a daemon, that monitors a logfile?
It sounds like what you want is a daemon. In that case the advise given in the second post you reference is the best practice. However, you do have other options like daemontools, which removes the fork complexity.
Daemons are allowed to have filehandles, but you should close STDIN, STDOUT, and STDERRR because you shouldn't really use them anymore. A lot of this has to do with the way fork works in *nix systems. Just open the pipe filehandle after your second fork, and you shouldn't have any issues.
this doesn't answer your question, but is another route to consider which may or may not be appropriate for you:
rsyslog can execute a program when a certain message is logged
see Filter Conditions for setting up the up the trigger, Templates for formatting the output that's passed to the script, and Actions > Shell Execute for specifying the executable.
Be sure to read the security implications, and that ryslog blocks while the external program runs. But if your script runs reliably quickly, it may be an option.

Error handling in sets of batch files running in Windows task scheduler

Let's say I have 5 batch files that run sequentially one after another (executed via the Windows task scheduler on a normal Windows XP PC):
Script1.bat
Script2.bat
Script3.bat
Script4.bat
Script5.bat
Suppose one of the scripts fail (an error condition is detected -- details on how this happens is not important for my question here). How do I stop the other scripts from running if they all run within the task scheduler? For example, if Script1.bat fails, I don't want to run Script2-5.bat. If Script3.bat fails, I don't want to run Script4-5.bat, etc.
I thought about writing a flag value to a temporary file that each script would read from. At the beginning of each script (except for the first one), it will check to see if the flag is valid. The first script would clear out this flag at the beginning each time these set of batch files run.
Surely there is a better way to do this or maybe there is a standard for how to handle this type of situation? Thanks!
Write a master.bat file that conditionally calls each of the scripts in sequence. Then schedule the master instead of directly scheduling the 5 scripts.
#echo off
call Script1.bat
if %errorlevel%==0 call Script2.bat
if %errorlevel%==0 call Script3.bat
if %errorlevel%==0 call Script4.bat
if %errorlevel%==0 call Script5.bat