What's behind the 'system' function in Perl? - perl

I can thought that it will open a shell, execute the parameter (shell command) and return the result in a scalar.
But, execute system function in a Perl script is faster than a shell command.
It will call this command in C?
If yes, what's the difference between
rmdir foo
system('rmdir foo');

The difference between the two is that the second one will open (fork) a child process (which will be the rmdir command) while the first one will make a direct Unix system call using the API without opening a process. Opening child process is expensive resource wise.
system() call will always open a child process to execute, BUT, it may either open a shell which will in turn fork off the desired program as its own child process (thus resulting in 2 child processes), or fork off the program as a child process directly.
The choice of when Perl will open a shell during a system() call is spelled out in perldoc -f system. The short (and not 100% accurate) version is:
If there is only one parameter to system call, and the parameter evaluates to a string that contains shell meta-characters, a shell will be forked first.
If there's only one parameter and it evaluates to no metacharacters; or there's a >1 element list of parameters, the program is forked directly bypassing shell.
system("rmdir foo"); # forks off rmdir command directly
system("rmdir", "foo"); # forks off rmdir command directly
system("rmdir foo > out.txt"); # forks off a shell because of ">"

Your system call starts a separate process while Perl's own rmdir will call the native C function in the context of the Perl process. Both methods end up doing the same system calls, but opening a new process is less efficient.*
It's best practice to use Perl's own functions (such as rmdir): they are portable, and you don't have to worry about interpreting shell exit codes, escaping metacharacters in filenames to prevent security risks, etc.
*system will create an additional sh process if your command contains pipes, input/output redirection, && etc. This does not apply in your case, but can make the system option even slower.


Does a background process complete if its parent finishes first?

I am not sure if this question is specific to Perl, but it is the language I am using. Say I launch a background process to save a web page to a local file like this:
system("curl http://google.com > output_file.html &");
I know this will launch a background process, though I'm not totally sure of the details (for example does it get its own PID?). But what's particularly important to me is, what happens if the process that launched it terminates before curl finishes downloading? Will curl be allowed to continue, or will it terminate too?
Is there any reason the solution wouldn't be to prepend the above command with nohup (nohup curl ...)? See http://linux.101hacks.com/unix/nohup-command/
Yes, your backgrounded process should complete even if the script exits first.
The system call forks, what means that at that point a new, independent, process is created as a near clone of the parent. That process is then replaced by the command to run, or by a shell that will run the command.† The system then waits for the child process to complete.
The & in the command makes sure that it is a shell that is run by the system, which then executes the command. The shell itself forks a process (subshell), in which the command is executed, and doesn't wait for it but returns right away.
At that point system's job is done and it returns control to the script.
The fate of the process forked by the shell has nothing more to do with the shell, or with your script, and the process will run to its completion.
The parent may well exit right away. See this with
use warnings;
use strict;
use feature 'say';
system("(sleep 5; echo hi) &");
say "Parent exiting.";
or, from a terminal
perl -wE'system("(sleep 3; echo me)&"); say "done"'
Once in the shell, the () starts a sub-shell, used here to put multiple commands in the background for this example (and representing your command). Keep that in mind when tracking process IDs via bash internal variables $BASHPID, $$, $PPID (here $$ differs from $BASHPID)
perl -wE'say $$; system("
( sleep 30; echo \$BASHPID; echo \$PPID; echo \$\$ ) &
"); say "done"'
Then view processes while this sleeps (by ps aux on my system, with | tail -n 10).
Most of the time the PID of a system-run command will be by two greater than that of the script, as there is a shell between them (for a backgrounded process as well, on my system). In the example above it should be greater by 3, because of an additional () subshell with mulitple commands.
This assumes that the /bin/sh which system uses does get relegated to bash.
Note: when the parent exits first the child is re-parented by init and all is well (no zombies).
† From system
Does exactly the same thing as exec, except that a fork is done first and the parent process waits for the child process to exit. Note that argument processing varies depending on the number of arguments. If there is more than one argument in LIST, or if LIST is an array with more than one value, starts the program given by the first element of the list with arguments given by the rest of the list. If there is only one scalar argument, the argument is checked for shell metacharacters, and if there are any, the entire argument is passed to the system's command shell for parsing (this is /bin/sh -c on Unix platforms, but varies on other platforms). If there are no shell metacharacters in the argument, it is split into words and passed directly to execvp, which is more efficient. ...
The "... starts the program given by the first element ..." also means by execvp, see exec.

How to change working directory using Net::SSH2?

use Net::SSH2;
my $ssh2 = Net::SSH2->new();
$chan = $ssh2->channel();
$chan->exec("cd dir1");
$chan->exec("command file1.txt");
The above doesn't work and command cannot find dir1/file1.txt. How do you change the working directory using Net::SSH2?
According to the documentation, each invocation of $chan->exec() runs in its own process on the remote. The cd dir1 in the first exec affects only that execution. The next exec is a completely separate process.
The simplest way to solve the problem would be to pass the full path in the command, i.e.
$chan->exec("command dir1/file1.txt");
You could also try setting the PATH variable using $chan->setenv() but that probably will be prohibited by the remote side.
Note also (from the process section):
... it is also possible to launch a remote shell (using shell) and simulate the user interaction printing commands to its stdin stream and reading data back from its stdout and stderr. But this approach should be avoided if possible; talking to a shell is difficult and, in general, unreliable.

Source configuration filr in terminal though perl script

I need to source configuration file 'eg.conf' to terminal though perl script. I am using system command but its not working.
system('. /etc/eg.conf')
Basically I am writing script in which later point it will use the environment variable (under conf file) for execute other process.
It is not clear what you are trying to achieve, but if you want to make the config available from within Perl AND your config file is valid Perl code you can use do or require (see perldoc for more information).
What you are doing in your code is to spawn a shell with system, include the config inside this shell (which must be in shell syntax) and then exit the shell again which of course throws all the config away on close. I guess this is not what you intend to do, but your real intention is not clear.
What is your goal? Do you need to source eg.conf to set up further calculations from within a perl controlled shell, or are you trying to affect the parent shell that is running the perl script?
Your example call to system('. /etc/eg.conf') creates a new shell subprocess. /etc/eg.conf is sourced into that shell at which point the shell exits. Nothing is changed within the perl script nor in the parent process that spawned the perl script.
One can not modify the environment of a parent process from a child process, without the assistance of the parent process[1]. One generally returns code for the parent shell to source or to eval.
1: ok, one could theoretically affect the parent process by directly poking into its memory space. Don't do that.

How can I find out what script, program, or shell executed my Perl script?

How would I determine what script, program, or shell executed my Perl script?
Example: I might want to have human readable output if executed from shell (customized for each type of shell), a different type of output if called as a script from another perl script, and a machine readable format if executed from a program such as a continuous integration server.
Motivation: I have a tool that changes its output based on which shell executes it. I'd normally implement this behavior as an option to the script, but this tool's design doesn't allow for options. Other shells have environment variables that indicate what shell is running. I'm working on a patch to support Powershell, which has no such special variable.
Edit: Many of these answers happen to be linux specific. Unfortuantely, Powershell is for Windows. getppid, the $ENV{SHELL} variable, and shelling out to ps won't help in this case. This script needs to run cross-platform.
You use getppid(). Take this snippet in child.pl:
my $ppid = getppid();
system("ps --no-headers $ppid");
If you run it from the command line, system will show bash or similar (among other things). Execute it with system("perl child.pl"); in another script, e.g. parent.pl, and you will see that perl parent.pl executed it.
To capture just the name of the process with arguments (thanks to ikegami for the correct ps syntax):
my $ppid = getppid();
my $ps = `ps --no-headers -o cmd $ppid`;
chomp $ps;
EDIT: An alternative to this approach, might be to create soft links to your script, make the different contexts use different links to access your script and inspect $0 to build logic around that.
I would suggest a different approach to accomplish your goal. Instead of guessing at the context, make it more explicit. Each use case is wholly separate, so have three different interfaces.
A function which can be called inside a Perl program. This would likely return a Perl data structure. This is far easier, faster and more reliable than parsing script output. It would also serve as the basis for the scripts.
A script which outputs for the current shell. It can look at $ENV{SHELL} to discover what shell is running. For bonus points, provide a switch to explicitly override.
A script which can be called inside a non-Perl program, such as your continuous integration server, and issue machine readable output. XML and/or JSON or whatever.
2 and 3 would be just thin wrappers to format the data coming out of 1.
Each is tailored to fit its specific need. Each will work without heuristics. Each will be far simpler than trying to guess the context and what the user wants.
If you can't separate 2 and 3, have the continuous integration server set an environment variable and look for it.
Depending on your environment, you may be able to pick it up from the environment variables. Consider the following code:
/usr/bin/perl -MData::Dumper -e 'print Dumper(\%ENV);' | grep sh
On my Ubuntu system, it gets me:
'SHELL' => '/bin/bash',
So I guess that says I'm running perl from a bash shell. If you use something else, the SHELL variable may give you a hint.
But let's say you know you're in bash, but perl is run from a subshell. Then try:
/bin/sh -c "/usr/bin/perl -MData::Dumper -e 'print Dumper(\%ENV);'" | grep sh
You will find:
'_' => '/bin/sh',
'SHELL' => '/bin/bash',
So the shell is still bash, but bash has a variable $_ which also show the absolute filename of the shell or script being executed, which may also give a valuable hint. Similarily, for other environments there will most probably be clues left in the perl %ENV hash that should give you valuable hints.
If you're running PowerShell 2.0 or above (most likely), you can infer the shell as a parent process by examining the environment variable %psmodulepath%. By default, it points to the system modules under %windir%\system32\windowspowershell\v1.0\modules; this is what you would see if you examine the variable from cmd.exe.
However, when PowerShell starts up, it prepends the user's default module search path to this environment variable which looks like: %userprofile%\documents\windowspowershell\modules. This is inherited by child processes. So, your logic would be to test if %psmodulepath% starts with %userprofile% to detect powershell 2.0 or higher. This won't work in PowerShell 1.0 because it does not support modules.
This is on Windows XP with PowerShell v2.0, so take it with a grain of salt.
In a cmd.exe shell, I get:
whereas in the PowerShell console window, I get:
where E:\Home\user is where my "My Documents" folder is. So, one heuristic may be to check if PSModulePath contains a user dependent path.
In addition, in a console window, I get:
in the environment. From the PowerShell ISE, I get:
!C:=C:\Documents and Settings\user

Sourcing shell scripts in Perl

I want to source a shell script from within Perl and have the environment variables be available in Perl, but I'm not sure if there's an elegant way to do it. Obviously, using system() won't work since it runs in a forked process, and all environment changes will be lost. I think there's a CPAN module that can do it, but I prefer not to use external modules.
I've seen two solutions that would not work in my case:
Have a wrapper that calls the shell script, and then calls the Perl script. I do not know ahead of time which of my shell scripts I need to call.
Manually opening the shell script and scraping for arg=value pairs. This won't work either because the shell script is not a simple list of ARG=VALUE, but rather contain a bunch of conditionals, and variables can have different values depending on certain conditions.
sh -c "source script; env" should output the environment at the end of script as name=value pairs, which you then can parse from your perl script (as Perl is a language made for parsing, this should be easy).
You can do this by installing external module from CPAN which is Shell::Source
$env_path= Shell::Source->new(shell=>"tcsh",file=>"../path/to/file/temp.csh");
As perl creates its own instance while running on a shell, so we can not set environment path for the main shell as the perl's instance will be like sub shell of the main shell. Child can not set environment paths for parents.
Now till the perl's sub shell will run you'll be able to access all the paths present in temp.csh by using Shell::Source