Why does fish's echo builtin behave differently from bash's builtin? - fish

Given a function bar that runs echo bar every second forever, I expect bar | cat to print one line every second. It does not. It doesn't print anything until I hit ^C. There seems to be some superfluous buffering going on.
yes | cat works just fine. bar | cat works as expected in bash.
As to why this matters, please see this other question.

We have a command with a builtin in a pipeline:
echo bar | cat
bash forks new processes to execute builtins in pipelines. So there's two new processes: one for cat, the other to run the builtin echo. The processes are created with pipes connecting them, and they run independently.
fish always runs builtins directly, without creating new processes for them. So it executes echo bar, then sends the output to cat. The input and output for builtins and functions (but not external commands) is fully buffered. We're working on removing that limitation, but we're not there yet.
The bash behavior may seem more desirable in this case, but there's a heavy price for it, which is that builtins and functions do different things (or just don't work) inside pipelines. Example:
echo foo | read some_var ; echo $some_var
In fish, this will set some_var=foo, as you would expect. In bash, this silently fails: the assignment "works," but only within the transient process spawned to execute the read builtin, so the value isn't visible to the next command. This should give you some appreciation for why fish always runs builtins directly.

Related

Does a background process complete if its parent finishes first?

I am not sure if this question is specific to Perl, but it is the language I am using. Say I launch a background process to save a web page to a local file like this:
system("curl http://google.com > output_file.html &");
I know this will launch a background process, though I'm not totally sure of the details (for example does it get its own PID?). But what's particularly important to me is, what happens if the process that launched it terminates before curl finishes downloading? Will curl be allowed to continue, or will it terminate too?
Is there any reason the solution wouldn't be to prepend the above command with nohup (nohup curl ...)? See http://linux.101hacks.com/unix/nohup-command/
Yes, your backgrounded process should complete even if the script exits first.
The system call forks, what means that at that point a new, independent, process is created as a near clone of the parent. That process is then replaced by the command to run, or by a shell that will run the command.† The system then waits for the child process to complete.
The & in the command makes sure that it is a shell that is run by the system, which then executes the command. The shell itself forks a process (subshell), in which the command is executed, and doesn't wait for it but returns right away.
At that point system's job is done and it returns control to the script.
The fate of the process forked by the shell has nothing more to do with the shell, or with your script, and the process will run to its completion.
The parent may well exit right away. See this with
use warnings;
use strict;
use feature 'say';
system("(sleep 5; echo hi) &");
say "Parent exiting.";
or, from a terminal
perl -wE'system("(sleep 3; echo me)&"); say "done"'
Once in the shell, the () starts a sub-shell, used here to put multiple commands in the background for this example (and representing your command). Keep that in mind when tracking process IDs via bash internal variables $BASHPID, $$, $PPID (here $$ differs from $BASHPID)
perl -wE'say $$; system("
( sleep 30; echo \$BASHPID; echo \$PPID; echo \$\$ ) &
"); say "done"'
Then view processes while this sleeps (by ps aux on my system, with | tail -n 10).
Most of the time the PID of a system-run command will be by two greater than that of the script, as there is a shell between them (for a backgrounded process as well, on my system). In the example above it should be greater by 3, because of an additional () subshell with mulitple commands.
This assumes that the /bin/sh which system uses does get relegated to bash.
Note: when the parent exits first the child is re-parented by init and all is well (no zombies).
† From system
Does exactly the same thing as exec, except that a fork is done first and the parent process waits for the child process to exit. Note that argument processing varies depending on the number of arguments. If there is more than one argument in LIST, or if LIST is an array with more than one value, starts the program given by the first element of the list with arguments given by the rest of the list. If there is only one scalar argument, the argument is checked for shell metacharacters, and if there are any, the entire argument is passed to the system's command shell for parsing (this is /bin/sh -c on Unix platforms, but varies on other platforms). If there are no shell metacharacters in the argument, it is split into words and passed directly to execvp, which is more efficient. ...
The "... starts the program given by the first element ..." also means by execvp, see exec.

Terminal prompt disappears when a named pipe is used

I'm trying to use named pipes in a project. I have two terminals open, Terminal A and Terminal B.
In terminal A, I issued this command:
mkfifo myFifo && tail -f myFifo | csh -s
It seems as if standard out is being redirected somewhere else, though, because my prompt disappears and some commands aren't reflected in terminal A.
For example, if in terminal B I begin a python session via issuing echo "python" > myFifo, then echo "print 'Hello, World'" > myFifo, I don't see Hello, World in terminal A.
However, if I issue echo ls > myFifo within terminal B, I see the correct output from ls in terminal A.
Does anyone know why sometimes the output appears and sometime it doesn't?
I'm running on CentOS 6.6
Thanks,
erip
You read from the FIFO with csh, if you start an interactive Python shell in csh, then it won't be reading from the FIFO because it's busy running python.
Python doesn't somehow automagically do a REPL on the FIFO. How should it even know about the FIFO? It has no knowledge of it.
You could, perhaps, tell Python to read commands from the FIFO with something like:
>>> import os, sys, time
>>> fifo = open(os.open('myFifo', os.O_NONBLOCK), 'r')
And then:
$ echo 'print(42+5)' > ! myFifo
Will give you:
>>> eval(fifo.read())
47
Perhaps there's also a way to tell Python to read commands from myFifo by overwriting sys.stdin, but I can't get that working in my testing.
It's a bit unclear to me what exactly you're trying to achieve here, though. I suspect there might be another solution which is much more appropriate to the problem you're having.

In IPython every shell command is run by prefixing it with "!" but few commands run without that, What is the reason behind it?

In case of "ls" command it runs with and without the prefix "!". In case of "cat fileName" it's the same, but when you consider "wc -l fileName" it works only with "!" prefix.
When you combine cat and wc command "cat fileName | wc -l" executed successfully without "!" prefix.
I don't understand the logic behind this prefix "!" in ipython.
Thank you in advance
(I am new to python programming, if it sounds silly question please forgive me.)
IPython tries to make interactive programming as comfortable as possible. Some shell builtins like ls, cd or cat are basic commands to navigate in unix shells. IPython, as a "Python Shell" provides the same functionally for convenience. Along with features like colored output, etc.
The !command is for executing arbitrary shell code and is much more powerful. It can be used to run any command you can type in a normal shell and can also catch its output.
Compare ls with !ls. The former will print the content in your current directory with nice coloring. The latter will print the same list, but just plain text.
But note that you can do really cool things with !command:
files = !ls
for f in files:
print("I like this file:", f)
Which reads the output of ls into a python array files which you can use in your code just like any other array.
To sum up: if you just want to navigate, you usually use the standard commands, if available. If you need to capture the output or run programs you have to use the !command syntax.

How can I find out what script, program, or shell executed my Perl script?

How would I determine what script, program, or shell executed my Perl script?
Example: I might want to have human readable output if executed from shell (customized for each type of shell), a different type of output if called as a script from another perl script, and a machine readable format if executed from a program such as a continuous integration server.
Motivation: I have a tool that changes its output based on which shell executes it. I'd normally implement this behavior as an option to the script, but this tool's design doesn't allow for options. Other shells have environment variables that indicate what shell is running. I'm working on a patch to support Powershell, which has no such special variable.
Edit: Many of these answers happen to be linux specific. Unfortuantely, Powershell is for Windows. getppid, the $ENV{SHELL} variable, and shelling out to ps won't help in this case. This script needs to run cross-platform.
You use getppid(). Take this snippet in child.pl:
my $ppid = getppid();
system("ps --no-headers $ppid");
If you run it from the command line, system will show bash or similar (among other things). Execute it with system("perl child.pl"); in another script, e.g. parent.pl, and you will see that perl parent.pl executed it.
To capture just the name of the process with arguments (thanks to ikegami for the correct ps syntax):
my $ppid = getppid();
my $ps = `ps --no-headers -o cmd $ppid`;
chomp $ps;
EDIT: An alternative to this approach, might be to create soft links to your script, make the different contexts use different links to access your script and inspect $0 to build logic around that.
I would suggest a different approach to accomplish your goal. Instead of guessing at the context, make it more explicit. Each use case is wholly separate, so have three different interfaces.
A function which can be called inside a Perl program. This would likely return a Perl data structure. This is far easier, faster and more reliable than parsing script output. It would also serve as the basis for the scripts.
A script which outputs for the current shell. It can look at $ENV{SHELL} to discover what shell is running. For bonus points, provide a switch to explicitly override.
A script which can be called inside a non-Perl program, such as your continuous integration server, and issue machine readable output. XML and/or JSON or whatever.
2 and 3 would be just thin wrappers to format the data coming out of 1.
Each is tailored to fit its specific need. Each will work without heuristics. Each will be far simpler than trying to guess the context and what the user wants.
If you can't separate 2 and 3, have the continuous integration server set an environment variable and look for it.
Depending on your environment, you may be able to pick it up from the environment variables. Consider the following code:
/usr/bin/perl -MData::Dumper -e 'print Dumper(\%ENV);' | grep sh
On my Ubuntu system, it gets me:
'SHELL' => '/bin/bash',
So I guess that says I'm running perl from a bash shell. If you use something else, the SHELL variable may give you a hint.
But let's say you know you're in bash, but perl is run from a subshell. Then try:
/bin/sh -c "/usr/bin/perl -MData::Dumper -e 'print Dumper(\%ENV);'" | grep sh
You will find:
'_' => '/bin/sh',
'SHELL' => '/bin/bash',
So the shell is still bash, but bash has a variable $_ which also show the absolute filename of the shell or script being executed, which may also give a valuable hint. Similarily, for other environments there will most probably be clues left in the perl %ENV hash that should give you valuable hints.
If you're running PowerShell 2.0 or above (most likely), you can infer the shell as a parent process by examining the environment variable %psmodulepath%. By default, it points to the system modules under %windir%\system32\windowspowershell\v1.0\modules; this is what you would see if you examine the variable from cmd.exe.
However, when PowerShell starts up, it prepends the user's default module search path to this environment variable which looks like: %userprofile%\documents\windowspowershell\modules. This is inherited by child processes. So, your logic would be to test if %psmodulepath% starts with %userprofile% to detect powershell 2.0 or higher. This won't work in PowerShell 1.0 because it does not support modules.
This is on Windows XP with PowerShell v2.0, so take it with a grain of salt.
In a cmd.exe shell, I get:
PSModulePath=C:\WINDOWS\system32\WindowsPowerShell\v1.0\Modules\
whereas in the PowerShell console window, I get:
PSModulePath=E:\Home\user\WindowsPowerShell\Modules;C:\WINDOWS\system32\WindowsP
owerShell\v1.0\Modules\
where E:\Home\user is where my "My Documents" folder is. So, one heuristic may be to check if PSModulePath contains a user dependent path.
In addition, in a console window, I get:
!::=::\
in the environment. From the PowerShell ISE, I get:
!::=::\
!C:=C:\Documents and Settings\user

Running command using XARG

I have example file: test_file
//--- Test File--
**RUN_THIS
RUN_THIS00
RUN_THIS01
DONT_RUN00
DONT_RUN00
RUN_THIS02**
where RUN_THIS* & DONT_RUN* are commands.
I would like to run only RUN_THIS commands from test_file without editing the file.
I am looking for option like
cat test_file | grep RUN_THIS | xargs {Some option to be provided to run run_this}
I cannot start new shell
Something like this perhaps?
for cmd in $(grep RUN_THIS < test_file); do
$cmd --some-option-to-be-provided-to-run-this
done
That should work okay as long as there are no spaces in the commands in test_file.
eval `grep RUN_THIS test_file`
Note also the avoidance of a Useless Use of Cat.
Actually, you may have to add a semicolon to the end of each command in test_file, or change the grep to something which adds the necessary semicolons.
eval `awk '/RUN_THIS/ { print; print ";" }'`
I'm not sure I understand the requirement to not start a new shell. Under the hood, the backticks run a subshell, so this might violate that requirement (but then ultimately every external command starts a new process, which starts out as a fork of the current shell process when you run a shell script). If you are scared of security implications, you should not be using a shell script in the first place, anyhow.
To run new shells you need to incorparate "ksh" in your command.
In its simplest form
RUN_THIS00='ls'
echo $RUN_THIS00 | ksh