posix shell: reading entries with spaces from a textfile into variables - sh

I want to pass around (lots of) information between scripts, ideally having in- /output of a script persistent.
Using simple textfiles seems rather fitting, but problems arise as soon as entries have spaces.
Whats the preferable way of storing variables in one shell script and reading them in another?
In the examples below I expect the output to be
'publicname' 'x86' '"test directory"' '5' rem:'some other parameters in the future'
Version 1:
This would be my prefered format of the textfile, no escaping necessary, spaces would be entered within quotes.
echo 'publicname x86 "test directory" 5 some other parameters in the future
' >/tmp/input.txt
while read -r name archname vcdir rev rem; do
set -e
[ -n "$name" -a "$name" != "#" ] || continue
echo "'$name' '$archname' '$vcdir' '$rev' rem:'$rem'"
done < /tmp/input.txt
Outputs
'publicname' 'x86' '"test' 'directory"' rem:'5 some other parameters in the future'
Version 2:
Produces correct output, put the input needs to be escaped for the shell. This is problematic since ideally the input would be used with other languages, and be generated by shell scripts too (need to escape variables for export)
echo 'publicname x86 test\ directory 5 some other parameters in the future
' >/tmp/input.txt
while read name archname vcdir rev rem; do
set -e
[ -n "$name" -a "$name" != "#" ] || continue
echo "'$name' '$archname' '$vcdir' '$rev' rem:'$rem'"
done < /tmp/input.txt

The general rule is: use a delimiter that can't appear in a field. Let's say a comma is safe:
echo 'publicname,x86,test directory,5,some other parameters in the future' > /tmp/input.txt
while IFS=, read -r name archname vcdir rev rem; do
printf "'%s' '%s' '%s' '%s' rem:'%s'\n" "$name" "$archname" "$vcdir" "$rev" "$rem"
done < /tmp/input.txt
If a comma could appear in a field, you might want to try a less-common delimiter, such as an ASCII control character.
printf 'publicname\037x86\037test directory\0375\037some other parameters in the future\n' > /tmp/input.txt
while IFS=$(printf '\037') read -r name archname vcdir rev rem; do
printf "'%s' '%s' '%s' '%s' rem:'%s'\n" "$name" "$archname" "$vcdir" "$rev" "$rem"
done < /tmp/input.txt
A simpler alternative is to use a newline, then call read multiple times.
printf 'publicname\nx86\ntest directory\n5\nsome other parameters in the future\n' > /tmp/input.txt
while read -r name; read -r archname; read -r vcdir; read -r rev; read -r rem; do
printf "'%s' '%s' '%s' '%s' rem:'%s'\n" "$name" "$archname" "$vcdir" "$rev" "$rem"
done < /tmp/input.txt
As a last resort, you can try using the null character as the delimiter. [No example, because I am blanking on how to handle null-delimited data in Posix shell, but if you've gotten this far without a feasible solution, shell is probably not the right language to be using.]

Related

Bash adding single quotes in variable [duplicate]

Let's say, you have a Bash alias like:
alias rxvt='urxvt'
which works fine.
However:
alias rxvt='urxvt -fg '#111111' -bg '#111111''
won't work, and neither will:
alias rxvt='urxvt -fg \'#111111\' -bg \'#111111\''
So how do you end up matching up opening and closing quotes inside a string once you have escaped quotes?
alias rxvt='urxvt -fg'\''#111111'\'' -bg '\''#111111'\''
seems ungainly although it would represent the same string if you're allowed to concatenate them like that.
If you really want to use single quotes in the outermost layer, remember that you can glue both kinds of quotation. Example:
alias rxvt='urxvt -fg '"'"'#111111'"'"' -bg '"'"'#111111'"'"
# ^^^^^ ^^^^^ ^^^^^ ^^^^
# 12345 12345 12345 1234
Explanation of how '"'"' is interpreted as just ':
' End first quotation which uses single quotes.
" Start second quotation, using double-quotes.
' Quoted character.
" End second quotation, using double-quotes.
' Start third quotation, using single quotes.
If you do not place any whitespaces between (1) and (2), or between (4) and (5), the shell will interpret that string as a one long word.
I always just replace each embedded single quote with the sequence: '\'' (that is: quote backslash quote quote) which closes the string, appends an escaped single quote and reopens the string.
I often whip up a "quotify" function in my Perl scripts to do this for me. The steps would be:
s/'/'\\''/g # Handle each embedded quote
$_ = qq['$_']; # Surround result with single quotes.
This pretty much takes care of all cases.
Life gets more fun when you introduce eval into your shell-scripts. You essentially have to re-quotify everything again!
For example, create a Perl script called quotify containing the above statements:
#!/usr/bin/perl -pl
s/'/'\\''/g;
$_ = qq['$_'];
then use it to generate a correctly-quoted string:
$ quotify
urxvt -fg '#111111' -bg '#111111'
result:
'urxvt -fg '\''#111111'\'' -bg '\''#111111'\'''
which can then be copy/pasted into the alias command:
alias rxvt='urxvt -fg '\''#111111'\'' -bg '\''#111111'\'''
(If you need to insert the command into an eval, run the quotify again:
$ quotify
alias rxvt='urxvt -fg '\''#111111'\'' -bg '\''#111111'\'''
result:
'alias rxvt='\''urxvt -fg '\''\'\'''\''#111111'\''\'\'''\'' -bg '\''\'\'''\''#111111'\''\'\'''\'''\'''
which can be copy/pasted into an eval:
eval 'alias rxvt='\''urxvt -fg '\''\'\'''\''#111111'\''\'\'''\'' -bg '\''\'\'''\''#111111'\''\'\'''\'''\'''
Since Bash 2.04 syntax $'string' allows a limit set of escapes.
Since Bash 4.4, $'string' also allows the full set of C-style escapes, making the behavior differ slightly in $'string' in previous versions. (Previously the $('string') form could be used.)
Simple example in Bash 2.04 and newer:
$> echo $'aa\'bb'
aa'bb
$> alias myvar=$'aa\'bb'
$> alias myvar
alias myvar='aa'\''bb'
In your case:
$> alias rxvt=$'urxvt -fg \'#111111\' -bg \'#111111\''
$> alias rxvt
alias rxvt='urxvt -fg '\''#111111'\'' -bg '\''#111111'\'''
Common escaping sequences works as expected:
\' single quote
\" double quote
\\ backslash
\n new line
\t horizontal tab
\r carriage return
Below is copy+pasted related documentation from man bash (version 4.4):
Words of the form $'string' are treated specially. The word expands to string, with backslash-escaped characters replaced as specified by the ANSI C standard. Backslash escape sequences, if present, are decoded as follows:
\a alert (bell)
\b backspace
\e
\E an escape character
\f form feed
\n new line
\r carriage return
\t horizontal tab
\v vertical tab
\\ backslash
\' single quote
\" double quote
\? question mark
\nnn the eight-bit character whose value is the octal
value nnn (one to three digits)
\xHH the eight-bit character whose value is the hexadecimal
value HH (one or two hex digits)
\uHHHH the Unicode (ISO/IEC 10646) character whose value is
the hexadecimal value HHHH (one to four hex digits)
\UHHHHHHHH the Unicode (ISO/IEC 10646) character whose value
is the hexadecimal value HHHHHHHH (one to eight
hex digits)
\cx a control-x character
The expanded result is single-quoted, as if the dollar sign had not been present.
See Quotes and escaping: ANSI C like strings on bash-hackers.org wiki for more details. Also note that "Bash Changes" file (overview here) mentions a lot for changes and bug fixes related to the $'string' quoting mechanism.
According to unix.stackexchange.com How to use a special character as a normal one? it should work (with some variations) in bash, zsh, mksh, ksh93 and FreeBSD and busybox sh.
I don't see the entry on his blog (link pls?) but according to the gnu reference manual:
Enclosing characters in single quotes
(‘'’) preserves the literal value of
each character within the quotes. A
single quote may not occur between
single quotes, even when preceded by a
backslash.
so bash won't understand:
alias x='y \'z '
however, you can do this if you surround with double quotes:
alias x="echo \'y "
> x
> 'y
I can confirm that using '\'' for a single quote inside a single-quoted string does work in Bash, and it can be explained in the same way as the "gluing" argument from earlier in the thread. Suppose we have a quoted string: 'A '\''B'\'' C' (all quotes here are single quotes). If it is passed to echo, it prints the following: A 'B' C.
In each '\'' the first quote closes the current single-quoted string, the following \' glues a single quote to the previous string (\' is a way to specify a single quote without starting a quoted string), and the last quote opens another single-quoted string.
Both versions are working, either with concatenation by using the escaped single quote character (\'), or with concatenation by enclosing the single quote character within double quotes ("'").
The author of the question did not notice that there was an extra single quote (') at the end of his last escaping attempt:
alias rxvt='urxvt -fg'\''#111111'\'' -bg '\''#111111'\''
│ │┊┊| │┊┊│ │┊┊│ │┊┊│
└─STRING──┘┊┊└─STRIN─┘┊┊└─STR─┘┊┊└─STRIN─┘┊┊│
┊┊ ┊┊ ┊┊ ┊┊│
┊┊ ┊┊ ┊┊ ┊┊│
└┴─────────┴┴───┰───┴┴─────────┴┘│
All escaped single quotes │
│
?
As you can see in the previous nice piece of ASCII/Unicode art, the last escaped single quote (\') is followed by an unnecessary single quote ('). Using a syntax-highlighter like the one present in Notepad++ can prove very helpful.
The same is true for another example like the following one:
alias rc='sed '"'"':a;N;$!ba;s/\n/, /g'"'"
alias rc='sed '\'':a;N;$!ba;s/\n/, /g'\'
These two beautiful instances of aliases show in a very intricate and obfuscated way how a file can be lined down. That is, from a file with a lot of lines you get only one line with commas and spaces between the contents of the previous lines. In order to make sense of the previous comment, the following is an example:
$ cat Little_Commas.TXT
201737194
201802699
201835214
$ rc Little_Commas.TXT
201737194, 201802699, 201835214
Simple example of escaping quotes in shell:
$ echo 'abc'\''abc'
abc'abc
$ echo "abc"\""abc"
abc"abc
It's done by finishing already opened one ('), placing escaped one (\'), then opening another one ('). This syntax works for all commands. It's very similar approach to the 1st answer.
I'm not specifically addressing the quoting issue because, well, sometimes, it's just reasonable to consider an alternative approach.
rxvt() { urxvt -fg "#${1:-000000}" -bg "#${2:-FFFFFF}"; }
which you can then call as:
rxvt 123456 654321
the idea being that you can now alias this without concern for quotes:
alias rxvt='rxvt 123456 654321'
or, if you need to include the # in all calls for some reason:
rxvt() { urxvt -fg "${1:-#000000}" -bg "${2:-#FFFFFF}"; }
which you can then call as:
rxvt '#123456' '#654321'
then, of course, an alias is:
alias rxvt="rxvt '#123456' '#654321'"
(oops, i guess i kind of did address the quoting :)
How to escape single quotes (') and double quotes (") with hex and octal chars
If using something like echo, I've had some really complicated and really weird and hard-to-escape (think: very nested) cases where the only thing I could get to work was using octal or hex codes!
Here are some basic examples just to demonstrate how it works:
1. Single quote example, where ' is escaped with hex \x27 or octal \047 (its corresponding ASCII code):
hex \x27
echo -e "Let\x27s get coding!"
# OR
echo -e 'Let\x27s get coding!'
Result:
Let's get coding!
octal \047
echo -e "Let\047s get coding!"
# OR
echo -e 'Let\047s get coding!'
Result:
Let's get coding!
2. Double quote example, where " is escaped with hex \x22 or octal \042 (its corresponding ASCII code).
Note: bash is nuts! Sometimes even the ! char has special meaning, and must either be removed from within the double quotes and then escaped "like this"\! or put entirely within single quotes 'like this!', rather than within double quotes.
# 1. hex; also escape `!` by removing it from within the double quotes
# and escaping it with `\!`
$ echo -e "She said, \x22Let\x27s get coding"\!"\x22"
She said, "Let's get coding!"
# OR put it all within single quotes:
$ echo -e 'She said, \x22Let\x27s get coding!\x22'
She said, "Let's get coding!"
# 2. octal; also escape `!` by removing it from within the double quotes
$ echo -e "She said, \042Let\047s get coding"\!"\042"
She said, "Let's get coding!"
# OR put it all within single quotes:
$ echo -e 'She said, \042Let\047s get coding!\042'
She said, "Let's get coding!"
# 3. mixed hex and octal, just for fun
# also escape `!` by removing it from within the double quotes when it is followed by
# another escape sequence
$ echo -e "She said, \x22Let\047s get coding! It\x27s waaay past time to begin"\!"\042"
She said, "Let's get coding! It's waaay past time to begin!"
# OR put it all within single quotes:
$ echo -e 'She said, \x22Let\047s get coding! It\x27s waaay past time to begin!\042'
She said, "Let's get coding! It's waaay past time to begin!"
Note that if you don't properly escape !, when needed, as I've shown two ways to do above, you'll get some weird errors, like this:
$ echo -e "She said, \x22Let\047s get coding! It\x27s waaay past time to begin!\042"
bash: !\042: event not found
OR:
$ echo -e "She said, \x22Let\x27s get coding!\x22"
bash: !\x22: event not found
One more alternative: this allows mixed expansion and non-expansion all within the same bash string
Here is another demo of an alternative escaping technique.
First, read the main answer by #liori to see how the 2nd form below works. Now, read these two alternative ways of escaping characters. Both examples below are identical in their output:
CMD="gs_set_title"
# 1. 1st technique: escape the $ symbol with a backslash (\) so it doesn't
# run and expand the command following it
echo "$CMD '\$(basename \"\$(pwd)\")'"
# 2. 2nd technique (does the same thing in a different way): escape the
# $ symbol using single quotes around it, and the single quote (') symbol
# using double quotes around it
echo "$CMD ""'"'$(basename "$(pwd)")'"'"
Sample output:
gs_set_title '$(basename "$(pwd)")'
gs_set_title '$(basename "$(pwd)")'
Note: for my gs_set_title bash function, which I have in my ~/.bash_aliases file somewhere around here, see my other answer here.
References:
https://en.wikipedia.org/wiki/ASCII#Printable_characters
https://serverfault.com/questions/208265/what-is-bash-event-not-found/208266#208266
See also my other answer here: How do I write non-ASCII characters using echo?.
I just use shell codes.. e.g. \x27 or \\x22 as applicable. No hassle, ever really.
Since one cannot put single quotes within single quoted strings, the simplest and most readable option is to use a HEREDOC string
command=$(cat <<'COMMAND'
urxvt -fg '#111111' -bg '#111111'
COMMAND
)
alias rxvt=$command
In the code above, the HEREDOC is sent to the cat command and the output of that is assigned to a variable via the command substitution notation $(..)
Putting a single quote around the HEREDOC is needed since it is within a $()
IMHO the real answer is that you can't escape single-quotes within single-quoted strings.
Its impossible.
If we presume we are using bash.
From bash manual...
Enclosing characters in single quotes preserves the literal value of each
character within the quotes. A single quote may not occur
between single quotes, even when preceded by a backslash.
You need to use one of the other string escape mechanisms " or \
There is nothing magic about alias that demands it use single quotes.
Both the following work in bash.
alias rxvt="urxvt -fg '#111111' -bg '#111111'"
alias rxvt=urxvt\ -fg\ \'#111111\'\ -bg\ \'#111111\'
The latter is using \ to escape the space character.
There is also nothing magic about #111111 that requires single quotes.
The following options achieves the same result the other two options, in that the rxvt alias works as expected.
alias rxvt='urxvt -fg "#111111" -bg "#111111"'
alias rxvt="urxvt -fg \"#111111\" -bg \"#111111\""
You can also escape the troublesome # directly
alias rxvt="urxvt -fg \#111111 -bg \#111111"
A minimal answer is needed so that people can get going without spending a lot of time as I had to sifting through people waxing eloquent.
There is no way to escape single quotes or anything else within single quotes.
The following is, perhaps surprisingly, a complete command:
$ echo '\'
whose output is:
\
Backslashes, surprisingly to even long-time users of bash, have no meaning inside single quotes. Nor does anything else.
Most of these answers hit on the specific case you're asking about. There is a general approach that a friend and I have developed that allows for arbitrary quoting in case you need to quote bash commands through multiple layers of shell expansion, e.g., through ssh, su -c, bash -c, etc. There is one core primitive you need, here in native bash:
quote_args() {
local sq="'"
local dq='"'
local space=""
local arg
for arg; do
echo -n "$space'${arg//$sq/$sq$dq$sq$dq$sq}'"
space=" "
done
}
This does exactly what it says: it shell-quotes each argument individually (after bash expansion, of course):
$ quote_args foo bar
'foo' 'bar'
$ quote_args arg1 'arg2 arg2a' arg3
'arg1' 'arg2 arg2a' 'arg3'
$ quote_args dq'"'
'dq"'
$ quote_args dq'"' sq"'"
'dq"' 'sq'"'"''
$ quote_args "*"
'*'
$ quote_args /b*
'/bin' '/boot'
It does the obvious thing for one layer of expansion:
$ bash -c "$(quote_args echo a'"'b"'"c arg2)"
a"b'c arg2
(Note that the double quotes around $(quote_args ...) are necessary to make the result into a single argument to bash -c.) And it can be used more generally to quote properly through multiple layers of expansion:
$ bash -c "$(quote_args bash -c "$(quote_args echo a'"'b"'"c arg2)")"
a"b'c arg2
The above example:
shell-quotes each argument to the inner quote_args individually and then combines the resulting output into a single argument with the inner double quotes.
shell-quotes bash, -c, and the already once-quoted result from step 1, and then combines the result into a single argument with the outer double quotes.
sends that mess as the argument to the outer bash -c.
That's the idea in a nutshell. You can do some pretty complicated stuff with this, but you have to be careful about order of evaluation and about which substrings are quoted. For instance, the following do the wrong things (for some definition of "wrong"):
$ (cd /tmp; bash -c "$(quote_args cd /; pwd 1>&2)")
/tmp
$ (cd /tmp; bash -c "$(quote_args cd /; [ -e *sbin ] && echo success 1>&2 || echo failure 1>&2)")
failure
In the first example, bash immediately expands quote_args cd /; pwd 1>&2 into two separate commands, quote_args cd / and pwd 1>&2, so the CWD is still /tmp when the pwd command is executed. The second example illustrates a similar problem for globbing. Indeed, the same basic problem occurs with all bash expansions. The problem here is that a command substitution isn't a function call: it's literally evaluating one bash script and using its output as part of another bash script.
If you try to simply escape the shell operators, you'll fail because the resulting string passed to bash -c is just a sequence of individually-quoted strings that aren't then interpreted as operators, which is easy to see if you echo the string that would have been passed to bash:
$ (cd /tmp; echo "$(quote_args cd /\; pwd 1\>\&2)")
'cd' '/;' 'pwd' '1>&2'
$ (cd /tmp; echo "$(quote_args cd /\; \[ -e \*sbin \] \&\& echo success 1\>\&2 \|\| echo failure 1\>\&2)")
'cd' '/;' '[' '-e' '*sbin' ']' '&&' 'echo' 'success' '1>&2' '||' 'echo' 'failure' '1>&2'
The problem here is that you're over-quoting. What you need is for the operators to be unquoted as input to the enclosing bash -c, which means they need to be outside the $(quote_args ...) command substitution.
Consequently, what you need to do in the most general sense is to shell-quote each word of the command not intended to be expanded at the time of command substitution separately, and not apply any extra quoting to the shell operators:
$ (cd /tmp; echo "$(quote_args cd /); $(quote_args pwd) 1>&2")
'cd' '/'; 'pwd' 1>&2
$ (cd /tmp; bash -c "$(quote_args cd /); $(quote_args pwd) 1>&2")
/
$ (cd /tmp; echo "$(quote_args cd /); [ -e *$(quote_args sbin) ] && $(quote_args echo success) 1>&2 || $(quote_args echo failure) 1>&2")
'cd' '/'; [ -e *'sbin' ] && 'echo' 'success' 1>&2 || 'echo' 'failure' 1>&2
$ (cd /tmp; bash -c "$(quote_args cd /); [ -e *$(quote_args sbin) ] && $(quote_args echo success) 1>&2 || $(quote_args echo failure) 1>&2")
success
Once you've done this, the entire string is fair game for further quoting to arbitrary levels of evaluation:
$ bash -c "$(quote_args cd /tmp); $(quote_args bash -c "$(quote_args cd /); $(quote_args pwd) 1>&2")"
/
$ bash -c "$(quote_args bash -c "$(quote_args cd /tmp); $(quote_args bash -c "$(quote_args cd /); $(quote_args pwd) 1>&2")")"
/
$ bash -c "$(quote_args bash -c "$(quote_args bash -c "$(quote_args cd /tmp); $(quote_args bash -c "$(quote_args cd /); $(quote_args pwd) 1>&2")")")"
/
$ bash -c "$(quote_args cd /tmp); $(quote_args bash -c "$(quote_args cd /); [ -e *$(quote_args sbin) ] && $(quote_args echo success) 1>&2 || $(quote_args echo failure) 1>&2")"
success
$ bash -c "$(quote_args bash -c "$(quote_args cd /tmp); $(quote_args bash -c "$(quote_args cd /); [ -e *sbin ] && $(quote_args echo success) 1>&2 || $(quote_args echo failure) 1>&2")")"
success
$ bash -c "$(quote_args bash -c "$(quote_args bash -c "$(quote_args cd /tmp); $(quote_args bash -c "$(quote_args cd /); [ -e *$(quote_args sbin) ] && $(quote_args echo success) 1>&2 || $(quote_args echo failure) 1>&2")")")"
success
etc.
These examples may seem overwrought given that words like success, sbin, and pwd don't need to be shell-quoted, but the key point to remember when writing a script taking arbitrary input is that you want to quote everything you're not absolutely sure doesn't need quoting, because you never know when a user will throw in a Robert'; rm -rf /.
To better understand what is going on under the covers, you can play around with two small helper functions:
debug_args() {
for (( I=1; $I <= $#; I++ )); do
echo -n "$I:<${!I}> " 1>&2
done
echo 1>&2
}
debug_args_and_run() {
debug_args "$#"
"$#"
}
that will enumerate each argument to a command before executing it:
$ debug_args_and_run echo a'"'b"'"c arg2
1:<echo> 2:<a"b'c> 3:<arg2>
a"b'c arg2
$ bash -c "$(quote_args debug_args_and_run echo a'"'b"'"c arg2)"
1:<echo> 2:<a"b'c> 3:<arg2>
a"b'c arg2
$ bash -c "$(quote_args debug_args_and_run bash -c "$(quote_args debug_args_and_run echo a'"'b"'"c arg2)")"
1:<bash> 2:<-c> 3:<'debug_args_and_run' 'echo' 'a"b'"'"'c' 'arg2'>
1:<echo> 2:<a"b'c> 3:<arg2>
a"b'c arg2
$ bash -c "$(quote_args debug_args_and_run bash -c "$(quote_args debug_args_and_run bash -c "$(quote_args debug_args_and_run echo a'"'b"'"c arg2)")")"
1:<bash> 2:<-c> 3:<'debug_args_and_run' 'bash' '-c' ''"'"'debug_args_and_run'"'"' '"'"'echo'"'"' '"'"'a"b'"'"'"'"'"'"'"'"'c'"'"' '"'"'arg2'"'"''>
1:<bash> 2:<-c> 3:<'debug_args_and_run' 'echo' 'a"b'"'"'c' 'arg2'>
1:<echo> 2:<a"b'c> 3:<arg2>
a"b'c arg2
$ bash -c "$(quote_args debug_args_and_run bash -c "$(quote_args debug_args_and_run bash -c "$(quote_args debug_args_and_run bash -c "$(quote_args debug_args_and_run echo a'"'b"'"c arg2)")")")"
1:<bash> 2:<-c> 3:<'debug_args_and_run' 'bash' '-c' ''"'"'debug_args_and_run'"'"' '"'"'bash'"'"' '"'"'-c'"'"' '"'"''"'"'"'"'"'"'"'"'debug_args_and_run'"'"'"'"'"'"'"'"' '"'"'"'"'"'"'"'"'echo'"'"'"'"'"'"'"'"' '"'"'"'"'"'"'"'"'a"b'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'c'"'"'"'"'"'"'"'"' '"'"'"'"'"'"'"'"'arg2'"'"'"'"'"'"'"'"''"'"''>
1:<bash> 2:<-c> 3:<'debug_args_and_run' 'bash' '-c' ''"'"'debug_args_and_run'"'"' '"'"'echo'"'"' '"'"'a"b'"'"'"'"'"'"'"'"'c'"'"' '"'"'arg2'"'"''>
1:<bash> 2:<-c> 3:<'debug_args_and_run' 'echo' 'a"b'"'"'c' 'arg2'>
1:<echo> 2:<a"b'c> 3:<arg2>
a"b'c arg2
In the given example, simply used double quotes instead of single quotes as outer escape mechanism:
alias rxvt="urxvt -fg '#111111' -bg '#111111'"
This approach is suited for many cases where you just want to pass a fixed string to a command: Just check how the shell will interpret the double-quoted string through an echo, and escape characters with backslash if necessary.
In the example, you'd see that double quotes are sufficient to protect the string:
$ echo "urxvt -fg '#111111' -bg '#111111'"
urxvt -fg '#111111' -bg '#111111'
Here is an elaboration on The One True Answer referenced above:
Sometimes I will be downloading using rsync over ssh and have to escape a filename with a ' in it TWICE! (OMG!) Once for bash and once for ssh. The same principle of alternating quotation delimiters is at work here.
For example, let's say we want to get: Louis Theroux's LA Stories ...
First you enclose Louis Theroux in single quotes for bash and double quotes for ssh:
'"Louis Theroux"'
Then you use single quotes to escape a double quote '"'
The use double quotes to escape the apostrophe "'"
Then repeat #2, using single quotes to escape a double quote '"'
Then enclose LA Stories in single quotes for bash and double quotes for ssh: '"LA Stories"'
And behold! You wind up with this:
rsync -ave ssh '"Louis Theroux"''"'"'"'"''"s LA Stories"'
which is an awful lot of work for one little ' -- but there you go
Obviously, it would be easier simply to surround with double quotes, but where's the challenge in that? Here is the answer using only single quotes. I'm using a variable instead of alias so that's it's easier to print for proof, but it's the same as using alias.
$ rxvt='urxvt -fg '\''#111111'\'' -bg '\''#111111'\'
$ echo $rxvt
urxvt -fg '#111111' -bg '#111111'
Explanation
The key is that you can close the single quote and re-open it as many times as you want. For example foo='a''b' is the same as foo='ab'. So you can close the single quote, throw in a literal single quote \', then reopen the next single quote.
Breakdown diagram
This diagram makes it clear by using brackets to show where the single quotes are opened and closed. Quotes are not "nested" like parentheses can be. You can also pay attention to the color highlighting, which is correctly applied. The quoted strings are maroon, whereas the \' is black.
'urxvt -fg '\''#111111'\'' -bg '\''#111111'\' # original
[^^^^^^^^^^] ^[^^^^^^^] ^[^^^^^] ^[^^^^^^^] ^ # show open/close quotes
urxvt -fg ' #111111 ' -bg ' #111111 ' # literal characters remaining
(This is essentially the same answer as Adrian's, but I feel this explains it better. Also his answer has 2 superfluous single quotes at the end.)
in addition to #JasonWoof perfect answer i want to show how i solved related problem
in my case encoding single quotes with '\'' will not always be sufficient, for example if a string must quoted with single quotes, but the total count of quotes results in odd amount
#!/bin/bash
# no closing quote
string='alecxs\'solution'
# this works for string
string="alecxs'solution"
string=alecxs\'solution
string='alecxs'\''solution'
let's assume string is a file name and we need to save quoted file names in a list (like stat -c%N ./* > list)
echo "'$string'" > "$string"
cat "$string"
but processing this list will fail (depending on how many quotes the string does contain in total)
while read file
do
ls -l "$file"
eval ls -l "$file"
done < "$string"
workaround: encode quotes with string manipulation
string="${string//$'\047'/\'\$\'\\\\047\'\'}"
# result
echo "$string"
now it works because quotes are always balanced
echo "'$string'" > list
while read file
do
ls -l "$file"
eval ls -l "$file"
done < list
Hope this helps when facing similar problem
Another way to fix the problem of too many layers of nested quotation:
You are trying to cram too much into too tiny a space, so use a bash function.
The problem is you are trying to have too many levels of nesting, and the basic alias technology is not powerful enough to accommodate. Use a bash function like this to make it so the single, double quotes back ticks and passed in parameters are all handled normally as we would expect:
lets_do_some_stuff() {
tmp=$1 #keep a passed in parameter.
run_your_program $# #use all your passed parameters.
echo -e '\n-------------' #use your single quotes.
echo `date` #use your back ticks.
echo -e "\n-------------" #use your double quotes.
}
alias foobarbaz=lets_do_some_stuff
Then you can use your $1 and $2 variables and single, double quotes and back ticks without worrying about the alias function wrecking their integrity.
This program prints:
el#defiant ~/code $ foobarbaz alien Dyson ring detected #grid 10385
alien Dyson ring detected #grid 10385
-------------
Mon Oct 26 20:30:14 EDT 2015
-------------
shell_escape () {
printf '%s' "'${1//\'/\'\\\'\'}'"
}
Implementation explanation:
double quotes so we can easily output wrapping single quotes and use the ${...} syntax
bash's search and replace looks like: ${varname//search/replacement}
we're replacing ' with '\''
'\'' encodes a single ' like so:
' ends the single quoting
\' encodes a ' (the backslash is needed because we're not inside quotes)
' starts up single quoting again
bash automatically concatenates strings with no white space between
there's a \ before every \ and ' because that's the escaping rules for ${...//.../...} .
string="That's "'##$*&^`(##'
echo "original: $string"
echo "encoded: $(shell_escape "$string")"
echo "expanded: $(bash -c "echo $(shell_escape "$string")")"
P.S. Always encode to single quoted strings because they are way simpler than double quoted strings.
Here are my two cents -- in the case if one wants to be sh-portable, not just bash-specific ( the solution is not too efficient, though, as it starts an external program -- sed ):
put this in quote.sh ( or just quote ) somewhere on your PATH :
# this works with standard input (stdin)
quote() {
echo -n "'" ;
sed 's/\(['"'"']['"'"']*\)/'"'"'"\1"'"'"'/g' ;
echo -n "'"
}
case "$1" in
-) quote ;;
*) echo "usage: cat ... | quote - # single-quotes input for Bourne shell" 2>&1 ;;
esac
An example:
$ echo -n "G'day, mate!" | ./quote.sh -
'G'"'"'day, mate!'
And, of course, that converts back:
$ echo 'G'"'"'day, mate!'
G'day, mate!
Explanation: basically we have to enclose the input with quotes ', and then also replace any single quote within with this micro-monster: '"'"' ( end the opening quote with a pairing ', escape the found single quote by wrapping it with double quotes -- "'", and then finally issue a new opening single quote ', or in pseudo-notation : ' + "'" + ' == '"'"' )
One standard way to do that would be to use sed with the following substitution command:
s/\(['][']*\)/'"\1"'/g
One small problem, though, is that in order to use that in shell one needs to escape all these single quote characters in the sed expression itself -- what leads to something like
sed 's/\(['"'"']['"'"']*\)/'"'"'"\1"'"'"'/g'
( and one good way to build this result is to feed the original expression s/\(['][']*\)/'"\1"'/g to Kyle Rose' or George V. Reilly's scripts ).
Finally, it kind of makes sense to expect the input to come from stdin -- since passing it through command-line arguments could be already too much trouble.
( Oh, and may be we want to add a small help message so that the script does not hang when someone just runs it as ./quote.sh --help wondering what it does. )
If you're generating the shell string within Python 2 or Python 3, the following may help to quote the arguments:
#!/usr/bin/env python
from __future__ import print_function
try: # py3
from shlex import quote as shlex_quote
except ImportError: # py2
from pipes import quote as shlex_quote
s = """foo ain't "bad" so there!"""
print(s)
print(" ".join([shlex_quote(t) for t in s.split()]))
This will output:
foo ain't "bad" so there!
foo 'ain'"'"'t' '"bad"' so 'there!'
If you have GNU Parallel installed you can use its internal quoting:
$ parallel --shellquote
L's 12" record
<Ctrl-D>
'L'"'"'s 12" record'
$ echo 'L'"'"'s 12" record'
L's 12" record
From version 20190222 you can even --shellquote multiple times:
$ parallel --shellquote --shellquote --shellquote
L's 12" record
<Ctrl-D>
'"'"'"'"'"'"'L'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'s 12" record'"'"'"'"'"'"'
$ eval eval echo '"'"'"'"'"'"'L'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'s 12" record'"'"'"'"'"'"'
L's 12" record
It will quote the string in all supported shells (not only bash).
This function:
quote ()
{
local quoted=${1//\'/\'\\\'\'};
printf "'%s'" "$quoted"
}
allows quoting of ' inside '. Use as this:
$ quote "urxvt -fg '#111111' -bg '#111111'"
'urxvt -fg '\''#111111'\'' -bg '\''#111111'\'''
If the line to quote gets more complex, like double quotes mixed with single quotes, it may become quite tricky to get the string to quote inside a variable. When such cases show up, write the exact line that you need to quote inside an script (similar to this).
#!/bin/bash
quote ()
{
local quoted=${1//\'/\'\\\'\'};
printf "'%s'" "$quoted"
}
while read line; do
quote "$line"
done <<-\_lines_to_quote_
urxvt -fg '#111111' -bg '#111111'
Louis Theroux's LA Stories
'single quote phrase' "double quote phrase"
_lines_to_quote_
Will output:
'urxvt -fg '\''#111111'\'' -bg '\''#111111'\'''
'Louis Theroux'\''s LA Stories'
''\''single quote phrase'\'' "double quote phrase"'
All correctly quoted strings inside single quotes.
Here is another solution. This function will take a single argument and appropriately quote it using the single-quote character, just as the voted answer above explains:
single_quote() {
local quoted="'"
local i=0
while [ $i -lt ${#1} ]; do
local ch="${1:i:1}"
if [[ "$ch" != "'" ]]; then
quoted="$quoted$ch"
else
local single_quotes="'"
local j=1
while [ $j -lt ${#1} ] && [[ "${1:i+j:1}" == "'" ]]; do
single_quotes="$single_quotes'"
((j++))
done
quoted="$quoted'\"$single_quotes\"'"
((i+=j-1))
fi
((i++))
done
echo "$quoted'"
}
So, you can use it this way:
single_quote "1 2 '3'"
'1 2 '"'"'3'"'"''
x="this text is quoted: 'hello'"
eval "echo $(single_quote "$x")"
this text is quoted: 'hello'

Passing a Variable to SED Command

What is the syntax to pass a variable to sed command that updates the second column in a CSV file. The variable name is $tag
This is the command I have used but I don't know where to put the variable exactly.
basename "$dec" | sed 's/.*/&,A/' >> home/kelsabry/Downloads/Tests/results.csv
where $decis variable that returns to me a certain directory.
Output:
Downloads, A
Documents, A
etc.
My command to pass the variable into sed to update the second column was:
basename "$dec" | sed 's/.*/&,'$tag'/' >> home/kelsabry/Downloads/Tests/results.csv
but it gave me this output:
Downloads, '$tag'
Documents, '$tag'
etc.
So, where should I write the variable $tag in sed command?
Unfortunately, sed is neither aware of fields nor capable of accepting variables. For that, you'd use shell or awk or shell or some other language.
sed is a Stream EDitor and in your example is taking input from stdin, not a variable.
If you do want to embed shell variables inside a sed script, understand that you are basically creating your sed script on-the-fly, and it's important to make sure you do it safely.
For example, if there's the possibility that your $tag variable might contain something that will cause misinterpretation of the sed script (i.e. perhaps it came from user input),
you need protection. In POSIX shell, perhaps something like this:
if [ "$tag" != "${tag#*[!A-Z]}" ]; then
printf 'ERROR: invalid tag\n' >&2
exit 1
fi
or even:
case "$tag" in
[A-Z]) : ;;
*) printf 'ERROR: invalid tag\n' >&2; exit 1 ;;
esac
then
# Note the alternative to `basename`
echo "${dec##*/}" | sed 's/$/,'"$tag"'/' >> path/to/file.csv
Note that sed doesn't know anything about fields or CSV. sed is simply being used to append a string on to the end of the line.
Of course, in csh (which perhaps shouldn't be used for scripted automation), you are missing the more useful parameter expansion tools, but you can still protect yourself in other ways:
if ( $%tag == 1 ) then
switch ($tag)
case [A-Z]:
printf '%s,%s\n' `basename "$dec"` "$tag"
breaksw
default:
printf 'ERROR: invalid tag\n'
exit 1
breaksw
endsw
else
printf 'ERROR: invalid tag\n'
exit 1
endif
(Note: this is untested. Mileage varies based on multiple conditions. May contain nuts.)
The issue you listed in your question was a quoting problem. You said: sed 's/.*/&,'$tag'/' >.
An alternative might be to use awk:
echo "${dec##*/}" | awk -v tag="$tag" '{print $0 OFS tag}' OFS=, >> path/to/file.csv
Awk is a more complete programming language, and supports named variables, unlike sed. The -v option allows you to pre-load an awk variable with the contents of a shell variable.
CSH is considered harmful by some. I'd recommend doing this in a POSIX shell instead, if only to take advantage of the much larger pool of experts who can help with your scripting questions. :)

fish shell: Is it possible to conveniently strip extensions?

Is there any convenient way to strip an arbitrary extension from a file name, something à la bash ${i%%.*}? Do I stick to my friend sed?
If you know the extension (eg _bak, a common usecase) this is possibly more convenient:
for f in (ls *_bak)
mv $f (basename $f _bak)
end
Nope. fish has a much smaller feature set than bash, relying on external commands:
$ set filename foo.bar.baz
$ set rootname (echo $filename | sed 's/\.[^.]*$//')
$ echo $rootname
foo.bar
You can strip off the extension from a filename using the string command:
echo (string split -r -m1 . $filename)[1]
This will split filename at the right-most dot and print the first element of the resulting list. If there is no dot, that list will contain a single element with filename.
If you also need to strip off leading directories, combine it with basename:
echo (basename $filename | string split -r -m1 .)[1]
In this example, string reads its input from stdin rather than being passed the filename as a command line argument.
--- Update 2022-08-02 ---
As of fish 3.5+, there is a path command (docs) which was designed to handle stripping extensions:
$ touch test.txt.bak
$ path change-extension '' ./test.txt.bak
test.txt
You can also strip a set number of extensions:
set --local file ./test.txt.1.2.3
for i in (seq 3)
set file (path change-extension '' $file)
end
echo $file
# ./test.txt
Or strip all extensions:
set --local file ./test.txt.1.2.3
while path extension $file
set file (path change-extension '' $file)
end
echo $file
# ./test
--- Original answer ---
The fish string command is still the canonical way to handle this. It has some really nice sub commands that haven't been shown in other answers yet.
split lets you split from the right with a max of 1, so that you just get the last extension.
for f in *
echo (string split -m1 -r '.' "$f")[1]
end
replace lets you use a regex to lop off the extension, defined as the final dot to the end of the string
for f in *
string replace -r '\.[^\.]*$' '' "$f"
end
man string for more info and some great examples.
Update:
If your system has proper basename and dirname utilities, you can use something like this:
function stripext \
--description "strip file extension"
for arg in $argv
echo (dirname $arg)/(string replace -r '\.[^\.]+$' '' (basename $arg))
end
end
With the string match function built into fish you can do
set rootname (string match -r "(.*)\.[^\.]*\$" $filename)[2]
The string match returns a list of 2 items. The first is the whole string, and the second one is the first regexp match (the stuff inside the parentheses in the regex). So, we grab the second one with the [2].
I too need a function to split random files root and extension. Rather than re-implementing naively the feature at the risk of meeting caveats (ex: dot before separator), I am forwarding the task to Python's built-in POSIX path libraries and inherit from their expertise.
Here is an humble example of what one may prefer:
function splitext --description "Print filepath(s) root, stem or extension"
argparse 'e/ext' 's/stem' -- $argv
for arg in $argv
if set -q _flag_ext
set cmd 'import os' \
"_, ext = os.path.splitext('$arg')" \
'print(ext)'
else if set -q _flag_stem
set cmd 'from pathlib import Path' \
"p = Path('$arg')" \
'print(p.stem)'
else
set cmd 'import os' \
"root, _ = os.path.splitext('$arg')" \
'print(root)'
end
python3 -c (string join ';' $cmd)
end
end
Examples:
$ splitext /this/is.a/test.path
/this/is.a/test
$ splitext --ext /this/is.a/test.path
.path
$ splitext --stem /this/is.a/test.path
test
$ splitext /this/is.another/test
/this/is.another/test

perl -pe to manipulate filenames

I was trying to do some quick filename cleanup at the shell (zsh, if it matters). Renaming files. (I'm using cp instead of mv just to be safe)
foreach f (\#*.ogg)
cp $f `echo $f | perl -pe 's/\#\d+ (.+)$/"\1"/'`
end
Now, I know there are tools to do stuff like this, but for personal interest I'm wondering how I can do it this way. Right now, I get an error:
cp: target `When.ogg"' is not a directory
Where 'When.ogg' is the last part of the filename. I've tried adding quotes (see above) and escaping the spaces, but nonetheless this is what I get.
Is there a reason I can't use the output of s perl pmr=;omrt as the final argument to another command line tool?
It looks like you have a space in the file names being processed, so each of your cp command lines evaluates to something like
cp \#nnnn When.Ogg When.ogg
When the cp command sees more than two arguments, the last one must be a target directory name for all the files to be copied to - hence the error message. Because your source filename ($f) contains a space it is being treated as two arguments - cp sees three args, rather than the two you intend.
If you put double quotes around the first $f that should prevent the two 'halves' of the name from being treated as separate file names:
cp "$f" `echo ...
This is what you need in bash, hope it's good for zsh too.
cp "$f" "`echo $f | perl -pe 's/\#\d+ (.+)$/\1/'`"
If the filename contains spaces, you also have quote the second argument of cp.
I often use
dir /b ... | perl -nle"$o=$_; s/.../.../; $n=$_; rename $o,$n if !-e $n"
The -l chomps the input.
The -e check is to avoid accidentally renaming all the files to one name. I've done that a couple of times.
In bash (and I'm guessing zsh), that would be
foreach f (...)
echo "$f" | perl -nle'$o=$_; s/.../.../; $n=$_; rename $o,$n if !-e $n'
end
or
find -name '...' -maxdepth 1 \
| perl -nle'$o=$_; s/.../.../; $n=$_; rename $o,$n if !-e $n'
or
find -name '...' -maxdepth 1 -exec \
perl -e'for (#ARGV) {
$o=$_; s/.../.../; $n=$_;
rename $o,$n if !-e $n;
}' {} +
The last supports file names with newlines in them.

Perl's diamond operator: can it be done in bash?

Is there an idiomatic way to simulate Perl's diamond operator in bash? With the diamond operator,
script.sh | ...
reads stdin for its input and
script.sh file1 file2 | ...
reads file1 and file2 for its input.
One other constraint is that I want to use the stdin in script.sh for something else other than input to my own script. The below code does what I want for the file1 file2 ... case above, but not for data provided on stdin.
command - $# <<EOF
some_code_for_first_argument_of_command_here
EOF
I'd prefer a Bash solution but any Unix shell is OK.
Edit: for clarification, here is the content of script.sh:
#!/bin/bash
command - $# <<EOF
some_code_for_first_argument_of_command_here
EOF
I want this to work the way the diamond operator would work in Perl, but it only handles filenames-as-arguments right now.
Edit 2: I can't do anything that goes
cat XXX | command
because the stdin for command is not the user's data. The stdin for command is my data in the here-doc. I would like the user data to come in on the stdin of my script, but it can't be the stdin of the call to command inside my script.
Sure, this is totally doable:
#!/bin/bash
cat $# | some_command_goes_here
Users can then call your script with no arguments (or '-') to read from stdin, or multiple files, all of which will be read.
If you want to process the contents of those files (say, line-by-line), you could do something like this:
for line in $(cat $#); do
echo "I read: $line"
done
Edit: Changed $* to $# to handle spaces in filenames, thanks to a helpful comment.
Kind of cheezy, but how about
cat file1 file2 | script.sh
I am (like everyone else, it seems) a bit confused about exactly what the goal is here, so I'll give three possible answers that may cover what you actually want. First, the relatively simple goal of getting the script to read from either a list of files (supplied on the command line) or from its regular stdin:
if [ $# -gt 0 ]; then
exec < <(cat "$#")
fi
# From this point on, the script's stdin is redirected from the files
# (if any) supplied on the command line
Note: the double-quoted use of $# is the best way to avoid problems with funny characters (e.g. spaces) in filenames -- $* and unquoted $# both mess this up. The <() trick I'm using here is a bash-only feature; it fires off cat in the background to feed data from files supplied on the command line, and then we use exec to replace the script's stdin with the output from cat.
...but that doesn't seem to be what you actually want. What you seem to really want is to pass the supplied filenames or the script's stdin as arguments to a command inside the script. This requires sort of the opposite process: converting the script's stdin into a file (actually a named pipe) whose name can be passed to the command. Like this:
if [[ $# -gt 0 ]]; then
command "$#" <<EOF
here-doc goes here
EOF
else
command <(cat) <<EOF
here-doc goes here
EOF
fi
This uses <() to launder the script's stdin through cat to a named pipe, which is then passed to command as an argument. Meanwhile, command's stdin is taken from the here-doc.
Now, I think that's what you want to do, but it's not quite what you've asked for, which is to both redirect the script's stdin from the supplied files and pass stdin to the command inside the script. This can be done by combining the above techniques:
if [ $# -gt 0 ]; then
exec < <(cat "$#")
fi
command <(cat) <<EOF
here-doc goes here
EOF
...although I can't think why you'd actually want to do this.
The Perl diamond operator essentially loops across all the command line arguments, treating each as a filename. It opens each file and reads them line-by-line. Here's some bash code that will do approximately the same.
for f in "$#"
do
# Do something with $f, such as...
cat $f | command1 | command2
-or-
command1 < $f
-or-
# Read $f line-by-line
cat $f | while read line_from_f
do
# Do stuff with $line_from_f
done
done
You want to take the first argument and do something with it, and then either read from any files specified or stdin if no files?
Personally, I'd suggest using getopt to indicate arguments using the "-a value" syntax to help disambiguate, but that's just me. Here's how I'd do it in bash without getopts:
firstarg=${1?:usage: $0 arg [file1 .. fileN]}
shift
typeset -a files
if [[ ${##} -gt 0 ]]
then
files=( "$#" )
else
files=( "/dev/stdin" )
fi
for file in "${files[#]}"
do
whatever_you_want < "$file"
done
The ?: operator will die if there are no args specified, since you seem to want at least one arg either way. After grabbing that, shift the args over by one, and then either use the remaining args as your file list, or the bash special filehandle "/dev/stdin" if there were no other args.
I think that the "if no files are specified, use /dev/stdin - otherwise use the files on the command line" piece is probably what you're looking for, but the rest of the code is at least useful for context.
Also a little cheezy, but how about this:
if [[ $# -eq 0 ]]
then
# read from stdin
else
# read from $* (args)
fi
If you need to read and process line-by-line (which is likely) and don't want to copy/paste the same code twice (which is likely), define a function in your script and just pass the lines one-by-one to this function, and process them in said function.
Why not use ``cat #* in the script? For example:
x=`cat $*`
echo $x