DOS/Windows xmlstarlet usage with a String instead of a xml file - command-line

can xmlstarlet be used with a String instead of a xml file?
e.g.:
xmlstarlet sel -t -v "/*" "<pathlist><path>C:\file.txt</path></pathlist>"
instead of
xmlstarlet sel -t -v "/*" pathlist.xml
or how else could i realize with a string ?
when i echo the string and pipe it to xmlstarlet it does not work:
SET "_var=^<pathlist^>^<path^>C:\file.txt^</path^> ^</pathlist^>"
&
call echo %^_var% | xmlstarlet sel -t -v "//*"
gives error:
< was unexpected at this time.
-:1.1: Document is empty
^
-:1.1: Start tag expected, '<' not found
^
this is a simple task actually, but i cant get it to work. i just want to echo a string to xmlstarlet within a One-Liner.

cmd.exe syntax is weird, the following trick using set /p seems to work:
C:\tmp><nul (set /p ="<pathlist><path>C:\file.txt</path></pathlist>") | xmlstarlet sel -t -v /*
C:\file.txt
/* may get glob expanded (depending on what files you have). Unfortunately, there is no way to quote it from cmd.exe (the expansion is performed by libc on behalf of xmlstarlet), so you will have to rewrite the XPath in that case, e.g. /pathlist instead.
Source: https://groups.google.com/d/msg/alt.msdos.batch.nt/RNug94fXI5s/BdgYJfNmXysJ via http://www.netikka.net/tsneti/info/tscmd047.htm
I found no explanation of why escaping <> doesn't work with | redirection??
C:\tmp> echo ^<^>
<>
C:\tmp> echo ^<^> | more
> was unexpected at this time.

Related

How to perform UTF-8 encoding using xmlstarlet fo --encode option?

The synopsis for xmlstarlet fo says
XMLStarlet Toolkit: Format XML document
Usage: xmlstarlet fo [<options>] <xml-file>
where <options> are
-n or --noindent - do not indent
-t or --indent-tab - indent output with tabulation
-s or --indent-spaces <num> - indent output with <num> spaces
-o or --omit-decl - omit xml declaration <?xml version="1.0"?>
--net - allow network access
-R or --recover - try to recover what is parsable
-D or --dropdtd - remove the DOCTYPE of the input docs
-C or --nocdata - replace cdata section with text nodes
-N or --nsclean - remove redundant namespace declarations
-e or --encode <encoding> - output in the given encoding (utf-8, unicode...)
-H or --html - input is HTML
-h or --help - print help
When I run
cat unformatted.html | xmlstarlet fo -H -R --encode utf-8
I am returned the error message
failed to load external entity "utf-8"
In my limited experience, xmlstarlet fo especially, needs the stdin dash to work (better).
In your example, the 'unformatted.html' contents are piped to xmlstarlet.
But xmlstarlet fo doesn't 'see' the piped input, if you don't use a - (dash).
It assumes that the last argument (utf-8) is the filename ("external entity") whose contents you're trying to format. Obviously, there's no such file. Just to be on the safe side, I'd also enclose the encoding argument with double quotes, like so: "utf-8".
Altering your statement to
xmlstarlet fo -H -R --encode "utf-8" unformatted.html
should do the trick.
The cat is unnecessary, I'd think.

Bash adding single quotes in variable [duplicate]

Let's say, you have a Bash alias like:
alias rxvt='urxvt'
which works fine.
However:
alias rxvt='urxvt -fg '#111111' -bg '#111111''
won't work, and neither will:
alias rxvt='urxvt -fg \'#111111\' -bg \'#111111\''
So how do you end up matching up opening and closing quotes inside a string once you have escaped quotes?
alias rxvt='urxvt -fg'\''#111111'\'' -bg '\''#111111'\''
seems ungainly although it would represent the same string if you're allowed to concatenate them like that.
If you really want to use single quotes in the outermost layer, remember that you can glue both kinds of quotation. Example:
alias rxvt='urxvt -fg '"'"'#111111'"'"' -bg '"'"'#111111'"'"
# ^^^^^ ^^^^^ ^^^^^ ^^^^
# 12345 12345 12345 1234
Explanation of how '"'"' is interpreted as just ':
' End first quotation which uses single quotes.
" Start second quotation, using double-quotes.
' Quoted character.
" End second quotation, using double-quotes.
' Start third quotation, using single quotes.
If you do not place any whitespaces between (1) and (2), or between (4) and (5), the shell will interpret that string as a one long word.
I always just replace each embedded single quote with the sequence: '\'' (that is: quote backslash quote quote) which closes the string, appends an escaped single quote and reopens the string.
I often whip up a "quotify" function in my Perl scripts to do this for me. The steps would be:
s/'/'\\''/g # Handle each embedded quote
$_ = qq['$_']; # Surround result with single quotes.
This pretty much takes care of all cases.
Life gets more fun when you introduce eval into your shell-scripts. You essentially have to re-quotify everything again!
For example, create a Perl script called quotify containing the above statements:
#!/usr/bin/perl -pl
s/'/'\\''/g;
$_ = qq['$_'];
then use it to generate a correctly-quoted string:
$ quotify
urxvt -fg '#111111' -bg '#111111'
result:
'urxvt -fg '\''#111111'\'' -bg '\''#111111'\'''
which can then be copy/pasted into the alias command:
alias rxvt='urxvt -fg '\''#111111'\'' -bg '\''#111111'\'''
(If you need to insert the command into an eval, run the quotify again:
$ quotify
alias rxvt='urxvt -fg '\''#111111'\'' -bg '\''#111111'\'''
result:
'alias rxvt='\''urxvt -fg '\''\'\'''\''#111111'\''\'\'''\'' -bg '\''\'\'''\''#111111'\''\'\'''\'''\'''
which can be copy/pasted into an eval:
eval 'alias rxvt='\''urxvt -fg '\''\'\'''\''#111111'\''\'\'''\'' -bg '\''\'\'''\''#111111'\''\'\'''\'''\'''
Since Bash 2.04 syntax $'string' allows a limit set of escapes.
Since Bash 4.4, $'string' also allows the full set of C-style escapes, making the behavior differ slightly in $'string' in previous versions. (Previously the $('string') form could be used.)
Simple example in Bash 2.04 and newer:
$> echo $'aa\'bb'
aa'bb
$> alias myvar=$'aa\'bb'
$> alias myvar
alias myvar='aa'\''bb'
In your case:
$> alias rxvt=$'urxvt -fg \'#111111\' -bg \'#111111\''
$> alias rxvt
alias rxvt='urxvt -fg '\''#111111'\'' -bg '\''#111111'\'''
Common escaping sequences works as expected:
\' single quote
\" double quote
\\ backslash
\n new line
\t horizontal tab
\r carriage return
Below is copy+pasted related documentation from man bash (version 4.4):
Words of the form $'string' are treated specially. The word expands to string, with backslash-escaped characters replaced as specified by the ANSI C standard. Backslash escape sequences, if present, are decoded as follows:
\a alert (bell)
\b backspace
\e
\E an escape character
\f form feed
\n new line
\r carriage return
\t horizontal tab
\v vertical tab
\\ backslash
\' single quote
\" double quote
\? question mark
\nnn the eight-bit character whose value is the octal
value nnn (one to three digits)
\xHH the eight-bit character whose value is the hexadecimal
value HH (one or two hex digits)
\uHHHH the Unicode (ISO/IEC 10646) character whose value is
the hexadecimal value HHHH (one to four hex digits)
\UHHHHHHHH the Unicode (ISO/IEC 10646) character whose value
is the hexadecimal value HHHHHHHH (one to eight
hex digits)
\cx a control-x character
The expanded result is single-quoted, as if the dollar sign had not been present.
See Quotes and escaping: ANSI C like strings on bash-hackers.org wiki for more details. Also note that "Bash Changes" file (overview here) mentions a lot for changes and bug fixes related to the $'string' quoting mechanism.
According to unix.stackexchange.com How to use a special character as a normal one? it should work (with some variations) in bash, zsh, mksh, ksh93 and FreeBSD and busybox sh.
I don't see the entry on his blog (link pls?) but according to the gnu reference manual:
Enclosing characters in single quotes
(‘'’) preserves the literal value of
each character within the quotes. A
single quote may not occur between
single quotes, even when preceded by a
backslash.
so bash won't understand:
alias x='y \'z '
however, you can do this if you surround with double quotes:
alias x="echo \'y "
> x
> 'y
I can confirm that using '\'' for a single quote inside a single-quoted string does work in Bash, and it can be explained in the same way as the "gluing" argument from earlier in the thread. Suppose we have a quoted string: 'A '\''B'\'' C' (all quotes here are single quotes). If it is passed to echo, it prints the following: A 'B' C.
In each '\'' the first quote closes the current single-quoted string, the following \' glues a single quote to the previous string (\' is a way to specify a single quote without starting a quoted string), and the last quote opens another single-quoted string.
Both versions are working, either with concatenation by using the escaped single quote character (\'), or with concatenation by enclosing the single quote character within double quotes ("'").
The author of the question did not notice that there was an extra single quote (') at the end of his last escaping attempt:
alias rxvt='urxvt -fg'\''#111111'\'' -bg '\''#111111'\''
│ │┊┊| │┊┊│ │┊┊│ │┊┊│
└─STRING──┘┊┊└─STRIN─┘┊┊└─STR─┘┊┊└─STRIN─┘┊┊│
┊┊ ┊┊ ┊┊ ┊┊│
┊┊ ┊┊ ┊┊ ┊┊│
└┴─────────┴┴───┰───┴┴─────────┴┘│
All escaped single quotes │
│
?
As you can see in the previous nice piece of ASCII/Unicode art, the last escaped single quote (\') is followed by an unnecessary single quote ('). Using a syntax-highlighter like the one present in Notepad++ can prove very helpful.
The same is true for another example like the following one:
alias rc='sed '"'"':a;N;$!ba;s/\n/, /g'"'"
alias rc='sed '\'':a;N;$!ba;s/\n/, /g'\'
These two beautiful instances of aliases show in a very intricate and obfuscated way how a file can be lined down. That is, from a file with a lot of lines you get only one line with commas and spaces between the contents of the previous lines. In order to make sense of the previous comment, the following is an example:
$ cat Little_Commas.TXT
201737194
201802699
201835214
$ rc Little_Commas.TXT
201737194, 201802699, 201835214
Simple example of escaping quotes in shell:
$ echo 'abc'\''abc'
abc'abc
$ echo "abc"\""abc"
abc"abc
It's done by finishing already opened one ('), placing escaped one (\'), then opening another one ('). This syntax works for all commands. It's very similar approach to the 1st answer.
I'm not specifically addressing the quoting issue because, well, sometimes, it's just reasonable to consider an alternative approach.
rxvt() { urxvt -fg "#${1:-000000}" -bg "#${2:-FFFFFF}"; }
which you can then call as:
rxvt 123456 654321
the idea being that you can now alias this without concern for quotes:
alias rxvt='rxvt 123456 654321'
or, if you need to include the # in all calls for some reason:
rxvt() { urxvt -fg "${1:-#000000}" -bg "${2:-#FFFFFF}"; }
which you can then call as:
rxvt '#123456' '#654321'
then, of course, an alias is:
alias rxvt="rxvt '#123456' '#654321'"
(oops, i guess i kind of did address the quoting :)
How to escape single quotes (') and double quotes (") with hex and octal chars
If using something like echo, I've had some really complicated and really weird and hard-to-escape (think: very nested) cases where the only thing I could get to work was using octal or hex codes!
Here are some basic examples just to demonstrate how it works:
1. Single quote example, where ' is escaped with hex \x27 or octal \047 (its corresponding ASCII code):
hex \x27
echo -e "Let\x27s get coding!"
# OR
echo -e 'Let\x27s get coding!'
Result:
Let's get coding!
octal \047
echo -e "Let\047s get coding!"
# OR
echo -e 'Let\047s get coding!'
Result:
Let's get coding!
2. Double quote example, where " is escaped with hex \x22 or octal \042 (its corresponding ASCII code).
Note: bash is nuts! Sometimes even the ! char has special meaning, and must either be removed from within the double quotes and then escaped "like this"\! or put entirely within single quotes 'like this!', rather than within double quotes.
# 1. hex; also escape `!` by removing it from within the double quotes
# and escaping it with `\!`
$ echo -e "She said, \x22Let\x27s get coding"\!"\x22"
She said, "Let's get coding!"
# OR put it all within single quotes:
$ echo -e 'She said, \x22Let\x27s get coding!\x22'
She said, "Let's get coding!"
# 2. octal; also escape `!` by removing it from within the double quotes
$ echo -e "She said, \042Let\047s get coding"\!"\042"
She said, "Let's get coding!"
# OR put it all within single quotes:
$ echo -e 'She said, \042Let\047s get coding!\042'
She said, "Let's get coding!"
# 3. mixed hex and octal, just for fun
# also escape `!` by removing it from within the double quotes when it is followed by
# another escape sequence
$ echo -e "She said, \x22Let\047s get coding! It\x27s waaay past time to begin"\!"\042"
She said, "Let's get coding! It's waaay past time to begin!"
# OR put it all within single quotes:
$ echo -e 'She said, \x22Let\047s get coding! It\x27s waaay past time to begin!\042'
She said, "Let's get coding! It's waaay past time to begin!"
Note that if you don't properly escape !, when needed, as I've shown two ways to do above, you'll get some weird errors, like this:
$ echo -e "She said, \x22Let\047s get coding! It\x27s waaay past time to begin!\042"
bash: !\042: event not found
OR:
$ echo -e "She said, \x22Let\x27s get coding!\x22"
bash: !\x22: event not found
One more alternative: this allows mixed expansion and non-expansion all within the same bash string
Here is another demo of an alternative escaping technique.
First, read the main answer by #liori to see how the 2nd form below works. Now, read these two alternative ways of escaping characters. Both examples below are identical in their output:
CMD="gs_set_title"
# 1. 1st technique: escape the $ symbol with a backslash (\) so it doesn't
# run and expand the command following it
echo "$CMD '\$(basename \"\$(pwd)\")'"
# 2. 2nd technique (does the same thing in a different way): escape the
# $ symbol using single quotes around it, and the single quote (') symbol
# using double quotes around it
echo "$CMD ""'"'$(basename "$(pwd)")'"'"
Sample output:
gs_set_title '$(basename "$(pwd)")'
gs_set_title '$(basename "$(pwd)")'
Note: for my gs_set_title bash function, which I have in my ~/.bash_aliases file somewhere around here, see my other answer here.
References:
https://en.wikipedia.org/wiki/ASCII#Printable_characters
https://serverfault.com/questions/208265/what-is-bash-event-not-found/208266#208266
See also my other answer here: How do I write non-ASCII characters using echo?.
I just use shell codes.. e.g. \x27 or \\x22 as applicable. No hassle, ever really.
Since one cannot put single quotes within single quoted strings, the simplest and most readable option is to use a HEREDOC string
command=$(cat <<'COMMAND'
urxvt -fg '#111111' -bg '#111111'
COMMAND
)
alias rxvt=$command
In the code above, the HEREDOC is sent to the cat command and the output of that is assigned to a variable via the command substitution notation $(..)
Putting a single quote around the HEREDOC is needed since it is within a $()
IMHO the real answer is that you can't escape single-quotes within single-quoted strings.
Its impossible.
If we presume we are using bash.
From bash manual...
Enclosing characters in single quotes preserves the literal value of each
character within the quotes. A single quote may not occur
between single quotes, even when preceded by a backslash.
You need to use one of the other string escape mechanisms " or \
There is nothing magic about alias that demands it use single quotes.
Both the following work in bash.
alias rxvt="urxvt -fg '#111111' -bg '#111111'"
alias rxvt=urxvt\ -fg\ \'#111111\'\ -bg\ \'#111111\'
The latter is using \ to escape the space character.
There is also nothing magic about #111111 that requires single quotes.
The following options achieves the same result the other two options, in that the rxvt alias works as expected.
alias rxvt='urxvt -fg "#111111" -bg "#111111"'
alias rxvt="urxvt -fg \"#111111\" -bg \"#111111\""
You can also escape the troublesome # directly
alias rxvt="urxvt -fg \#111111 -bg \#111111"
A minimal answer is needed so that people can get going without spending a lot of time as I had to sifting through people waxing eloquent.
There is no way to escape single quotes or anything else within single quotes.
The following is, perhaps surprisingly, a complete command:
$ echo '\'
whose output is:
\
Backslashes, surprisingly to even long-time users of bash, have no meaning inside single quotes. Nor does anything else.
Most of these answers hit on the specific case you're asking about. There is a general approach that a friend and I have developed that allows for arbitrary quoting in case you need to quote bash commands through multiple layers of shell expansion, e.g., through ssh, su -c, bash -c, etc. There is one core primitive you need, here in native bash:
quote_args() {
local sq="'"
local dq='"'
local space=""
local arg
for arg; do
echo -n "$space'${arg//$sq/$sq$dq$sq$dq$sq}'"
space=" "
done
}
This does exactly what it says: it shell-quotes each argument individually (after bash expansion, of course):
$ quote_args foo bar
'foo' 'bar'
$ quote_args arg1 'arg2 arg2a' arg3
'arg1' 'arg2 arg2a' 'arg3'
$ quote_args dq'"'
'dq"'
$ quote_args dq'"' sq"'"
'dq"' 'sq'"'"''
$ quote_args "*"
'*'
$ quote_args /b*
'/bin' '/boot'
It does the obvious thing for one layer of expansion:
$ bash -c "$(quote_args echo a'"'b"'"c arg2)"
a"b'c arg2
(Note that the double quotes around $(quote_args ...) are necessary to make the result into a single argument to bash -c.) And it can be used more generally to quote properly through multiple layers of expansion:
$ bash -c "$(quote_args bash -c "$(quote_args echo a'"'b"'"c arg2)")"
a"b'c arg2
The above example:
shell-quotes each argument to the inner quote_args individually and then combines the resulting output into a single argument with the inner double quotes.
shell-quotes bash, -c, and the already once-quoted result from step 1, and then combines the result into a single argument with the outer double quotes.
sends that mess as the argument to the outer bash -c.
That's the idea in a nutshell. You can do some pretty complicated stuff with this, but you have to be careful about order of evaluation and about which substrings are quoted. For instance, the following do the wrong things (for some definition of "wrong"):
$ (cd /tmp; bash -c "$(quote_args cd /; pwd 1>&2)")
/tmp
$ (cd /tmp; bash -c "$(quote_args cd /; [ -e *sbin ] && echo success 1>&2 || echo failure 1>&2)")
failure
In the first example, bash immediately expands quote_args cd /; pwd 1>&2 into two separate commands, quote_args cd / and pwd 1>&2, so the CWD is still /tmp when the pwd command is executed. The second example illustrates a similar problem for globbing. Indeed, the same basic problem occurs with all bash expansions. The problem here is that a command substitution isn't a function call: it's literally evaluating one bash script and using its output as part of another bash script.
If you try to simply escape the shell operators, you'll fail because the resulting string passed to bash -c is just a sequence of individually-quoted strings that aren't then interpreted as operators, which is easy to see if you echo the string that would have been passed to bash:
$ (cd /tmp; echo "$(quote_args cd /\; pwd 1\>\&2)")
'cd' '/;' 'pwd' '1>&2'
$ (cd /tmp; echo "$(quote_args cd /\; \[ -e \*sbin \] \&\& echo success 1\>\&2 \|\| echo failure 1\>\&2)")
'cd' '/;' '[' '-e' '*sbin' ']' '&&' 'echo' 'success' '1>&2' '||' 'echo' 'failure' '1>&2'
The problem here is that you're over-quoting. What you need is for the operators to be unquoted as input to the enclosing bash -c, which means they need to be outside the $(quote_args ...) command substitution.
Consequently, what you need to do in the most general sense is to shell-quote each word of the command not intended to be expanded at the time of command substitution separately, and not apply any extra quoting to the shell operators:
$ (cd /tmp; echo "$(quote_args cd /); $(quote_args pwd) 1>&2")
'cd' '/'; 'pwd' 1>&2
$ (cd /tmp; bash -c "$(quote_args cd /); $(quote_args pwd) 1>&2")
/
$ (cd /tmp; echo "$(quote_args cd /); [ -e *$(quote_args sbin) ] && $(quote_args echo success) 1>&2 || $(quote_args echo failure) 1>&2")
'cd' '/'; [ -e *'sbin' ] && 'echo' 'success' 1>&2 || 'echo' 'failure' 1>&2
$ (cd /tmp; bash -c "$(quote_args cd /); [ -e *$(quote_args sbin) ] && $(quote_args echo success) 1>&2 || $(quote_args echo failure) 1>&2")
success
Once you've done this, the entire string is fair game for further quoting to arbitrary levels of evaluation:
$ bash -c "$(quote_args cd /tmp); $(quote_args bash -c "$(quote_args cd /); $(quote_args pwd) 1>&2")"
/
$ bash -c "$(quote_args bash -c "$(quote_args cd /tmp); $(quote_args bash -c "$(quote_args cd /); $(quote_args pwd) 1>&2")")"
/
$ bash -c "$(quote_args bash -c "$(quote_args bash -c "$(quote_args cd /tmp); $(quote_args bash -c "$(quote_args cd /); $(quote_args pwd) 1>&2")")")"
/
$ bash -c "$(quote_args cd /tmp); $(quote_args bash -c "$(quote_args cd /); [ -e *$(quote_args sbin) ] && $(quote_args echo success) 1>&2 || $(quote_args echo failure) 1>&2")"
success
$ bash -c "$(quote_args bash -c "$(quote_args cd /tmp); $(quote_args bash -c "$(quote_args cd /); [ -e *sbin ] && $(quote_args echo success) 1>&2 || $(quote_args echo failure) 1>&2")")"
success
$ bash -c "$(quote_args bash -c "$(quote_args bash -c "$(quote_args cd /tmp); $(quote_args bash -c "$(quote_args cd /); [ -e *$(quote_args sbin) ] && $(quote_args echo success) 1>&2 || $(quote_args echo failure) 1>&2")")")"
success
etc.
These examples may seem overwrought given that words like success, sbin, and pwd don't need to be shell-quoted, but the key point to remember when writing a script taking arbitrary input is that you want to quote everything you're not absolutely sure doesn't need quoting, because you never know when a user will throw in a Robert'; rm -rf /.
To better understand what is going on under the covers, you can play around with two small helper functions:
debug_args() {
for (( I=1; $I <= $#; I++ )); do
echo -n "$I:<${!I}> " 1>&2
done
echo 1>&2
}
debug_args_and_run() {
debug_args "$#"
"$#"
}
that will enumerate each argument to a command before executing it:
$ debug_args_and_run echo a'"'b"'"c arg2
1:<echo> 2:<a"b'c> 3:<arg2>
a"b'c arg2
$ bash -c "$(quote_args debug_args_and_run echo a'"'b"'"c arg2)"
1:<echo> 2:<a"b'c> 3:<arg2>
a"b'c arg2
$ bash -c "$(quote_args debug_args_and_run bash -c "$(quote_args debug_args_and_run echo a'"'b"'"c arg2)")"
1:<bash> 2:<-c> 3:<'debug_args_and_run' 'echo' 'a"b'"'"'c' 'arg2'>
1:<echo> 2:<a"b'c> 3:<arg2>
a"b'c arg2
$ bash -c "$(quote_args debug_args_and_run bash -c "$(quote_args debug_args_and_run bash -c "$(quote_args debug_args_and_run echo a'"'b"'"c arg2)")")"
1:<bash> 2:<-c> 3:<'debug_args_and_run' 'bash' '-c' ''"'"'debug_args_and_run'"'"' '"'"'echo'"'"' '"'"'a"b'"'"'"'"'"'"'"'"'c'"'"' '"'"'arg2'"'"''>
1:<bash> 2:<-c> 3:<'debug_args_and_run' 'echo' 'a"b'"'"'c' 'arg2'>
1:<echo> 2:<a"b'c> 3:<arg2>
a"b'c arg2
$ bash -c "$(quote_args debug_args_and_run bash -c "$(quote_args debug_args_and_run bash -c "$(quote_args debug_args_and_run bash -c "$(quote_args debug_args_and_run echo a'"'b"'"c arg2)")")")"
1:<bash> 2:<-c> 3:<'debug_args_and_run' 'bash' '-c' ''"'"'debug_args_and_run'"'"' '"'"'bash'"'"' '"'"'-c'"'"' '"'"''"'"'"'"'"'"'"'"'debug_args_and_run'"'"'"'"'"'"'"'"' '"'"'"'"'"'"'"'"'echo'"'"'"'"'"'"'"'"' '"'"'"'"'"'"'"'"'a"b'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'c'"'"'"'"'"'"'"'"' '"'"'"'"'"'"'"'"'arg2'"'"'"'"'"'"'"'"''"'"''>
1:<bash> 2:<-c> 3:<'debug_args_and_run' 'bash' '-c' ''"'"'debug_args_and_run'"'"' '"'"'echo'"'"' '"'"'a"b'"'"'"'"'"'"'"'"'c'"'"' '"'"'arg2'"'"''>
1:<bash> 2:<-c> 3:<'debug_args_and_run' 'echo' 'a"b'"'"'c' 'arg2'>
1:<echo> 2:<a"b'c> 3:<arg2>
a"b'c arg2
In the given example, simply used double quotes instead of single quotes as outer escape mechanism:
alias rxvt="urxvt -fg '#111111' -bg '#111111'"
This approach is suited for many cases where you just want to pass a fixed string to a command: Just check how the shell will interpret the double-quoted string through an echo, and escape characters with backslash if necessary.
In the example, you'd see that double quotes are sufficient to protect the string:
$ echo "urxvt -fg '#111111' -bg '#111111'"
urxvt -fg '#111111' -bg '#111111'
Here is an elaboration on The One True Answer referenced above:
Sometimes I will be downloading using rsync over ssh and have to escape a filename with a ' in it TWICE! (OMG!) Once for bash and once for ssh. The same principle of alternating quotation delimiters is at work here.
For example, let's say we want to get: Louis Theroux's LA Stories ...
First you enclose Louis Theroux in single quotes for bash and double quotes for ssh:
'"Louis Theroux"'
Then you use single quotes to escape a double quote '"'
The use double quotes to escape the apostrophe "'"
Then repeat #2, using single quotes to escape a double quote '"'
Then enclose LA Stories in single quotes for bash and double quotes for ssh: '"LA Stories"'
And behold! You wind up with this:
rsync -ave ssh '"Louis Theroux"''"'"'"'"''"s LA Stories"'
which is an awful lot of work for one little ' -- but there you go
Obviously, it would be easier simply to surround with double quotes, but where's the challenge in that? Here is the answer using only single quotes. I'm using a variable instead of alias so that's it's easier to print for proof, but it's the same as using alias.
$ rxvt='urxvt -fg '\''#111111'\'' -bg '\''#111111'\'
$ echo $rxvt
urxvt -fg '#111111' -bg '#111111'
Explanation
The key is that you can close the single quote and re-open it as many times as you want. For example foo='a''b' is the same as foo='ab'. So you can close the single quote, throw in a literal single quote \', then reopen the next single quote.
Breakdown diagram
This diagram makes it clear by using brackets to show where the single quotes are opened and closed. Quotes are not "nested" like parentheses can be. You can also pay attention to the color highlighting, which is correctly applied. The quoted strings are maroon, whereas the \' is black.
'urxvt -fg '\''#111111'\'' -bg '\''#111111'\' # original
[^^^^^^^^^^] ^[^^^^^^^] ^[^^^^^] ^[^^^^^^^] ^ # show open/close quotes
urxvt -fg ' #111111 ' -bg ' #111111 ' # literal characters remaining
(This is essentially the same answer as Adrian's, but I feel this explains it better. Also his answer has 2 superfluous single quotes at the end.)
in addition to #JasonWoof perfect answer i want to show how i solved related problem
in my case encoding single quotes with '\'' will not always be sufficient, for example if a string must quoted with single quotes, but the total count of quotes results in odd amount
#!/bin/bash
# no closing quote
string='alecxs\'solution'
# this works for string
string="alecxs'solution"
string=alecxs\'solution
string='alecxs'\''solution'
let's assume string is a file name and we need to save quoted file names in a list (like stat -c%N ./* > list)
echo "'$string'" > "$string"
cat "$string"
but processing this list will fail (depending on how many quotes the string does contain in total)
while read file
do
ls -l "$file"
eval ls -l "$file"
done < "$string"
workaround: encode quotes with string manipulation
string="${string//$'\047'/\'\$\'\\\\047\'\'}"
# result
echo "$string"
now it works because quotes are always balanced
echo "'$string'" > list
while read file
do
ls -l "$file"
eval ls -l "$file"
done < list
Hope this helps when facing similar problem
Another way to fix the problem of too many layers of nested quotation:
You are trying to cram too much into too tiny a space, so use a bash function.
The problem is you are trying to have too many levels of nesting, and the basic alias technology is not powerful enough to accommodate. Use a bash function like this to make it so the single, double quotes back ticks and passed in parameters are all handled normally as we would expect:
lets_do_some_stuff() {
tmp=$1 #keep a passed in parameter.
run_your_program $# #use all your passed parameters.
echo -e '\n-------------' #use your single quotes.
echo `date` #use your back ticks.
echo -e "\n-------------" #use your double quotes.
}
alias foobarbaz=lets_do_some_stuff
Then you can use your $1 and $2 variables and single, double quotes and back ticks without worrying about the alias function wrecking their integrity.
This program prints:
el#defiant ~/code $ foobarbaz alien Dyson ring detected #grid 10385
alien Dyson ring detected #grid 10385
-------------
Mon Oct 26 20:30:14 EDT 2015
-------------
shell_escape () {
printf '%s' "'${1//\'/\'\\\'\'}'"
}
Implementation explanation:
double quotes so we can easily output wrapping single quotes and use the ${...} syntax
bash's search and replace looks like: ${varname//search/replacement}
we're replacing ' with '\''
'\'' encodes a single ' like so:
' ends the single quoting
\' encodes a ' (the backslash is needed because we're not inside quotes)
' starts up single quoting again
bash automatically concatenates strings with no white space between
there's a \ before every \ and ' because that's the escaping rules for ${...//.../...} .
string="That's "'##$*&^`(##'
echo "original: $string"
echo "encoded: $(shell_escape "$string")"
echo "expanded: $(bash -c "echo $(shell_escape "$string")")"
P.S. Always encode to single quoted strings because they are way simpler than double quoted strings.
Here are my two cents -- in the case if one wants to be sh-portable, not just bash-specific ( the solution is not too efficient, though, as it starts an external program -- sed ):
put this in quote.sh ( or just quote ) somewhere on your PATH :
# this works with standard input (stdin)
quote() {
echo -n "'" ;
sed 's/\(['"'"']['"'"']*\)/'"'"'"\1"'"'"'/g' ;
echo -n "'"
}
case "$1" in
-) quote ;;
*) echo "usage: cat ... | quote - # single-quotes input for Bourne shell" 2>&1 ;;
esac
An example:
$ echo -n "G'day, mate!" | ./quote.sh -
'G'"'"'day, mate!'
And, of course, that converts back:
$ echo 'G'"'"'day, mate!'
G'day, mate!
Explanation: basically we have to enclose the input with quotes ', and then also replace any single quote within with this micro-monster: '"'"' ( end the opening quote with a pairing ', escape the found single quote by wrapping it with double quotes -- "'", and then finally issue a new opening single quote ', or in pseudo-notation : ' + "'" + ' == '"'"' )
One standard way to do that would be to use sed with the following substitution command:
s/\(['][']*\)/'"\1"'/g
One small problem, though, is that in order to use that in shell one needs to escape all these single quote characters in the sed expression itself -- what leads to something like
sed 's/\(['"'"']['"'"']*\)/'"'"'"\1"'"'"'/g'
( and one good way to build this result is to feed the original expression s/\(['][']*\)/'"\1"'/g to Kyle Rose' or George V. Reilly's scripts ).
Finally, it kind of makes sense to expect the input to come from stdin -- since passing it through command-line arguments could be already too much trouble.
( Oh, and may be we want to add a small help message so that the script does not hang when someone just runs it as ./quote.sh --help wondering what it does. )
If you're generating the shell string within Python 2 or Python 3, the following may help to quote the arguments:
#!/usr/bin/env python
from __future__ import print_function
try: # py3
from shlex import quote as shlex_quote
except ImportError: # py2
from pipes import quote as shlex_quote
s = """foo ain't "bad" so there!"""
print(s)
print(" ".join([shlex_quote(t) for t in s.split()]))
This will output:
foo ain't "bad" so there!
foo 'ain'"'"'t' '"bad"' so 'there!'
If you have GNU Parallel installed you can use its internal quoting:
$ parallel --shellquote
L's 12" record
<Ctrl-D>
'L'"'"'s 12" record'
$ echo 'L'"'"'s 12" record'
L's 12" record
From version 20190222 you can even --shellquote multiple times:
$ parallel --shellquote --shellquote --shellquote
L's 12" record
<Ctrl-D>
'"'"'"'"'"'"'L'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'s 12" record'"'"'"'"'"'"'
$ eval eval echo '"'"'"'"'"'"'L'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'s 12" record'"'"'"'"'"'"'
L's 12" record
It will quote the string in all supported shells (not only bash).
This function:
quote ()
{
local quoted=${1//\'/\'\\\'\'};
printf "'%s'" "$quoted"
}
allows quoting of ' inside '. Use as this:
$ quote "urxvt -fg '#111111' -bg '#111111'"
'urxvt -fg '\''#111111'\'' -bg '\''#111111'\'''
If the line to quote gets more complex, like double quotes mixed with single quotes, it may become quite tricky to get the string to quote inside a variable. When such cases show up, write the exact line that you need to quote inside an script (similar to this).
#!/bin/bash
quote ()
{
local quoted=${1//\'/\'\\\'\'};
printf "'%s'" "$quoted"
}
while read line; do
quote "$line"
done <<-\_lines_to_quote_
urxvt -fg '#111111' -bg '#111111'
Louis Theroux's LA Stories
'single quote phrase' "double quote phrase"
_lines_to_quote_
Will output:
'urxvt -fg '\''#111111'\'' -bg '\''#111111'\'''
'Louis Theroux'\''s LA Stories'
''\''single quote phrase'\'' "double quote phrase"'
All correctly quoted strings inside single quotes.
Here is another solution. This function will take a single argument and appropriately quote it using the single-quote character, just as the voted answer above explains:
single_quote() {
local quoted="'"
local i=0
while [ $i -lt ${#1} ]; do
local ch="${1:i:1}"
if [[ "$ch" != "'" ]]; then
quoted="$quoted$ch"
else
local single_quotes="'"
local j=1
while [ $j -lt ${#1} ] && [[ "${1:i+j:1}" == "'" ]]; do
single_quotes="$single_quotes'"
((j++))
done
quoted="$quoted'\"$single_quotes\"'"
((i+=j-1))
fi
((i++))
done
echo "$quoted'"
}
So, you can use it this way:
single_quote "1 2 '3'"
'1 2 '"'"'3'"'"''
x="this text is quoted: 'hello'"
eval "echo $(single_quote "$x")"
this text is quoted: 'hello'

How to ignore empty selections in XMLStarlet?

I have a script that performs many XML edit operations with XMLStarlet.
For instance, it removes all foo nodes if any are present:
xmlstarlet ed -d '//foo'
(except that in my script, the name of the element is not foo).
When no foo node is present, the following message is printed:
None of the XPaths matched; to match a node in the default namespace
use '_' as the prefix (see section 5.1 in the manual).
For instance, use /_:node instead of /node
But there is nothing wrong if no foo nodes are present in the input document.
So for this particular operation, I do want to avoid this particular warning,
while I do not want to disable such warnings in general.
How can I achieve this?
At the moment, deleting a node that does not exist is categorized by xmlstarlet as an error. The return code will be 1 which you will have to understand as "no nodes were removed".
The error message appears if you happen to use a document which has a default namespace:
No namespace:
echo '<foo />' | xmlstarlet sel -t -m foo -v joe
With a default namespace xmlstarlet prints the error message
echo '<foo xmlns="urn:foo" />' | xmlstarlet sel -t -m foo -v joe
None of the XPaths matched; to match a node in the default namespace
use '_' as the prefix (see section 5.1 in the manual).
For instance, use /_:node instead of /node
No default namespace:
echo '<ns:foo xmlns:ns="urn:foo" />' | xmlstarlet sel -t -m foo -v joe
In all of these cases, no nodes are found, and xmlstarlet thus exit with return code 1 (i.e. an error). The error message was to explain the error in the case where the user forgot that the document had a default namespace. I have discussed this with the author and more recently changes have been introduced to reduce the chances of these messages, and a way to inhibit them.
xmlstarlet has (yet undocumented) support for using the document's namespaces, without needing to declare them up-front:
Compare:
echo '<ns:foo xmlns:ns="urn:foo" />' | xmlstarlet sel -t -m ns:foo -v 'count(.)'
echo '<ns:foo xmlns:ns="urn:foo" />' | xmlstarlet sel -N xx=urn:foo -t -m xx:foo -v 'count(.)'
Technically, the commands are identical, except the first one depends on the document binding to the 'ns' prefix, while the second does not.
To inhibit the message, you have to redirecting stderr to null:
echo '<foo xmlns="urn:xxx"/>' |
xmlstarlet sel -t -m foo -v joe 2> /dev/null
The downside of this is that it suppresses legitimate error messages, not just this bogus one which is caused by the fact that the source document uses namespaces.
In recent builds --no-doc-namespace has been added to inhibit this behaviour
The change was introduced in this changeset and the changeset contains a long exchange regarding this error message, all caused by this StackOverflow question!

Xmlstarlet and sed to replace string in a file

I have huge number of html files. I need to replace all the , and " with html entities &nsbquo and &quto respectively.
I need to succeed in two steps for this:
1) Find all the text between tags. I need to replace only in this text between tags.
2) Replace all required strings using sed
My command for this is :
xmlstarlet sel -t -v "*//p" "index.html" | sed 's/,/\&nsbquo/'
This works, but now I dont know how to put back the changes to index.html file.
In sed we have -i option, but for that I need to specify the filename with sed command. But in my case, i have to use | to filter out the required string from html file.
Please help. I did a lot of search for this from 2 days but no luck.
Thank you,
Divya.
The main problem here is that in XML there is no difference between " and ", so you can't use xmlstarlet to do this directly. You could replace " with a special string and then use sed to replace that with ":
xmlstarlet ed -u "//p/text()" \
-x "str:replace(str:replace(., ',', '#NSBQUO#'), '\"', '#QUOT#')" \
quote.html | \
sed 's/#NSBQUO#/\&nsbquo\;/g; s/#QUOT#/\&quot\;/g' > quote-new.html
mv quote-new.html quote.html
NOTE: str:replace and other exslt functions were only added to xmlstarlet ed in version 1.3.0, so it was not available at the time this question was asked.

How do I push `sed` matches to the shell call in the replacement pattern?

I need to replace several URLs in a text file with some content dependent on the URL itself. Let's say for simplicity it's the first line of the document at the URL.
What I'm trying is this:
sed "s/^URL=\(.*\)/TITLE=$(curl -s \1 | head -n 1)/" file.txt
This doesn't work, since \1 is not set. However, the shell is getting called. Can I somehow push the sed match variables to that subprocess?
The accept answer is just plain wrong. Proof:
Make an executable script foo.sh:
#! /bin/bash
echo $* 1>&2
Now run it:
$ echo foo | sed -e "s/\\(foo\\)/$(./foo.sh \\1)/"
\1
$
The $(...) is expanded before sed is run.
So you are trying to call an external command from inside the replacement pattern of a sed substitution. I dont' think it can be done, the $... inside a pattern just allows you to use an already existent (constant) shell variable.
I'd go with Perl, see the /e option in the search-replace operator (s/.../.../e).
UPDATE: I was wrong, sed plays nicely with the shell, and it allows you do to that. But, then, the backlash in \1 should be escaped. Try instead:
sed "s/^URL=\(.*\)/TITLE=$(curl -s \\1 | head -n 1)/" file.txt
Try this:
sed "s/^URL=\(.*\)/\1/" file.txt | while read url; do sed "s#URL=\($url\)#TITLE=$(curl -s $url | head -n 1)#" file.txt; done
If there are duplicate URLs in the original file, then there will be n^2 of them in the output. The # as a delimiter depends on the URLs not including that character.
Late reply, but making sure people don't get thrown off by the answers here -- this can be done in gnu sed using the e command. The following, for example, decrements a number at the beginning of a line:
echo "444 foo" | sed "s/\([0-9]*\)\(.*\)/expr \1 - 1 | tr -d '\n'; echo \"\2\";/e"
will produce:
443 foo