YAML, Docker Compose, Spaces & Quotes - docker-compose

Under what circumstances must one use quotes in a YAML file, specifically when using docker-compose.
For instance,
service:
image: "my-registry/repo:tag1"
environment:
ENV1: abc
ENV2: "abc"
ENV3: "a b c"
If spaces are required, for example, must one use quotes around the environment variable, as depicted in ENV3?

After some googling I've found a blog post
that touches this problem as I understood it.
I'll cite the most important part here:
plain scalars:
- a string
- a string with a \ backslash that doesn't need to be escaped
- can also use " quotes ' and $ a % lot /&?+ of other {} [] stuff
single quoted:
- '& starts with a special character, needs quotes'
- 'this \ backslash also does not need to be escaped'
- 'just like the " double quote'
- 'to express one single quote, use '' two of them'
double quoted:
- "here we can use predefined escape sequences like \t \n \b"
- "or generic escape sequences \x0b \u0041 \U00000041"
- "the double quote \" needs to be escaped"
- "just like the \\ backslash"
- "the single quote ' and other characters must not be escaped"
literal block scalar: |
a multiline text
line 2
line 3
folded block scalar: >
a long line split into
several short
lines for readability
Also I have not seen such docker-compose syntax to set env variables. Documentation suggests using simple values like
environment:
- ENV1=abc
- "ENV2=abc"
Where quotes " or ' are optional in this particular example according to what I've said earlier.
To see how to include spaces in env variables you can check out this so answer

Whether or not you need quotes, depends on the parser. Docker-compose AFAIK is still relying on the PyYAML module and that implements most of YAML 1.1 and has a few quirks of its own.
In general you only need to quote what could otherwise be misinterpreted or clash with some YAML construct that is not a scalar string. You also need (double) quotes for things that cannot be represented in plain scalars, single quoted scalars or block style literal or folded scalars.
Misinterpretation
You need to quote strings that look like some of the other data structures:
booleans: "True", "False", but PyYAML also assumes alternatives words like "Yes", "No", "On", "Off" represent boolean values ( and the all lowercase, all uppercase versions should be considered as well). Please note that the YAML 1.2 standard removed references to these alternatives.
integers: this includes string consisting of numbers only. But also hex (0x123) and octal number (0123). The octals in YAML 1.2 are written as 0o123, but PyYAML doesn't support this, however it is best to quote both.
A special integer that PyYAML still supports but again not in the YAML 1.2 specification are sexagesimals: base 60 number separated by colon (:), time indications, but also MAC addresses can be interpreted as such if the values between/after the colons are in the range 00-59
floats: strings like 1E3 (with optional sign ans mantissa) should be quoted. Of course 3.14 needs to be quoted as well if it is a string. And sexagesimal floats (with a mantissa after the number after the final colon) should be quoted as well.
timestamps: 2001-12-15T02:59:43.1Z but also iso-8601 like strings should be quoted to prevent them from being interpreted as timestamps
The null value is written as the empty string, as ~ or Null (in all casing types), so any strings matching those need to be quoted.
Quoting in the above can be done with either single or double quotes, or block style literal or folded scalars can be used. Please note that for the block-style you should use |- resp. >- in order not to introduce a trailing newline that is not in the original string.
Clashes
YAML assigns special meaning to certain characters or character combinations. Some of these only have special meaning at the beginning of a string, others only within a string.
characters fromt the set !&*?{[ normally indicate special YAML constructs. Some of these might be disambiguated depending on the following character, but I would not rely on that.
whitespace followed by # indicates an end of line comment
wherever a key is possible (and within block mode that is in many places) the combination of colon + space (:) indicates a value will be following. If that combination is part of your scalar string, you have to quote.
As with the misinterpretation you can use single or double quoting or block-style literal or folding scalars. There can be no end-of-line comments beyond the first line of a block-style scalar.
PyYAML can additionally get confused by any colon + space within a plain scalar (even when this is in a value) so always quote those.
Representing special characters
You can insert special characters or unicode code-points in a YAML file, but if you want these to be clearly visible in all cases, you might want to use escape sequences. In that case you have to use double quotes, this is the only mode that
allows backslash escapes. And e.g. \u2029. A full list of such escapes can be taken from the standard, but note that PyYAML doesn't implement e.g \/ (or at least did not when I forked that library).
One trick to find out what to quote or not is to use the library used to dump the strings that you have. My ruamel.yaml and PyYAML used by docker-compose, when potentially dumping a plain scalar, both try to read back (yes, by parsing the result) the plain scalar representation of a string and if that results in something different than a string, it is clear quotes need to be applied. You can do so too: when in doubt write a small program dumping the list of strings that you have using PyYAML's safe_dump() and apply quotes anywhere that PyYAML does.

Related

Output is not generating while running the bash script [duplicate]

In Bash, what are the differences between single quotes ('') and double quotes ("")?
Single quotes won't interpolate anything, but double quotes will. For example: variables, backticks, certain \ escapes, etc.
Example:
$ echo "$(echo "upg")"
upg
$ echo '$(echo "upg")'
$(echo "upg")
The Bash manual has this to say:
3.1.2.2 Single Quotes
Enclosing characters in single quotes (') preserves the literal value of each character within the quotes. A single quote may not occur between single quotes, even when preceded by a backslash.
3.1.2.3 Double Quotes
Enclosing characters in double quotes (") preserves the literal value of all characters within the quotes, with the exception of $, `, \, and, when history expansion is enabled, !. The characters $ and ` retain their special meaning within double quotes (see Shell Expansions). The backslash retains its special meaning only when followed by one of the following characters: $, `, ", \, or newline. Within double quotes, backslashes that are followed by one of these characters are removed. Backslashes preceding characters without a special meaning are left unmodified. A double quote may be quoted within double quotes by preceding it with a backslash. If enabled, history expansion will be performed unless an ! appearing in double quotes is escaped using a backslash. The backslash preceding the ! is not removed.
The special parameters * and # have special meaning when in double quotes (see Shell Parameter Expansion).
The accepted answer is great. I am making a table that helps in quick comprehension of the topic. The explanation involves a simple variable a as well as an indexed array arr.
If we set
a=apple # a simple variable
arr=(apple) # an indexed array with a single element
and then echo the expression in the second column, we would get the result / behavior shown in the third column. The fourth column explains the behavior.
#
Expression
Result
Comments
1
"$a"
apple
variables are expanded inside ""
2
'$a'
$a
variables are not expanded inside ''
3
"'$a'"
'apple'
'' has no special meaning inside ""
4
'"$a"'
"$a"
"" is treated literally inside ''
5
'\''
invalid
can not escape a ' within ''; use "'" or $'\'' (ANSI-C quoting)
6
"red$arocks"
red
$arocks does not expand $a; use ${a}rocks to preserve $a
7
"redapple$"
redapple$
$ followed by no variable name evaluates to $
8
'\"'
\"
\ has no special meaning inside ''
9
"\'"
\'
\' is interpreted inside "" but has no significance for '
10
"\""
"
\" is interpreted inside ""
11
"*"
*
glob does not work inside "" or ''
12
"\t\n"
\t\n
\t and \n have no special meaning inside "" or ''; use ANSI-C quoting
13
"`echo hi`"
hi
`` and $() are evaluated inside "" (backquotes are retained in actual output)
14
'`echo hi`'
`echo hi`
`` and $() are not evaluated inside '' (backquotes are retained in actual output)
15
'${arr[0]}'
${arr[0]}
array access not possible inside ''
16
"${arr[0]}"
apple
array access works inside ""
17
$'$a\''
$a'
single quotes can be escaped inside ANSI-C quoting
18
"$'\t'"
$'\t'
ANSI-C quoting is not interpreted inside ""
19
'!cmd'
!cmd
history expansion character '!' is ignored inside ''
20
"!cmd"
cmd args
expands to the most recent command matching "cmd"
21
$'!cmd'
!cmd
history expansion character '!' is ignored inside ANSI-C quotes
See also:
ANSI-C quoting with $'' - GNU Bash Manual
Locale translation with $"" - GNU Bash Manual
A three-point formula for quotes
If you're referring to what happens when you echo something, the single quotes will literally echo what you have between them, while the double quotes will evaluate variables between them and output the value of the variable.
For example, this
#!/bin/sh
MYVAR=sometext
echo "double quotes gives you $MYVAR"
echo 'single quotes gives you $MYVAR'
will give this:
double quotes gives you sometext
single quotes gives you $MYVAR
Others explained it very well, and I just want to give something with simple examples.
Single quotes can be used around text to prevent the shell from interpreting any special characters. Dollar signs, spaces, ampersands, asterisks and other special characters are all ignored when enclosed within single quotes.
echo 'All sorts of things are ignored in single quotes, like $ & * ; |.'
It will give this:
All sorts of things are ignored in single quotes, like $ & * ; |.
The only thing that cannot be put within single quotes is a single quote.
Double quotes act similarly to single quotes, except double quotes still allow the shell to interpret dollar signs, back quotes and backslashes. It is already known that backslashes prevent a single special character from being interpreted. This can be useful within double quotes if a dollar sign needs to be used as text instead of for a variable. It also allows double quotes to be escaped so they are not interpreted as the end of a quoted string.
echo "Here's how we can use single ' and double \" quotes within double quotes"
It will give this:
Here's how we can use single ' and double " quotes within double quotes
It may also be noticed that the apostrophe, which would otherwise be interpreted as the beginning of a quoted string, is ignored within double quotes. Variables, however, are interpreted and substituted with their values within double quotes.
echo "The current Oracle SID is $ORACLE_SID"
It will give this:
The current Oracle SID is test
Back quotes are wholly unlike single or double quotes. Instead of being used to prevent the interpretation of special characters, back quotes actually force the execution of the commands they enclose. After the enclosed commands are executed, their output is substituted in place of the back quotes in the original line. This will be clearer with an example.
today=`date '+%A, %B %d, %Y'`
echo $today
It will give this:
Monday, September 28, 2015
Since this is the de facto answer when dealing with quotes in Bash, I'll add upon one more point missed in the answers above, when dealing with the arithmetic operators in the shell.
The Bash shell supports two ways to do arithmetic operation, one defined by the built-in let command and the other the $((..)) operator. The former evaluates an arithmetic expression while the latter is more of a compound statement.
It is important to understand that the arithmetic expression used with let undergoes word-splitting, pathname expansion just like any other shell commands. So proper quoting and escaping need to be done.
See this example when using let:
let 'foo = 2 + 1'
echo $foo
3
Using single quotes here is absolutely fine here, as there isn't any need for variable expansions here. Consider a case of
bar=1
let 'foo = $bar + 1'
It would fail miserably, as the $bar under single quotes would not expand and needs to be double-quoted as
let 'foo = '"$bar"' + 1'
This should be one of the reasons, the $((..)) should always be considered over using let. Because inside it, the contents aren't subject to word-splitting. The previous example using let can be simply written as
(( bar=1, foo = bar + 1 ))
Always remember to use $((..)) without single quotes
Though the $((..)) can be used with double quotes, there isn't any purpose to it as the result of it cannot contain content that would need the double quote. Just ensure it is not single quoted.
printf '%d\n' '$((1+1))'
-bash: printf: $((1+1)): invalid number
printf '%d\n' $((1+1))
2
printf '%d\n' "$((1+1))"
2
Maybe in some special cases of using the $((..)) operator inside a single quoted string, you need to interpolate quotes in a way that the operator either is left unquoted or under double quotes. E.g., consider a case, when you are tying to use the operator inside a curl statement to pass a counter every time a request is made, do
curl http://myurl.com --data-binary '{"requestCounter":'"$((reqcnt++))"'}'
Notice the use of nested double quotes inside, without which the literal string $((reqcnt++)) is passed to the requestCounter field.
There is a clear distinction between the usage of ' ' and " ".
When ' ' is used around anything, there is no "transformation or translation" done. It is printed as it is.
With " ", whatever it surrounds, is "translated or transformed" into its value.
By translation/ transformation I mean the following:
Anything within the single quotes will not be "translated" to their values. They will be taken as they are inside quotes. Example: a=23, then echo '$a' will produce $a on standard output. Whereas echo "$a" will produce 23 on standard output.
A minimal answer is needed for people to get going without spending a lot of time as I had to.
The following is, surprisingly (to those looking for an answer), a complete command:
$ echo '\'
whose output is:
\
Backslashes, surprisingly to even long-time users of bash, do not have any meaning inside single quotes. Nor does anything else.

Oddities in fail2ban regex

This appears to be a bug in fail2ban, with different behaviour between the fail2ban-regex tool and a failregex filter
I am attempting to develop a new regex rule for fail2ban, to match:
\"%20and%20\"x\"%3D\"x
When using fail2ban-regex, this appears to produce the desired result:
^<HOST>.*GET.*\\"%20and%20\\"x\\"%3D\\"x.* 200.*$
As does this:
^<HOST>.*GET.*\\\"%20and%20\\\"x\\\"%3D\\\"x.* 200.*$
However, when I put either of these into a filter, I get the following error:
Failed during configuration: '%' must be followed by '%' or '(', found:…
To have this work in a filter you have to double-up the ‘%’, ie ‘%%’:
^<HOST>.*GET.*\\\"%%20and%%20\\\"x\\\"%%3D\\\"x.* 200.*$
While this gets the required hits running as a filter, it gets none running through fail2ban-regex.
I tried the \\\\ as Andre suggested below, but this gets no results in fail2ban-regex.
So, as this appears to be differential behaviour, I am going to file it as a bug.
According to Python's own site a singe backslash "\" has to be written as "\\\\" and there's no mention of %.
Regular expressions use the backslash character ('') to indicate
special forms or to allow special characters to be used without
invoking their special meaning. This collides with Python’s usage of
the same character for the same purpose in string literals; for
example, to match a literal backslash, one might have to write '\\'
as the pattern string, because the regular expression must be \, and
each backslash must be expressed as \ inside a regular Python string
literal
I would just go with:
failregex = (?i)^<HOST> -.*"(GET|POST|HEAD|PUT).*20and.*3d.*$
the .* wil match anything inbetween anyways and (?i) makes the entire regex case-insensitive

Not able to understand a command in perl

I need help to understand what below command is doing exactly
$abc{hier} =~ s#/tools.*/dfII/?.*##g;
and $abc{hier} contains a path "/home/test1/test2/test3"
Can someone please let me know what the above command is doing exactly. Thanks
s/PATTERN/REPLACEMENT/ is Perl's substitution operator. It searches a string for text that matches the regex PATTERN and replaces it with REPLACEMENT.
By default, the substitution operator works on $_. To tell it to work on a different variable, you use the binding operator - =~.
The default delimiter used by the substitution operator is a slash (/) but you can change that to any other character. This is useful if your PATTERN or your REPLACEMENT contains a slash. In this case, the programmer has used # as the delimiter.
To recap:
$abc{hier} =~ s#PATTERN#REPLACEMENT#;
means "look for text in $abc{hier} that matches PATTERN and replace it with REPLACEMENT.
The substitution operator also has various options that change its behaviour. They are added by putting letters after the final delimiter. In this case we have a g. That means "make the substitution global" - or match and change all occurrences of PATTERN.
In your case, the REPLACEMENT string is empty (we have two # characters next to each other). So we're replacing the PATTERN with nothing - effectively deleting whatever matches PATTERN.
So now we have:
$abc{hier} =~ s#PATTERN*##g;
And we know it means, "in the variable $abc{hier}, look for any string that matches PATTERN and replace it with nothing".
The last thing to look at is the PATTERN (or regular expression - "regex"). You can get the full definition of regexes in perldoc perlre. But to explain what we're using here:
/tools : is the fixed string "/tools"
.* : is zero or more of any character
/dfII : is the fixed string "/dfII"
/? : is an optional slash character
.* : is (again) zero or more of any character
So, basically, we're removing bits of a file path from a value that's stored in a hash.
This =~ means "Do a regex operation on that variable."
(Actually, as ikegami correctly reminds me, it is not necessarily only regex operations, because it could also be a transliteration.)
The operation in question is s#something#else#, which means replace the "something" with something "else".
The g at the end means "Do it for all occurences of something."
Since the "else" is empty, the replacement has the effect of deleting.
The "something" is a definition according to regex syntax, roughly it means "Starting with '/tools' and later containing '/dfII', followed pretty much by anything until the end."
Note, the regex mentions at the end /?.*. In detail, this would mean "A slash (/) , or maybe not (?), and then absolutely anything (.) any number of times including 0 times (*). Strictly speaking it is not necessary to define "slash or not", if it is followed by "anything any often", because "anything" includes as slash, and anyoften would include 0 or one time; whether it is followed by more "anything" or not. I.e. the /? could be omitted, without changing the behaviour.
(Thanks ikeagami for confirming.)
$abc{hier} =~ s#/tools.*/dfII/?.*##g;
The above commands use regular expression to strip/remove trailing /tools.*/dfII and
/tools.*/dfII/.* from value of hier member of %abc hash.
It is pretty basic perl except non standard regular expression limiters (# instead of standard /). It allows to avoid escaping / inside the regular expression (s/\/tools.*\/dfII\/?.*//g).
My personal preferred style-guide would make it s{/tools.*/dfII/?.*}{}g .

CSV specification - double quotes at the start and end of fields

Question (because I can't work it out), should ""hello world"" be a valid field value in a CSV file according to the specification?
i.e should:
1,""hello world"",9.5
be a valid CSV record?
(If so, then the Perl CSV-XS parser I'm using is mildly broken, but if not, then $line =~ s/\342\200\234/""/g; is a really bad idea ;) )
The weird thing is is that this code has been running without issue for years, but we've only just hit a record that started with both a left double quote and contained no comma (the above is from a CSV pre-parser).
The canonical format definition of CSV is https://www.rfc-editor.org/rfc/rfc4180.txt. It says:
Each field may or may not be enclosed in double quotes (however
some programs, such as Microsoft Excel, do not use double quotes
at all). If fields are not enclosed with double quotes, then
double quotes may not appear inside the fields. For example:
"aaa","bbb","ccc" CRLF
zzz,yyy,xxx
Fields containing line breaks (CRLF), double quotes, and commas
should be enclosed in double-quotes. For example:
"aaa","b CRLF
bb","ccc" CRLF
zzz,yyy,xxx
If double-quotes are used to enclose fields, then a double-quote
appearing inside a field must be escaped by preceding it with
another double quote. For example:
"aaa","b""bb","ccc"
Last rule means your line should have been:
1,"""hello world""",9.5
But not all parsers/generators follow this standard perfectly, so you might need for interoperability reasons to relax some rules. It all depends on how much you control the CSV format writing and CSV format parsing parts.
That depends on the escape character you use. If your escape character is '"' (double quote) then your line should look like
1,"""hello world""",9.5
If your escape character is '\' (backslash) then your line should look like
1,"\"hello world\"",9.5
Check your parser/environment defaults or explicitly configure your parser with the escape character you need e.g. to use backslash do:
my $csv = Text::CSV_XS->new ({ quote_char => '"', escape_char => "\\" });

Why does my LIKE statement fail with '\\_' for matching?

I have a database entry that has entries that look like this:
id | name | code_set_id
I have this particular entry that I need to find:
674272310 | raphodo/qrc_resources.py | 782732
In my rails app (2.3.8), I have a statement that evaluates to this:
SELECT * from fyles WHERE code_set_id = 782732 AND name LIKE 'raphodo/qrc\\_resources.py%';
From reading up on escaping, the above query is correct. This is supposed to correctly double escape the underscore. However this query does not find the record in the database. These queries will:
SELECT * from fyles WHERE code_set_id = 782732 AND name LIKE 'raphodo/qrc\_resources.py%';
SELECT * from fyles WHERE code_set_id = 782732 AND name LIKE 'raphodo/qrc_resources.py%';
Am I missing something here? Why is the first SQL statement not finding the correct entry?
A single backslash in the RHS of a LIKE escapes the following character:
9.7.1. LIKE
[...]
To match a literal underscore or percent sign without matching other characters, the respective character in pattern must be preceded by the escape character. The default escape character is the backslash but a different one can be selected by using the ESCAPE clause. To match the escape character itself, write two escape characters.
So this is a literal underscore in a LIKE pattern:
\_
and this is a single backslash followed by an "any character" pattern:
\\_
You want LIKE to see this:
raphodo/qrc\_resources.py%
PostgreSQL used to interpret C-stye backslash escapes in strings by default but no longer, now you have to use E'...' to use backslash escapes in string literals (unless you've changed the configuration options). The String Constants with C-style Escapes section of the manual covers this but the simple version is that these two:
name LIKE E'raphodo/qrc\\_resources.py%'
name LIKE 'raphodo/qrc\_resources.py%'
do the same thing as of PostgreSQL 9.1.
Presumably your Rails 2.3.8 app (or whatever is preparing your LIKE patterns) is assuming an older version of PostgreSQL than the one you're actually using. You'll need to adjust things to not double your backslashes (or prefix the pattern string literals with Es).