Why is the k8s container spec "command" field an array?

Why is the k8s container spec "command" field an array? - kubernetes

According to this official kubernetes documentation page, it is possible to provide "a command" and args to a container.
The page has 13 occurrences of the string "a command" and 10 occurrences of "the command" -- note the use of singular.
There are (besides file names) 3 occurrences of the plural "commands":
One leads to the page Get a Shell to a Running Container, which I am not interested in. I am interested in the start-up command of the container.
One mention is concerned with running several piped commands in a shell environment, however the provided example uses a single string: command: ["/bin/sh"].
The third occurrence is in the introductory sentence:
This page shows how to define commands and arguments when you run a container in a Pod.
All examples, including the explanation of how command and args interact when given or omitted, only ever show a single string in an array. It even seems to be intended to use a single command only, which would receive all specified args, since the field is named with a singular.
The question is: Why is this field an array?
I assume the developers of kubernetes had a good reason for this, but I cannot think of one. What is going on here? Is it legacy? If so, how come? Is it future-readiness? If so, what for? Is it for compatibility? If so, to what?
Edit:
As I have written in a comment below, the only reason I can conceive of at this moment is this: The k8s developers wanted to achieve the interaction of command and args as documented AND allow a user to specify all parts of a command in a single parameter instead of having a command span across both command and args.
So essentially a compromise between a feature and readability.
Can anyone confirm this hypothesis?

Because the execve(2) system call takes an array of words. Everything at a higher level fundamentally reduces to this. As you note, a container only runs a single command, and then exits, so the array syntax is a native-Unix way of providing the command rather than a way to try to specify multiple commands.
For the sake of argument, consider a file named a file; with punctuation, where the spaces and semicolon are part of the filename. Maybe this is the input to some program, so in a shell you might write
some_program 'a file; with punctuation'
In C you could write this out as an array of strings and just run it
char *const argv[] = {
"some_program",
"a file; with punctuation", /* no escaping or quoting, an ordinary C string */
NULL
};
execvp(argv[0], argv); /* does not return */
and similarly in Kubernetes YAML you can write this out as a YAML array of bare words
command:
- some_program
- a file; with punctuation
Neither Docker nor Kubernetes will automatically run a shell for you (except in the case of the Dockerfile shell form of ENTRYPOINT or CMD). Part of the question is "which shell"; the natural answer would be a POSIX Bourne shell in the container's /bin/sh, but a very-lightweight container might not even have that, and sometimes Linux users expect /bin/sh to be GNU Bash, and confusion results. There are also potential lifecycle issues if the main container process is a shell rather than the thing it launches. If you do need a shell, you need to run it explicitly
command:
- /bin/sh
- -c
- some_program 'a file; with punctuation'
Note that sh -c's argument is a single word (in our C example, it would be a single entry in the argv array) and so it needs to be a single item in a command: or args: list. If you have the sh -c wrapper it can do anything you could type at a shell prompt, including running multiple commands in sequence. For a very long command it's not uncommon to see YAML block-scalar syntax here.

I think the reason the command field is an array is because it directly overrides the entrypoint of the container (and args the CMD) which can be an array, and should be one in order to use command and args together properly (see the documentation)

Related

Difference between '-- /bin/sh -c ls' vs 'ls' when setting a command in kubectl?

I am bit confused with commands in kubectl. I am not sure when I can use the commands directly like
command: ["command"] or -- some_command
vs
command: [/bin/sh, -c, "command"] or -- /bin/sh -c some_command

I am bit confused with commands in kubectl. I am not sure when I can use the commands directly
Thankfully the distinction is easy(?): every command: is fed into the exec system call (or its golang equivalent); so if your container contains a binary that the kernel can successfully execute, you are welcome to use it in command:; if it is a shell built-in, shell alias, or otherwise requires sh (or python or whatever) to execute, then you must be explicit to the container runtime about that distinction
If it helps any, the command: syntax of kubernetes container:s are the equivalent of ENTRYPOINT ["",""] line of Dockerfile, not CMD ["", ""] and for sure not ENTRYPOINT echo this is fed to /bin/sh for you.

At a low level, every (Unix/Linux) command is invoked as a series of "words". If you type a command into your shell, the shell does some preprocessing and then creates the "words" and runs the command. In Kubernetes command: (and args:) there isn't a shell involved, unless you explicitly supply one.
I would default to using the list form unless you specifically need shell features.
command: # overrides Docker ENTRYPOINT
- the_command
- --an-argument
- --another
- value
If you use list form, you must explicitly list out each word. You may use either YAML block list syntax as above or flow list syntax [command, arg1, arg2]. If there are embedded spaces in a single item [command, --option value] then those spaces are included in a single command-line option as if you quoted it, which frequently confuses programs.
You can explicitly invoke a shell if you need to:
command:
- sh
- -c
- the_command --an-argument --another value
This command is in exactly three words, sh, the option -c, and the shell command. The shell will process this command in the usual way and execute it.
You need the shell form only if you're doing something more complicated than running a simple command with fixed arguments. Running multiple sequential commands c1 && c2 or environment variable expansion c1 "$OPTION" are probably the most common ones, but any standard Bourne shell syntax would be acceptable here (redirects, pipelines, ...).

How to use expressions by name to select multiple pods to apply labels or annotations?

I want to understand how the following commands works and what kind of expressions are supported:
kubectl label pod foo{1..3} fizz=buzz
The foo{1..3} selects:
foo1
foo2
foo3
I couldn't have found any documentation so far.

That syntax is the GNU Bash brace expansion extension syntax. Some other shells like zsh support it too, but it is not one of the Word Expansions in the POSIX shell spec; it won't work with some minimalist shells like the default dash shell in Debian GNU/Linux or the Busybox shell in an Alpine Docker image.
That means this is expanded by your local shell, to construct the arguments to kubectl. Most of the expansion possibilities focus on filenames or environment variables. (foo* would match local files whose names begin with foo, not Kubernetes pods.) Potentially you could find $(command) substitution or $(( 1 + 2 )) arithmetic substitution useful. There's not any broader Kubernetes name-matching syntax that gets used here, this is exclusively local-shell processing.

Can nmap run multiple nmap script with multiple arguments in one command?

I want to run multiple nmap scripts, each of which takes in one or multiple arguments.
For example, I want to run 3 scripts: sc1, sc2, sc3.
sc1 uses args: sc1.ag1, sc1.ag2, sc1.ag3
sc2 uses args: sc2.ag1, sc2.ag2
sc3 uses args: sc3.ag1
Is it possible to run a command like this?
nmap --script sc1,sc2,sc3 --script-args=sc1.ag1,sc1.ag2,sc1.ag3,sc2.ag1,sc2.ag2,sc3.ag1 192.168.111.111

Yes, that is allowed. You should be careful with quoting for your shell, since script args can contain spaces and quote characters.
You may also be interested in the --script-args-file option, which allows you to put each script argument on a separate line of a text file. The newline acts the same as the comma (",") in your example.
Script specification is covered in the online documentation.

supervisor program:x command expansion of environment variables $(ENV_VAR)s?

I would like to put configuration (in this case, site name) into supervisor
environment variables, for expansion in program:x command arguments. Is this supported? The documentation's wording would seem to indicate yes.
The following syntax is not working for me on supervisor-3.0 (excerpt of config file):
[supervisord]
environment = SITE="mysite"
[program:service_name]
command=/path/to/myprog/myservice /data/myprog/%(ENV_SITE)s/%(ENV_SITE)s.db %(program_name)s_%(process_num)03d
process_name=%(program_name)s_%(process_num)03d
numprocs=5
numprocs_start=1
Raises the following error:
sudo supervisord -c supervisord.conf
Error: Format string
'/path/to/myprog/myservice /data/myprog/%(ENV_SITE)s/%(ENV_SITE)s.db %(program_name)s_%(process_num)03d'
for 'command' contains names which cannot be expanded
Reading the documentation, I expected environment variables to be available for
expansion in program:x command as %(ENV_VAR)s:
http://supervisord.org/configuration.html#program-x-section-values
command:
"String expressions are evaluated against a dictionary containing the keys
group_name, host_node_name, process_num, program_name, here (the directory of
the supervisord config file), and all supervisord's environment variables
prefixed with ENV_."
Introduced: 3.0
Related:
There are open pull requests to enable expansion in additional section values:
https://github.com/Supervisor/supervisor/issues?labels=expansions&page=1&state=open
A search of goole (or SO) returns no examples of attempts to use %(ENV_VAR)s
expansion in the command section value:
https://www.google.com/search?q=supervisord+environment+expansion+in+command

I agree supervisor is not clear about this ( to me at least ).
I've found the easiest solution to execute /bin/bash -c.
In your case it would be:
command=/bin/bash -c"/path/to/myprog/myservice /data/myprog/${SITE}/${SITE}.db ..."
What do you think?
I've found inspiration here: http://blog.trifork.com/2014/03/11/using-supervisor-with-docker-to-manage-processes-supporting-image-inheritance/

You are doing it right; however, the ENV defined in your supervisord section doesn't get made available to the processes for whatever reason during configuration loading. If you start supervisord like this:
SITE=mysite supervisord
It will run correctly and expand that variable. I don't know why supervisord has issues adding to the environment and making it available to the subprocesses' config expansion. I think the environment variable is available inside the subprocess, but not when expanding variables in the subprocess config declaration.

How can I find out what script, program, or shell executed my Perl script?

How would I determine what script, program, or shell executed my Perl script?
Example: I might want to have human readable output if executed from shell (customized for each type of shell), a different type of output if called as a script from another perl script, and a machine readable format if executed from a program such as a continuous integration server.
Motivation: I have a tool that changes its output based on which shell executes it. I'd normally implement this behavior as an option to the script, but this tool's design doesn't allow for options. Other shells have environment variables that indicate what shell is running. I'm working on a patch to support Powershell, which has no such special variable.
Edit: Many of these answers happen to be linux specific. Unfortuantely, Powershell is for Windows. getppid, the $ENV{SHELL} variable, and shelling out to ps won't help in this case. This script needs to run cross-platform.

You use getppid(). Take this snippet in child.pl:
my $ppid = getppid();
system("ps --no-headers $ppid");
If you run it from the command line, system will show bash or similar (among other things). Execute it with system("perl child.pl"); in another script, e.g. parent.pl, and you will see that perl parent.pl executed it.
To capture just the name of the process with arguments (thanks to ikegami for the correct ps syntax):
my $ppid = getppid();
my $ps = `ps --no-headers -o cmd $ppid`;
chomp $ps;
EDIT: An alternative to this approach, might be to create soft links to your script, make the different contexts use different links to access your script and inspect $0 to build logic around that.

I would suggest a different approach to accomplish your goal. Instead of guessing at the context, make it more explicit. Each use case is wholly separate, so have three different interfaces.
A function which can be called inside a Perl program. This would likely return a Perl data structure. This is far easier, faster and more reliable than parsing script output. It would also serve as the basis for the scripts.
A script which outputs for the current shell. It can look at $ENV{SHELL} to discover what shell is running. For bonus points, provide a switch to explicitly override.
A script which can be called inside a non-Perl program, such as your continuous integration server, and issue machine readable output. XML and/or JSON or whatever.
2 and 3 would be just thin wrappers to format the data coming out of 1.
Each is tailored to fit its specific need. Each will work without heuristics. Each will be far simpler than trying to guess the context and what the user wants.
If you can't separate 2 and 3, have the continuous integration server set an environment variable and look for it.

Depending on your environment, you may be able to pick it up from the environment variables. Consider the following code:
/usr/bin/perl -MData::Dumper -e 'print Dumper(\%ENV);' | grep sh
On my Ubuntu system, it gets me:
'SHELL' => '/bin/bash',
So I guess that says I'm running perl from a bash shell. If you use something else, the SHELL variable may give you a hint.
But let's say you know you're in bash, but perl is run from a subshell. Then try:
/bin/sh -c "/usr/bin/perl -MData::Dumper -e 'print Dumper(\%ENV);'" | grep sh
You will find:
'_' => '/bin/sh',
'SHELL' => '/bin/bash',
So the shell is still bash, but bash has a variable $_ which also show the absolute filename of the shell or script being executed, which may also give a valuable hint. Similarily, for other environments there will most probably be clues left in the perl %ENV hash that should give you valuable hints.

If you're running PowerShell 2.0 or above (most likely), you can infer the shell as a parent process by examining the environment variable %psmodulepath%. By default, it points to the system modules under %windir%\system32\windowspowershell\v1.0\modules; this is what you would see if you examine the variable from cmd.exe.
However, when PowerShell starts up, it prepends the user's default module search path to this environment variable which looks like: %userprofile%\documents\windowspowershell\modules. This is inherited by child processes. So, your logic would be to test if %psmodulepath% starts with %userprofile% to detect powershell 2.0 or higher. This won't work in PowerShell 1.0 because it does not support modules.

This is on Windows XP with PowerShell v2.0, so take it with a grain of salt.
In a cmd.exe shell, I get:
PSModulePath=C:\WINDOWS\system32\WindowsPowerShell\v1.0\Modules\
whereas in the PowerShell console window, I get:
PSModulePath=E:\Home\user\WindowsPowerShell\Modules;C:\WINDOWS\system32\WindowsP
owerShell\v1.0\Modules\
where E:\Home\user is where my "My Documents" folder is. So, one heuristic may be to check if PSModulePath contains a user dependent path.
In addition, in a console window, I get:
!::=::\
in the environment. From the PowerShell ISE, I get:
!::=::\
!C:=C:\Documents and Settings\user

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse