luigi: command-line parameters not becoming part of a task's signature? - command-line

In luigi, I know how to use its parameter mechanism to pass command-line parameters into a task. However, if I do so, the parameter becomes part of the task's signature.
But there are some cases -- for example, if I want to optionally pass a --debug or --verbose flag on the command line -- where I don't want the command-line parameter to become part of the task's signature.
I know I can do this outside of the luigi world, such as by running my tasks via a wrapper script which can optionally set environment variables to be read within my luigi code. However, is there a way I can accomplish this via luigi, directly?

Just declare them as insignificant parameters, ie instantiate the parameter class passing significant=False as keyword argument.
Example:
class MyTask(DateTask):
other = luigi.Parameter(significant=False)

Related

Data Fusion - Argument defined in Argument Setter Plugin Supersedes Runtime Arguments intermittently

Using Data Fusion Argument Setter, I've defined all parameters in it for a reusable pipeline. While executing it, I provide runtime arguments for some parameters which are different from default arguments provided in the JSON URL embedded in Argument Setter.
But a number of times, the pipeline ends up taking the default values from Argument Setter URL instead of Runtime Arguments causing failures.
This behavior is not consistent in every pipeline I create - which confirms that Runtime arguments are supposed to supersede any prior value defined for an argument.
The workarounds I use is by deleting the plugin and re-adding it for every new pipeline. But that defeats the purpose of creating a re-usable pipeline.
Has anyone experienced this issue ?
Current Runtime Options
This wiki https://cloud.google.com/data-fusion/docs/tutorials/reusable-pipeline provides the sample of how to create re-usable pipeline using Argument Setter. From there, it seems like the runtime arguments was used to notify the data fusion pipeline to use the macro from Argument Setter URL. Argument Setter is a type of Action plugin that allows one to create reusable pipelines by dynamically substituting the configurations that can be served by an HTTP Server. It looks like no matter how you change the runtime arguments, as long as long the same marco can be read when pipeline is running, the arguments will be override.

Using dot source function in params

I'm going to start by saying I'm still pretty much a rookie at PowerShell and hoping there is a way to do this.
We have a utils.ps1 script that contains just functions that we dot source with in other scripts. One of the functions returns back a default value if a value is not passed in. I know I could check $args and such but what I wanted was to use the function for the default value in the parameters.
param(
[string]$dbServer=$(Get-DefaultParam "dbServer"),
[string]$appServer=$(Get-DefaultParam "appServer")
)
This doesn't work since the Util script hasn't been sourced yet. I can't put the dot source first because then params doesn't work as it's not the top line. The utils isn't a module and I can't use the #require.
What I got working was this
param(
[ValidateScript({ return $false; })]
[bool]$loadScript=$(. ./Utils.ps1; $true),
[string]$dbServer=$(Get-DefaultParam "dbServer"),
[string]$appServer=$(Get-DefaultParam "appServer")
)
Create a parameter that loads the script and prevent passing a value into that parameter. This will load the script in the correct scope, if I load it in the ValidateScript it's not in the correct scope. Then the rest of the parameters have access to the functions in the Utils.ps1. This probably is not a supported side effect, aka hack, as if I move the loadScript below the other parameters fail since the script hasn't been loaded.
PowerShell guarantee parameters will always load sequential?
Instead should we put all the functions in Utils.ps1 in global scope? this would need to run Utils.ps1 before the other scripts - which seems ok in scripting but less than ideal when running the scripts by hand
Is there a more supported way of doing this besides modules and #require?
Better to not use default value of params and just code all the checks after sourcing and check $args if we need to run the function?
It would be beneficial to instead turn that script into a PowerShell Module, despite your statement that you desire to avoid one. This way, your functions are always available for use as long as the module is installed. Also, despite not wanting to use it, the #Require directive is how you put execution constraints on your script, such as PowerShell version or modules that must be installed for the script to function.
If you really don't want to put this into a module, you can dot-source utils.ps1 from the executing user's $profile. As long as you don't run powershell.exe with the -NoProfile parameter, the profile loads with each session and your functions will be available for use.

Access PowerShell script parameters from inside module

I have a PowerShell module that contains a number of common management and deployment functions. This is installed on all our client workstations. This module is called from a large number of scripts that get executed at login, via scheduled tasks or during deployments.
From within the module, it is possible to get the name of the calling script:
function Get-CallingScript {
return ($script:MyInvocation.ScriptName)
}
However, from within the module, I have not found any way of accessing the parameters originally passed to the calling script. For my purposes, I'd prefer to access them in the form of a dictionary object, but even the original command line would do. Unfortunately, given my use case, accessing the parameters from within the script and passing them to the module is not an option.
Any ideas? Thank you.
From about_Scopes:
Sessions, modules, and nested prompts are self-contained environments,
but they are not child scopes of the global scope in the session.
That being said, this worked for me from within a module:
$Global:MyInvocation.UnboundArguments
Note that I was calling my script with an unnamed parameter when the script was defined without parameters, so UnboundArguments makes sense. You might need this instead if you have defined parameters:
$Global:MyInvocation.BoundParameters
I can see how this in general would be a security concern. For instance, if there was a credential passed to the function up the stack from you, you would be able to access that credential.
The arguments passed to the current function can be accessed via $PSBoundParameters, but there isn't a mechanism to look at the call stack function's parameters.

How can I get the list of properties that MSBuild was invoked with?

Given this command:
MSBuild.exe build.xml /p:Configuration=Live /p:UseMerge=true /p:EnableUpdateable=false
how can I form a string like this in my build script:
UseMerge=true;EnableUpdateable=true
where I might not know which properties were used at the command line.
What are you going to do with the list?
There's no built in "properties that came via the commandline" thing a la splatting in PowerShell 2.0
Remember properties can come from environment variables and/or other scripts.
Also, you stripped on of the params out in your example.
In general, if one is trying to chain to another command, one uses defaulting (Conditions on elements in PropertyGroups) and validation (Messages Conditional on presence of options) and then either create a new property or embed the params you want to pass into a string.
Here's hoping someone has a nice neat example of a more general way to do this but I doubt it.
As covered in http://www.simple-talk.com/dotnet/.net-tools/extending-msbuild/ one can dump out the parameters passed by doing /v:diag on the commandline (but that's obviously not what you're after).
Have a look in the Common.targets files - you'll find lots of cases of chaininign involving manaully building up lists to pass onto subservient tasks.

How do I get around PowerShell not binding pipeline parameters until after BeginProcessing is called?

I'm writing a Cmdlet that can be called in the middle of a pipeline. With this Cmdlet, there are parameters that have the ValueFromPipelineByPropertyName attribute defined so that the Cmdlet can use parameters with the same names that are defined earlier in the pipeline.
The paradox that I've run into is in the overridden BeginProcessing() method, I utilize one of the parameters that can get its value bound from the pipeline. According to the Cmdlet Processing Lifecycle, the binding of pipeline parameters does not occur until after BeginProcessing() is called. Therefore, it seems that I'm unable to rely on pipeline bound parameters if they're attempting to be used in BeginProcessing().
I've thought about moving things to the ProcessRecord() method. Unfortunately, there is a one time, relatively expensive operation that needs to occur. The best place for this to happen seems to be in the BeginProcessing() method to help ensure that it only happens once in the pipeline.
A few questions question surrounding this:
Is there a good way around this?
These same parameters also have the Mandatory attribute set on them. How can I even get this far without PowerShell complaining about not having these required parameters?
Thanks in advance for your thoughts.
Update
I took out the second part of the question as I realized I just didn't understand pipeline bound parameters well enough. I mistakingly thought that pipeline bound parameters came from the previous Cmdlet that executed in the pipeline. The actually come from the object being passed through the pipeline! I referenced a post by Keith Hill to help understand this.
You could set an instance field bool (Init) to false in BeginProcessing. Then check to see if the parameter is set in BeginProcessing. If it is then call a method that does the one time init (InitMe). In ProcessRecord check the value of Init and if it is false, then call InitMe. InitMe should set Init to true before returning.
Regarding your second question, if you've marked the parameter as mandatory then it must be supplied either as a parameter or via the pipeline. Are you using multiple parameter sets? If so, then even if a parameter is marked as mandatory, it is only mandatory if the associated parameter set is the one that is determined by PowerShell to be in use for a particular invocation of the cmdlet.