Data Fusion - Argument defined in Argument Setter Plugin Supersedes Runtime Arguments intermittently - google-cloud-data-fusion

Using Data Fusion Argument Setter, I've defined all parameters in it for a reusable pipeline. While executing it, I provide runtime arguments for some parameters which are different from default arguments provided in the JSON URL embedded in Argument Setter.
But a number of times, the pipeline ends up taking the default values from Argument Setter URL instead of Runtime Arguments causing failures.
This behavior is not consistent in every pipeline I create - which confirms that Runtime arguments are supposed to supersede any prior value defined for an argument.
The workarounds I use is by deleting the plugin and re-adding it for every new pipeline. But that defeats the purpose of creating a re-usable pipeline.
Has anyone experienced this issue ?
Current Runtime Options

This wiki https://cloud.google.com/data-fusion/docs/tutorials/reusable-pipeline provides the sample of how to create re-usable pipeline using Argument Setter. From there, it seems like the runtime arguments was used to notify the data fusion pipeline to use the macro from Argument Setter URL. Argument Setter is a type of Action plugin that allows one to create reusable pipelines by dynamically substituting the configurations that can be served by an HTTP Server. It looks like no matter how you change the runtime arguments, as long as long the same marco can be read when pipeline is running, the arguments will be override.

Related

How do I make Cloudformation reprocess a template using a macro when parameters change?

I have a Cloudformation template that uses a custom Macro to generate part of the template. The lambda for the Macro uses template parameters (via the templateParameterValues field in the incoming event) to generate the template fragment.
When I change the Cloudformation Stack's parameters, I get an error:
The submitted information didn't contain changes. Submit different information to create a change set.
If I use the CLI I get a similar error:
An error occurred (ValidationError) when calling the UpdateStack operation: No updates are to be performed.
The parameters I am changing are only used by the Macro, not the rest of the template.
How can I make Cloudformation reprocess the template with the macro when I update these parameters?
After working with AWS Support I learned that you must supply the template again in order for the macro to be re-processed.
Even if it is the same exact template it will cause the macros to be reprocessed.
You can do this via the Console UI (by uploading the template file again) or the CLI (by passing the template / template URL again).
I recently came across this when using the count macro to create instances.
I found i was able to modify just the parameters used just by the macro by moving that part of the template to a nested stack and passing the parameters through.
It does involve a bit more work to setup by having the separate stacks, but it did allow me to modify just the parameters of the parent stack how i wanted.

why reference parameters cannot be used inside fork join any/none in system verilog?

usage of reference parameters cannot be used inside fork join any/none in system verilog
** Error: ../tb/range_xform_driver.sv(28): (vlog-LRM-2295) Arguments passed by reference cannot be used within fork-join_any or fork_join_none blocks
** Error: ../tb/range_xform_driver.sv(29): (vlog-LRM-2295) Arguments passed by reference cannot be used within fork-join_any or fork_join_none blocks
This is an LRM restriction (See section Section 9.3.2 "Parallel blocks in the 1800-2017 LRM). The reason behind this restriction has to do with the fact that the lifetime of any variable referenced inside a fork/join_none/join_any block has to exist throughout the life of the fork block. Recall that there are these kinds variable lifetimes
Static - permanent and not an issue for this problem
Automatic - exists for the duration of a block activation
Dynamic - class object memory management by active references.
Queues, dynamic arrays, and associative array add another dimension to the above for each element.
The problem is when you pass variable by reference, you have no information about what kind of storage class the variable belongs to in order to be able to extend it's lifetime. You only have the reference to a generic variable type that matches the type of the reference.
Suppose you have a task with a ref argument to an int, and you call that task passing it class member that is an int. The code that calls that task only only passes a reference to that int, and not the handle to the class it belongs to. same problem with passing an element of an array.
If the compiler in-lines the task (replacing the call to the task with the contents of the source code of the task) you can get around this restriction. But then you can't take advantage of separate compilation( compiling the task definition in a separate step from compiling the code that calls the task).
Update
Note a newer revision of the IEEE 1800(-2023) SystemVerilog standard adds a static ref argument form that gets around this restriction.

luigi: command-line parameters not becoming part of a task's signature?

In luigi, I know how to use its parameter mechanism to pass command-line parameters into a task. However, if I do so, the parameter becomes part of the task's signature.
But there are some cases -- for example, if I want to optionally pass a --debug or --verbose flag on the command line -- where I don't want the command-line parameter to become part of the task's signature.
I know I can do this outside of the luigi world, such as by running my tasks via a wrapper script which can optionally set environment variables to be read within my luigi code. However, is there a way I can accomplish this via luigi, directly?
Just declare them as insignificant parameters, ie instantiate the parameter class passing significant=False as keyword argument.
Example:
class MyTask(DateTask):
other = luigi.Parameter(significant=False)

Long running workflow versioning: where and how to use OnActivityExecutionContextLoad?

We have a long running workflow which uses SQL tracking service (.Net WF 4.0). In the next update, we would like to introuduce a public property in one of the arguments of the workflow. Since this is breaking change, the persisted workflow inatances throw the following error on re-loading:
System.Runtime.DurableInstancing.InstancePersistenceCommandException: The execution of the InstancePersistenceCommand named .. LoadWorkflow was interrupted by an error.
InnerException: System.Runtime.Serialization.SerializationException: 'Element' '_x003C_BookmarkName_x003E_k__BackingField' from namespace '...' is not expected. Expecting element '....'
I understand this is a typical versioning issue and one of the recommendations I noticed on some of the sites is to override OnActivityExecutionContextLoad method and fill in the missing values. But I am not sure where and how to do this! OnActivityExecutionContextLoad is declared in System.Workflow.ComponentModel.Activity (.Net 3.5?) whereas what we have is a code-based top-level custom activity derived from System.Activities.NativeActivity (which receives the argument in question). Can something be done in this class to initialize the missing property of the argument?
All suggestions are welcome :)

How do I get around PowerShell not binding pipeline parameters until after BeginProcessing is called?

I'm writing a Cmdlet that can be called in the middle of a pipeline. With this Cmdlet, there are parameters that have the ValueFromPipelineByPropertyName attribute defined so that the Cmdlet can use parameters with the same names that are defined earlier in the pipeline.
The paradox that I've run into is in the overridden BeginProcessing() method, I utilize one of the parameters that can get its value bound from the pipeline. According to the Cmdlet Processing Lifecycle, the binding of pipeline parameters does not occur until after BeginProcessing() is called. Therefore, it seems that I'm unable to rely on pipeline bound parameters if they're attempting to be used in BeginProcessing().
I've thought about moving things to the ProcessRecord() method. Unfortunately, there is a one time, relatively expensive operation that needs to occur. The best place for this to happen seems to be in the BeginProcessing() method to help ensure that it only happens once in the pipeline.
A few questions question surrounding this:
Is there a good way around this?
These same parameters also have the Mandatory attribute set on them. How can I even get this far without PowerShell complaining about not having these required parameters?
Thanks in advance for your thoughts.
Update
I took out the second part of the question as I realized I just didn't understand pipeline bound parameters well enough. I mistakingly thought that pipeline bound parameters came from the previous Cmdlet that executed in the pipeline. The actually come from the object being passed through the pipeline! I referenced a post by Keith Hill to help understand this.
You could set an instance field bool (Init) to false in BeginProcessing. Then check to see if the parameter is set in BeginProcessing. If it is then call a method that does the one time init (InitMe). In ProcessRecord check the value of Init and if it is false, then call InitMe. InitMe should set Init to true before returning.
Regarding your second question, if you've marked the parameter as mandatory then it must be supplied either as a parameter or via the pipeline. Are you using multiple parameter sets? If so, then even if a parameter is marked as mandatory, it is only mandatory if the associated parameter set is the one that is determined by PowerShell to be in use for a particular invocation of the cmdlet.