Powershell cmdlets development best practices - powershell

I'm currently putting together some Powershell cmdlets. Building them is easy enough but I don't know if I'm building them in an acceptable manner (so to speak).
Are there any guidelines/best practices that one should follow for passing data into the Powershell pipeline? At the moment I'm actually output a single object of type DataSet - if any cmdlet wanted to use it downstream then they would have to loop over the DataTables in that DataSet, then loop over the DataRows in each DataTable.
I guess the question is....am I going to p!ss anyone off by doing this? Or should I be outputting data that is inherently a bunch of rows?
Thanks all in advance
-JT

It's acceptable to output whatever type of object is best used to represent what you're writing out - a DataSet is absolutely fine. The only potential caution is that v2 of PowerShell may find itself running on a reduced version of the .NET Framework (such as on Server Core), so if that's a potential scenario for your cmdlets, you need to use some caution to make sure the object you're outputting exists on every system where your cmdlet might be used.
All that said, the pipeline works best when it contains collections of objects; a DataSet isn't a collection per se. In other words, you want downstream cmdlets to be able to receive one object at a time via the pipeline, so that those cmdlets don't have to manually enumerate through an object. I don't know a lot about exactly what you're doing - it could well be that a DataSet is entirely appropriate - but I'd generally prefer to see a cmdlet loop through the DataSet internally, create its own custom objects (so that each column in the table becomes a property), and output those objects to the pipeline. That simply increases the number of downstream cmdlets that can consume what you're putting out.
A simple test is to pipe your cmdlet's output to Export-CSV. If it works (and it probably wouldn't with a DataSet), then you're doing the right thing generally. Now, you may well need to create a cmdlet which outputs a DataSet and you only intend for certain other cmdlets you've written (which consume DataSets) to operate against that output. Nothing wrong with that. Max flexibility is single objects, though, since it enables all of PowerShell's core cmdlets to work on your output.
Hope that helps.

MSDN has an amazing set of Cmdlet Development Guidelines which I found extremely useful when developing my own. They are broken up into three different sections:
Required Development Guidelines
Strongly Encouraged Development Guidelines
Advisory Development Guidelines

Related

WebGPU best practices with setPipeline and optimization

I'm learning WebGPU for the first time and, in the tutorials I'm following, I see that setPipeline is called on each rendering pass. I'm wondering if there's a performance hit if the pipeline is changed between passes? Most of the tutorials I'm reading use the same pipeline for every pass and just change the data going to it via a writeBuffer, but I don't know if that's intentional. The only thing I've read about pipeline optimization is from this tutorial
The configuration of the components of this pipeline (e.g., the shaders, vertex state, render output state, etc.) are fixed, allowing the GPU to better optimize rendering for the pipeline.
That'd lead me to believe that the pipeline shouldn't be changed between passes, but I haven't seen anything stating that explicitly. Thanks in advance for any help!
It's fairly common for applications to use different shaders for different objects in a single render (eg: see this question). From an optimisation perspective, you'd want to set a pipeline and render all objects that use that pipeline, then set another pipeline and render all objects that use that pipeline, and so on. You'd probably want to look into instances to minimize the number of draw calls too.

How to find $plusargs with same string in different locations

Very general issue in large integration of verification environment.
Our verification development involves large group across different time zone.
Group has preference to use $plusargs instead factory mechanism.
Probably main reason it is hard to set factory from command line processor,
we have more layers of scripts to start simulation.
Recently i found that same string been used in different environment to control behavior of environment. In this case two different score board used same string to disable some checking and test pass. Both those environment some time created at run time. Also some time it is OK to re-use same string, and it will require owner to be involved.
Is there any way to find duplication like this from final elaborated model, and provide locations in code as a warning?
I thought create our own wrapper, but problem that we are integrating some code that we are not owners as in this case was.
Thanks,
This is a perfect example of how people think they can get things done quicker by not following the recommended UVM methodology and instead create time consuming complexity later on.
I see at least two possible options.
Write a script that searches the source code for $plusargs and hopefully they have used string literals for you to trace for duplicates.
You can override $plusargs with PLI code and have it trace duplicates.
The choice depends on wether you are better at writing Perl/Python or C code.

Getting Recursive Tasks in Asana with reasonable performance

I'm using the Asana REST API to iterate over workspaces, projects, and tasks. After I achieved the initial crawl over the data, I was surprised to see that I only retrieved the top-level tasks. Since I am required to provide the workspace and project information, I was hoping not to have to recurse any deeper. It appears that I can recurse on a single task with the \subtasks endpoint and re-query... wash/rinse/repeat... but that amounts to a potentially massive number of REST calls (one for each subtask to see if they, in turn, have subtasks to query - and so on).
I can partially mitigate this by adding to the opt_fields query parameter something like:
&opt_fields=subtasks,subtasks.subtasks
However, this doesn't scale well. It means I have to elongate the query for each layer of depth. I suppose I could say "don't put tasks deeper than x layers deep" - but that seems to fly in the face of Asana's functionality and design. Also, since I need lots of other properties, it requires me to make a secondary query for each node in the hierarchy to gather those. Ugh.
I can use the path method to try to mitigate this a bit:
&opt_fields=(this|subtasks).(id|name|etc...)
but again, I have to do this for every layer of depth. That's impractical.
There's documentation about this great REPEATER + operator. Supposedly it would work like this:
&opt_fields=this.subtasks+.name
That is supposed to apply to ALL subtasks anywhere in the hierarchy. In practice, this is completely broken, and the REST API chokes and returns only the ids of the top-level tasks. :( Apparently their documentation is just wrong here.
The only method that seems remotely functional (if not practical) is to iterate first on the top-level tasks, being sure to include opt_fields=subtasks. Whenever this is a non-empty array, I would need to recurse on that task, query for its subtasks, and continue in that manner, until I reach a null subtasks array. This could be of arbitrary depth. In practice, the first REST call yields me (hopefully) the largest number of tasks, so the individual recursion may be mitigated by real data... but it's a heck of an assumption.
I also noticed that the limit parameter applied ONLY to the top-level tasks. If I choose to expand the subtasks, say. I could get a thousand tasks back instead of 100. The call could timeout if the data is too large. The safest thing to do would be to only request the ids of subtasks until recursion, and as always, ask for all the desired top-level properties at that time.
All of this seems incredibly wasteful - what I really want is a flat list of tasks which include the parent.id and possibly a list of subtasks.id - but I don't want to query for them hierarchically. I also want to page my queries with rational data sizes in mind. I'd like to get 100 tasks at a time until Asana runs out - but that doesn't seem possible, since the limit only applies to top-level items.
Unfortunately the repeater didn't solve my problem, since it just doesn't work. What are other people doing to solve this problem? And, secondarily, can anyone with intimate Asana insight provide any hope of getting a better way to query?
While I'm at it, a suggested way to design this: the task endpoint should not require workspace or project predicate. I should be able to filter by them, but not be required to. I am limited to 100 objects already, why force me to filter unnecessarily? In the same vein - navigating the hierarchy of Asana seems an unnecessary tax for clients who are not Asana (and possibly even the Asana UI itself).
Any ideas or insights out there?
Have you ensured that the + you send is URL-encoded? Whatever library you are using should usually handle this (which language are you using, btw? We have some first-party client libraries available)
Try &opt_fields=this.subtasks%2B.name if you're creating the URL manually, or (better yet) use a library that correctly encodes URL query parameters.

Make A step in a SAS macro timeout after a set interval

I'm on SAS 9.1.3 (on a server) and have a macro looping over an array to feed a computationally intensive set of modelling steps which are appended out to a table. I'm wondering if it is possible to set a maximum time to run for each element of the array. This is so that any element which takes longer than 3 minutes to run is skipped and the next item fed in.
Say for example I'm using a proc nlin with a by statement to build separate models per class on a large data set, and one class is failing to converge; how do I skip over that class?
Bit of a niche requirement, hope someone can assist!
The only approach I can think of here would be to rewrite your code so that it runs each by group separately from the rest, in one or more SAS/CONNECT sessions, have the parent session kill each one after a set timeout, and then recombine the surviving output.
As Dom and Joe have pointed out, this is not a trivial task, but it's possible if you're sufficiently keen on learning about that aspect of SAS. A good place to get started for this sort of thing would be this page:
http://support.sas.com/rnd/scalability/tricks/connect.html
I was able to use the examples there and elsewhere as the basis of a simple parallel processing framework (in SAS 9.1.3, coincidentally!), but there are many details you will need to consider. To give you an idea of the sorts of adventures in store if you go down this route:
Learning how to sign on to your server via SAS/CONNECT within whatever infrastructure you're using (will the usual autoexec file work? What invocation options do you need to use?)
Explaining to your sysadmin/colleagues why you need to run multiple processes in parallel
Managing asynchronous sessions
Syncing macro variables, macro definitions, libraries and formats between sessions
Obscure bugs (I wasn't able to use the usual option for syncing libraries and had to roll my own via call execute...)
One could write a (lengthy) SUGI paper on this topic, and I'm sure there are plenty of them out there if you look around.
In general, SAS is running in a linear manner. So you cannot write a step to monitor another step in the same program. What you could do is run your code in a SAS/CONNECT session and monitor it with the process that started the session. That's not trivial and the how to is beyond the scope of Stack Overflow.
For a data step, use the datetime() function to get the current system date and time. This is measured in seconds. You can check the time inside your data step. Stop a data step with the stop; statement.
Now you specifically asked about breaking a specific step inside a PROC. That must be implemented in the PROC by the SAS developer. If it is possible, it will be documented in the procedure's documentation. View SAS documentation at http://support.sas.com/documentation/.
For PROC NLIN, I do not think there is a "break after X" parameter. You can use the trace parameters to track model execution to see what it hanging up. You can then work on changing the convergence parameters to attempt to speed up slow, badly converging, models.

In salesforce.com can you have multivalued attributes?

I am developing a Novell Identity Manager driver for Salesforce.com, and am trying to understand the Salesforce.com platform better.
I have had really good success to date. I can read pretty much arbitrary object classes out of SFDC, and create eDirectory objects for them, and what not. This is all done and working nicely. (Publisher Channel). Once I got Query events mapped out, most everything started working in the Publisher Channel.
I am now working on sending events back to SFDC (Subscriber channel) when changes occur in eDirectory.
I am using the upsert() function in the SOAP API, and with Novell Identity Manager, you basically build the SOAP doc, and can see the results as you build it. (You can do it in XSLT or you can use the various allowed tokens to build the document in DirXML Script. I am using DirXML Script which has been working well so far.).
The upshot of that comment is that I can build the SOAP document, see it, to be sure I get it right. Which is usually different than the Java/C++ approach that the sample code usually provides. Much more visual this way.
There are several things about upsert() that I do not entirely understand. I know how to blank a value, should I get that sort of event. Inside the <urn:sObjects> node, add a node like (assuming you get your namespaces declared already):
<urn1:fieldsToNull>FieldName</urn1:fieldsToNull>
I know how to add a value (AttrValue) to the attribute (FieldName), add a node like:
<FieldName>AttrValue</FieldName>
All this works and is pretty straight forward.
The question I have is, can a value in SFDC be multi-valued? In eDirectory, a multi valued attribute being changed, can happen two ways:
All values can be removed, and the new set re-added.
The single value removed can be sent as that sort of event (remove-value) or many values can be removed in one operation.
Looking at SFDC, I only ever see Multi-picklist attributes that seem to be stored in a single entry : or ; delimited. Is there another kind of multi valued attribute managed differently in SFDC? And if so, how would one manipulate it via the SOAP API?
I still have to decide if I want to map those multi-picklists to a single string, or a multi valued attribute of strings. First way is easier, second way is more useful... Hmmm... Choices...
Some references:
I have been using the page Sample SOAP messages to understand what the docs should look like.
Apex Explorer is a kicking tool for browsing the database and testing queries. Much like DBVisualizer does for JDBC connected databases. This would have been so much harder without it!
SoapUi is also required, and a lovely tool!
As far as I know there's no multi-value field other than multi-select picklists (and they map to semicolon-separated string). Generally platform encourages you to create a proper relationship with another (possibly new, custom) table if you're in need of having multiple values associated to your data.
Only other "unusual" thing I can think of is how the OwnerId field on certain objects (Case, Lead, maybe something else) can be used to point to User or Queue record. Looks weird when you are used to foreign key relationships from traditional databases. But this is not identical with what you're asking as there will be only one value at a time.
Of course you might be surpised sometimes with values you'll see in the database depending on the viewing user's locale (stuff like System Administrator profile becoming Systeembeheerder in Dutch). But this will be still a single value, translated on the fly just before the query results are sent back to you.
When I had to perform SOAP integration with SFDC, I've always used WSDL files and most of the time was fine with Java code generated out of them with Apache Axis. Hand-crafting the SOAP message yourself seems... wow, hardcore a bit. Are you sure you prefer visualisation of XML over the creation of classes, exceptions and all this stuff ready for use with one of several out-of-the-box integration methods? If they'll ever change the WSDL I need just to regenerate the classes from it; whereas changes to your SOAP message creation library might be painful...