Get References to Powershell's Stream Objects? - powershell

I am interested in getting .NET object references for the different streams that come with a Powershell host (stdin, plus the 5 output streams debug, info, error, etc.) I am interested in passing these to custom .NET types which will NOT be cmdlets... just .NET types that expect to use 5 output streams and 1 input stream.
I have spent lots of time googling and msdning and I just can't seem to find information about these streams beyond the cmdlets that read/write them.
If this is not possible, then a link to some related documentation would make for an answer.
Update
Thanks for the feedback so far, and sorry for the delay in making it back to this question.
#CharlieJoynt the idea here is that I will be using PowerShell as an entry point for a number of custom .NET types. These are types that will also be imported into other class libraries and EXEs so they cannot be PowerShell-specific. Anything that does host the types will, however, provide streams for info/log/error/etc output (instead of choosing a specific logging framework like log4net).
#PetSerAl I am not sure what an XY question is? If my update doesn't add the clarity you are looking for, can you clarify ( :P ) what the gap is?
Thanks again for the feedback so far, folks.

I have been able to intercept data written to certain streams by using the Register-ObjectEvent cmdlet.
Register-ObjectEvent
https://technet.microsoft.com/en-us/library/hh849929.aspx
The Register-ObjectEvent cmdlet subscribes to events that are
generated by .NET Framework objects on the local computer or on a
remote computer. When the subscribed event is raised, it is added to
the event queue in your session. To get events in the event queue, use
the Get-Event cmdlet.
You can use the parameters of Register-ObjectEvent to specify
property values of the events that can help you to identify the event
in the queue. You can also use the Action parameter to specify actions
to take when a subscribed event is raised and the Forward parameter to
send remote events to the event queue in the local session.
In my case I had created a new System.Diagnostics.Process object as $Process, but before starting that process I registered some event handlers, which exists as Jobs, e.g.
$StdOutJob = Register-Object-Event -InputObject $Process `
-EventName OutputDataReceived -Action $ScriptBlock
...where $ScriptBlock is a pre-determined script block that handles the events coming from that stream. Within that script block, the events are accessible via some built-in variables:
The value of the Action parameter can include the $Event,
$EventSubscriber, $Sender, $EventArgs, and $Args
automatic variables, which provide information about the event to
the Action script block.
So your ScriptBlock could take $EventArgs.Data and do something with it.
Disclaimer: I have not used this method to try to intercept all the streams you mention, just OutputDataReceived andErrorDataReceived.

Related

Convert object of type PsObject to HtmlWebResponseObject

I am trying to get around invoke-webrequest's propensity to hang in memory and kill my entire script. So far I have written a script block using get-job which calls this from a foreach:
start-job -scriptblock {invoke-webrequest $using:varSiteVariable} -name jobTitle | out-null
I wait 10 seconds and then use receive-job to capture the output from the most recent job into a variable, which I then want to parse as a PowerShell HtmlWebResponseObject in the same manner I would if I were using invoke-webrequest directly. The logic behind this is that I will then throw script execution and return to square one if there is nothing to parse, as invoke-webrequest has clearly crashed again.
However, when I pull the data from jobTitle into a variable, it is always of type PsObject, meaning it lacks the crucial ParsedHtml method which I'm using to perform all of the further parsing of the HTML code; that method appears to belong specifically to objects of type HtmlWebResponseObject. There does not appear to be a way that I have found to force-cast the object into this type, nor any way to convert one into the other after the type.
I cannot simply define the variable from within the job and then refer to it outside of the job, as the two commands happen in different contexts and share no working space. I cannot write the data to a file as I am unable to import it back as the right data-type for the processing I need to perform.
Does anyone know how I can convert my PsObject data into HtmlWebResponseObject data?
I ended up fixing this with the help of this article:
https://gallery.technet.microsoft.com/Powershell-Tip-Parsing-49eb8810
I couldn't re-cast the data as HtmlWebResponseObject, but I was able to make a new COM Object of type HTMLFile and write the data from the variable grabbed from my job into that. The script needed to be slightly re-written but the all-important methods I was using to parse the data work as before.

Cannot remove variable because it has been optimized and is not removable - releasing a COM object

At the end of my script I use 'ie' | ForEach-Object {Remove-Variable $_ -Force}. It works fine in PS 2 (Windows 7) but PS 5 (Windows 10) throws an error:
Cannot remove variable ie because the variable has been optimized and
is not removable. Try using the Remove-Variable cmdlet (without any
aliases), or dot-sourcing the command that you are using to remove the
variable.
How can I make it play nice with PS 5; or should I just use Remove-Variable 'ie' -Force?
The recommended way to remove COM objects is to call the ReleaseComObject method, passing the object reference ($ie) to the instance of your COM object.
Here is more detailed explanation and sample code from a Windows PowerShell Tip of the Week that shows how to get rid of COM objects:
Whenever you call a COM object from the common language runtime (which
happens to be the very thing you do when you call a COM object from
Windows PowerShell), that COM object is wrapped in a “runtime callable
wrapper,” and a reference count is incremented; that reference count
helps the CLR (common language runtime) keep track of which COM
objects are running, as well as how many COM objects are running. When
you start Excel from within Windows PowerShell, Excel gets packaged up
in a runtime callable wrapper, and the reference count is incremented
to 1.
That’s fine, except for one thing: when you call the Quit method and
terminate Excel, the CLR’s reference count does not get decremented
(that is, it doesn’t get reset back to 0). And because the reference
count is not 0, the CLR maintains its hold on the COM object: among
other things, that means that our object reference ($x) is still valid
and that the Excel.exe process continues to run. And that’s definitely
not a good thing; after all, if we wanted Excel to keep running we
probably wouldn’t have called the Quit method in the first place. ...
... calling the ReleaseComObject method [with] our
instance of Excel ... decrements the reference count for the object in
question. In this case, that means it’s going to change the reference
count for our instance of Excel from 1 to 0. And that is a good thing:
once the reference count reaches 0 the CLR releases its hold on the
object and the process terminates. (And this time it really does
terminate.)
$x = New-Object -com Excel.Application
$x.Visible = $True
Start-Sleep 5
$x.Quit()
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($x)
Remove-Variable x
The message "Cannot remove variable ie because the variable has been optimized and is not removable." you get, most likely means you have tried to access (inspect, watch, or otherwise access) a variable which has been already removed by the optimizer.
wp78de's helpful answer explains what you need to do to effectively release a COM object instantiated in PowerShell code with New-Object -ComObject.
Releasing the underlying COM object (which means terminating the process of a COM automation server such as Internet Explorer) is what matters most, but it's worth pointing out that:
Even without calling [System.Runtime.Interopservices.Marshal]::ReleaseComObject($ie) first, there's NO reason why your Remove-Variable call should fail (even though, if successful, it wouldn't by itself release the COM object).
I have no explanation for the error you're seeing (I cannot recreate it, but it may be related to this bug).
There's usually no good reason to use ForEach-Object with Remove-Variable, because you can not only pass one variable name directly, but even an array of names to the (implied) -Name parameter - see Remove-Variable -?;
Remove-Variable ie -Force should work.
Generally, note that -Force is only needed to remove read-only variables; if you want to (also) guard against the case where a variable by the specified name(s) doesn't exist, (also) use
-ErrorAction Ignore.

Sharing variables\data between Powershell processes

I would like to come up with a mechanism by which I can share 'data' between different Powershell processes. This would be in order to implement a kind of job system, whereby a function can be run in one Powershell process, complete and then someone communicate its status to a function run from another (distinct) Powershell process...
I guess what I'd ideally like psjob results to be shareable between sessions, but this does not seem to be possible.
I can think of a few dirty ways of achieving this (like O/S environment variables), but am I missing an semi-elegant way?
For example:
Function giveMeNumber
{
$return_vlaue = Get-Random -Minimum -100 -Maximum 100
Return $return_vlaue
}
What are some ways i could get this function to store it's return somewhere and then grab it from another Powershell session (without using a database).
Cheers.
The QA mentioned by Keith refers to using MSMQ, a message queueing feature optionally available on desktop, mobile & server OS's from Microsoft.
It doesn't run by default on desktop OS's so you would have to ensure that the appropriate service was started. Seems like serious overkill to me unless you wanted something pretty beefy.
Of course, the most common choice for this type of task would be a simple shared file.
Alternatively, you could create a TCP listener in each of the jobs that you want to have accept external info. Not done this myself in PowerShell though I know it is possible. Node.JS would be a more familiar environment or Python. Seems like overkill if a shared file would do the job!
Another way would be to use the registry. Though you might consider that cheating since it is actually a database (of a very broken and simplistic sort).
I'm actually not sure that environment variables would work since I know that they can be picky about the parent environment scope (for example setting an env variable in a cmd doesn't make it available outside of the cmd scope by default.
UPDATE: Doh, missed a few! Some of them very obvious. Microsoft have a list:
Clipboard
COM
Data Copy
DDE
File Mapping
Mailslots
Pipes
RPC
Windows Sockets
Pipes was the one I was trying to remember. Windows sockets would be similar to a TCP listener.

Hosting PowerShell: PowerShell vs. Runspace vs. RunspacePool vs. Pipeline

I attempting to add some fairly limited PowerShell support in my application: I want the ability to periodically run a user-defined PowerShell script and show any output and (eventually) be able to handle progress notification and user-prompt requests. I don't need command-line-style interactive support, or (I think) remote access or the ability to run multiple simultaneous scripts, unless the user script does that itself from within the shell I host. I'll eventually want to run the script asynchronously or on a background thread, and probably seed the shell with some initial variables and maybe a cmdlet but that's as "fancy" as this feature is likely to get.
I've been reading the MSDN documentation about writing host application code, but while it happily explains how to create a PowerShell object, or Runspace, or RunspacePool, or Pipeline, there's no indication about why one would choose any of these approaches over another.
I think I'm down to one of these two, but I've like some feedback about which approach is a better one to take:
PowerShell shell = PowerShell.Create();
shell.AddCommand(/* set initial state here? */);
shell.AddStatement();
shell.AddScript(myScript);
shell.Invoke(/* can set host! */);
or:
Runspace runspace = RunspaceFactory.CreateRunspace(/* can set host and initial state! */);
PowerShell shell = PowerShell.Create();
shell.Runspace = runspace;
shell.AddScript(myScript);
shell.Invoke(/* can set host here, too! */);
(One of the required PSHost-derived class methods is EnterNestedPrompt(), and I don't know whether the user-defined script I run could cause that to get called or not. If it can, then I'll be responsible for "starting a new nested input loop" (as per here)... if that impacts which path to take above, that would also be good to know.)
Thanks!
What are they?
Pipeline
A Pipeline is a way to concatenate commands inside a powershell script. Example: You "pipe" the output from Get-ChildeItem to Where-Object with | to filter them:
Get-ChildItem | Where-Object {$_}
PowerShell Object
The PowerShell object referes to a powershell session, like the one you would get when you start powershell.exe.
Runspace
Every powershell session has its own runspace (You'll always get output from Get-Runspace). It defines the state of the powershell session. Hence the InitialSessionState object/property of a runspace. You may decide to create a new powershell session, with its own runspace from within a powershell, to enable a kind of multithreading.
RunspacePool
Last but not least the RunspacePool. Like the name says, it's a pool of runspaces (or powershell sessions) that can be used to process a lot of complecated tasks. As soon as one of the runspaces in the pool has finished its task it may take the next task till everything is done. (100 things to do with 10 runspaces: on avarage they process 10 each but one may process 8 while two others process 11...)
When to use what?
Pipeline
The pipeline is used insed of scripts. It makes it easier to build complex scripts and should be used as often as possible.
PowerShell Object
The powershell object is used when ever you need a new powershell session. You can create one inside of an existing script, be it C# or Powershell. It's usefull for easy multithreading. On its own it will create a default session.
Runspace
If you want to create a non standard session of powershell, you can manipulate the runspace object before you create a powershell session with it. It's usefull when you want to share synchronized variables, functions or classes in the extra runspaces. Slightly more complex multithreading.
RunspacePool
Like mentioned before it's a heavy tool for heavy work. When one execution of a script takes hours and you need to do it very often.E.g. In combination with remoting you could simultanly install something on every node of a big cluster and the like.
You are overthinking it. The code you show in samples is a good start. Now you just need to read the result of Invoke() and check the error and warning streams.
PowerShell host provides some hooks that RunSpace can use to communicate with user, like stream and format outputs, show progress, report errors, etc. For what you want to do you do not need PowerShell Host. You can read results back from script execution using PowerShell class, check for errors, warnings, read output streams and show notification to the user using facilities of your application. This will be much more straightforward and effective than write entire PowerShell host to show a message box if errors detected.
Also, PowerShell object HAS a Runspace when it is created, you do not need to give it one. If you need to retain the runspace to preserve the environment just keep entire PowerShell object and clear Commands and all Streams each time after you call Invoke.
The next question you should ask is how to process result of PowerShell::Invoke() and read PowerShell::Streams.

How to instruct PowerShell to garbage collect .NET objects like XmlSchemaSet?

I created a PowerShell script which loops over a large number of XML Schema (.xsd) files, and for each creates a .NET XmlSchemaSet object, calls Add() and Compile() to add a schema to it, and prints out all validation errors.
This script works correctly, but there is a memory leak somewhere, causing it to consume gigabytes of memory if run on 100s of files.
What I essentially do in a loop is the following:
$schemaSet = new-object -typename System.Xml.Schema.XmlSchemaSet
register-objectevent $schemaSet ValidationEventHandler -Action {
...write-host the event details...
}
$reader = [System.Xml.XmlReader]::Create($schemaFileName)
[void] $schemaSet.Add($null_for_dotnet_string, $reader)
$reader.Close()
$schemaSet.Compile()
(A full script to reproduce this problem can be found in this gist: https://gist.github.com/3002649. Just run it, and watch the memory usage increase in Task Manager or Process Explorer.)
Inspired by some blog posts, I tried adding
remove-variable reader, schemaSet
I also tried picking up the $schema from Add() and doing
[void] $schemaSet.RemoveRecursive($schema)
These seem to have some effect, but still there is a leak. I'm presuming that older instances of XmlSchemaSet are still using memory without being garbage collected.
The question: How do I properly teach the garbage collector that it can reclaim all memory used in the code above? Or more generally: how can I achieve my goal with a bounded amount of memory?
Microsoft has confirmed that this is a bug in PowerShell 2.0, and they state that this has been resolved in PowerShell 3.0.
The problem is that an event handler registered using Register-ObjectEvent is not garbage collected. In reponse to a support call, Microsoft said that
"we’re dealing with a bug in PowerShell v.2. The issue is caused
actually by the fact that the .NET object instances are no longer
released due to the event handlers not being released themselves. The
issue is no longer reproducible with PowerShell v.3".
The best solution, as far as I can see, is to interface between PowerShell and .NET at a different level: do the validation completely in C# code (embedded in the PowerShell script), and just pass back a list of ValidationEventArgs objects. See the fixed reproduction script at https://gist.github.com/3697081: that script is functionally correct and leaks no memory.
(Thanks to Microsoft Support for helping me find this solution.)
Initially Microsoft offered another workaround, which is to use $xyzzy = Register-ObjectEvent -SourceIdentifier XYZZY, and then at the end do the following:
Unregister-Event XYZZY
Remove-Job $xyzzy -Force
However, this workaround is functionally incorrect. Any events that are still 'in flight' are lost at the time these two additional statements are executed. In my case, that means that I miss validation errors, so the output of my script is incomplete.
After the remove-variable you can try to force GC collection :
[GC]::Collect()