How to implement a parallel jobs and queues system in Powershell [duplicate] - powershell

I spent days trying to implement a parallel jobs and queues system, but... I tried but I can't make it. Here is the code without implementing nothing, and CSV example from where looks.
I'm sure this post can help other users in their projects.
Each user have his pc, so the CSV file look like:
pc1,user1
pc2,user2
pc800,user800
CODE:
#Source File:
$inputCSV = '~\desktop\report.csv'
$csv = import-csv $inputCSV -Header PCName, User
echo $csv #debug
#Output File:
$report = "~\desktop\output.csv"
#---------------------------------------------------------------
#Define search:
$findSize = 40GB
Write-Host "Lonking for $findSize GB sized Outlook files"
#count issues:
$issues = 0
#---------------------------------------------------------------
foreach($item in $csv){
if (Test-Connection -Quiet -count 1 -computer $($item.PCname)){
$w7path = "\\$($item.PCname)\c$\users\$($item.User)\appdata\Local\microsoft\outlook"
$xpPath = "\\$($item.PCname)\c$\Documents and Settings\$($item.User)\Local Settings\Application Data\Microsoft\Outlook"
if(Test-Path $W7path){
if(Get-ChildItem $w7path -Recurse -force -Include *.ost -ErrorAction "SilentlyContinue" | Where-Object {$_.Length -gt $findSize}){
$newLine = "{0},{1},{2}" -f $($item.PCname),$($item.User),$w7path
$newLine | add-content $report
$issues ++
Write-Host "Issue detected" #debug
}
}
elseif(Test-Path $xpPath){
if(Get-ChildItem $w7path -Recurse -force -Include *.ost -ErrorAction "SilentlyContinue" | Where-Object {$_.Length -gt $findSize}){
$newLine = "{0},{1},{2}" -f $($item.PCname),$($item.User),$xpPath
$newLine | add-content $report
$issues ++
Write-Host "Issue detected" #debug
}
}
else{
write-host "Error! - bad path"
}
}
else{
write-host "Error! - no ping"
}
}
Write-Host "All done! detected $issues issues"

Parallel data processing in PowerShell is not quite simple, especially with
queueing. Try to use some existing tools which have this already done.
You may take look at the module
SplitPipeline. The cmdlet
Split-Pipeline is designed for parallel input data processing and supports
queueing of input (see the parameter Load). For example, for 4 parallel
pipelines with 10 input items each at a time the code will look like this:
$csv | Split-Pipeline -Count 4 -Load 10, 10 {process{
<operate on input item $_>
}} | Out-File $outputReport
All you have to do is to implement the code <operate on input item $_>.
Parallel processing and queueing is done by this command.
UPDATE for the updated question code. Here is the prototype code with some
remarks. They are important. Doing work in parallel is not the same as
directly, there are some rules to follow.
$csv | Split-Pipeline -Count 4 -Load 10, 10 -Variable findSize {process{
# Tips
# - Operate on input object $_, i.e $_.PCname and $_.User
# - Use imported variable $findSize
# - Do not use Write-Host, use (for now) Write-Warning
# - Do not count issues (for now). This is possible but make it working
# without this at first.
# - Do not write data to a file, from several parallel pipelines this
# is not so trivial, just output data, they will be piped further to
# the log file
...
}} | Set-Content $report
# output from all jobs is joined and written to the report file
UPDATE: How to write progress information
SplitPipeline handled pretty well a 800 targets csv, amazing. Is there anyway
to let the user know if the script is alive...? Scan a big csv can take about
20 mins. Something like "in progress 25%","50%","75%"...
There are several options. The simplest is just to invoke Split-Pipeline with
the switch -Verbose. So you will get verbose messages about the progress and
see that the script is alive.
Another simple option is to write and watch verbose messages from the jobs,
e.g. Write-Verbose ... -Verbose which will write messages even if
Split-Pipeline is invoked without Verbose.
And another option is to use proper progress messages with Write-Progress.
See the scripts:
Test-ProgressJobs.ps1
Test-ProgressTotal.ps1
Test-ProgressTotal.ps1 also shows how to use a collector updated from jobs
concurrently. You can use the similar technique for counting issues (the
original question code does this). When all is done show the total number of
issues to a user.

Related

Why is other powershell script run twice?

I have a script that checks whether disk has a certain amount of free space. If not, a pop-up appears asking for a yes or no. If yes, then an alarm is set to 1 and then another script that deletes files from folder runs. My issue is that it seems to delete twice the number specified in the script.
Main script:
$limit_low = 0.1 # låg gräns 10%
$DiskD = Get-PSDrive D | Select-Object Used,Free | Write-Output
$DiskD_use = [math]::Round(($DiskD.Free / ($DiskD.Used + $DiskD.Free)),2)
if( $DiskD_use -le $limit_low ) {
Write-Host "RDS-server har för lite utrymme på disk D $diskD_use < $limit_low" -ForegroundColor Red -BackgroundColor Yellow
$ButtonType = 4
$Timeout = 60
$Confirmation = New-Object -ComObject wscript.shell
$ConfirmationAnswer = $Confirmation.popup("Clear disk space?",$Timeout,"No space",$ButtonType)
If( $ConfirmationAnswer -eq 6 ) {
Write-Host "Kör script Diskspace.ps1 under P:\backupscripts"
& c:\dynamics\app\JDSend.exe "/UDP /LOG:c:\dynamics\app\Fixskick.log /TAG:Lunsc2:K_PROCESS_LARM_DISKUTRYMME "1""
& P:\BackupScripts\Delete_archives_test.ps1 # here i call the other script
} else {
Write-Host "Gör ingenting"
& c:\dynamics\app\JDSend.exe "/UDP /LOG:c:\dynamics\app\Fixskick.log /TAG:Lunsc2:K_PROCESS_LARM_DISKUTRYMME "0""
}
}
Other script:
# List all txt-files in directory, sort them och select the first 10, then delete
Get-ChildItem -Path c:\temp -File Archive*.txt | Sort-Object | Select-Object -First 10 | Remove-Item -Force
Cheers
EDIT
So it would be enough to enclose the first statement like this:
(Get-ChildItem -Path c:\temp -File Archive*.txt) | Sort-Object | Select-Object -First 10 | Remove-Item -Force
?? Funny thing is i tried to reproduce this today at home with no effect. Works as intended from here, even without parens.
Further I am painfully ignorant about how to use foreach statement, as it doesn't pipe the bastard :)
foreach ($file in $filepath) {$file} | Sort-Object | Select-Object -First 10 | Remove-Item -Force
Tried to put the sort and select-part in the {} too, but nothing good came of it. As i'm stuck in the pipe and don't understand the foreach logic.
your problem appears to be caused by how your pipeline works. [grin]
think about what it does ...
read ONE fileinfo item
send it to the pipeline
change/add a file
continue the pipeline
that 3rd step will cause the file list to change ... and may result in a file being read again OR some other change in the list of files to work on.
there are two solutions that come to mind ...
wrap the Get-ChildItem call in parens
that will force one read of the list before sending anything to the pipeline ... and that will ignore any changes caused by later pipeline stages.
use a foreach loop
that will read the whole list and then iterate thru the list one item at a time.
the 2nd solution also has the benefit of being easier to debug since the value of your current item only changes when explicitly modified. the current pipeline item changes at every pipeline stage ... and that is easy to forget. [grin]

Powershell Get-Content with Wait flag and IOErrors

I have a PowerShell script that spawns x number of other PowerShell scripts in a Fire-And-Forget way.
In order to keep track of the progress of all the scripts that I just start, I create a temp file, where I have all of them write log messages in json format to report progress.
In the parent script I then monitor that log file using Get-Content -Wait. Whenever I receive a line in the log file, I parse the json and update an array of objects that I then display using Format-Table. That way I can see how far the different scripts are in their process and if they fail at a specific step. That works well... almost.
I keep running into IOErrors because so many scripts are accessing the log file, and when that happens the script just aborts and I lose all information on what is going on.
I would be able to live with the spawned scripts running into an IOError because they just continue and then I just catch the next message. I can live with some messages getting lost as this is not an audit log, but just a progress log.
But when the script that tails the log crashes then I lose insight.
I have tried to wrap this in a Try/Catch but that doesn't help. I have tried setting -ErrorAction Stop inside the Try/Catch but that still doesn't catch the error.
My script that reads looks like this:
function WatchLogFile($statusFile)
{
Write-Host "Tailing statusfile: $($statusFile)"
Write-Host "Press CTRL-C to end."
Write-Host ""
Try {
Get-Content $statusFile -Force -Wait |
ForEach {
$logMsg = $_ | ConvertFrom-JSON
#Update status on step for specific service
$svc = $services | Where-Object {$_.Service -eq $logMsg.Service}
$svc.psobject.properties[$logMsg.step].value = $logMsg.status
Clear-Host
$services | Format-Table -Property Service,Old,New,CleanRepo,NuGet,Analyzers,CleanImports,Build,Invoke,Done,LastFailure
} -ErrorAction Stop
} Catch {
WatchLogFile $statusFile
}
}
And updates are written like this in the spawned scripts
Add-Content $statusFile $jsonLogMessage
Is there an easy way to add retries or how can I make sure my script survives file locks?
As #ChiliYago pointed out I should use jobs. So that is what I have done now. I had to figure out how to get the output as it arrived from the many scripts.
So I did added all my jobs to an array of jobs and and monitored them like this. Beware that you can receive multiple lines if your script has had multiple outputs since you invoked Receive-Job. Be sure to use Write-Output from the scripts you execute as jobs.
$jobs=#()
foreach ($script in $scripts)
{
$sb = [scriptblock]::create("$script $(&{$args} #jobArgs)")
$jobs += Start-Job -ScriptBlock $sb
}
while ($hasRunningJobs -gt 0)
{
$runningJobs = $jobs | Where-Object {$_.State -eq "Running"} | measure
$hasRunningJobs = $runningJobs.Count
foreach ($job in $jobs)
{
$outvar = Receive-Job -Job $job
if ($outvar)
{
$outvar -split "`n" | %{ UpdateStatusTable $_}
}
}
}
Write-Host "All scripts done."

How to check the users currently using a powershell program?

So I have a basic program (incredibly buggy but we quite like it) that uses a shared folder that a couple of people at school have access to (Paths have been changed for ease of use). It is designed to work as a messaging application, with each user writing into the same Notepad file to send a message to a Poweshell script using the Get-Content and -Wait parameter. I have added a couple of commands using "/", but I want one (i.e. /online) that a user can type and see all of the other people currently using the program.
I have tried to set up a different text file that is updated every x seconds by each individual user with their own user name, while wiping the previous record:
while (1){
Clear-Content -Path C:\users\Freddie\Desktop\ConvoOnline.txt
Start-Sleep -Milliseconds 5000
Add-Content -Path C:\users\Freddie\Desktop\ConvoOnline.txt $env:UserName
}
So this can be called upon later:
elseif($_ -match "/online"){Get-Content -Path C:\users\Freddie\Desktop\ConvoOnline.txt}
But this doesn't work, it won't sync up between users, so one user will wipe the current users and only that will apear as active, until the other users' cycle wipes THEIR name.
To avoid the XY Problem, I want a fairly simple way (still only using two files maximum) to determine which users are actively using (therefore updating) the Powershell script they are running.
Whole code:
Add-Type -AssemblyName System.speech
$speak = New-Object System.Speech.Synthesis.SpeechSynthesizer
$speak.Volume = 100
Write-Host "Type /helpp, save it, then hit backspace and save it again for a guide and list of commands!"
Get-Content -Path C:\users\Freddie\Desktop\Convo.txt -Wait |
%{$_ -replace "^", "$env:UserName "} |
%{if($_ -match "/cls"){cls} `
elseif($_ -match "/online"){Get-Content -Path C:\users\Freddie\Desktop \ConvoOnline.txt} `
elseif(($_ -match "/afk") -and ($env:UserName -eq "Freddie")){Write-Host "$env:UserName has gone afk"} `
elseif(($_ -match "/say") -and ($env:UserName -eq "Freddie")) {$speak.Speak($_.Substring(($_.length)-10))} `
elseif($_ -match "/whisper"){
$array = #($_ -split "\s+")
if($array[2] -eq "$env:UserName"){
Write-Host $array[2]
} `
} `
elseif($_ -match "/help"){
Write-Host "Help: `
1. Press Ctrl+S in Notepad to send your message `
2. Make sure you delete it after it's been sent `
3. If your message doesn't send properly, just hit backspace and all but the last letter will be sent `
`
COMMANDS: `
`
/online - Lists all users currently in the chat `
/cls - Clears you screen of all current and previous messages `
/whisper [USERNAME] [MESSAGE] - This allows you to send a message privately to a user"
}
else{Write-Host "$_"}}
#
#
#
#
#Add a command: elseif($_ -match "/[COMMAND]"){[FUNCTION]}
#
#Make it user-specific: elseif($_ -match "/[COMMAND]" -and $envUserName -eq "[USERNAME]"){[FUNCTION]}
You can add time stamp with add-content and another ps1 file for clearing data written before 5 seconds (you can do this in the same ps1 file but another ps1 file is better)
Modified user online updation part :
while ($true){
Add-Content -Path d:\ConvoOnline.txt "$($env:UserName);$(get-date)"
Start-Sleep -Milliseconds 5000
}
Another script which watches and clears content before 5 seconds ,so the online file is always updated
while($true){
Start-Sleep -Milliseconds 5000
$data = get-content -Path D:\ConvoOnline.txt
clear-content -Path D:\ConvoOnline.txt
if($data){
$data | %{if(!([datetime]($_.split(";")[1]) -lt (get-date).addmilliseconds(-4500))){Add-Content -Path d:\ConvoOnline.txt $_}}
}
}

How to use Powershell Pipeline to Avoid Large Objects?

I'm using a custom function to essentially do a DIR command (recursive file listing) on an 8TB drive (thousands of files).
My first iteration was:
$results = $PATHS | % {Get-FolderItem -Path "$($_)" } | Select Name,DirectoryName,Length,LastWriteTime
$results | Export-CVS -Path $csvfile -Force -Encoding UTF8 -NoTypeInformation -Delimiter "|"
This resulted in a HUGE $results variable and slowed the system down to a crawl by spiking the powershell process to use 99%-100% of the CPU as the processing went on.
I decided to use the power of the pipeline to WRITE to the CSV file directly (presumably freeing up the memory) instead of saving to an intermediate variable, and came up with this:
$PATHS | % {Get-FolderItem -Path "$($_)" } | Select Name,DirectoryName,Length,LastWriteTime | ConvertTo-CSV -NoTypeInformation -Delimiter "|" | Out-File -FilePath $csvfile -Force -Encoding UTF8
This seemed to be working fine (the CSV file was growing..and CPU seemed to be stable) but then abruptly stopped when the CSV file size hit ~200MB, and the error to the console was "The pipeline has been stopped".
I'm not sure the CSV file size had anything to do with the error message, but I'm unable to process this large directory with either method! Any suggestions on how to allow this process to complete successfully?
Get-FolderItem runs robocopy to list the files and converts its output into a PSObject array. This is a slow operation, which isn't required for the actual task, strictly speaking. Pipelining also adds big overhead compared to the foreach statement. In the case of thousands or hundreds of thousands repetitions that becomes noticeable.
We can speed up the process beyond anything pipelining and standard PowerShell cmdlets can offer to write the info for 400,000 files on an SSD drive in 10 seconds.
.NET Framework 4 or newer (included since Win8, installable on Win7/XP) IO.DirectoryInfo's EnumerateFileSystemInfos to enumerate the files in a non-blocking pipeline-like fashion;
PowerShell 3 or newer as it's faster than PS2 overall;
foreach statement which doesn't need to create ScriptBlock context for each item thus it's much faster than ForEach cmdlet
IO.StreamWriter to write each file's info immediately in a non-blocking pipeline-like fashion;
\\?\ prefix trick to lift the 260 character path length restriction;
manual queuing of directories to process to get past "access denied" errors, which otherwise would stop naive IO.DirectoryInfo enumeration;
progress reporting.
function List-PathsInCsv([string[]]$PATHS, [string]$destination) {
$prefix = '\\?\' #' UNC prefix lifts 260 character path length restriction
$writer = [IO.StreamWriter]::new($destination, $false, [Text.Encoding]::UTF8, 1MB)
$writer.WriteLine('Name|Directory|Length|LastWriteTime')
$queue = [Collections.Generic.Queue[string]]($PATHS -replace '^', $prefix)
$numFiles = 0
while ($queue.Count) {
$dirInfo = [IO.DirectoryInfo]$queue.Dequeue()
try {
$dirEnumerator = $dirInfo.EnumerateFileSystemInfos()
} catch {
Write-Warning ("$_".replace($prefix, '') -replace '^.+?: "(.+?)"$', '$1')
continue
}
$dirName = $dirInfo.FullName.replace($prefix, '')
foreach ($entry in $dirEnumerator) {
if ($entry -is [IO.FileInfo]) {
$writer.WriteLine([string]::Join('|', #(
$entry.Name
$dirName
$entry.Length
$entry.LastWriteTime
)))
} else {
$queue.Enqueue($entry.FullName)
}
if (++$numFiles % 1000 -eq 0) {
Write-Progress -activity Digging -status "$numFiles files, $dirName"
}
}
}
$writer.Close()
Write-Progress -activity Digging -Completed
}
Usage:
List-PathsInCsv 'c:\windows', 'd:\foo\bar' 'r:\output.csv'
dont use robocopy, use native PowerShell command, like this :
$PATHS = 'c:\temp', 'c:\temp2'
$csvfile='c:\temp\listresult.csv'
$PATHS | % {Get-ChildItem $_ -file -recurse } | Select Name,DirectoryName,Length,LastWriteTime | export-csv $csvfile -Delimiter '|' -Encoding UTF8 -NoType
Short version for no purist :
$PATHS | % {gci $_ -file -rec } | Select Name,DirectoryName,Length,LastWriteTime | epcsv $csvfile -D '|' -E UTF8 -NoT

Local Groups and Members

I have a requirement to report the local groups and members from a specific list of servers. I have the following script that I have pieced together from other scripts. When run the script it writes the name of the server it is querying and the server's local group names and the members of those groups. I would like to output the text to a file, but where ever I add the | Out-File command I get an error "An empty pipe element is not allowed". My secondary concern with this script is, will the method I've chosen the report the server being queried work when outputting to a file. Will you please help correct this newbies script errors please?
$server=Get-Content "C:\Powershell\Local Groups\Test.txt"
Foreach ($server in $server)
{
$computer = [ADSI]"WinNT://$server,computer"
"
"
write-host "==========================="
write-host "Server: $server"
write-host "==========================="
"
"
$computer.psbase.children | where { $_.psbase.schemaClassName -eq 'group' } | foreach {
write-host $_.name
write-host "------"
$group =[ADSI]$_.psbase.Path
$group.psbase.Invoke("Members") | foreach {$_.GetType().InvokeMember("Name", 'GetProperty', $null, $_, $null)}
write-host **
write-host
}
}
Thanks,
Kevin
You say that you are using Out-File and getting that error. You don't show_where_ in your code that is being called from.
Given the code you have my best guess is that you were trying something like this
Foreach ($server in $server){
# All the code in this block
} | Out-File c:\pathto.txt
I wish I had a technical reference for this interpretation but alas I have not found one (Think it has to do with older PowerShell versions). In my experience there is not standard output passed from that construct. As an aside ($server in $server) is misleading even if it works. Might I suggest this small change an let me know if that works.
$servers=Get-Content "C:\Powershell\Local Groups\Test.txt"
$servers | ForEach-Object{
$server = $_
# Rest of code inside block stays the same
} | Out-File c:\pathto.txt
If that is not your speed then I would also consider building an empty array outside the block and populate is for each loop pass.
# Declare empty array to hold results
$results = #()
Foreach ($server in $server){
# Code before this line
$results += $group.psbase.Invoke("Members") | foreach {$_.GetType().InvokeMember("Name", 'GetProperty', $null, $_, $null)}
# Code after this line
}
$results | Set-Content c:\pathto.txt
Worthy Note
You are mixing Console output with standard output. depending on what you want to do with the script you will not get the same output you expect. If you want the lines like write-host "Server: $server" to be in the output file then you need to use Write-Output