foreach loops: How to update collection variable within loop? - powershell

Is there a way to change to behaviour that the collection variable for a loop cannot be updated from within its loop and use the new values in the next iteration?
For example:
$items = #(1,1,1,2)
$counter = 0
foreach ($item in $items) {
$counter += 1
Write-Host "Iteration:" $counter " | collection variable:" $items
$item
$items = $items | Where-Object {$_ -ne $item}
}
$counter
If you run this code the loop will execute for times.
However, since with the first iteration $items is changed from 1,1,1,2 to only contain 2, the loop should only run once more.
I suspect this is because the collection variable $items is not updated in the foreach part.
Is there a way to fix this?

You cannot use a foreach loop with a collection that is being modified in the loop body.
Attempting to do so will actually result in an error (Collection was modified; enumeration operation may not execute.)
The reason you're not seeing an error is that you're not actually modifying the original collection itself; you're assigning a new collection instance to the same variable, but that has no bearing on the original collection instance being enumerated.
You should use a while loop instead, in whose condition the $items variable reference is re-evaluated in every iteration:
$items = 1, 1, 1, 2
$counter = 0
while ($items) { # Loop as long as the collection has at last 1 item.
$counter += 1
Write-Host "Iteration: $counter | collection variable: $items"
$item = $items[0] # access the 1st element
$item # output it
$items = $items | Where-Object {$_ -ne $item} # filter out all elements with the same val.
}
Now you get just 2 iterations:
Iteration: 1 | collection variable: 1 1 1 2
1
Iteration: 2 | collection variable: 2
2

Related

How do I add an atomic counter to a powershell ForEach -Parallel loop

In this question, it was explained how to add to a concurrent ThreadSafe collection Powershell: How to add Result to an Array (ForEach-Object -Parallel)
I have a simpler use case , where I would just like to increment a single value. (Integer).
Is it possible to do in Powershell using some sort of Atomic Integer data type?
$myAtomicCounter = 0
$myItems | ForEach-Object -Parallel {
#...other work
$myAtomicCounter.ThreadSafeAdd(2)
# .. some more work using counter
}
Write-Host($myAtomicCounter)
In PowerShell when updating a single value from multiple threads you must use a locking mechanism, for example Mutex, SemaphoreSlim or even Monitor.Enter otherwise the updating operation will not be thread safe.
For example, supposing you have an array of arrays:
$toProcess = 0..10 | ForEach-Object {
, (Get-Random -Count (Get-Random -Minimum 5 -Maximum 10))
}
And you wanted to keep track of the processed items in each array, here is how you could do it using Mutex:
$processedItems = [hashtable]::Synchronized(#{
Lock = [System.Threading.Mutex]::new()
Counter = 0
})
$toProcess | ForEach-Object -Parallel {
# using sleep as to emulate doing something here
Start-Sleep (Get-Random -Maximum 5)
# bring the local variable to this scope
$ref = $using:processedItems
# lock this thread until I can write
if($ref['Lock'].WaitOne()) {
# when I can write, update the value
$ref['Counter'] += $_.Count
# and realease this lock so others threads can write
$ref['Lock'].ReleaseMutex()
}
}
$processedCount = ($toProcess | Write-Output | Measure-Object).Count
# Should be True:
$processedItems['Counter'] -eq $processedCount
A much simpler approach in PowerShell would be to output from your parallel loop into a linear loop where you can safely update the counter without having to care about thread safety:
$counter = 0
$toProcess | ForEach-Object -Parallel {
# using sleep as to emulate doing something here
Start-Sleep (Get-Random -Maximum 5)
# when this thread is done,
# output this array of processed items
$_
} | ForEach-Object {
# then the output from the parallel loop is received in this linear
# thread safe loop where we can update the counter
$counter += $_.Count
}
$processedCount = ($toProcess | Write-Output | Measure-Object).Count
# Should be True:
$counter -eq $processedCount

Powershell - Matching text from array and string?

I have an array that simply pulls a list of numbers in one long column. I am trying to match it with some data in a string and if it matches, I am wanting it to state Down otherwise it will state Up in the output CSV.
Here is my code: `
IF($RESULTS -like $TEST)
{$Output = "DOWN"
}ELSE{
$OUtput = "UP"
}
`
$RESULTS is the array, and $TEST is the string. If I do -match it works, but -match only pulls single digits so it gives false positives. For example, if there is a 3 in the list as well as 638 it will mark them both as down. None of the other switches seem to work like -eq, -like, etc.
What am I missing please?
Thanks much for any assistance!
EDIT:
Sample of Data in $TEST
2
3
5
Sample of Output of $RESULTS
5
628
Since 5 exists in both, my expected output would be DOWN and everything else would be UP.
It sounds like you have two arrays of numbers, and you want to test if the input array contains at least one of the values in the test array.
You can use Compare-Object, which compares two arrays and indicates which elements are different and, with -IncludeEqual, also which ones are the same:
if (
(Compare-Object -IncludeEqual $RESULTS $TEST).
Where({ $_.SideIndicator -eq '==' }).
Count -gt 0
) {
$Output = "DOWN"
}
else {
$Output = "UP"
}
As an aside:
You can use an if statement as an expression, which means you only need to specify $Output once:
$Output =
IF (<# conditional #>) {
"DOWN"
}
ELSE {
"UP"
}
In PowerShell (Core) 7+, you can use ?:, the ternary conditional operator for a more concise solution:
$Output = <# conditional #> ? 'DOWN' : 'UP'
I would do it by using "foreach". Hope this might be helpful
foreach ($result in $RESULTS){
if ($result -like $Test){
$OUTPUT = "Down"}
else{
$OUTPUT= "UP"}
}
In your edited question you show that variable $TEST is also an array, so in that case you can do
$TEST = 2,3,5
$RESULTS = 5,628
# compare both arrays for equal items
$Output = if ($TEST | Where-Object { $RESULTS -contains $_ }) {"DOWN"} else {"UP"}
$Output
In this case, $Output will be DOWN because both arrays have the number 5
If however variable $TEST contains a multiline string, then first create an array out of that like
$TEST = #'
2
3
5
'#
# convert the string to array by splitting at the newline
$TEST = $TEST -split '\r?\n' -ne ''
$RESULTS = 5,628
# compare both arrays for equal items
$Output = if ($TEST | Where-Object { $RESULTS -contains $_ }) {"DOWN"} else {"UP"}
$Output

how to find unique line in a txt file?

I have a LARGE list of hashes. I need to find out which ones only appear once as most are duplicates.
EX: the last line 238db2..... only appears once
ac6b51055fdac5b92934699d5b07db78
ac6b51055fdac5b92934699d5b07db78
7f5417a85a63967d8bba72496faa997a
7f5417a85a63967d8bba72496faa997a
1e78ba685a4919b7cf60a5c60b22ebc2
1e78ba685a4919b7cf60a5c60b22ebc2
238db202693284f7e8838959ba3c80e8
I tried the following that just listed one of each of the doubles, not just identifying the one that only appeared once
foreach ($line in (Get-Content "C:\hashes.txt" | Select-Object -Unique)) {
Write-Host "Line '$line' appears $(($line | Where-Object {$_ -eq $line}).count) time(s)."
}
You could use a Hashtable and a StreamReader.
The StreamReader reads the file line-by-line and the Hashtable will store that line as Key and in its Value state $true (if this is a duplicate) or $false (if it is unique)
$reader = [System.IO.StreamReader]::new('D:\Test\hashes.txt')
$hash = #{}
while($null -ne ($line = $reader.ReadLine())) {
$hash[$line] = $hash.ContainsKey($line)
}
# clean-up the StreamReader
$reader.Dispose()
# get the unique line(s) by filtering for value $false
$result = $hash.Keys | Where-Object {-not $hash[$_]}
Given your example, $result will contain 238db202693284f7e8838959ba3c80e8
Given that you're dealing with a large file, Get-Content is best avoided.
A switch statement with the -File parameter allows efficient line-by-line processing, and given that duplicates appear to be grouped together already, they can be detected by keeping a running count of identical lines.
$count = 0 # keeps track of the count of identical lines occurring in sequence
switch -File 'C:\hashes.txt' {
default {
if ($prevLine -eq $_ -or $count -eq 0) { # duplicate or first line.
if ($count -eq 0) { $prevLine = $_ }
++$count
}
else { # current line differs from the previous one.
if ($count -eq 1) { $prevLine } # non-duplicate -> output
$prevLine = $_
$count = 1
}
}
}
if ($count -eq 1) { $prevLine } # output the last line, if a non-duplicate.
$values = Get-Content .\hashes.txt # Read the values from the hashes.txt file
$groups = $values | Group-Object | Where-Object { $_.Count -eq 1 } # Group the values by their distinct values and filter for groups with a single value
foreach ($group in $groups) {
foreach ($value in $group.Values) {
Write-Host "$value" # Output the value of each group
}
}
To handle very large files you could try this.
$chunkSize = 1000 # Set the chunk size to 1000 lines
$lineNumber = 0 # Initialize a line number counter
# Use a do-while loop to read the file in chunks
do {
# Read the next chunk of lines from the file
$values = Get-Content .\hashes.txt | Select-Object -Skip $lineNumber -First $chunkSize
# Group the values by their distinct values and filter for groups with a single value
$groups = $values | Group-Object | Where-Object { $_.Count -eq 1 }
foreach ($group in $groups) {
foreach ($value in $group.Values) {
Write-Host "$value" # Output the value of each group
}
}
# Increment the line number counter by the chunk size
$lineNumber += $chunkSize
} while ($values.Count -eq $chunkSize)
Or this
# Create an empty dictionary
$dict = New-Object System.Collections.Hashtable
# Read the file line by line
foreach ($line in Get-Content .\hashes.txt) {
# Check if the line is already in the dictionary
if ($dict.ContainsKey($line)) {
# Increment the value of the line in the dictionary
$dict.Item($line) += 1
} else {
# Add the line to the dictionary with a count of 1
$dict.Add($line, 1)
}
}
# Filter the dictionary for values with a count of 1
$singles = $dict.GetEnumerator() | Where-Object { $_.Value -eq 1 }
# Output the values of the single items
foreach ($single in $singles) {
Write-Host $single.Key
}

A better way to slice an array (or a list) in powershell

How can i export mails adresses in CSV files in a range of 30 users for each one.
I have already try this
$users = Get-ADUser -Filter * -Properties Mail
$nbCsv = [int][Math]::Ceiling($users.Count/30)
For($i=0; $i -le $nbCsv; $i++){
$arr=#()
For($j=(0*$i);$j -le ($i + 30);$j++){
$arr+=$users[$j]
}
$arr|Export-Csv -Path ($PSScriptRoot + "\ASSFAM" + ("{0:d2}" -f ([int]$i)) + ".csv") -Delimiter ";" -Encoding UTF8 -NoTypeInformation
}
It works but, i think there is a better way to achieve this task.
Have you got some ideas ?
Thank you.
If you want a subset of an array, you can just use .., the range operator. The first 30 elements of an array would be:
$users[0..29]
Typically, you don't have to worry about going past the end of the array, either (see mkelement0's comment below). If there are 100 items and you're calling $array[90..119], you'll get the last 10 items in the array and no error. You can use variables and expressions there, too:
$users[$i..($i + 29)]
That's the $ith value and the next 29 values after the $ith value (if they exist).
Also, this pattern should be avoided in PowerShell:
$array = #()
loop-construct {
$array += $value
}
Arrays are immutable in .Net, and therefore immutable in PowerShell. That means that adding an element to an array with += means "create a brand new array, copy every element over, and then put this one new item on it, and then delete the old array." It generates tremendous memory pressure, and if you're working with more than a couple hundred items it will be significantly slower.
Instead, just do this:
$array = loop-construct {
$value
}
Strings are similarly immutable and have the same problem with the += operator. If you need to build a string via concatenation, you should use the StringBuilder class.
Ultimately, however, here is how I would write this:
$users = Get-ADUser -Filter * -Properties Mail
$exportFileTemplate = Join-Path -Path $PSScriptRoot -ChildPath 'ASSFAM{0:d2}.csv'
$batchSize = 30
$batchNum = 0
$row = 0
while ($row -lt $users.Count) {
$users[$row..($row + $batchSize - 1)] | Export-Csv ($exportFileTemplate -f $batchNum) -Encoding UTF8 -NoTypeInformation
$row += $batchSize
$batchNum++
}
$row and $batchNum could be rolled into one variable, technically, but this is a bit more readable, IMO.
I'm sure you could also write this with Select-Object and Group-Object, but that's going to be fairly complicated compared to the above and Group-Object isn't entirely known for it's performance prior to PowerShell 6.
If you are using Powershell's strict mode, which may be required in certain configurations or scenarios, then you'll need to check that you don't enumerate past the end of the array. You could do that with:
while ($row -lt $users.Count) {
$users[$row..([System.Math]::Min(($row + $batchSize - 1),($users.Count - 1)))] | Export-Csv ($exportFileTemplate -f $batchNum) -Encoding UTF8 -NoTypeInformation
$row += $batchSize
$batchNum++
}
I believe that's correct, but I may have an off-by-one error in that logic. I have not fully tested it.
Bacon Bits' helpful answer shows how to simplify your code with the help of .., the range operator, but it would be nice to have a general-purpose chunking (partitioning, batching) mechanism; however, as of PowerShell 7.0, there is no built-in feature.
GitHub feature suggestion #8270 proposes adding a -ReadCount <int> parameter to Select-Object, analogous to the parameter of the same name already defined for Get-Content.
If you'd like to see this feature implemented, show your support for the linked issue there.
With that feature in place, you could do the following:
$i = 0
Get-ADUser -Filter * -Properties Mail |
Select-Object -ReadCount 30 | # WISHFUL THINKING: output 30-element arrays
ForEach-Object {
$_ | Export-Csv -Path ($PSScriptRoot + "\ASSFAM" + ("{0:d2}" -f ++$i) + ".csv") -Delimiter ";" -Encoding UTF8 -NoTypeInformation
}
In the interim, you could use custom function Select-Chunk (source code below): replace Select-Object -ReadCount 30 with Select-Chunk -ReadCount 30 in the snippet above.
Here's a simpler demonstration of how it works:
PS> 1..7 | Select-Chunk -ReadCount 3 | ForEach-Object { "$_" }
1 2 3
4 5 6
7
The above shows that the ForEach-Object script block receive the following
three arrays, via $_, in sequence:
1, 2, 3, 4, 5, 6, and , 7
(When you stringify an array, by default you get a space-separated list of its elements; e.g., "$(1, 2, 3)" yields 1 2 3).
Select-Chunk source code:
The implementation uses a [System.Collections.Generic.Queue[object]] instance to collect the inputs in batches of fixed size.
function Select-Chunk {
<#
.SYNOPSIS
Chunks pipeline input.
.DESCRIPTION
Chunks (partitions) pipeline input into arrays of a given size.
By design, each such array is output as a *single* object to the pipeline,
so that the next command in the pipeline can process it as a whole.
That is, for the next command in the pipeline $_ contains an *array* of
(at most) as many elements as specified via -ReadCount.
.PARAMETER InputObject
The pipeline input objects binds to this parameter one by one.
Do not use it directly.
.PARAMETER ReadCount
The desired size of the chunks, i.e., how many input objects to collect
in an array before sending that array to the pipeline.
0 effectively means: collect *all* inputs and output a single array overall.
.EXAMPLE
1..7 | Select-Chunk 3 | ForEach-Object { "$_" }
1 2 3
4 5 6
7
The above shows that the ForEach-Object script block receive the following
three arrays: (1, 2, 3), (4, 5, 6), and (, 7)
#>
[CmdletBinding(PositionalBinding = $false)]
[OutputType([object[]])]
param (
[Parameter(ValueFromPipeline)]
$InputObject
,
[Parameter(Mandatory, Position = 0)]
[ValidateRange(0, [int]::MaxValue)]
[int] $ReadCount
)
begin {
$q = [System.Collections.Generic.Queue[object]]::new($ReadCount)
}
process {
$q.Enqueue($InputObject)
if ($q.Count -eq $ReadCount) {
, $q.ToArray()
$q.Clear()
}
}
end {
if ($q.Count) {
, $q.ToArray()
}
}
}

What's the best way to iterate over an array and modify the elements in PowerShell?

I know Ruby has a map! method which do this. In PowerShell currently what I did is:
$new_array = #();
$array | % {
$new_array += <do something with the currently element $)>;
}
$array = $new_array;
I want to know the best way to do this. Thanks!
Simplest I can think of is
$array | %{ $_ + 1 }
$_ + 1 being the transformation I want.
So you can do:
$new = $array | %{$_ + 1}
or
$array = $array | %{ $_ + 1}
PS:
You can define a map function if you want:
function map ([scriptblock]$script, [array] $a) {
$a | %{ & $script $_ }
}
map {param($i) $i + 1} 1,2,3
Or write an extension method on System.Array: http://blogs.msdn.com/b/powershell/archive/2008/09/06/hate-add-member-powershell-s-adaptive-type-system.aspx
You can also use a filter:
$arr = 1,2,3
filter addone {$_ +1}
$arr | addone
Or a filtering function:
function addone {process{$_ +1}}
$arr | addone
Or an anonymous filter:
$addone = {$_ +1}
$addone.isfilter = $true
$arr | &$addone
Edit: The days of that anonymous filter working may be numbered. In the V3 beta that doesn't work any more, and I suspect that's not going to change. It appears the script blocks are compiled as soon as they are created, and the .isfilter property can't be changed after they're compiled.