I'm enumerating through all of the datastores in our VMware environment to get names and used space.
When I run the foreach loop, it's both enumerating through the array, and not enumerating through the array.
Here's my script:
$list = #()
$row = '' | select Name, UsedSpace
$datastores = Get-Datastore
foreach ($store in $datastores) {
$row.name = $store.name;
$row.usedspace = [math]::Round(($store.extensiondata.summary.capacity - $store.extensiondata.summary.freespace)/1gb)
Write-Host $row; #To Verify that each row is different, and that enumeration is working#
$list += $row;
}
Console Output:
#{name=datastore1; usedspace=929}
#{name=datastore2; usedspace=300}
#{name=datastore3; usedspace=400}
$list variable output:
Name Usedspace
Datastore3 400
Datastore3 400
Datastore3 400
So it's enumerating through. getting all the correct data. but for some reason the line $list += $row is waiting until the last object in the array, grabs only that data, but knows that there's 3 objects in the array, and populates each index with that objects data.
The only thing I've done to troubleshoot is bounced my PowerShell console.
The reason for this is that $row is a single object. You created it once, and then you keep changing the values of its properties. When you add it to the array, you're adding a reference to it, not a copy. So the values seen will always be those that were most recently set.
Recreate your $row on every iteration of the loop.
Alternatively you could create a PSCustomObject
$list = foreach ($store in Get-Datastore) {
[PSCustomObject]#{
Name = $store.name
UsedSpace = [math]::Round(($store.extensiondata.summary.capacity -
$store.extensiondata.summary.freespace)/1gb)
}
}
$list
As mentioned in the comments, with:
$row = '' | select Name, UsedSpace; $row.GetType()
you implicitly also create an (empty) PSCustomObject,
but as this needs to be created in every iteration of the foreach and then (inefficiently) appended to $list by rebuilding the array - directly building the PSCustomObject is IMO more clear / straight forward.
Related
I have a TXT file with 1300 megabytes (huge thing). I want to build code that does two things:
Every line contains a unique ID at the beginning. I want to check for all lines with the same unique ID if the conditions is met for that "group" of IDs. (This answers me: For how many lines with the unique ID X have all conditions been met)
If the script is finished I want to remove all lines from the TXT where the condition was met (see 2). So I can rerun the script with another condition set to "narrow down" the whole document.
After few cycles I finally have a set of conditions that applies to all lines in the document.
It seems that my current approach is very slow.( one cycle needs hours). My final result is a set of conditions that apply to all lines of code.
If you find an easier way to do that, feel free to recommend.
Help is welcome :)
Code so far (does not fullfill everything from 1&2)
foreach ($item in $liste)
{
# Check Conditions
if ( ($item -like "*XXX*") -and ($item -like "*YYY*") -and ($item -notlike "*ZZZ*")) {
# Add a line to a document to see which lines match condition
Add-Content "C:\Desktop\it_seems_to_match.txt" "$item"
# Retrieve the unique ID from the line and feed array.
$array += $item.Split("/")[1]
# Remove the line from final document
$liste = $liste -replace $item, ""
}
}
# Pipe the "new cleaned" list somewhere
$liste | Set-Content -Path "C:\NewListToWorkWith.txt"
# Show me the counts
$array | group | % { $h = #{} } { $h[$_.Name] = $_.Count } { $h } | Out-File "C:\Desktop\count.txt"
Demo Lines:
images/STRINGA/2XXXXXXXX_rTTTTw_GGGG1_Top_MMM1_YY02_ZZZ30_AAAA5.jpg images/STRINGA/3XXXXXXXX_rTTTTw_GGGG1_Top_MMM1_YY02_ZZZ30_AAAA5.jpg images/STRINGB/4XXXXXXXX_rTTTTw_GGGG1_Top_MMM1_YY02_ZZZ30_AAAA5.jpg images/STRINGB/5XXXXXXXX_rTTTTw_GGGG1_Top_MMM1_YY02_ZZZ30_AAAA5.jpg images/STRINGC/5XXXXXXXX_rTTTTw_GGGG1_Top_MMM1_YY02_ZZZ30_AAAA5.jpg
performance considerations:
Add-Content "C:\Desktop\it_seems_to_match.txt" "$item"
try to avoid wrapping cmdlet pipelines
See also: Mastering the (steppable) pipeline
$array += $item.Split("/")[1]
Try to avoid using the increase assignment operator (+=) to create a collection
See also: Why should I avoid using the increase assignment operator (+=) to create a collection
$liste = $liste -replace $item, ""
This is a very expensive operation considering that you are reassigning (copying) a long list ($liste) with each iteration.
Besides it is a bad practice to change an array that you are currently iterating.
$array | group | ...
Group-Object is a rather slow cmdlet, you better collect (or count) the items on-the-fly (where you do $array += $item.Split("/")[1]) using a hashtable, something like:
$Name = $item.Split("/")[1]
if (!$HashTable.Contains($Name)) { $HashTable[$Name] = [Collections.Generic.List[String]]::new() }
$HashTable[$Name].Add($Item)
To minimize memory usage it may be better to read one line at a time and check if it already exists. Below code I used StringReader and you can replace with StreamReader for reading from a file. I'm checking if the entire string exists, but you may want to split the line. Notice I have duplicaes in the input but not in the dictionary. See code below :
$rows= #"
images/STRINGA/2XXXXXXXX_rTTTTw_GGGG1_Top_MMM1_YY02_ZZZ30_AAAA5.jpg
images/STRINGA/3XXXXXXXX_rTTTTw_GGGG1_Top_MMM1_YY02_ZZZ30_AAAA5.jpg
images/STRINGB/4XXXXXXXX_rTTTTw_GGGG1_Top_MMM1_YY02_ZZZ30_AAAA5.jpg
images/STRINGB/5XXXXXXXX_rTTTTw_GGGG1_Top_MMM1_YY02_ZZZ30_AAAA5.jpg
images/STRINGC/5XXXXXXXX_rTTTTw_GGGG1_Top_MMM1_YY02_ZZZ30_AAAA5.jpg
images/STRINGA/2XXXXXXXX_rTTTTw_GGGG1_Top_MMM1_YY02_ZZZ30_AAAA5.jpg
images/STRINGA/3XXXXXXXX_rTTTTw_GGGG1_Top_MMM1_YY02_ZZZ30_AAAA5.jpg
images/STRINGB/4XXXXXXXX_rTTTTw_GGGG1_Top_MMM1_YY02_ZZZ30_AAAA5.jpg
images/STRINGB/5XXXXXXXX_rTTTTw_GGGG1_Top_MMM1_YY02_ZZZ30_AAAA5.jpg
images/STRINGC/5XXXXXXXX_rTTTTw_GGGG1_Top_MMM1_YY02_ZZZ30_AAAA5.jpg
"#
$dict = [System.Collections.Generic.Dictionary[int, System.Collections.Generic.List[string]]]::new();
$reader = [System.IO.StringReader]::new($rows)
while(($row = $reader.ReadLine()) -ne $null)
{
$hash = $row.GetHashCode()
if($dict.ContainsKey($hash))
{
#check if list contains the string
if($dict[$hash].Contains($row))
{
#string is a duplicate
}
else
{
#add string to dictionary value if it is not in list
$list = $dict[$hash].Value
$list.Add($row)
}
}
else
{
#add new hash value to dictionary
$list = [System.Collections.Generic.List[string]]::new();
$list.Add($row)
$dict.Add($hash, $list)
}
}
$dict
Apologies if this is irrelevant but I'm new to powershell and I've been scratching my head on this for a few days on and off now. I'm trying to write a script that will output two columns of data to a html document. I've achieved most of it by learning through forums and testing different combinations.
The problem is although it gives me the result I need within powershell itself; it will not properly display the second column results for Net Log Level.
So the script looks at some folders and pulls the * value which is always three digits (this is the Site array). It then looks within each of these folders to the Output folder and grabs a Net Log Level node from a file inside there. The script is correctly listing the Sites but is only showing the last value for Net Log Level which is 2. You can see this in the screenshot above. I need this to take every value for each Site and display as appropriate. The image of the incorrect result is below. I need the result to be 1,4,2,2,2. Any help would be greatly appreciated!
function getSite {
Get-ChildItem C:\Scripts\ServiceInstalls\*\Output\'Config.exe.config' | foreach {
$Site = $_.fullname.substring(27, 3)
[xml]$xmlRead = Get-Content $_
$NetLogLevel = $xmlRead.SelectSingleNode("//add[#key='Net Log Level']")
$NetLogLevel = $NetLogLevel.value
New-Object -TypeName System.Collections.ArrayList
$List1 += #([System.Collections.ArrayList]#($Site))
New-Object -TypeName System.Collections.ArrayList
$List2 += #([System.Collections.ArrayList]#($NetLogLevel))
}
$Results = #()
ForEach($Site in $List1){
$Results += [pscustomobject]#{
"Site ID" = $Site
"Net Log Level" = $NetLogLevel
}
}
$Results | ConvertTo-HTML -Property 'Site','Net Log Level' | Set-Content Output.html
Invoke-Item "Output.html"
}
getSite
Restructure your code as follows:
Get-ChildItem 'C:\Scripts\ServiceInstalls\*\Output\Config.exe.config' |
ForEach-Object {
$site = $_.fullname.substring(27, 3)
[xml]$xmlRead = Get-Content -Raw $_.FullName
$netLogLevel = $xmlRead.SelectSingleNode("//add[#key='Net Log Level']").InnerText
# Construct *and output* a custom object for the file at hand.
[pscustomobject] #{
'Site ID' = $site
'Net Log Level' = $netLogLevel
}
} | # Pipe the stream of custom objects directly to ConvertTo-Html
ConvertTo-Html | # No need to specify -Property if you want to use all properties.
Set-Content Output.html
As for what you tried:
New-Object -TypeName System.Collections.ArrayList in effect does nothing: it creates an array-list instance but doesn't save it in a variable, causing it to be enumerated to the pipeline, and since there is nothing to enumerate, nothing happens.
There is no point in wrapping a [System.Collections.ArrayList] instance in #(...): its elements are enumerated and then collected in a regular [object[]] array - just use #(...) by itself.
Using += to "grow" an array is quite inefficient, because a new array must be allocated behind the scenes every time; often there is no need to explicitly create an array - e.g. if you can simply stream objects to another command via the pipeline, as shown above, or you can let PowerShell itself implicitly create an array for you by assigning the result of a pipeline or foreach loop as a whole to a variable - see this answer.
Also note that when you use +=, the result is invariably a regular [object[] array, even if the RHS is a different collection type such as ArrayList.
There are still cases where iteratively creating an array-like collection is necessary, but you then need to use the .Add() method of such a collection type in order to grow the collection efficiently - see this answer.
Instead of populating two separate lists, simply create the resulting objects in the first loop:
function getSite {
$Results = Get-ChildItem C:\Scripts\ServiceInstalls\*\Output\'Config.exe.config' | ForEach-Object {
$Site = $_.fullname.substring(27, 3)
[xml]$xmlRead = Get-Content $_
$NetLogLevel = $xmlRead.SelectSingleNode("//add[#key='Net Log Level']")
$NetLogLevel = $NetLogLevel.value
[pscustomobject]#{
"Site ID" = $Site
"Net Log Level" = $NetLogLevel
}
}
$Results | ConvertTo-HTML -Property 'Site', 'Net Log Level' | Set-Content Output.html
Invoke-Item "Output.html"
}
getSite
The increase assignment operator (+=) is often used in [PowerShell] questions and answers at the StackOverflow site to construct a collection objects, e.g.:
$Collection = #()
1..$Size | ForEach-Object {
$Collection += [PSCustomObject]#{Index = $_; Name = "Name$_"}
}
Yet it appears an very inefficient operation.
Is it Ok to generally state that the increase assignment operator (+=) should be avoided for building an object collection in PowerShell?
Yes, the increase assignment operator (+=) should be avoided for building an object collection, see also: PowerShell scripting performance considerations.
Apart from the fact that using the += operator usually requires more statements (because of the array initialization = #()) and it encourages to store the whole collection in memory rather then push it intermediately into the pipeline, it is inefficient.
The reason it is inefficient is because every time you use the += operator, it will just do:
$Collection = $Collection + $NewObject
Because arrays are immutable in terms of element count, the whole collection will be recreated with every iteration.
The correct PowerShell syntax is:
$Collection = 1..$Size | ForEach-Object {
[PSCustomObject]#{Index = $_; Name = "Name$_"}
}
Note: as with other cmdlets; if there is just one item (iteration), the output will be a scalar and not an array, to force it to an array, you might either us the [Array] type: [Array]$Collection = 1..$Size | ForEach-Object { ... } or use the Array subexpression operator #( ): $Collection = #(1..$Size | ForEach-Object { ... })
Where it is recommended to not even store the results in a variable ($a = ...) but immediately pass it into the pipeline to save memory, e.g.:
1..$Size | ForEach-Object {
[PSCustomObject]#{Index = $_; Name = "Name$_"}
} | ConvertTo-Csv .\Outfile.csv
Note: Using the System.Collections.ArrayList class could also be considered, this is generally almost as fast as the PowerShell pipeline but the disadvantage is that it consumes a lot more memory than (properly) using the PowerShell pipeline.
see also: Fastest Way to get a uniquely index item from the property of an array and Array causing 'system.outofmemoryexception'
Performance measurement
To show the relation with the collection size and the decrease of performance you might check the following test results:
1..20 | ForEach-Object {
$size = 1000 * $_
$Performance = #{Size = $Size}
$Performance.Pipeline = (Measure-Command {
$Collection = 1..$Size | ForEach-Object {
[PSCustomObject]#{Index = $_; Name = "Name$_"}
}
}).Ticks
$Performance.Increase = (Measure-Command {
$Collection = #()
1..$Size | ForEach-Object {
$Collection += [PSCustomObject]#{Index = $_; Name = "Name$_"}
}
}).Ticks
[pscustomobject]$Performance
} | Format-Table *,#{n='Factor'; e={$_.Increase / $_.Pipeline}; f='0.00'} -AutoSize
Size Increase Pipeline Factor
---- -------- -------- ------
1000 1554066 780590 1.99
2000 4673757 1084784 4.31
3000 10419550 1381980 7.54
4000 14475594 1904888 7.60
5000 23334748 2752994 8.48
6000 39117141 4202091 9.31
7000 52893014 3683966 14.36
8000 64109493 6253385 10.25
9000 88694413 4604167 19.26
10000 104747469 5158362 20.31
11000 126997771 6232390 20.38
12000 148529243 6317454 23.51
13000 190501251 6929375 27.49
14000 209396947 9121921 22.96
15000 244751222 8598125 28.47
16000 286846454 8936873 32.10
17000 323833173 9278078 34.90
18000 376521440 12602889 29.88
19000 422228695 16610650 25.42
20000 475496288 11516165 41.29
Meaning that with a collection size of 20,000 objects using the += operator is about 40x slower than using the PowerShell pipeline for this.
Steps to correct a script
Apparently some people struggle with correcting a script that already uses the increase assignment operator (+=). Therefore, I have created a little instruction to do so:
Remove all the <variable> += assignments from the concerned iteration, just leave only the object item. By not assigning the object, the object will simply be put on the pipeline.
It doesn't matter if there are multiple increase assignments in the iteration or if there are embedded iterations or function, the end result will be the same.
Meaning, this:
ForEach ( ... ) {
$Array += $Object1
$Array += $Object2
ForEach ( ... ) {
$Array += $Object3
$Array += Get-Object
}
}
Is essentially the same as:
ForEach ( ... ) {
$Object1
$Object2
ForEach ( ... ) {
$Object3
Get-Object
}
}
Note: if there is no iteration, there is probably no reason to change your script as likely only concerns a few additions
Assign the output of the iteration (everything that is put on the pipeline) to the concerned a variable. This is usually at the same level as where the array was initialized ($Array = #()). e.g.:
$Array = ForEach ( ... ) { ...
Note 1: Again, if you want single object to act as an array, you probably want to use the Array subexpression operator #( ) but you might also consider to do this at the moment you use the array, like: #($Array).Count or ForEach ($Item in #($Array))
Note 2: Again, you're better off not assigning the output at all. Instead, pass the pipeline output directly to the next cmdlet to free up memory: ... | ForEach-Object {...} | Export-Csv .\File.csv.
Remove the array initialization <Variable> = #()
For a full example, see: Comparing Arrays within Powershell
Note that the same applies for using += to build strings (
see: Is there a string concatenation shortcut in PowerShell?) and also building HashTables like:
$HashTable += #{ $NewName = $Value }
I need help with loop processing an array of arrays. I have finally figured out how to do it, and I am doing it as such...
$serverList = $1Servers,$2Servers,$3Servers,$4Servers,$5Servers
$serverList | % {
% {
Write-Host $_
}
}
I can't get it to process correctly. What I'd like to do is create a CSV from each array, and title the lists accordingly. So 1Servers.csv, 2Servers.csv, etc... The thing I can not figure out is how to get the original array name into the filename. Is there a variable that holds the list object name that can be accessed within the loop? Do I need to just do a separate single loop for each list?
You can try :
$1Servers = "Mach1","Mach2"
$2Servers = "Mach3","Mach4"
$serverList = $1Servers,$2Servers
$serverList | % {$i=0}{$i+=1;$_ | % {New-Object -Property #{"Name"=$_} -TypeName PsCustomObject} |Export-Csv "c:\temp\$($i)Servers.csv" -NoTypeInformation }
I take each list, and create new objects that I export in a CSV file. The way I create the file name is not so nice, I don't take the var name I just recreate it, so if your list is not sorted it will not work.
It would perhaps be more efficient if you store your servers in a hash table :
$1Servers = #{Name="1Servers"; Computers="Mach1","Mach2"}
$2Servers = #{Name="2Servers"; Computers="Mach3","Mach4"}
$serverList = $1Servers,$2Servers
$serverList | % {$name=$_.name;$_.computers | % {New-Object -Property #{"Name"=$_} -TypeName PsCustomObject} |Export-Csv "c:\temp\$($name).csv" -NoTypeInformation }
Much like JPBlanc's answer, I kinda have to kludge the filename... (FWIW, I can't see how you can get that out of the array itself).
I did this example w/ foreach instead of foreach-object (%). Since you have actual variable names you can address w/ foreach, it seems a little cleaner, if nothing else, and hopefully a little easier to read/maintain:
$1Servers = "apple.contoso.com","orange.contoso.com"
$2Servers = "peach.contoso.com","cherry.contoso.com"
$serverList = $1Servers,$2Servers
$counter = 1
foreach ( $list in $serverList ) {
$fileName = "{0}Servers.csv" -f $counter++
"FileName: $fileName"
foreach ( $server in $list ) {
"-- ServerName: $server"
}
}
I was able to resolve this issue myself. Because I wasn't able to get the object name through, I just changed the nature of the object. So now my server lists consist of two columns, one of which is the name of the list itself.
So...
$1Servers = += [pscustomobject] #{
Servername = $entry.Servername
Domain = $entry.Domain
}
Then...
$serverList = $usaServers,$devsubServers,$wtencServers,$wtenclvServers,$pcidevServers
Then I am able to use that second column to name the lists within my foreach loop.
Forgive the title, I'm not really sure how to explain what I'm seeing.
Sample Code:
$SampleValues = 1..5
$Result = "" | Select ID
$Results = #()
$SampleValues | %{
$Result.ID = $_
$Results += $Result
}
$Results
This is fairly straightforward:
Create an array with 5 numbers to be used in a loop
Create a temp variable with a NoteProperty called ID
Create an empty array to store results
Iterate through each of the 5 numbers assigning them to a temp variable then appending that to an array.
The expected result is 1,2,3,4,5 but when run this returns 5,5,5,5,5
This is a barebones example taken from a much more complex script and I'm trying to figure out why the result is what it is. In each iteration all elements that have already been added to $Results have their values updated to the most recent value. I've tested forcing everything to $Script: or $Global: scope and get the same results.
The only solution I've found is the following, which moves the $Result declaration into the loop.
$SampleValues = 1..5
$Results = #()
$SampleValues | %{
$Result = "" | Select ID
$Result.ID = $_
$Results += $Result
}
This works (you get 1,2,3,4,5 as your results). It looks like $Results is just holding multiple references to a singular $Result object but why does moving this into the loop fix the problem? In this example $Result is a string so perhaps it is creating a new object each iteration but even when I forced $Result to be an integer (which shouldn't recreate a new object since an integer isn't immutable like a string) it still fixed the problem and I got the result I expected.
If anybody has any insight into exactly why this fixes the problem I've be very curious. There are plenty of alternatives for me to implement but not understanding specifically why this works this way is bugging me.
It fixes the problem by moving into the loop because then you are then creating a new $Result object each time rather than changing a value on the same one (referenced 5 times in the array).
It doesn't have anything to do with whether you use "" | Select ID or 123 | Select ID because that just becomes a sort of property on the PSCustomObject, which is still a reference type rather than a value type.
Remember, Powershell is all .NET on the inside. Here is some C# that's analogous to what Powershell is doing in your first example that resulted in all 5s (hopefully you know C#):
var SampleValues = new []{1,2,3,4,5};
var Result = new CustomObject(){ ID = "" };
var Results = new List<Object>();
foreach (var _ in SampleValues) {
Result.ID = _;
Results.Add(Result);
}
Hopefully you can see how moving var Result = new CustomObject(){ ID = "" } inside the foreach loop would make it work better, and the same concept holds true in Powershell.
In this example $Result is a string so perhaps it is creating a new object each iteration but even when I forced $Result to be an integer (which shouldn't recreate a new object since an integer isn't immutable like a string) it still fixed the problem and I got the result I expected.
Actually, in your example $Result is not a string. It is a generic object with one propery (ID) that is a string.
Powershell variables come in 2 types - value and reference. A string or integer is passed by value, an object is passed by reference. By addind $result to $results 5 times you got 5 refereneces to the one $result object, not 5 different objects.
If we add the $result.id property (an integer / value type) to $results instead of the $result object we get:
$SampleValues = 1..5
$Result = "" | Select ID
$Results = #()
$SampleValues | %{
$Result.ID = $_
$Results += $Result.ID
}
1
2
3
4
5