Alphanumerical sorting not working in Array - powershell

We have a directory, which features many subdirectories (one per day) with serveral files in it. Unfortunately, files can be resent - so a file of 2020-01-01 can be resend (with slightly different filename, since a timestamp is added to the filename) on 2020-02-03. The structure looks something like this:
AFile_20200801_20200801150000 (Timestamped 2020-08-01 15:00:00)
AFile_20200801_20200802150000 (Timestamped 2020-08-02 15:00:00)
So the AFile of 2020-08-01 has been resent on 2020-08-02 at 3 PM.
I am now trying to retrieve a list with the most recent file per day, so I built up an array and populated it with all files below TopDir (recurively). So far so good, all files are found:
$path = "Y:\";
$FileArray = #()
$FileNameArray = #()
$FileArrayCounter = 0
foreach ($item in Get-ChildItem $path -Recurse)
if ($item.Extension -ne "")
$StringPart1, $StringPart2, $StringPart3, $StringPart4 = $item.Name.Split('_');
$FileNameShort = "{0}_{1}_{2}" -f $StringPart1.Trim(), $StringPart2.Trim(), $StringPart3.Trim();
$FileNameShort = $FileNameShort.Trim().ToUpper();
$FileArray += #{FileID = $FileArrayCounter; FileNameShort = $FileNameShort; FileName = $item.Name; FullName = $item.FullName; LastWriteTime = $item.LastWriteTime};
$FileArrayCounter ++;
$FileArray = $FileArray | sort FileNameShort; ##{Expression={"FileNameShort"}; Ascending=$True} #, #{Expression={"LastWriteTime"}; Descending=$True}
foreach($f in $FileArray)
Write-host($f.FileNameShort, $f.LastWriteTime)
Write-host($FileArrayCounter.ToString() + " Dateien gefunden");
The newly added column "FileNameShort" includes a substring of the filename. With this done, I receive two Rows for AFile_20200801:
AFile_20200801, AFile_20200801_20200801150000, ...
AFile_20200801, AFile_20200801_20200802150000, ...
However, when I try to sort my array (see above code), the output is NOT sorted by name. Instead I receive something like the following:
What I want to achieve is a sorting by FileNameShort ASCENDING and LastWriteTime DESCENDING.
What am I missing here?

Your sort does not work because $FileArray is an array of hash tables. The syntax Sort FileNameShort is binding the FileNameShort property to the -Property parameter of Sort-Object. However, the hash table does not contain a property called FileShortName. You can see this if you run $FileArray[0] | Get-Member.
If you create them as custom objects, the simple sort syntax works.
$FileArray += [pscustomobject]#{FileID = $FileArrayCounter; FileNameShort = $FileNameShort; FileName = $item.Name; FullName = $item.FullName; LastWriteTime = $item.LastWriteTime}
$FileArray | Sort FileNameShort # This will sort correctly
As an aside, I do not recommend using += to seemingly add elements to an array. It is best to either output the results inside of your loop and save the loop results or create a list with an .Add() method. The problem with += is the current array is expanded into memory and those contents are then used to create a new array with the new items. As the array grows, it becomes increasingly non-performant. See below for a more efficient example.
$FileArray = foreach ($item in Get-ChildItem $path -Recurse)
if ($item.Extension -ne "")
$StringPart1, $StringPart2, $StringPart3, $StringPart4 = $item.Name.Split('_');
$FileNameShort = "{0}_{1}_{2}" -f $StringPart1.Trim(), $StringPart2.Trim(), $StringPart3.Trim();
$FileNameShort = $FileNameShort.Trim().ToUpper();
# Outputting custom object here
[pscustomobject]#{FileID = $FileArrayCounter; FileNameShort = $FileNameShort; FileName = $item.Name; FullName = $item.FullName; LastWriteTime = $item.LastWriteTime};
$FileArrayCounter ++;

I just found the solution:
$FileArray = $FileArray | sort #{Expression={[string]$_.FileNameShort}; Ascending=$True}, #{Expression={[datetime]$_.LastWriteTime}; Descending=$True}
Still I don't know, why the first sorting did not work as expected.


Check if a condition is met by a line within a TXT but "in an advanced way"

I have a TXT file with 1300 megabytes (huge thing). I want to build code that does two things:
Every line contains a unique ID at the beginning. I want to check for all lines with the same unique ID if the conditions is met for that "group" of IDs. (This answers me: For how many lines with the unique ID X have all conditions been met)
If the script is finished I want to remove all lines from the TXT where the condition was met (see 2). So I can rerun the script with another condition set to "narrow down" the whole document.
After few cycles I finally have a set of conditions that applies to all lines in the document.
It seems that my current approach is very slow.( one cycle needs hours). My final result is a set of conditions that apply to all lines of code.
If you find an easier way to do that, feel free to recommend.
Code so far (does not fullfill everything from 1&2)
foreach ($item in $liste)
# Check Conditions
if ( ($item -like "*XXX*") -and ($item -like "*YYY*") -and ($item -notlike "*ZZZ*")) {
# Add a line to a document to see which lines match condition
Add-Content "C:\Desktop\it_seems_to_match.txt" "$item"
# Retrieve the unique ID from the line and feed array.
$array += $item.Split("/")[1]
# Remove the line from final document
$liste = $liste -replace $item, ""
# Pipe the "new cleaned" list somewhere
$liste | Set-Content -Path "C:\NewListToWorkWith.txt"
# Show me the counts
$array | group | % { $h = #{} } { $h[$_.Name] = $_.Count } { $h } | Out-File "C:\Desktop\count.txt"
Demo Lines:
performance considerations:
Add-Content "C:\Desktop\it_seems_to_match.txt" "$item"
try to avoid wrapping cmdlet pipelines
See also: Mastering the (steppable) pipeline
$array += $item.Split("/")[1]
Try to avoid using the increase assignment operator (+=) to create a collection
See also: Why should I avoid using the increase assignment operator (+=) to create a collection
$liste = $liste -replace $item, ""
This is a very expensive operation considering that you are reassigning (copying) a long list ($liste) with each iteration.
Besides it is a bad practice to change an array that you are currently iterating.
$array | group | ...
Group-Object is a rather slow cmdlet, you better collect (or count) the items on-the-fly (where you do $array += $item.Split("/")[1]) using a hashtable, something like:
$Name = $item.Split("/")[1]
if (!$HashTable.Contains($Name)) { $HashTable[$Name] = [Collections.Generic.List[String]]::new() }
To minimize memory usage it may be better to read one line at a time and check if it already exists. Below code I used StringReader and you can replace with StreamReader for reading from a file. I'm checking if the entire string exists, but you may want to split the line. Notice I have duplicaes in the input but not in the dictionary. See code below :
$rows= #"
$dict = [System.Collections.Generic.Dictionary[int, System.Collections.Generic.List[string]]]::new();
$reader = [System.IO.StringReader]::new($rows)
while(($row = $reader.ReadLine()) -ne $null)
$hash = $row.GetHashCode()
#check if list contains the string
#string is a duplicate
#add string to dictionary value if it is not in list
$list = $dict[$hash].Value
#add new hash value to dictionary
$list = [System.Collections.Generic.List[string]]::new();
$dict.Add($hash, $list)

How can subtract a character from csv using PowerShell

I'm trying to insert my CSV into my SQL Server database but just wondering how can I subtract the last three character from CSV GID column and then assigned it to my $CSVHold1 variable.
My CSV file look like this
GID Source Type Message Time
KLEMOE Od Hello 12/22/2022
EEINGJ Od hey 12/22/2022
Basically I'm trying to get only the first three character from GID and pass that value to my $CSVHold1 variable.
$CSVImport = Import-CSV $Global:ErrorReport
ForEach ($CSVLine1 in $CSVImport) {
$CSVHold1 = $CSVLine1.GID | ForEach-Object { $_.$GID = $_.$GID.subString(0, $_.$GID.Length - 3); $_ }
$CSVSource1 = $CSVLine1.Source
$CSVMessage1 = $CSVLine1.Message
I'm trying to do like above but some reason I'm getting an error.
You cannot call a method on a null-valued expression.
Your original line 3 was/is not valid syntax as Santiago pointed out.
$CSVHold1 = $CSVLine1.GID | ForEach-Object { $_.$GID = $_.$GID.subString(0, $_.$GID.Length - 3); $_ }
You are calling $_.$GID but you're wanting $_.GID
You also don't need to pipe the object into a loop to achieve what it seems you are asking.
#!/usr/bin/env powershell
$csvimport = Import-Csv -Path $env:HOMEDRIVE\Powershell\TestCSVs\test1.csv
##$CSVImport = Import-CSV $Global:ErrorReport
ForEach ($CSVLine1 in $CSVImport) {
$CSVHold1 = $CSVLine1.GID.SubString(0, $CSVLine1.GID.Length - 3)
$CSVSource1 = $CSVLine1.Source
$CSVMessage1 = $CSVLine1.Message
Write-Output -InputObject ('Changing {0} to {1}' -f $CSVLine1.gid, $CSVHold1)
Using your sample data, the above outputs:
C:> . 'C:\Powershell\Scripts\dchero.ps1'
Changing KLEMOE to KLE
Changing EEINGJ to EEI
Lastly, be aware that that the SubString method will fail if the length of $CSVLine1.GID is less than 3.

Fast compare two large csv(boths rows and columns) in powershell

I have two large CSVs to compare. Bosth csvs are basically data from the same system 1 day apart. No of rows are around 12k and columns 30.
The aim is to identify what column data has changed for primary key(#ID).
My idea was to loop through the CSVs to identify which rows have changed and dump these into a separate csvs. One done, I again loop through the changes rows, and indetify the exact change in column.
NewCSV = Import-Csv -Path ".\Data_A.csv"
OldCSV = Import-Csv -Path ".\Data_B.csv"
foreach ($LineNew in $NewCSV)
ForEach ($LineOld in $OldCSV)
If($LineNew -eq $LineOld)
Write-Host $LineNew, " Match"
Write-Host $LineNew, " Not Match"
But as soon as run the loop, it takes forever to run for 12k rows. I was hoping there must be a more efficient way to compare large files powershell. Something that is quicker.
Well you can give this a try, I'm not claiming it will be fast for what vonPryz has already pointed out but it should give you a good side-by-side perspective to compare what has changed from OldCsv to NewCsv.
Note: Those cells that have the same value on both CSVs will be ignored.
$NewCSV = Import-Csv -Path ".\Data_A.csv"
$OldCSV = Import-Csv -Path ".\Data_B.csv" | Group-Object ID -AsHashTable -AsString
$properties = $newCsv[0].PSObject.Properties.Name
$result = foreach($line in $NewCSV)
if($ref = $OldCSV[$line.ID])
foreach($prop in $properties)
if($line.$prop -ne $ref.$prop)
ID = $line.ID
Property = $prop
OldValue = $ref.$prop
NewValue = $line.$prop
Write-Warning "ID $($line.ID) could not be found on Old Csv!!"
As vonPryz hints in the comments, you've written an algorithm with quadratic time complexity (O(n²) in Big-O notation) - every time the input size doubles, the number of computations performed increase 4-fold.
To avoid this, I'd suggest using a hashtable or other dictionary type to hold each data set, and use the primary key from the input as the dictionary key. This way you get constant-time lookup of corresponding records, and the time complexity of your algorithm becomes near-linear (O(2n + k)):
$NewCSV = #{}
Import-Csv -Path ".\Data_A.csv" |ForEach-Object {
$NewCSV[$_.ID] = $_
$OldCSV = #{}
Import-Csv -Path ".\Data_B.csv" |ForEach-Object {
$OldCSV[$_.ID] = $_
Now that we can efficiently resolve each row by it's ID, we can inspect the whole of the data sets with an independent loop over each:
foreach($entry in $NewCSV.GetEnumerator()){
if(-not $OldCSV.ContainsKey($entry.Key)){
# $entry.Value is a new row, not seen in the old data set
$newRow = $entry.Value
$oldRow = $OldCSV[$entry.Key]
# do the individual comparison of the rows here
Do another loop like above, but with $NewCSV in place of $OldCSV to find/detect deletions.

Powershell array of arrays loop process

I need help with loop processing an array of arrays. I have finally figured out how to do it, and I am doing it as such...
$serverList = $1Servers,$2Servers,$3Servers,$4Servers,$5Servers
$serverList | % {
% {
Write-Host $_
I can't get it to process correctly. What I'd like to do is create a CSV from each array, and title the lists accordingly. So 1Servers.csv, 2Servers.csv, etc... The thing I can not figure out is how to get the original array name into the filename. Is there a variable that holds the list object name that can be accessed within the loop? Do I need to just do a separate single loop for each list?
You can try :
$1Servers = "Mach1","Mach2"
$2Servers = "Mach3","Mach4"
$serverList = $1Servers,$2Servers
$serverList | % {$i=0}{$i+=1;$_ | % {New-Object -Property #{"Name"=$_} -TypeName PsCustomObject} |Export-Csv "c:\temp\$($i)Servers.csv" -NoTypeInformation }
I take each list, and create new objects that I export in a CSV file. The way I create the file name is not so nice, I don't take the var name I just recreate it, so if your list is not sorted it will not work.
It would perhaps be more efficient if you store your servers in a hash table :
$1Servers = #{Name="1Servers"; Computers="Mach1","Mach2"}
$2Servers = #{Name="2Servers"; Computers="Mach3","Mach4"}
$serverList = $1Servers,$2Servers
$serverList | % {$name=$;$_.computers | % {New-Object -Property #{"Name"=$_} -TypeName PsCustomObject} |Export-Csv "c:\temp\$($name).csv" -NoTypeInformation }
Much like JPBlanc's answer, I kinda have to kludge the filename... (FWIW, I can't see how you can get that out of the array itself).
I did this example w/ foreach instead of foreach-object (%). Since you have actual variable names you can address w/ foreach, it seems a little cleaner, if nothing else, and hopefully a little easier to read/maintain:
$1Servers = "",""
$2Servers = "",""
$serverList = $1Servers,$2Servers
$counter = 1
foreach ( $list in $serverList ) {
$fileName = "{0}Servers.csv" -f $counter++
"FileName: $fileName"
foreach ( $server in $list ) {
"-- ServerName: $server"
I was able to resolve this issue myself. Because I wasn't able to get the object name through, I just changed the nature of the object. So now my server lists consist of two columns, one of which is the name of the list itself.
$1Servers = += [pscustomobject] #{
Servername = $entry.Servername
Domain = $entry.Domain
$serverList = $usaServers,$devsubServers,$wtencServers,$wtenclvServers,$pcidevServers
Then I am able to use that second column to name the lists within my foreach loop.

Powershell: how to fetch a single column from a multi-dimensional array?

Is there a function, method, or language construction allowing to retrieve a single column from a multi-dimensional array in Powershell?
$my_array = #()
$my_array += ,#(1,2,3)
$my_array += ,#(4,5,6)
$my_array += ,#(7,8,9)
# I currently use that, and I want to find a better way:
foreach ($line in $my_array) {
[array]$single_column += $line[1] # fetch column 1
# now $single_column contains only 2 and 5 and 8
My final goal is to find non-duplicated values from one column.
Sorry, I don't think anything like that exist. I would go with:
#($my_array | foreach { $_[1] })
To quickly find unique values I tend to use hashtables keys hack:
$UniqueArray = #($my_array | foreach -Begin {
$unique = #{}
} -Process {
$unique.($_[1]) = $null
} -End {
Obviously it has it limitations...
To extract one column:
$single_column = $my_array | foreach { $_[1] }
To extract any columns:
$some_columns = $my_array | foreach { ,#($_[2],$_[1]) } # any order
To find non-duplicated values from one column:
$unique_array = $my_array | foreach {$_[1]} | sort-object -unique
# caveat: the resulting array is sorted,
# so BartekB have a better solution if sort is a problem
I tried #BartekB's solution and it worked for me. But for the unique part I did the following.
#($my_array | foreach { $_[1] } | select -Unique)
I am not very familiar with powershell but I am posting this hoping it helps others since it worked for me.