PowerShell - filtering for unique values - powershell

I have an input CSV file with a column containing information similar to the sample below:
805265
995874
805674
984654
332574
339852
I'd like to extract unique values into a array based on the leading two characters, so using the above sample my result would be:
80, 99, 98, 33
How might I achieve this using PowerShell?

Use Select-Object and parameter -unique:
$values =
'805265',
'995874',
'805674',
'984654',
'332574',
'339852'
$values |
Foreach-Object { $_.Substring(0,2) } |
Select-Object -unique
If conversion to int is needed, then just cast it to [int]:
$ints =
$values |
Foreach-Object { [int]$_.Substring(0,2) } |
Select-Object -unique

I'd use the Group-Object cmdlet (alias group) for this:
Import-Csv foo.csv | group {$_.ColumnName.Substring(0, 2)}
Count Name Group
----- ---- -----
2 80 {805265, 805674}
1 99 {995874}
1 98 {984654}
2 33 {332574, 339852}

You might use a hash table:
$values = #(805265, 995874, 805674, 984654, 332574, 339852)
$ht = #{}
$values | foreach {$ht[$_ -replace '^(..).+','$1']++}
$ht.keys
99
98
33
80

You could make a new array with items containing the first two characters and then use Select-Object to give you the unique items like this:
$newArray = #()
$csv = Import-Csv -Path C:\your.csv
$csv | % {
$newArray += $_.YourColumn.Substring(0, 2)
}
$newArray | Select-Object -Unique

Just another option instead of using Select-Object -unique would be to use the Get-Unique cmdlet (or its alias gu; see the detailed description here) as demonstrated below:
$values = #(805265, 995874, 805674, 984654, 332574, 339852)
$values | % { $_.ToString().Substring(0,2) } | Get-Unique
# Or the same using the alias
$values | % { $_.ToString().Substring(0,2) } | gu

Related

Group csv column data and display count using powershell

I am having below data in my csv
"Path_Name","Lun_Number","status"
"vmhba0:C2:T0:L1","1","active"
"vmhba0:C1:T0:L1","1","active"
"vmhba1:C0:T7:L230","230","active"
"vmhba1:C0:T7:L231","231","active"
"vmhba1:C0:T7:L232","230","active"
"vmhba1:C0:T7:L235","231","active"
"vmhba1:C0:T7:L236","230","active"
I need to group the data based on Lun_Number and create a column to get the count of those Lun_Number
expected output
"Path_Name","Lun_Number","status","Count"
"vmhba0:C2:T0:L1","1","active", 2
"vmhba0:C1:T0:L1","1","active",
"vmhba1:C0:T7:L230","230","active",3
"vmhba1:C0:T7:L231","230","active",
"vmhba1:C0:T7:L232","230","active",
"vmhba1:C0:T7:L235","231","active",2
"vmhba1:C0:T7:L236","231","active",
Please let me know how can I do that. I tried group-object, sort-object but it doesn't seems to be working
Below is the code which is generating the above csv
$status_csv = Import-Csv -Path E:\pathstate.csv
$path_csv = Import-Csv -Path E:\PathInfo.csv
foreach($row in $path_csv)
{
$path_1 = $row.Path_Name
$path_2 = $status_csv | where{$_.Name -match "^$path_1$" }
[PsCustomObject]#{
Path_Name = $path_1
Lun_Number = $row.Lun_Number
status = $path_2.PathState
} | Export-Csv -Path E:\FinalReport.csv -NoTypeInformation -Append | Group-Object Lun_Number
}
I can see the approach you're trying to take, and I think something like the following could be useful:
$path_csv = Import-Csv 'sample.csv'
$Unique_Counts = $path_csv.Lun_Number | Group-Object | Select-Object Name,Count
This will help give you the output that you can use as part of your mapping for later, where you can make it dynamic to match with the row you're checking. Which you can use to pull out through a loop, such as $Unique_Counts.
Meaning that it you do something like $Unique_Counts[0].Count, will be able to grab the Lun_Number associated to it (listed as Name in the array).
Name Count
---- -----
1 2
230 3
231 2
If you're okay with having the count for each row, you can then use something like what you have:
foreach($row in $path_csv)
{
$path_1 = $row.Path_Name
[PsCustomObject]#{
Path_Name = $path_1
Lun_Number = $row.Lun_Number
status = $row.status
count = $Unique_Counts | Where-Object {$_.name -eq $row.Lun_Number} | Select-Object -ExpandProperty Count
} | Export-Csv $finalReport -NoTypeInformation -Append
}
This then provides me with the following outcome:
"Path_Name","Lun_Number","status","count"
"vmhba0:C2:T0:L1","1","active","2"
"vmhba0:C1:T0:L1","1","active","2"
"vmhba1:C0:T7:L230","230","active","3"
"vmhba1:C0:T7:L231","231","active","2"
"vmhba1:C0:T7:L232","230","active","3"
"vmhba1:C0:T7:L235","231","active","2"
"vmhba1:C0:T7:L236","230","active","3"
Hope this helps, it may be useful to understand more with the use case, but at least you can grab the unique count for all Lun_numbers. Just has to put it in all rows too.

Get results of For-Each arrays and display in a table with column headers one line per results

I am trying to get a list of files and a count of the number of rows in each file displayed in a table consisting of two columns, Name and Lines.
I have tried using format table but I don't think the problem is with the format of the table and more to do with my results being separate results. See below
#Get a list of files in the filepath location
$files = Get-ChildItem $filepath
$files | ForEach-Object { $_ ; $_ | Get-Content | Measure-Object -Line} | Format-Table Name,Lines
Expected results
Name Lines
File A
9
File B
89
Actual Results
Name Lines
File A
9
File B
89
Another approach how to make a custom object like this: Using PowerShell's Calculated Properties:
$files | Select-Object -Property #{ N = 'Name' ; E = { $_.Name} },
#{ N = 'Lines'; E = { ($_ | Get-Content | Measure-Object -Line).Lines } }
Name Lines
---- -----
dotNetEnumClass.ps1 232
DotNetVersions.ps1 9
dotNETversionTable.ps1 64
Typically you would make a custom object like this, instead of outputting two different kinds of objects.
$files | ForEach-Object {
$lines = $_ | Get-Content | Measure-Object -Line
[pscustomobject]#{name = $_.name
lines = $lines.lines}
}
name lines
---- -----
rof.ps1 11
rof.ps1~ 7
wai.ps1 2
wai.ps1~ 1

Powershell Compare-object IF different then ONLY list items from one file, not both

I have deleted my original question because I believe I have a more efficient way to run my script, thus I'm changing my question.
$scrubFileOneDelim = "|"
$scrubFileTwoDelim = "|"
$scrubFileOneBal = 2
$scrubFileTwoBal = 56
$scrubFileOneAcctNum = 0
$scrubFileTwoAcctNum = 0
$ColumnsF1 = Get-Content $scrubFileOne | ForEach-Object{($_.split($scrubFileOneDelim)).Count} | Measure-Object -Maximum | Select-Object -ExpandProperty Maximum
$ColumnsF2 = Get-Content $scrubFileTwo | ForEach-Object{($_.split($scrubFileTwoDelim)).Count} | Measure-Object -Maximum | Select-Object -ExpandProperty Maximum
$useColumnsF1 = $ColumnsF1-1;
$useColumnsF2 = $ColumnsF2-1;
$fileOne = import-csv "$scrubFileOne" -Delimiter "$scrubFileOneDelim" -Header (0..$useColumnsF1) | select -Property #{label="BALANCE";expression={$($_.$scrubFileOneBal)}},#{label="ACCTNUM";expression={$($_.$scrubFileOneAcctNum)}}
$fileTwo = import-csv "$scrubFileTwo" -Delimiter "$scrubFileTwoDelim" -Header (0..$useColumnsF2) | select -Property #{label="BALANCE";expression={$($_.$scrubFileTwoBal)}},#{label="ACCTNUM";expression={$($_.$scrubFileTwoAcctNum)}}
$hash = #{}
$hashTwo = #{}
$fileOne | foreach { $hash.add($_.ACCTNUM, $_.BALANCE) }
$fileTwo | foreach { $hashTwo.add($_.ACCTNUM, $_.BALANCE) }
In this script I'm doing the following, counting header's to return the count and use it in a range operator in order to dynamically insert headers for later manipulation. Then I'm importing 2 CSV files. I'm taking those CSV files and pushing them into their own hashtable.
Just for an idea of what I'm trying to do from here...
CSV1 (as a hashtable) looks like this:
Name Value
---- -----
000000000001 000000285+
000000000002 000031000+
000000000003 000004685+
000000000004 000025877+
000000000005 000000001+
000000000006 000031000+
000000000007 000018137+
000000000008 000000000+
CSV2 (as a hashtable) looks like this:
Name Value
---- -----
000000000001 000008411+
000000000003 000018137+
000000000007 000042865+
000000000008 000009761+
I would like to create a third hash table. It will have all the "NAME" items from CSV2, but I don't want the "VALUE" from CSV2, I want it to have the "VALUE"s that CSV1 has. So in the end result would look like this.
Name Value
---- -----
000000000001 000000285+
000000000003 000004685+
000000000007 000018137+
000000000008 000000000+
Ultimately I want this to be exported as a csv.
I have tried this with just doing a compare-object, not doing the hashtables with the following code, but I abandoned trying to do it this way because file 1 may have 100,000 "accounts" where file 2 only has 200, and the result I was getting listed close to the 100,000 accounts that I didn't want to be in the result. They had the right balances but I want a file that only has those balances for the accounts listed in file 2. This code below isn't really a part of my question, just showing something I've tried. I just think this is much easier and faster with a hash table now so I would like to go that route.
#Find and Rename the BALANCE and ACCOUNT NUMBER columns in both files.
$fileOne = import-csv "$scrubFileOne" -Delimiter "$scrubFileOneDelim" -Header (0..$useColumnsF1) | select -Property #{label="BALANCE";expression={$($_.$scrubFileOneBal)}},#{label="ACCT-NUM";expression={$($_.$scrubFileOneAcctNum)}}
$fileTwo = import-csv "$scrubFileTwo" -Delimiter "$scrubFileTwoDelim" -Header (0..$useColumnsF2) | select -Property #{label="BALANCE";expression={$($_.$scrubFileTwoBal)}},#{label="ACCT-NUM";expression={$($_.$scrubFileTwoAcctNum)}}
Compare-Object $fileOne $fileTwo -Property 'BALANCE','ACCTNUM' -IncludeEqual -PassThru | Where-Object{$_.sideIndicator -eq "<="} | select * -Exclude SideIndicator | export-csv -notype "C:\test\f1.txt"
What you are after is filtering the Compare-Object function. This will show only one side of the result. YOu will need to place this before you exclude that property for it to work.
| Where-Object{$_.sideIndicator -eq "<="} |
Assuming that you have the following hash tables:
$hash = #{
'000000000001' = '000000285+';
'000000000002' = '000031000+';
'000000000003' = '000004685+';
'000000000004' = '000025877+';
'000000000005' = '000000001+';
'000000000006' = '000031000+';
'000000000007' = '000018137+';
'000000000008' = '000000000+';
}
$hashTwo = #{
'000000000001' = '000008411+';
'000000000003' = '000018137+';
'000000000007' = '000042865+';
'000000000008' = '000009761+';
}
you can create the third hash table by iterating over the keys from the second hash table and then assigning those keys to the value from the first hash table.
$hashThree = #{}
ForEach ($key In $hashTwo.Keys) {
$hashThree["$key"] = $hash["$key"]
}
$hashThree
The output of $hashThree is:
Name Value
---- -----
000000000007 000018137+
000000000001 000000285+
000000000008 000000000+
000000000003 000004685+
If you want the order of the data maintained (and you are using PowerShell 6 Core), you can use [ordered]#{} when creating the hash tables.

Searching for key in ConvertFrom-StringData hash

I have a hash table that I got from a file, command used:
[array]$hash= Get-Content -raw '../../file.txt' | ConvertFrom-StringData
file.txt looks like this:
key=value
It works perfectly, the problem is when I trying to search using $hash.GetEnumerator.
I am trying to do something like this:
$hash.GetEnumerator() | where {$_.value -match 'value'} // or with key
It always returns an empty value. Got it from link, tried to create a local hash using $hash=#{} then add, and it worked(like for the guy from the link).
Note! $hash.GetEnumerator() | Sort-Object Name works for me too, and returning the right table.
Do you have any idea how can I search(-eq or -match) in the hash table that I have created?
try this
$hash.GetEnumerator() | where {$($_.Value) -match 'value'}
or like this
$hash.Keys | % {$($hash[$_]) -eq 'value'} | %{$hash[$_]}
or better, you can transform your hashtable into objects list
$hash.GetEnumerator() |
% {New-Object psobject -Property #{Name=$($_.Name); Value=$($_.Value)} } |
where Value -match 'value'
Your response to my comment asking for clarification didn't clarify too much, but I'm going to assume that you want to find the value for a particular key in the hashtable. That kind of lookup is the core functionality of hashtables.
$ht = #'
foo=23
bar=42
baz=5
'# | ConvertFrom-StringData
$key = 'bar'
$ht[$key] # returns 42
If you actually want a fuzzy match on the keys you could replace the direct lookup with a wildcard match like this:
$partialKey = 'ba*'
$ht.Keys | Where-Object {
$_ -like $partialKey
} | ForEach-Object {
$ht[$_] # returns 42 and 5
}
or a regular expression match like this:
$partialKey = '^ba'
$ht.Keys | Where-Object {
$_ -match $partialKey
} | ForEach-Object {
$ht[$_] # returns 42 and 5
}
As an alternative to looking up multiple keys in a loop you could also build a list of keys and use that list in a single lookup:
$partialKey = 'ba*'
$keys = $ht.Keys | Where-Object { $_ -like $partialKey }
$ht[$keys] # returns 42 and 5

Compare-Object - Separate side columns

Is it possible to display the results of a PowerShell Compare-Object in two columns showing the differences of reference vs difference objects?
For example using my current cmdline:
Compare-Object $Base $Test
Gives:
InputObject SideIndicator
987654 =>
555555 <=
123456 <=
In reality the list is rather long. For easier data reading is it possible to format the data like so:
Base Test
555555 987654
123456
So each column shows which elements exist in that object vs the other.
For bonus points it would be fantastic to have a count in the column header like so:
Base(2) Test(1)
555555 987654
123456
Possible? Sure. Feasible? Not so much. PowerShell wasn't really built for creating this kind of tabular output. What you can do is collect the differences in a hashtable as nested arrays by input file:
$ht = #{}
Compare-Object $Base $Test | ForEach-Object {
$value = $_.InputObject
switch ($_.SideIndicator) {
'=>' { $ht['Test'] += #($value) }
'<=' { $ht['Base'] += #($value) }
}
}
then transpose the hashtable:
$cnt = $ht.Values |
ForEach-Object { $_.Count } |
Sort-Object |
Select-Object -Last 1
$keys = $ht.Keys | Sort-Object
0..($cnt-1) | ForEach-Object {
$props = [ordered]#{}
foreach ($key in $keys) {
$props[$key] = $ht[$key][$_]
}
New-Object -Type PSObject -Property $props
} | Format-Table -AutoSize
To include the item count in the header name change $props[$key] to $props["$key($($ht[$key].Count))"].