Powershell make list with object groupping - powershell

Assuming a CSV File:
Name,group_name,group_id
foo,Best,1
bar,Worst,2
baz,Best,1
bob,Worst,2
What's the simplest form of Grouping by Powershell I can use to have output like:
Count group_id group_name Names
----- -------- ---------- -----
2 1 Best ["foo", "baz"]
2 2 Worst ["bar", "bob"]

Use the Group-Object cmdlet to group the rows together by name and id, then use Select-Object to extract the appropriate details from each group as individual properties:
# replace with `$Data = Import-Csv path\to\file.csv`
$Data = #'
Name,group_name,group_id
foo,Best,1
bar,Worst,2
baz,Best,1
bob,Worst,2
'#|ConvertFrom-Csv
# Group rows, then construct output record with `Select-Object`
$Data |Group-Object group_name,group_id |Select-Object Count,#{Name='group_id';Expression={$_.Group[0].group_id}},#{Name='group_name';Expression={$_.Group[0].group_name}},#{Name='Names';Expression={$_.Group.Name}}

Related

Powershell - group array objects by properties and sum

I am working on getting some data out of CSV file with a script and have no idea to solve the most important part - I have an array with few hundred lines, there are about 50 Ids in those lines, and each Id has a few different services attached to it. Each line has a price attached.
I want to group lines by ID and Service and I want each of those groups in some sort of variable so I can sum the prices. I filter out unique IDs and Services earlier in a script because they are different all the time.
Some example data:
$data = #(
[pscustomobject]#{Id='1';Service='Service1';Propertyx=1;Price='5'}
[pscustomobject]#{Id='1';Service='Service2';Propertyx=1;Price='4'}
[pscustomobject]#{Id='2';Service='Service1';Propertyx=1;Price='17'}
[pscustomobject]#{Id='3';Service='Service1';Propertyx=1;Price='3'}
[pscustomobject]#{Id='2';Service='Service2';Propertyx=1;Price='11'}
[pscustomobject]#{Id='4';Service='Service1';Propertyx=1;Price='7'}
[pscustomobject]#{Id='2';Service='Service3';Propertyx=1;Price='5'}
[pscustomobject]#{Id='3';Service='Service2';Propertyx=1;Price='4'}
[pscustomobject]#{Id='4';Service='Service2';Propertyx=1;Price='12'}
[pscustomobject]#{Id='1';Service='Service3';Propertyx=1;Price='8'})
$ident = $data.Id | select -unique | sort
$Serv = $data.Service | select -unique | sort
All help will be appreciated!
Use Group-Object to group objects by common values across one or more properties.
For example, to calculate the sum per Id, do:
$data |Group-Object Id |ForEach-Object {
[pscustomobject]#{
Id = $_.Name
Sum = $_.Group |Measure-Object Price -Sum |ForEach-Object Sum
}
}
Which should yield output like:
Id Sum
-- ---
1 17
2 33
3 7
4 19

remove duplicates in powershell when there are two parameters

I have a data as follows :
Percent SiteName
-------------------------- --------
95.15 Walnu
88.15 Tucson
99.14 Tarrace
99.39 Tampa
94.73 walnu
92.85 Tarrace
I want to remove the duplicates in sitename and want the data as :
Percent SiteName
-------------------------- --------
95.15 Walnu
88.15 Tucson
99.14 Tarrace
99.39 Tampa
select-object -unique works only when i want a single parameter in the output. Is there any method to do this.
Please help
Thanks in advance
If you want to filter out extra objects that contain an already output SiteName value, you can use Group-Object.
# $data contains your collection of objects
$data | Group-Object SiteName | Foreach-Object { $_.Group[0] }
Group-Object creates a collection of items based on property values. Using SiteName will retrieve all duplicate SiteName valued objects into a GroupInfo object. The Group property contains the collection of all grouped objects. Using the index [0], the first one will be output, which works even for SiteName values that have no duplicates.

How to filter rows in the powershell which are taken from console program?

I have an .exe console program which put the result into the console in the following format:
------------------ ----------- ----------------
CompanyName CompanyId CompanyType
------------------ ----------- ----------------
test1 1 Root
test2 2 Center
test3 3 Company
------------------ ----------- ----------------
I would like to pick up this in a PowerShell script and filter by the CompanyName.
I tried it with:
MyTool.exe companies | where {$_.CompanyName -eq 'test1'}
but it seems that this doesn't work.
Here is one way to convert the output of an EXE to a powershell collection of objects. what it does ...
creates a fake version of the output of your exe file
filters out the lines with repeated hyphens
replaces leading spaces with nothing
replaces 2-or-more spaces with a comma
converts that CSV-like string array into a collection of powershell objects
here's the code [grin] ...
# fake getting string output from an EXE
$InStuff = #'
------------------ ----------- ----------------
CompanyName CompanyId CompanyType
------------------ ----------- ----------------
test1 1 Root
test2 2 Center
test3 3 Company
------------------ ----------- ----------------
'# -split [environment]::NewLine
$CompanyInfo = $InStuff -notmatch '--{2,}' -replace '^ {1,}' -replace ' {2,}', ',' |
ConvertFrom-Csv
$CompanyInfo
'=' * 30
$CompanyInfo -match 'Test1'
output ...
CompanyName CompanyId CompanyType
----------- --------- -----------
test1 1 Root
test2 2 Center
test3 3 Company
==============================
test1 1 Root
PowerShell reports an external program's output as an array of lines (strings).
To filter such output using string parsing, use the -match operator:
# Extract the line of interest with -match and a regex
PS> #(MyTool.exe companies) -match '^\s+test1\s'
test1 1 Root
Note:
#(...), while not strictly necessary here, ensures that MyTool.exe's output becomes an array even if it happens to output just one line, so that -match performs filtering on that array (with a scalar LHS, -match returns a Boolean).
Regex ^\s+test1\s matches one or more (+) whitespace characters (\s) at the start of each line (^), followed by literal test1, followed by a whitespace character - thereby limiting matching to the CompanyName column.
If you want to parse the result into individual fields:
# Extract the line of interest with -match and a regex,
# then split that line into whitespace-separated tokens and store
# them in individual variables.
PS> $name, $id, $type = -split (#(MyTool.exe companies) -match '^\s+test1\s')
PS> $name, $id, $type
test1
1
Root
Lee Dailey's answer:
shows you how to instead parse your external program's output into custom objects whose properties you can query, by first transforming your program's output into CSV text and then parsing that into custom objects via ConvertFrom-Csv.
While this is very much in the spirit of PowerShell, you inevitably pay a performance penalty, and for extracting simple substrings it may not be worth it.
then, regrettably, forgoes the advantages of having parsed the input into objects by reverting to string matching that negates the benefits of having property-individual matching at one's disposal:
applying -match - a string operator - to a custom object LHS results in a hashtable-like representation for display that is not suited to programmatic processing; e.g.: #{CompanyName=test1; CompanyId=1; CompanyType=Root}
therefore - speaking in the abstract - using -match can result in false positives - because the matching isn't limited to the property of interest.
In short: If you went to the trouble of parsing the input into objects - if necessary at all - use their properties for robust filtering, as you attempted in your question:
$CompanyInfo | where {$_.CompanyName -eq 'test1'}
or, more succinctly, using PSv3+ syntax:
$CompanyInfo | where CompanyName -eq test1
or, more efficiently, in PSv4+, using the .Where() array method:
$CompanyInfo.Where({ $_.CompanyName -eq 'test1'})

Count and group by multiple columns

I have a CSV file with the following columns:
Error_ID
Date
hh (hour in two digit)
Error description
It look like this:
In SQL it was very easy:
SELECT X,Y,Count(1)
FROM #Table
GROUP BY X,Y
In PowerShell its a bit more different.
The Group-Object cmdlet allows grouping by multiple properties:
Import-Csv 'C:\path\to\your.csv' | Group-Object ErrorID, Date
which will give you a result like this:
Count Name Group
----- ---- -----
3 1, 15/07/2016 {#{ErrorID=1; Date=15/07/2016; Hour=16}, #{ErrorID=1; Da...
1 2, 16/07/2016 {#{ErrorID=2; Date=16/07/2016; Hour=9}}
However, to display grouped values in tabular form like an SQL query would do you need to extract them from the groups with calculated properties:
Import-Csv 'C:\path\to\your.csv' | Group-Object ErrorID, Date |
Select-Object #{n='ErrorID';e={$_.Group[0].ErrorID}},
#{n='Date';e={$_.Group[0].Date}}, Count
which will produce output like this:
ErrorID Date Count
------- ---- -----
1 15/07/2016 3
2 16/07/2016 1
You can use the following:
$csv = import-csv path/to/csv.csv
$csv | group-object errorid
Count Name Group
----- ---- -----
2 1 {#{errorID=1; time=15/7/2016; description=bad}, #{errorID=1; time=15/8/2016; description=wow}}
1 3 {#{errorID=3; time=15/7/2016; description=worse}}
1 5 {#{errorID=5; time=15/8/2016; description=the worst}}
$csv | where {$_.errorid -eq "2"}
errorID time description
------- ---- -----------
1 15/7/2016 bad
1 15/8/2016 wow
You can Pipe first and second example to get the desired result.

PS: Filter selected rows with only max values as output?

I have a variable results ($result) of several rows of data or object like this:
PS> $result | ft -auto;
name value
---- -----
a 1
a 2
b 30
b 20
....
what I need to get all the rows of name and max(value) like this filtered output:
PS> $result | ? |ft -auto
name value
---- -----
a 2
b 30
....
Not sure what command or filters available (as ? in above) so that I can get each name and only the max value for the name out?
$result | group name | select name,#{n='value';e={ ($_.group | measure value -max).maximum}}
This should do the trick:
PS> $result | Foreach {$ht=#{}} `
{if ($_.Value -gt $ht[$_.name].Value) {$ht[$_.Name]=$_}} `
{$ht.Values}
This is essentially using the Begin/Process/End scriptblock parameters of the Foreach-Object cmdlet to stash input objects with a max value based on a key into a hashtable.
Note: watch out for extra spaces after the line continuation character (`) - there shouldn't be any.