Powershell Sort and transpose data based on particular column - powershell

I am new to powershell. I have a data which is in columnar format, which needs to be converted into rows by sorting it numerically.
Input data:
Fruit month amount
1apple jan 10
3blueberry jan 20
4mango jan 30
2banana jan 50
1apple feb 20
3blueberry feb 50
4mango feb 80
2banana feb 95
Desired Output:
Fruit JanAmount FebAmount
1apple 10 20
2banana 50 95
3blueberry 20 50
4mango 30 80
Can anyone pls help me with this?

As long as there are no whitespace in the fruit-names then you can read the file as CSV with whitespace as delimiter. Then merge them using Group-Object and Add-Member to dynamically add x months. Ex:
Import-Csv -Path $InSort -Delimiter " " |
#Group all records per fruit
Group-Object Fruit |
#Sort by fruitname
Sort-Object Name |
#Process each group (fruit) to merge the rows
ForEach-Object {
#Create object (row) per fruit
$obj = New-Object -TypeName psobject -Property #{
Fruit = $_.Name
}
#For each record (month), add amount column
$_.Group | ForEach-Object {
#Turn month-value into TitleCase (first letter uppercase)
$month = (Get-Culture).TextInfo.ToTitleCase($_.Month)
#Add amount-column per record (month)
Add-Member -InputObject $obj -MemberType NoteProperty -Name "$($Month)Amount" -Value $_.Amount
}
#Output new objects
$obj
} | Export-CSV -Path newfile.csv -NoTypeInformation -Delimiter " "
Output:
Fruit JanAmount FebAmount
----- --------- ---------
1apple 10 20
2banana 50 95
3blueberry 20 50
4mango 30 80

if you delimiter into your file is space, try this:
import-csv "C:\Temp\yourfile.txt" -Delimiter ' ' |
group fruit |
select Name, #{N="JanAmount";E={($_.Group | where month -eq 'jan').amount}} , #{N="FebAmount";E={($_.Group | where month -eq 'feb').amount}}

Related

Select Fields From Duplicate Records for an Array

I've got a CSV file that is imported that looks like this:
Customer ID Contract Start Contract End Region Customer
2-213456 2/20/2018 1/1/2030 NA Acme
2-213456 6/18/2019 6/17/2020 NA Acme
2-213456 6/18/2020 6/30/2021 NA Acme
3-213458 6/27/2019 6/26/2020 CAN Acme Shipping
2-123456 6/27/2020 6/27/2021 AUS Acme Manufacturing
5-123576 6/29/2019 6/28/2020 AUS Acme Storage
Which I'm trying to build an array that only has the unique values (Customer ID) but, would like to include the earliest Contract Start date and the latest Contract End date to get a result like:
Customer ID Contract Start Contract End Region Customer
2-213456 2/20/2018 6/30/2021 NA Acme
3-213458 6/27/2019 6/26/2020 CAN Acme Shipping
2-123456 6/27/2020 6/27/2021 AUS Acme Manufacturing
5-123576 6/29/2019 6/28/2020 AUS Acme Storage
This is what I have but, I keep getting a System.Object[] for the dates
$Data = import-csv -path "C:\Customers.csv"
$Final = #()
$N = 0
$count = $Data.count
foreach ($record in $Data)
{
Write-Host "Record " $N " of " $Count
$Rec = New-Object System.Object
$Rec | Add-Member -type NoteProperty -name "Customer ID" -value $record.'Customer ID'
$StartDate = $Data | Foreach-Object {$_.'Contract Start' = [DateTime]$_.'Contract Start'; $_} | Group-Object 'Customer ID' | Foreach-Object {$_.Group | Sort-Object 'Contract Start' | Select-Object -Property $record.'Contract Start' -first 1}
$Rec | Add-Member -type NoteProperty -name "Contract Start" -value $StartDate
$EndDate = $Data | Foreach-Object {$_.'Contract End' = [DateTime]$_.'Contract End'; $_} | Group-Object 'Customer ID' | Foreach-Object {$_.Group | Sort-Object 'Contract End' | Select-Object -Property $record.'Contract End' -Last 1}
$Rec | Add-Member -type NoteProperty -name "Contract End" -value $EndDate
$Rec | Add-Member -type NoteProperty -name "Region" -value $record.'Region'
$Rec | Add-Member -type NoteProperty -name "Customer" -value $record.'Customer'
$Final += $Rec
$N++
}
I got a lot of errors about Datetime trying to replicate what you have posted above. You've tried to do a lot in one place when setting and sorting the start and end dates, so our first task is to simplify that. Knowing that you could potentially have a lot of customer data, I thought it best to group the customers by their ID in a hashtable. That way we can call the customer ID and immediately just have their records. PowerShell classes allow us to create a couple of methods to import the data in to the hashtable, parse the dates as part of the import. The final method exports your data picking the earliest start date, and the latest end date. Fully tested solution below.
class Customers {
[hashtable]$Accounts
# Constructor
Customers() {
$this.Accounts = #{}
}
# Methods
[void]AddCustomerData([psobject[]]$Records) {
foreach ($Record in $Records) {
# Convert the dates to datetime objects so we can sort them later
$Record = $this.ParseDates($Record)
$ID = $Record."Customer ID"
# If the hashtable already contains the customer ID, we need to add the new record to their existing ones.
if ($this.Accounts.Contains($ID)) {
$CustomerRecords = $this.Accounts[$ID]
$CustomerRecords += $Record
$this.Accounts[$ID] = $CustomerRecords
}
# If it doesn't we create a new record with the value as an array.
else {
$this.Accounts[$ID] = #(,$Record)
}
}
}
[psobject]ParseDates([psobject]$Row) {
# Your dates appear to be US format, so I've kept them that way, change the culture from 'en-US' if you need to.
$Row."Contract Start" = [Datetime]::Parse($Row."Contract Start",[cultureinfo]::new("en-US",$false))
$Row."Contract End" = [Datetime]::Parse($Row."Contract End",[cultureinfo]::new("en-US",$false))
return $Row
}
[psobject[]]PrintCustomerData() {
$CustomerData = #()
# Loop through the hashtable
$this.Accounts.GetEnumerator() | ForEach-Object {
$Contracts = $_.Value
# Find the earliest start date for the current customer by sorting in ascending order
$StartDate = $Contracts."Contract Start" | Sort-Object | Select-Object -First 1
# Find the latest end date for the current customer by sorting in descending order
$EndDate = $Contracts."Contract End" | Sort-Object -Descending | Select-Object -First 1
# Create a new PSObject for each customer, selecting a Unique value for Region and Customer as it should be the same across records
$CustomerData += [PSCustomObject] #{
"Customer ID" = $_.Key
"Contract Start" = $StartDate
"Contract End" = $EndDate
Region = $($Contracts | Select-Object -Unique -ExpandProperty Region)
Customer = $($Contracts | Select-Object -Unique -ExpandProperty Customer)
}
}
return $CustomerData
}
}
Usage:
$csv = Import-Csv -Path .\Desktop\test.csv
# Create a new instance of the class
$customers = [Customers]::new()
# Add the CSV data to a the Accounts hashtable
$customers.AddCustomerData($csv)
# Print out the data from the hashtable in the desired format.
$customers.PrintCustomerData() | Format-Table -AutoSize
Customer ID Contract Start Contract End Region Customer
----------- -------------- ------------ ------ --------
2-213456 20/02/2018 00:00:00 01/01/2030 00:00:00 NA Acme
2-123456 27/06/2020 00:00:00 27/06/2021 00:00:00 AUS Acme Manufacturing
3-213458 27/06/2019 00:00:00 26/06/2020 00:00:00 CAN Acme Shipping
5-123576 29/06/2019 00:00:00 28/06/2020 00:00:00 AUS Acme Storage
And now you have your records in a hashtable, you can do other awesome stuff like look up the records for a particular customer.
$customers.Accounts['2-213456'] | Format-Table -AutoSize
Customer ID Contract Start Contract End Region Customer
----------- -------------- ------------ ------ --------
2-213456 20/02/2018 00:00:00 01/01/2030 00:00:00 NA Acme
2-213456 18/06/2019 00:00:00 17/06/2020 00:00:00 NA Acme
2-213456 18/06/2020 00:00:00 30/06/2021 00:00:00 NA Acme
Using this data.csv as an example input:
Customer ID,Contract Start,Contract End,Region,Customer
2-213456,2/20/2018,1/1/2030,NA,Acme
2-213456,6/18/2019,6/17/2020,NA,Acme
2-213456,6/18/2020,6/30/2021,NA,Acme
3-213458,6/27/2019,6/26/2020,CAN,Acme Shipping
2-123456,6/27/2020,6/27/2021,AUS,Acme Manufacturing
5-123576,6/29/2019,6/28/2020,AUS,Acme Storage
We can use Group-Object to group by Customer ID and use Sort-Object to sort by datetime versions of Contract Start and Contract End. Then we can construct a new System.Management.Automation.PSCustomObject for each compressed record, and format the System.Object[] array with Format-Table.
$array = Import-Csv -Path .\data.csv | Group-Object -Property "Customer ID" | ForEach-Object {
$contractStart = $_.Group |
Sort-Object -Property #{Expression = {[datetime]$_."Contract Start"}} |
Select-Object -First 1
$contractEnd = $_.Group |
Sort-Object -Property #{Expression = {[datetime]$_."Contract End"}} |
Select-Object -Last 1
[PSCustomObject]#{
"Customer ID" = $_.Name
"Contract Start" = $contractStart."Contract Start"
"Contract End" = $contractEnd."Contract End"
"Region" = $contractStart.Region
"Customer" = $contractStart.Customer
}
}
$array.GetType().FullName
$array | Format-Table -AutoSize
Which results in the following table result:
System.Object[]
Customer ID Contract Start Contract End Region Customer
----------- -------------- ------------ ------ --------
2-123456 6/27/2020 6/27/2021 AUS Acme Manufacturing
2-213456 2/20/2018 1/1/2030 NA Acme
3-213458 6/27/2019 6/26/2020 CAN Acme Shipping
5-123576 6/29/2019 6/28/2020 AUS Acme Storage

How to convert powershell array to table

I found a similar post regarding the problem in the link below.
How to fetch first column from given powershell array?
I am not able to directly convert it to a table as some fields are missing and do operations.
Customer ID Client Name Computer Name Computer Brand Duration Connection Time Lang
123 first last 127.0.0.1 lenovo 10:00 8/18/2019 6:00 PM Eng
1 lastname 127.0.0.2 apple 2:30:00 8/18/2019 1:00 AM Chn
86 user3 127.0.0.1 dell 8/18/2019 2:00 PM
21 user4 127.0.0.4 apple 30:00 8/17/2019 1:00 PM Eng
I want to first filter with a specific user who is connected for more than 30 minutes and then list its id.
Update
The result should be
1
21
because they are connected for 30min and over.
If the data you show is indeed the output of a Fixed-Width file, you need to try and get the widths for each field in order to parse it. A handicap here is that the original header names contain a space character and we need to replace that by an underscore.
For that, you can use the below function:
function ConvertFrom-FixedWith {
[CmdletBinding()]
Param(
[Parameter(Mandatory = $true, Position = 0)]
[string[]]$Content
)
$splitter = '§¤¶' # some unlikely string: Alt-21, [char]164, Alt-20
$needQuotes = '^\s+|[",]|\s+$' # quote the fields if needed
function _FWClean ([string]$field) {
# internal helper function to clean a field value with regards to quoted fields
$field = $_.Trim() -replace '(?<!\\)\\"|""', '§DQUOTE¶'
if ($field -match '^"(.*)"$') { $field = $matches[1] }
if ($field -match $needQuotes) { $field = '"{0}"' -f $field }
return $field -replace '§DQUOTE¶', '""'
}
# try and calculate the field widths using the first header line
# this only works if none of the header names have spaces in them
# and where the headers are separated by at least one space character.
Write-Verbose "Calculating column widths using first row"
$row = ($Content[0] -replace '\s+', ' ').Trim()
$fields = #($row -split ' ' ) # | ForEach-Object { _FWClean $_ })
$ColumnBreaks = for ($i = 1; $i -lt $fields.Length; $i++) {
$Content[0].IndexOf($fields[$i])
}
$ColumnBreaks = $ColumnBreaks | Sort-Object -Descending
Write-Verbose "Splitting fields and generating output"
$Content | ForEach-Object {
if ($null -ne $_ -and $_ -match '\S') {
$line = $_
# make sure lines that are too short get padded on the right
if ($line.Length -le $ColumnBreaks[0]) { $line = $line.PadRight(($ColumnBreaks[0] + 1), ' ') }
# add the splitter string on every column break point
$ColumnBreaks | ForEach-Object {
$line = $line.Insert($_, $splitter)
}
# split on the splitter string, trim, and dedupe possible quotes
# then join using the delimiter character
#($line -split $splitter | ForEach-Object { _FWClean $_ }) -join ','
}
} | ConvertFrom-Csv # the result is an array of PSCustomObjects
}
With that function in place, parsing the text can be done like so:
$text = #"
Customer_ID Client_Name Computer_Name Computer_Brand Duration Connection_Time Lang
123 first last 127.0.0.1 lenovo 10:00 8/18/2019 6:00 PM Eng
1 lastname 127.0.0.2 apple 2:30:00 8/18/2019 1:00 AM Chn
86 user3 127.0.0.1 dell 8/18/2019 2:00 PM
21 user4 127.0.0.4 apple 30:00 8/17/2019 1:00 PM Eng
"# -split '\r?\n'
# replace the single space characters in the header names by underscore
$text[0] = $text[0] -replace '(\w+) (\w+)', '$1_$2'
# the 'ConvertFrom-FixedWith' function takes a string array as input
$table = ConvertFrom-FixedWith -Content $text
#output on screen
$table | Format-Table -AutoSize
# export to CSV file
$table | Export-Csv -Path 'D:\test.csv' -NoTypeInformation
Output (on screen)
Customer ID Client Name Computer Name Computer Brand Duration Connection Time Lang
----------- ----------- ------------- -------------- -------- --------------- ----
123 first last 127.0.0.1 lenovo 10:00 8/18/2019 6:00 PM Eng
1 lastname 127.0.0.2 apple 2:30:00 8/18/2019 1:00 AM Chn
86 user3 127.0.0.1 dell 8/18/2019 2:00 PM
21 user4 127.0.0.4 apple 30:00 8/17/2019 1:00 PM Eng
If your input $text is already a string array storing all the ines as we see them in your question, then leave out the -split '\r?\n'
Having parsed the input to a table of PsCustomObjects, you can get the customers that are connected for 30 minutes or more with the help of another small helper function:
function Get-DurationInMinutes ([string]$Duration) {
$h, $m, $s = (('0:{0}' -f $Duration) -split ':' | Select-Object -Last 3)
return [int]$h * 60 + [int]$m
}
($table | Where-Object { (Get-DurationInMinutes $_.Duration) -ge 30 }).Customer_ID
This will output
1
21
Update
Now that we finally know the data is from a TAB delimited CSV file, you don't need the ConvertFrom-FixedWith function.
Simply import the data using if it comes from a file
$table = Import-Csv -Path 'D:\customers.csv' -Delimiter "`t"
Or, if it comes from the output of another command as string or string array:
$table = $original_output | ConvertFrom-Csv -Delimiter "`t"
Then, use the Get-DurationInMinutes helper function just like above to get the Customer ID's that are connected for more than 30 minutes:
function Get-DurationInMinutes ([string]$Duration) {
$h, $m, $s = (('0:{0}' -f $Duration) -split ':' | Select-Object -Last 3)
return [int]$h * 60 + [int]$m
}
($table | Where-Object { (Get-DurationInMinutes $_.Duration) -ge 30 }).'Customer ID'
Uhh. I'm surprised there's not a canonical way to do this. Based on https://www.reddit.com/r/PowerShell/comments/211ewa/how_to_convert_fixedwidth_to_pipedelimited_or/.
# 0 19 38 59 81 97 120 123
# Customer ID Client Name Computer Name Computer Brand Duration Connection Time Lang
# 123 first last 127.0.0.1 lenovo 10:00 8/18/2019 6:00 PM Eng
# 1 lastname 127.0.0.2 apple 2:30:00 8/18/2019 1:00 AM Chn
# 86 user3 127.0.0.1 dell 8/18/2019 2:00 PM
# 21 user4 127.0.0.4 apple 30:00 8/17/2019 1:00 PM Eng
$cols = 0,19,38,59,81,97,120,123 # fake extra column at the end, assumes all rows are that wide
$firstline = get-content columns.txt | select -first 1
$headers = for ($i = 0; $i -lt $cols.count - 1; $i++) {
$firstline.substring($cols[$i], $cols[$i+1]-$cols[$i]).trim()
}
# string Substring(int startIndex, int length)
$lines = Get-Content columns.txt | select -skip 1
$lines | ForEach {
$hash = [ordered]#{}
for ($i = 0; $i -lt $headers.length; $i++) {
$hash += #{$headers[$i] = $_.substring($cols[$i], $cols[$i+1]-$cols[$i]).trim()}
}
[pscustomobject]$hash
}
Output:
PS /Users/js/foo> ./columns | ft
Customer ID Client Name Computer Name Computer Brand Duration Connection Time Lan
----------- ----------- ------------- -------------- -------- --------------- ---
123 first last 127.0.0.1 lenovo 10:00 8/18/2019 6:00 PM Eng
1 lastname 127.0.0.2 apple 2:30:00 8/18/2019 1:00 AM Chn
86 user3 127.0.0.1 dell 8/18/2019 2:00 PM
21 user4 127.0.0.4 apple 30:00 8/17/2019 1:00 PM Eng
I think you have a couple of requirements here. I'm going to describe one way to do it using a generic 'for loop' and regular expression - something you can play with and tweak to your needs. There are better ways of doing this (Powershell shortcuts), but based on the way you asked I'm going to assume that understanding is your goal, so this code should serve well if you have a background in any programming language. Hope this helps!
# Here is your data formatted in an array. Missing values are just empty fields.
# You could have fewer or more fields, but I've broken up your data into nine fields
# (0-8 when counting elements in an array)
# Customer ID, FName, LName, ComputerHostName, Brand, Duration, ConnectionDate, ConnectionTime, Lang
$myarray = #(
('123', 'firstname', 'lastname', '127.0.0.1', 'lenovo', '10:00', '8/18/2019', '6:00 PM', 'Eng'),
('1', 'lastnam', '', '127.0.0.2', 'apple', '2:30:00', '8/18/2019', '1:00 AM', 'Chn'),
('86', 'user3', '', '127.0.0.1', 'dell', '04:33', '8/18/2019', '2:00 PM', ''),
('21', 'user4', '', '127.0.0.4', 'apple', '30:00', '8/17/2019', '1:00 PM', 'Eng')
)
# This is a generic for loop that prints the ComputerHostName, which is the 4th column.
# The 4th column is column #3 if counting from zero (0,1,2,3)
# I'm using a regular expression to match duration above 30 minutes with the '-match' operator
for ( $i = 0; $i -lt $myarray.Length; $i++ ) {
if ( $myarray[$i][5] -match "[3-5][0-9]:[0-9][0-9]$" ){
"$($myarray[$i][5]) - $($myarray[$i][3])"
}
}
Printed Result:
2:30:00 - 127.0.0.2
30:00 - 127.0.0.4

Group and Sum CSV with unknown number of columns

Wondering if someone would be able to help me. Problem is that I'm trying to Import , Group, Sum and the Export a CSV. The problem is that my CSV has a unknown number of columns of the following format.
GroupA,GroupB,GroupC,ValueA,ValueB,ValueC,ValueD...
GroupA, B and C are constant and the fields I want to group by - I know the names of these fields in advance. The problem is there are an unknown number of Value columns - all of which I want to Sum (and don't know the names of in advance.)
I'm comfortable getting this code working if I know the name of the Value fields and have a fixed number of Value Fields. But I'm struggling to get code for unknown names and number of columns.
$csvImport = import-csv 'C:\input.csv'
$csvGrouped = $csvImport | Group-Object -property GroupA,GroupB,GroupC
$csvGroupedFinal = $csvGrouped | Select-Object #{Name = 'GroupA';Expression={$_.Values[0]}},
#{Name = 'GroupB';Expression={$_.Values[1]}},
#{Name = 'GroupC';Expression={$_.Values[2]}},
#{Name = 'ValueA' ;Expression={
($_.Group|Measure-Object 'ValueA' -Sum).Sum
}}
$csvGroupedFinal | Export-Csv 'C:\output.csv' -NoTypeInformation
Example Input Data -
GroupA, GroupB, Value A
Sam, Apple, 10
Sam, Apple, 20
Sam, Orange, 50
Ian, Apple, 15
Output Data -
GroupA, GroupB, Value A
Sam, Apple, 30
Sam, Orange, 50
Ian, Apple, 15
The following script should work. Pay your attention to the $FixedNames variable:
$csvImport = #"
Group A,Group B,Value A
sam,apple,10
sam,apple,20
sam,orange,50
ian,apple,15
"# | ConvertFrom-Csv
$FixedNames = #('Group A', 'Group B', 'Group C')
# $aux = ($csvImport|Get-Member -MemberType NoteProperty).Name ### sorted (wrong)
$aux = ($csvImport[0].psobject.Properties).Name ### not sorted
$auxGrpNames = #( $aux | Where-Object {$_ -in $FixedNames})
$auxValNames = #( $aux | Where-Object {$_ -notin $FixedNames})
$csvGrouped = $csvImport | Group-Object -property $auxGrpNames
$csvGroupedFinal = $csvGrouped |
ForEach-Object {
($_.Name.Replace(', ',','), (($_.Group |
Measure-Object -Property $auxValNames -Sum
).Sum -join ',')) -join ','
} | ConvertFrom-Csv -Header $aux
$csvGroupedFinal
Tested likewise for
$csvImport = #"
Group A,Group B,Value A,Value B
sam,apple,10,1
sam,apple,20,
sam,orange,50,5
ian,apple,15,51
"# | ConvertFrom-Csv
as well as for more complex data of Group A,Group B,Group C,Value A,Value B header.
Edit updated according to the beneficial LotPings' comment.
After importing this script splits the properties (columns) into Groups / Values
It groups dynamically and sums on only value fields independent of the number
The input ordering is maintained with a final Select-Object
## Q:\Test\2019\01\17\SO_54237887.ps1
$csvImport = Import-Csv '.\input.csv'
$Cols = ($csvImport[0].psobject.Properties).Name
# get list of group columns by name and wildcard
$GroupCols = $Cols | Where-Object {$_ -like 'Group*'}
# a different approach would be to select a number of leading columns
# $GroupCols = $Cols[0..1]
$ValueCols = $Cols | Where-Object {$_ -notin $GroupCols}
$OutCols = ,'Groups' + $ValueCols
$csvGrouped = $csvImport | Group-Object $GroupCols | ForEach-Object{
$Props = #{Groups=$_.Name}
ForEach ($ValCol in $ValueCols){
$Props.Add($ValCol,($_.Group|Measure-Object $ValCol -Sum).Sum)
}
[PSCustomObject]$Props
}
$csvGrouped | Select-Object $OutCols
With this sample input file
GroupA GroupB ValueA ValueB
------ ------ ------ ------
Sam Apple 10 15
Sam Apple 20 25
Sam Orange 50 75
Ian Apple 15 20
Sample output for any number of Groups and values
Groups ValueA ValueB
------ ------ ------
Sam, Apple 30 40
Sam, Orange 50 75
Ian, Apple 15 20
Without any change in code it does process data from Hassans answer too:
Groups ValueA ValueB ValueC
------ ------ ------ ------
Sam, Apple 30 4 20
Sam, Orange 50 4 5
Ian, Apple 15 3 3
script1.ps1
Import-Csv 'input.csv' | `
Group-Object -Property GroupA,GroupB | `
% {$b=$_.name -split ', ';$c=($_.group | `
Measure-Object -Property Value* -Sum).Sum;
[PScustomobject]#{GroupA=$b[0];
GroupB=$b[1];
Sum=($c | Measure-Object -Sum).Sum }}
input.csv
GroupA, GroupB, ValueA, ValueB, ValueC
Sam, Apple, 10, 1, 10
Sam, Apple, 20, 3, 10
Sam, Orange, 50, 4, 5
Ian, Apple, 15, 3, 3
OUTPUT
PS D:\coding> .\script1.ps1
GroupA GroupB Sum
------ ------ ---
Sam Apple 54
Sam Orange 59
Ian Apple 21

re-arrange and combine powershell custom objects

I have a system that currently reads data from a CSV file produced by a separate system that is going to be replaced.
The imported CSV file looks like this
PS> Import-Csv .\SalesValues.csv
Sale Values AA BB
----------- -- --
10 6 5
5 3 4
3 1 9
To replace this process I hope to produce an object that looks identical to the CSV above, but I do not want to continue to use a CSV file.
I already have a script that reads data in from our database and extracts the data that I need to use. I'll not detail the fairly long script that preceeds this point but in effect it looks like this:
$SQLData = Custom-SQLFunction "SELECT * FROM SALES_DATA WHERE LIST_ID = $LISTID"
$SQLData will contain ~5000+ DataRow objects that I need to query.
One of those DataRow object looks something like this:
lead_id : 123456789
entry_date : 26/10/2018 16:51:16
modify_date : 01/11/2018 01:00:02
status : WRONG
user : mrexample
vendor_lead_code : TH1S15L0NGC0D3
source_id : A543212
list_id : 333004
list_name : AA Some Text
gmt_offset_now : 0.00
SaleValue : 10
list_name is going to be prefixed with AA or BB.
SaleValue can be any integer 3 and up, however realistically extremely unlikely to be higher than 100 (as this is a monthly donation) and will be one of 3,5,10 in the vast majority of occurrences.
I already have script that takes the content of list_name, creates and populates the data I need to use into two separate psobjects ($AASalesValues and $BBSalesValues) that collates the total numbers of 'SaleValue' across the data set.
Because I cannot reliably anticipate the value of any SaleValue I have to dynamically create the psobjects properties like this
foreach ($record in $SQLData) {
if ($record.list_name -match "BB") {
if ($record.SaleValue -gt 0) {
if ($BBSalesValues | Get-Member -Name $($record.SaleValue) -MemberType Properties) {
$BBSalesValues.$($record.SaleValue) = $BBSalesValues.$($record.SaleValue)+1
} else {
$BBSalesValues | Add-Member -Name $($record.SaleValue) -MemberType NoteProperty -Value 1
}
}
}
}
The two resultant objects look like this:
PS> $AASalesValues
10 5 3 50
-- - - --
17 14 3 1
PS> $BBSalesvalues
3 10 5 4
- -- - -
36 12 11 1
I now have the data that I need, however I need to format it in a way that replicates the format of the CSV so I can pass it directly to another existing powershell script that is configured to expect the data in the format that the CSV is in, but I do not want to write the data to a file.
I'd prefer to pass this directly to the next part of the script.
Ultimately what I want to do is to produce a new object/some output that looks like the output from Import-Csv command at the top of this post.
I'd like a new object, say $OverallSalesValues, to look like this:
PS>$overallSalesValues
Sale Values AA BB
50 1 0
10 17 12
5 14 11
4 0 1
3 3 36
In the above example the values from $AASalesValues is listed under the AA column, the values from $BBSalesValues is listed under the BB column, with the rows matching the headers of the two original objects.
I did try this with hashtables but I was unable to work out how to both create them from dynamic values and format them to how I needed them to look.
Finally got there.
$TotalList = #()
foreach($n in 3..200){
if($AASalesValues.$n -or $BBSalesValues.$n){
$AACount = $AASalesValues.$n
$BBcount = $BBSalesValues.$n
$values = [PSCustomObject]#{
'Sale Value'= $n
AA = $AACount
BB = $BBcount
}
$TotalList += $values
}
}
$TotalList
produces an output of
Sale Value AA BB
---------- -- --
3 3 36
4 2
5 14 11
10 18 12
50 1
Just need to add a bit to include '0' values instead of $null.
I'm going to assume that $record contains a list of the database results for either $AASalesValues or $BBSalesValues, not both, otherwise you'd need some kind of selector to avoid counting records of one group with the other group.
Group the records by their SaleValue property as LotPings suggested:
$BBSalesValues = $record | Group-Object SaleValue -NoElement
That will give you a list of the SaleValue values with their respective count.
PS> $BBSalesValues
Count Name
----- ----
36 3
12 10
11 5
1 4
You can then update your CSV data with these values like this:
$file = 'C:\path\to\data.csv'
# read CSV into a hashtable mapping the sale value to the complete record
# (so that we can lookup the record by sale value)
$csv = #{}
Import-Csv $file | ForEach-Object {
$csv[$_.'Sale Values'] = $_
}
# Add records for missing sale values
$($AASalesValues; $BBSalesValues) | Select-Object -Expand Name -Unique | ForEach-Object {
if (-not $csv.ContainsKey($_)) {
$csv[$_] = New-Object -Type PSObject -Property #{
'Sale Values' = $_
'AA' = 0
'BB' = 0
}
}
}
# update records with values from $AASalesValues
$AASalesValues | ForEach-Object {
[int]$csv[$_.Name].AA += $_.Count
}
# update records with values from $BBSalesValues
$BBSalesValues | ForEach-Object {
[int]$csv[$_.Name].BB += $_.Count
}
# write updated records back to file
$csv.Values | Export-Csv $file -NoType
Even with your updated question the approach would be pretty much the same, you'd just add another level of grouping for collecting the sales numbers:
$sales = #{}
$record | Group-Object {$_.list_name.Split()[0]} | ForEach-Object {
$sales[$_.Name] = $_.Group | Group-Object SaleValue -NoElement
}
and then adjust the merging to something like this:
$file = 'C:\path\to\data.csv'
# read CSV into a hashtable mapping the sale value to the complete record
# (so that we can lookup the record by sale value)
$csv = #{}
Import-Csv $file | ForEach-Object {
$csv[$_.'Sale Values'] = $_
}
# Add records for missing sale values
$sales.Values | Select-Object -Expand Name -Unique | ForEach-Object {
if (-not $csv.ContainsKey($_)) {
$prop = #{'Sale Values' = $_}
$sales.Keys | ForEach-Object {
$prop[$_] = 0
}
$csv[$_] = New-Object -Type PSObject -Property $prop
}
}
# update records with values from $sales
$sales.GetEnumerator() | ForEach-Object {
$name = $_.Key
$_.Value | ForEach-Object {
[int]$csv[$_.Name].$name += $_.Count
}
}
# write updated records back to file
$csv.Values | Export-Csv $file -NoType

Aggregating tasks by duration for each day in a week/month? (PoSH)

I am parsing JSON from a web service to get my tasks (using a TimeFlip). Right now, I get back each task, when it occurred, and duration, so the data looks like this:
(taskname, start, durationinSec)
TaskA,"6/5/2018 12:16:36 PM",312
TaskB,"6/5/2018 12:30:36 PM",200
TaskA,"6/6/2018 08:00:00 AM",150
TaskA,"6/6/2018 03:00:00 PM",150
(etc etc)
I would like to generate a rollup report, showing by day which tasks had how much time.
While the data will span weeks, I'm just trying to do a weekly report that I can easily transcribe into our time app (since they won't give me an API key). So I'll do something like where {$_.start -gt (? {$_.start -gt (get-date -Hour 0 -Minute 00 -Second 00).adddays(-7)} first.
6/5/2018 6/6/2018
TaskA 312 300
TaskB 200
How can I do that? I assume group-object, but unclear how you'd do either the pivot or even the grouping.
The following doesn't output a pivot table, but performs the desired grouping and aggregation:
$rows = #'
taskname,start,durationinSec
TaskA,"6/5/2018 12:16:36 PM",312
TaskB,"6/5/2018 12:30:36 PM",200
TaskA,"6/6/2018 08:00:00 AM",150
TaskA,"6/6/2018 03:00:00 PM",150
'# | ConvertFrom-Csv
$rows | Group-Object { (-split $_.start)[0] }, taskname | ForEach-Object {
$_ | Select-Object #{ n='Date'; e={$_.Values[0]} },
#{ n='Task'; e={$_.Values[1]} },
#{ n='Duration'; e={ ($_.Group | Measure-Object durationInSec -Sum).Sum } }
}
(-split $_.start)[0] splits each start value by whitespace and returns the first token ([0]), which is the date portion of the time stamp; e.g., 6/5/2018 is returned for 6/5/2018 12:16:36 PM; passing this operation as a script block ({ ... }) to Group-Object means that grouping happens by date only, not also time (in addition to grouping by taskname).
This yields:
Date Task Duration
---- ---- --------
6/5/2018 TaskA 312
6/5/2018 TaskB 200
6/6/2018 TaskA 300
To construct pivot-table-like output requires substantially more effort, and it won't be fast:
Assume that $objs contains the objects created above ($objs = $rows | Group-Object ...).
# Get all distinct dates.
$dates = $objs | Select-Object -Unique -ExpandProperty Date
# Get all distinct tasks.
$tasks = $objs | Select-Object -Unique -ExpandProperty Task
# Create an ordered hashtable that contains an entry for each task that
# holds a nested hashtable with (empty-for-now) entries for all dates.
$ohtPivot = [ordered] #{}
$tasks | ForEach-Object {
$ohtDates = [ordered] #{}
$dates | ForEach-Object { $ohtDates[$_] = $null }
$ohtPivot[$_] = $ohtDates
}
# Fill the hashtable from the grouped objects with the task- and
# date-specific durations.
$objs | ForEach-Object { $ohtPivot[$_.Task][$_.Date] = $_.Duration }
# Output the resulting hashtable in pivot-table-like form by transforming
# each entry into a custom object
$ohtPivot.GetEnumerator() | ForEach-Object {
[pscustomobject] #{ Task = $_.Key } | Add-Member -PassThru -NotePropertyMembers $_.Value
}
The above yields:
Task 6/5/2018 6/6/2018
---- -------- --------
TaskA 312 300
TaskB 200
Googling for PowerShell and Pivot I found this gist.github.com with a more universal way to create the PivotTable.
To transpose (swap x,y) you simply change the variables $rotate, $keep
It has the additional benefit of calculating a row Total
## Q:\Test\2018\06\09\PivotTable.ps1
## Source https://gist.github.com/andyoakley/1651859
# #############################################################################
# Rotates a vertical set similar to an Excel PivotTable
# #############################################################################
$OutputFile = "MyPivot.csv"
$data = #'
taskname,start,duration
TaskA,"6/5/2018 12:16:36 PM",312
TaskB,"6/5/2018 12:30:36 PM",200
TaskA,"6/6/2018 08:00:00 AM",150
TaskA,"6/6/2018 03:00:00 PM",150
'# | ConvertFrom-Csv |Select-Object taskname, duration, #{n='start';e={($_.start -split ' ')[0]}}
# Fields of interest
$rotate = "taskname" # Bits along the top
$keep = "start" # Those along the side
$value = "duration" # What to total
#-------------------- No need to change anything below ------------------------
# Creatre variable to store the output
$rows = #()
# Find the unique "Rotate" [top row of the pivot] values and sort ascending
$pivots = $data | select -unique $rotate | foreach { $_.$rotate} | Sort-Object
# Step through the original data...
# for each of the "Keep" [left hand side] find the Sum of the "Value" for each "Rotate"
$data |
group $keep |
foreach {
$group = $_.Group
# Create the data row and name it as per the "Keep"
$row = new-object psobject
$row | add-member NoteProperty $keep $_.Name
# Cycle through the unique "Rotate" values and get the sum
foreach ($pivot in $pivots) {
$row | add-member NoteProperty $pivot ($group | where { $_.$rotate -eq $pivot } | measure -sum $value).Sum
}
# Add the total to the row
$row | add-member NoteProperty Total ($group | measure -sum $value).Sum
# Add the row to the collection
$rows += $row
}
# Do something with the pivot rows
$rows | Format-Table
$rows | Export-Csv $OutputFile -NoTypeInformation
Sample output:
start TaskA TaskB Total
----- ----- ----- -----
6/5/2018 312 200 512
6/6/2018 300 300
Or x/y swapped
taskname 6/5/2018 6/6/2018 Total
-------- -------- -------- -----
TaskA 312 300 612
TaskB 200 200