Powershell - Group and count from CSV file - powershell

In my CSV file, I have two columns with header Start_date and Status. I am trying to find out the success percentage for each Start_date
Start_date Status
------------------------------------------
02-03-2022 Completed
02-03-2022 Completed
03-03-2022 Failed
03-03-2022 Completed
I am looking for a final output like below which I export CSV
Start_date Total Completed Failed Success %
02-03-2022 2 2 0 100
03-03-2022 2 1 1 50
As a first step, I am trying to get the count of each day job using below code.
$data = Import-Csv "C:\file.csv"
$data | group {$_.Start_date} | Sort-Object {$_.Start_date} | Select-Object {$_.Status}, Count
Above code will give me output like
$_.Status Count
--------- -----
1
1
it is not showing the date value. what will be the correct approach for this issue ?

You can use Group-Object to group the objects by the date column, then it's just math:
$csv = #'
Start_date,Status
02-03-2022,Completed
02-03-2022,Completed
03-03-2022,Failed
03-03-2022,Completed
'# | ConvertFrom-Csv
# This from your side should be:
# Import-Csv path/to/csv.csv | Group-Object ....
$csv | Group-Object Start_date | ForEach-Object {
$completed, $failed = $_.Group.Status.where({ $_ -eq 'Completed' }, 'Split')
$totalc = $_.Group.Count
$complc = $completed.Count
$failc = $failed.Count
$success = $complc / $totalc
[pscustomobject]#{
Start_Date = $_.Name
Total = $totalc
Completed = $complc
Failed = $failc
Success = $success.ToString('P0')
}
}

Here's another one:
$csv = Import-Csv C:\Temp\tmp.csv
$Results = #()
foreach ($group in $csv | Group Start_date)
{
$Completed = ($group.Group | group status | ? Name -eq Completed).Count
$Failed = ($group.Group | group status | ? Name -eq Failed).Count
$row = "" | Select Start_date,Total,Completed,Failed,"Success %"
$row.Start_date = $group.Name
$row.Total = $group.Count
$row.Completed = $Completed
$row.Failed = $Failed
$row."Success %" = $Completed / 2 * 100
$Results += $row
}
$results
Start_date Total Completed Failed Success %
---------- ----- --------- ------ ---------
02-03-2022 2 2 0 100
03-03-2022 2 1 1 50

Related

Powershell - Group and count unique values from CSV file based on a column

I am trying to get total completed and failed job for each days. If a job failed for any specific VM (Name field) for any specific day, it will retry the operation.If it complete in second attempt, I want o ignore the failed count for that and reduce the total count accordingly
Example:
My code
$csv =ConvertFrom-Csv #"
Name,Start_time,Status
vm1,20-03-2022,Completed
vm2,20-03-2022,Completed
vm1,21-03-2022,Failed
vm1,21-03-2022,Completed
vm2,21-03-2022,Completed
vm1,22-03-2022,Completed
vm2,22-03-2022,Failed
vm2,22-03-2022,Failed
"#
$Results = #()
foreach ($group in $csv | Group Start_time)
{
$Completed = ($group.Group | group status | ? Name -eq Completed).Count
$Failed = ($group.Group | group status | ? Name -eq Failed).Count
$row = "" | Select Date,Total,Completed,Failed,"Success %"
$row.Date = $group.Name
$row.Total = $group.Count
$row.Completed = $Completed
$row.Failed = $Failed
$row."Success %" = [math]::Round($Completed / $row.Total * 100,2)
$Results += $row
}
Above code will give me output as :
Date Total Completed Failed Success %
20-03-2022 2 2 0 100
21-03-2022 3 2 1 66.67
22-03-2022 3 1 2 33.33
But I am looking for the unique value for each VM for each day and ignore the failure, any retry shows as completed
Date Total Completed Failed Success %
20-03-2022 2 2 0 100 -> Job completed for vm1 and vm2
21-03-2022 2 2 0 100 -> job failed for vm1 first, but in second try it completed. same day 2 entries for vm1(Failed and Completed. Ignore failure and take only completed)
22-03-2022 2 1 1 50 -> vm2 failed on both attempt. so it has to take as 1 entry. ignore the duplicate run.
This seems to work, needless to say, you're displaying the information in an quite unorthodox way. I believe the code you currently have is how the information should be displayed.
Using this CSV for demonstration:
$csv = ConvertFrom-Csv #"
Name,Start_time,Status
vm1,20-03-2022,Completed
vm2,20-03-2022,Completed
vm1,21-03-2022,Failed
vm1,21-03-2022,Completed
vm2,21-03-2022,Completed
vm1,22-03-2022,Completed
vm2,22-03-2022,Failed
vm2,22-03-2022,Failed
vm1,23-03-2022,Failed
vm1,23-03-2022,Failed
vm2,23-03-2022,Failed
vm2,23-03-2022,Failed
"#
Code:
$csv | Group-Object Start_Time | ForEach-Object {
$completed = 0; $failed = 0
$thisGroup = $_.Group | Group-Object Name
foreach($group in $thisGroup) {
if('Completed' -in $group.Group.Status) {
$completed++
continue
}
$failed++
}
$total = $completed + $failed
[pscustomobject]#{
Start_Date = $_.Name
Total = $total
Completed = $completed
Failed = $failed
Success = ($completed / $total).ToString('P0')
}
} | Format-Table
Result:
Start_Date Total Completed Failed Success
---------- ----- --------- ------ -------
20-03-2022 2 2 0 100%
21-03-2022 2 2 0 100%
22-03-2022 2 1 1 50%
23-03-2022 2 0 2 0%

Input validation/data filtering with Powershell

I've been working on this for a little while, basically all I'm doing is grabbing the licensing from our clients' Office 365 tenant and displaying it in a more readable manner since Office 365 Powershell doesn't output subscription names in a common name. This works perfectly when every subscription is in my CSV of subscription names, however on occasions where the SKU is not in my list (brand new or legacy offerings) the licensing table doesn't populate correctly because it can't find the friendlyname in my CSV (licensing quantities don't match the subscription name because of a blank record when it failed to find the SKU).
What I'm trying to have it do is display the skupartnumber in place of the friendlyname in the event that the subscription is not in my CSV instead of breaking the output. The first snippet below is my current working script that only works if the SKU is in my CSV, the one below it is the my best attempt at trying some input validation but I just can't get it to work. Everything displays correctly except the Subscription column which is blank (I also notice that it takes about 5x as long to run as normal), I would greatly appreciate any assistance offered; thanks!
Works as long as subscription is in my CSV:
$sku = Get-MsolAccountSku | select-object skupartnumber,ActiveUnits,suspendedUnits,ConsumedUnits | sort-object -property skupartnumber
$skudata = import-csv -Header friendlyname,skupartnumber "C:\PShell\cspcatalogalphabet.csv" | where-object {$sku.skupartnumber -eq $_.skupartnumber} | sort-object -property skupartnumber
$result = for ($n = 0; $n -lt #($skudata).Count; $n++) {
[PsCustomObject]#{
Subscription = #($skudata.friendlyname)[$n]
Active = $sku.ActiveUnits[$n]
Suspended = $sku.SuspendedUnits[$n]
Assigned = $sku.ConsumedUnits[$n]
}
}
$result | Format-Table -AutoSize
# Output:
Subscription Active Suspended Assigned
------------ ------ --------- --------
Microsoft Flow Free 10000 0 1
Power Bi (Free) 1000000 0 1
Microsoft Teams Exploratory 100 0 6
My best attempt at input validation which results in no data being read into $skulist.Subscription:
$sku = Get-MsolAccountSku | select-object skupartnumber,ActiveUnits,suspendedUnits,ConsumedUnits
$skulist = import-csv -Header friendlyname,skupartnumber "C:\PShell\cspcatalogalphabet.csv"
$skuname = for ($c = 0; $c -lt #($sku).count; $c++) {
if ($sku.skupartnumber[$c] -in $skulist.skupartnumber) {
[PsCustomObject]#{
Subscription = $skulist.friendlyname | where-object {$sku.skupartnumber[$c] -eq $skulist.skupartnumber}
}
}
else {
[PSCustomObject]#{
Subscription = #($sku.skupartnumber)[$c]
}
}
}
$table = for ($n = 0; $n -lt #($sku).Count; $n++) {
[PsCustomObject]#{
Subscription = #($skuname.Subscription)[$n]
Active = $sku.ActiveUnits[$n]
Suspended = $sku.SuspendedUnits[$n]
Assigned = $sku.ConsumedUnits[$n]
}
}
$table | format-table -AutoSize
# Output:
Subscription Active Suspended Assigned
------------ ------ --------- --------
10000 0 1
1000000 0 1
100 0 6
# An example of data I am grabbing from our clients' accounts:
$sku = Get-MsolAccountSku | select-object skupartnumber,ActiveUnits,suspendedUnits,ConsumedUnits
$sku
# Output:
SkuPartNumber ActiveUnits SuspendedUnits ConsumedUnits
------------- ----------- -------------- -------------
FLOW_FREE 10000 0 1
POWER_BI_STANDARD 1000000 0 1
TEAMS_EXPLORATORY 100 0 6
Instead of relying on the two arrays being aligned - that is, index $n in $sku must correspond to the item at index $n in $skulist for you code to work - you'll want to be able to resolve a value in $skulist based on the actual $SKU.SkuPartNumber value instead.
So how does one do that?!
Feed your $skulist into a [hashtable] instead:
$skulist = #{}
Import-Csv -Header friendlyname,skupartnumber "C:\PShell\cspcatalogalphabet.csv" |ForEach-Object {
$skulist[$_.skupartnumber] = $_.friendlyname
}
And then iterate over $skudata like this (notice there's no need for a for(;;) loop anymore, we don't need to care about array alignment!):
foreach($skuEntry in $sku){
[pscustomobject]#{
Subscription = if($skuList.ContainsKey($skuEntry.SKUPartNumber){$skuList[$skuEntry.SKUPartNumber]}else{$skuEntry.SKUPartNumber})
Active = $skuEntry.ActiveUnits
Suspended = $skuEntry.SuspendedUnits
Assigned = $skuEntry.ConsumedUnits
}
}

Group by and add or subtract depending on the indicator

I have a file which has transaction_date, transaction_amount and debit_credit_indicator. I want to write a program which shows for each date total count and total amount.
Total amount is calculated as follows -
if debit_credit_indicator is 'C' add else if 'D' subtract.
I got till grouping by indicators but don't know how to proceed after wards.
My ouput looks like this
TRANSACTION_DATE DEBIT_CREDIT_INDICA TotalAmount Count
TOR
---------------- ------------------- ----------- -----
2019-02-26 C 1478
2019-02-25 D 100
2019-02-26 D 200
param([string]$inputFileName=30)
(Get-Content $inputFileName) -replace '\|', ',' | Set-Content c:\learnpowershell\test.csv
$transactionData = Import-csv c:\learnpowershell\test.csv | Group-Object -Property TRANSACTION_DATE, DEBIT_CREDIT_INDICATOR
[Array] $newsbData += foreach($gitem in $transactionData)
{
$gitem.group | Select -Unique TRANSACTION_DATE, DEBIT_CREDIT_INDICATOR, `
#{Name = ‘TotalAmount’;Expression = {(($gitem.group) | measure -Property TRANSACTION_AMOUNT -sum).sum}},
#{Name = ‘Count’;Expression = {(($gitem.group) | Measure-Object -count).count}}
};
write-output $newsbData
I suppose you want replace '|' by ',' because you dont know -delimiter option otherwise keep you code for replace. now i propose my code for your problem:
#import en group by date
import-csv "c:\learnpowershell\test.csv" -Delimiter '|' | group TRANSACTION_DATE | %{
$TotalCredit=0
$TotalDebit=0
$CountRowCredit=0
$CountRowDebit=0
$HasProblem=$false
#calculation by date for every group
$_.Group | %{
if ($_.DEBIT_CREDIT_INDICATOR -EQ 'C')
{
$TotalCredit+=$_.transaction_amount
$CountRowCredit++
}
elseif ($_.DEBIT_CREDIT_INDICATOR -EQ 'D')
{
$TotalDebit+=$_.transaction_amount
$CountRowDebit++
}
else
{
$HasProblem=$true
}
}
#output result
[pscustomobject]#{
TRANSACTION_DATE=$_.Name
CountRow=$_.Count
Credit_Total=$TotalCredit
Credit_CountRow=$CountRowCredit
Debit_Total=-$TotalDebit
Debit_CountRow=$CountRowDebit
Total_DebitCredit=$TotalCredit - $TotalDebit
HasProblem=$HasProblem
}
}
You can add ' | Format-Table ' if you want print result formated in table

Conditional criteria in powershell group measure-object?

I have data in this shape:
externalName,day,workingHours,hoursAndMinutes
PRJF,1,11,11:00
PRJF,2,11,11:00
PRJF,3,0,0:00
PRJF,4,0,0:00
CFAW,1,11,11:00
CFAW,2,11,11:00
CFAW,3,11,11:00
CFAW,4,11,11:00
CFAW,5,0,0:00
CFAW,6,0,0:00
and so far code is
$gdata = Import-csv $filepath\$filename | Group-Object -Property Externalname;
$test = #()
$test += foreach($rostername in $gdata) {
$rostername.Group | Select -Unique externalName,
#{Name = 'AllDays';Expression = {(($rostername.Group) | measure -Property day).count}},
}
$test;
What I can't work out is how to do a conditional count of the lines where day is non-zero.
The aim is to produce two lines:
PRJF, 4, 2, 11
CFAW, 6, 4, 11
i.e. Roster name, roster length, days on, average hours worked per day on.
You need a where-object to filter for non zero workinghours
I'd use a [PSCustomObject] to generate a new table
EDIT a bit more efficient with only one Measure-Object
## Q:\Test\2018\08\06\SO_51700660.ps1
$filepath = 'Q:\Test\2018\08\06'
$filename = 'SO_S1700660.csv'
$gdata = Import-Csv (Join-Path $filepath $filename) | Group-Object -Property Externalname
$test = ForEach($Roster in $gdata) {
$WH = ($Roster.Group.Workinghours|Where-Object {$_ -ne 0}|Measure-Object -Ave -Sum)
[PSCustomObject]#{
RosterName = $Roster.Name
RosterLength = $Roster.Count
DaysOn = $WH.count
AvgHours = $WH.Average
TotalHours = $WH.Sum
}
}
$test | Format-Table
Sample output:
> .\SO_51700660.ps1
RosterName RosterLength DaysOn AvgHours TotalHours
---------- ------------ ------ -------- ----------
PRJF 4 2 11 22
CFAW 6 4 11 44

Aggregating tasks by duration for each day in a week/month? (PoSH)

I am parsing JSON from a web service to get my tasks (using a TimeFlip). Right now, I get back each task, when it occurred, and duration, so the data looks like this:
(taskname, start, durationinSec)
TaskA,"6/5/2018 12:16:36 PM",312
TaskB,"6/5/2018 12:30:36 PM",200
TaskA,"6/6/2018 08:00:00 AM",150
TaskA,"6/6/2018 03:00:00 PM",150
(etc etc)
I would like to generate a rollup report, showing by day which tasks had how much time.
While the data will span weeks, I'm just trying to do a weekly report that I can easily transcribe into our time app (since they won't give me an API key). So I'll do something like where {$_.start -gt (? {$_.start -gt (get-date -Hour 0 -Minute 00 -Second 00).adddays(-7)} first.
6/5/2018 6/6/2018
TaskA 312 300
TaskB 200
How can I do that? I assume group-object, but unclear how you'd do either the pivot or even the grouping.
The following doesn't output a pivot table, but performs the desired grouping and aggregation:
$rows = #'
taskname,start,durationinSec
TaskA,"6/5/2018 12:16:36 PM",312
TaskB,"6/5/2018 12:30:36 PM",200
TaskA,"6/6/2018 08:00:00 AM",150
TaskA,"6/6/2018 03:00:00 PM",150
'# | ConvertFrom-Csv
$rows | Group-Object { (-split $_.start)[0] }, taskname | ForEach-Object {
$_ | Select-Object #{ n='Date'; e={$_.Values[0]} },
#{ n='Task'; e={$_.Values[1]} },
#{ n='Duration'; e={ ($_.Group | Measure-Object durationInSec -Sum).Sum } }
}
(-split $_.start)[0] splits each start value by whitespace and returns the first token ([0]), which is the date portion of the time stamp; e.g., 6/5/2018 is returned for 6/5/2018 12:16:36 PM; passing this operation as a script block ({ ... }) to Group-Object means that grouping happens by date only, not also time (in addition to grouping by taskname).
This yields:
Date Task Duration
---- ---- --------
6/5/2018 TaskA 312
6/5/2018 TaskB 200
6/6/2018 TaskA 300
To construct pivot-table-like output requires substantially more effort, and it won't be fast:
Assume that $objs contains the objects created above ($objs = $rows | Group-Object ...).
# Get all distinct dates.
$dates = $objs | Select-Object -Unique -ExpandProperty Date
# Get all distinct tasks.
$tasks = $objs | Select-Object -Unique -ExpandProperty Task
# Create an ordered hashtable that contains an entry for each task that
# holds a nested hashtable with (empty-for-now) entries for all dates.
$ohtPivot = [ordered] #{}
$tasks | ForEach-Object {
$ohtDates = [ordered] #{}
$dates | ForEach-Object { $ohtDates[$_] = $null }
$ohtPivot[$_] = $ohtDates
}
# Fill the hashtable from the grouped objects with the task- and
# date-specific durations.
$objs | ForEach-Object { $ohtPivot[$_.Task][$_.Date] = $_.Duration }
# Output the resulting hashtable in pivot-table-like form by transforming
# each entry into a custom object
$ohtPivot.GetEnumerator() | ForEach-Object {
[pscustomobject] #{ Task = $_.Key } | Add-Member -PassThru -NotePropertyMembers $_.Value
}
The above yields:
Task 6/5/2018 6/6/2018
---- -------- --------
TaskA 312 300
TaskB 200
Googling for PowerShell and Pivot I found this gist.github.com with a more universal way to create the PivotTable.
To transpose (swap x,y) you simply change the variables $rotate, $keep
It has the additional benefit of calculating a row Total
## Q:\Test\2018\06\09\PivotTable.ps1
## Source https://gist.github.com/andyoakley/1651859
# #############################################################################
# Rotates a vertical set similar to an Excel PivotTable
# #############################################################################
$OutputFile = "MyPivot.csv"
$data = #'
taskname,start,duration
TaskA,"6/5/2018 12:16:36 PM",312
TaskB,"6/5/2018 12:30:36 PM",200
TaskA,"6/6/2018 08:00:00 AM",150
TaskA,"6/6/2018 03:00:00 PM",150
'# | ConvertFrom-Csv |Select-Object taskname, duration, #{n='start';e={($_.start -split ' ')[0]}}
# Fields of interest
$rotate = "taskname" # Bits along the top
$keep = "start" # Those along the side
$value = "duration" # What to total
#-------------------- No need to change anything below ------------------------
# Creatre variable to store the output
$rows = #()
# Find the unique "Rotate" [top row of the pivot] values and sort ascending
$pivots = $data | select -unique $rotate | foreach { $_.$rotate} | Sort-Object
# Step through the original data...
# for each of the "Keep" [left hand side] find the Sum of the "Value" for each "Rotate"
$data |
group $keep |
foreach {
$group = $_.Group
# Create the data row and name it as per the "Keep"
$row = new-object psobject
$row | add-member NoteProperty $keep $_.Name
# Cycle through the unique "Rotate" values and get the sum
foreach ($pivot in $pivots) {
$row | add-member NoteProperty $pivot ($group | where { $_.$rotate -eq $pivot } | measure -sum $value).Sum
}
# Add the total to the row
$row | add-member NoteProperty Total ($group | measure -sum $value).Sum
# Add the row to the collection
$rows += $row
}
# Do something with the pivot rows
$rows | Format-Table
$rows | Export-Csv $OutputFile -NoTypeInformation
Sample output:
start TaskA TaskB Total
----- ----- ----- -----
6/5/2018 312 200 512
6/6/2018 300 300
Or x/y swapped
taskname 6/5/2018 6/6/2018 Total
-------- -------- -------- -----
TaskA 312 300 612
TaskB 200 200