Powershell - Group and count unique values from CSV file based on a column - powershell

I am trying to get total completed and failed job for each days. If a job failed for any specific VM (Name field) for any specific day, it will retry the operation.If it complete in second attempt, I want o ignore the failed count for that and reduce the total count accordingly
Example:
My code
$csv =ConvertFrom-Csv #"
Name,Start_time,Status
vm1,20-03-2022,Completed
vm2,20-03-2022,Completed
vm1,21-03-2022,Failed
vm1,21-03-2022,Completed
vm2,21-03-2022,Completed
vm1,22-03-2022,Completed
vm2,22-03-2022,Failed
vm2,22-03-2022,Failed
"#
$Results = #()
foreach ($group in $csv | Group Start_time)
{
$Completed = ($group.Group | group status | ? Name -eq Completed).Count
$Failed = ($group.Group | group status | ? Name -eq Failed).Count
$row = "" | Select Date,Total,Completed,Failed,"Success %"
$row.Date = $group.Name
$row.Total = $group.Count
$row.Completed = $Completed
$row.Failed = $Failed
$row."Success %" = [math]::Round($Completed / $row.Total * 100,2)
$Results += $row
}
Above code will give me output as :
Date Total Completed Failed Success %
20-03-2022 2 2 0 100
21-03-2022 3 2 1 66.67
22-03-2022 3 1 2 33.33
But I am looking for the unique value for each VM for each day and ignore the failure, any retry shows as completed
Date Total Completed Failed Success %
20-03-2022 2 2 0 100 -> Job completed for vm1 and vm2
21-03-2022 2 2 0 100 -> job failed for vm1 first, but in second try it completed. same day 2 entries for vm1(Failed and Completed. Ignore failure and take only completed)
22-03-2022 2 1 1 50 -> vm2 failed on both attempt. so it has to take as 1 entry. ignore the duplicate run.

This seems to work, needless to say, you're displaying the information in an quite unorthodox way. I believe the code you currently have is how the information should be displayed.
Using this CSV for demonstration:
$csv = ConvertFrom-Csv #"
Name,Start_time,Status
vm1,20-03-2022,Completed
vm2,20-03-2022,Completed
vm1,21-03-2022,Failed
vm1,21-03-2022,Completed
vm2,21-03-2022,Completed
vm1,22-03-2022,Completed
vm2,22-03-2022,Failed
vm2,22-03-2022,Failed
vm1,23-03-2022,Failed
vm1,23-03-2022,Failed
vm2,23-03-2022,Failed
vm2,23-03-2022,Failed
"#
Code:
$csv | Group-Object Start_Time | ForEach-Object {
$completed = 0; $failed = 0
$thisGroup = $_.Group | Group-Object Name
foreach($group in $thisGroup) {
if('Completed' -in $group.Group.Status) {
$completed++
continue
}
$failed++
}
$total = $completed + $failed
[pscustomobject]#{
Start_Date = $_.Name
Total = $total
Completed = $completed
Failed = $failed
Success = ($completed / $total).ToString('P0')
}
} | Format-Table
Result:
Start_Date Total Completed Failed Success
---------- ----- --------- ------ -------
20-03-2022 2 2 0 100%
21-03-2022 2 2 0 100%
22-03-2022 2 1 1 50%
23-03-2022 2 0 2 0%

Related

Powershell - Group and count from CSV file

In my CSV file, I have two columns with header Start_date and Status. I am trying to find out the success percentage for each Start_date
Start_date Status
------------------------------------------
02-03-2022 Completed
02-03-2022 Completed
03-03-2022 Failed
03-03-2022 Completed
I am looking for a final output like below which I export CSV
Start_date Total Completed Failed Success %
02-03-2022 2 2 0 100
03-03-2022 2 1 1 50
As a first step, I am trying to get the count of each day job using below code.
$data = Import-Csv "C:\file.csv"
$data | group {$_.Start_date} | Sort-Object {$_.Start_date} | Select-Object {$_.Status}, Count
Above code will give me output like
$_.Status Count
--------- -----
1
1
it is not showing the date value. what will be the correct approach for this issue ?
You can use Group-Object to group the objects by the date column, then it's just math:
$csv = #'
Start_date,Status
02-03-2022,Completed
02-03-2022,Completed
03-03-2022,Failed
03-03-2022,Completed
'# | ConvertFrom-Csv
# This from your side should be:
# Import-Csv path/to/csv.csv | Group-Object ....
$csv | Group-Object Start_date | ForEach-Object {
$completed, $failed = $_.Group.Status.where({ $_ -eq 'Completed' }, 'Split')
$totalc = $_.Group.Count
$complc = $completed.Count
$failc = $failed.Count
$success = $complc / $totalc
[pscustomobject]#{
Start_Date = $_.Name
Total = $totalc
Completed = $complc
Failed = $failc
Success = $success.ToString('P0')
}
}
Here's another one:
$csv = Import-Csv C:\Temp\tmp.csv
$Results = #()
foreach ($group in $csv | Group Start_date)
{
$Completed = ($group.Group | group status | ? Name -eq Completed).Count
$Failed = ($group.Group | group status | ? Name -eq Failed).Count
$row = "" | Select Start_date,Total,Completed,Failed,"Success %"
$row.Start_date = $group.Name
$row.Total = $group.Count
$row.Completed = $Completed
$row.Failed = $Failed
$row."Success %" = $Completed / 2 * 100
$Results += $row
}
$results
Start_date Total Completed Failed Success %
---------- ----- --------- ------ ---------
02-03-2022 2 2 0 100
03-03-2022 2 1 1 50

Input validation/data filtering with Powershell

I've been working on this for a little while, basically all I'm doing is grabbing the licensing from our clients' Office 365 tenant and displaying it in a more readable manner since Office 365 Powershell doesn't output subscription names in a common name. This works perfectly when every subscription is in my CSV of subscription names, however on occasions where the SKU is not in my list (brand new or legacy offerings) the licensing table doesn't populate correctly because it can't find the friendlyname in my CSV (licensing quantities don't match the subscription name because of a blank record when it failed to find the SKU).
What I'm trying to have it do is display the skupartnumber in place of the friendlyname in the event that the subscription is not in my CSV instead of breaking the output. The first snippet below is my current working script that only works if the SKU is in my CSV, the one below it is the my best attempt at trying some input validation but I just can't get it to work. Everything displays correctly except the Subscription column which is blank (I also notice that it takes about 5x as long to run as normal), I would greatly appreciate any assistance offered; thanks!
Works as long as subscription is in my CSV:
$sku = Get-MsolAccountSku | select-object skupartnumber,ActiveUnits,suspendedUnits,ConsumedUnits | sort-object -property skupartnumber
$skudata = import-csv -Header friendlyname,skupartnumber "C:\PShell\cspcatalogalphabet.csv" | where-object {$sku.skupartnumber -eq $_.skupartnumber} | sort-object -property skupartnumber
$result = for ($n = 0; $n -lt #($skudata).Count; $n++) {
[PsCustomObject]#{
Subscription = #($skudata.friendlyname)[$n]
Active = $sku.ActiveUnits[$n]
Suspended = $sku.SuspendedUnits[$n]
Assigned = $sku.ConsumedUnits[$n]
}
}
$result | Format-Table -AutoSize
# Output:
Subscription Active Suspended Assigned
------------ ------ --------- --------
Microsoft Flow Free 10000 0 1
Power Bi (Free) 1000000 0 1
Microsoft Teams Exploratory 100 0 6
My best attempt at input validation which results in no data being read into $skulist.Subscription:
$sku = Get-MsolAccountSku | select-object skupartnumber,ActiveUnits,suspendedUnits,ConsumedUnits
$skulist = import-csv -Header friendlyname,skupartnumber "C:\PShell\cspcatalogalphabet.csv"
$skuname = for ($c = 0; $c -lt #($sku).count; $c++) {
if ($sku.skupartnumber[$c] -in $skulist.skupartnumber) {
[PsCustomObject]#{
Subscription = $skulist.friendlyname | where-object {$sku.skupartnumber[$c] -eq $skulist.skupartnumber}
}
}
else {
[PSCustomObject]#{
Subscription = #($sku.skupartnumber)[$c]
}
}
}
$table = for ($n = 0; $n -lt #($sku).Count; $n++) {
[PsCustomObject]#{
Subscription = #($skuname.Subscription)[$n]
Active = $sku.ActiveUnits[$n]
Suspended = $sku.SuspendedUnits[$n]
Assigned = $sku.ConsumedUnits[$n]
}
}
$table | format-table -AutoSize
# Output:
Subscription Active Suspended Assigned
------------ ------ --------- --------
10000 0 1
1000000 0 1
100 0 6
# An example of data I am grabbing from our clients' accounts:
$sku = Get-MsolAccountSku | select-object skupartnumber,ActiveUnits,suspendedUnits,ConsumedUnits
$sku
# Output:
SkuPartNumber ActiveUnits SuspendedUnits ConsumedUnits
------------- ----------- -------------- -------------
FLOW_FREE 10000 0 1
POWER_BI_STANDARD 1000000 0 1
TEAMS_EXPLORATORY 100 0 6
Instead of relying on the two arrays being aligned - that is, index $n in $sku must correspond to the item at index $n in $skulist for you code to work - you'll want to be able to resolve a value in $skulist based on the actual $SKU.SkuPartNumber value instead.
So how does one do that?!
Feed your $skulist into a [hashtable] instead:
$skulist = #{}
Import-Csv -Header friendlyname,skupartnumber "C:\PShell\cspcatalogalphabet.csv" |ForEach-Object {
$skulist[$_.skupartnumber] = $_.friendlyname
}
And then iterate over $skudata like this (notice there's no need for a for(;;) loop anymore, we don't need to care about array alignment!):
foreach($skuEntry in $sku){
[pscustomobject]#{
Subscription = if($skuList.ContainsKey($skuEntry.SKUPartNumber){$skuList[$skuEntry.SKUPartNumber]}else{$skuEntry.SKUPartNumber})
Active = $skuEntry.ActiveUnits
Suspended = $skuEntry.SuspendedUnits
Assigned = $skuEntry.ConsumedUnits
}
}

Powershell Get-Random with Constraints

I'm currently using the Get-Random function of Powershell to randomly pull a set number of rows from a csv. I need to create a constraint that says if one id is pulled, find the other ids that match it and pull their value.
Here is what I currently have:
$chosenOnes = Import-CSV C:\Temp\pk2.csv | sort{Get-Random} | Select -first 6
$i = 1
$count = $chosenOnes | Group-Object householdID
foreach ($row in $count)
{
if ($row.count -gt 1)
{
$students = $row.Group.Student
foreach ($student in $students)
{
$name = $student.tostring()
#...do something
$i = $i + 1
}
}
else
{
$name = $row.Group.Student
if($i -le 5)
{
#...do something
}
else
{
#...do something
}
$i = $i + 1
}
}
Example dataset
ID,name
165,Ernest Hemingway
1204,Mark Twain
1578,Stephen King
1634,Charles Dickens
1726,George Orwell
7751,John Doe
7751,Tim Doe
In this example, there are 7 rows but I'm randomly selecting 6 in my code. What needs to happen is when ID=7751 then I must return both rows where ID=7751. The IDs cannot not be statically set in the code.
Use Get-Random directly, with -Count, to extract a given number of random elements from a collection.
$allRows = Import-CSV C:\Temp\pk2.csv
$chosenHouseholdIDs = ($allRows | Get-Random -Count 6).householdID
Then filter all rows by whether their householdID column contains one of the 6 randomly selected rows' householdID values (PSv3+ syntax), using the -in array-containment operator:
$allRows | Where-Object householdID -in $chosenHouseholdIDs
Optional reading: performance considerations:
$allRows | Get-Random -Count 6 is not only conceptually simpler, but also much faster than $allRows | Sort-Object { Get-Random } | Select-Object -First 6
Using the Time-Command function to compare the performance of two approaches, using a 1000-row test file with 10 columns yields the following sample timings on my Windows 10 VM in Windows PowerShell - note that the Sort-Object { Get-Random }-based solution is more than 15(!) times slower:
Factor Secs (100-run avg.) Command TimeSpan
------ ------------------- ------- --------
1.00 0.007 $allRows | Get-Random -Count 6 00:00:00.0072520
15.65 0.113 $allRows | Sort-Object { Get-Random } | Select-Object -First 6 00:00:00.1134909
Similarly, a single pass through all rows to find matching IDs via array-containment operator -in performs much better than looping over the randomly selected IDs and searching all rows for each.
I tried sticking with your beginning and came up with this.
$Array = Import-CSV C:\test\StudtentTest.csv
$Array | Sort{Get-Random} | select -first 2 | %{
$id = $_.id
$Array | ?{$_.id -eq $id} | %{
$_
}
}
$Array will be your parsed CSV
We pipe in and sort by random select -first 2 (in this case)
Save the ID of the object into $id and then search the array for that ID and dispaly each that matches
If same ID does match you end up with something like
ID name
-- ----
7751 John Doe
7751 Tim Doe
1634 Charles Dickens

Extract columns from text based table output

qfarm /load command shows me the load from my servers.
Output:
PS> qfarm /load
Server Name Server Load Load Throttling Load Logon Mode
-------------------- ----------- -------------------- ------------------
SERVER-01 400 0 AllowLogons
SERVER-02 1364 OFF AllowLogons
SERVER-03 1364 OFF AllowLogons
SERVER-04 1000 0 AllowLogons
SERVER-05 700 0 AllowLogons
SERVER-06 1200 0 AllowLogons
I need to display only first column (Server Name) and the second one (Server Load) and loop through them, in order to make some logic later, but it seems the powershell doesn't see it as object with properties:
PS> qfarm /load | Select -ExpandProperty "Server Name"
Select-Object : Property "Server Name" cannot be found.
Is there any other possibility, like a table or something?
One way to do this is to build objects out of the command's output. Tested the following:
#requires -version 3
# sample data output from command
$sampleData = #"
Server Name Server Load Load Throttling Load Logon Mode
-------------------- ----------- -------------------- ------------------
SERVER-01 400 0 AllowLogons
SERVER-02 1364 OFF AllowLogons
SERVER-03 1364 OFF AllowLogons
SERVER-04 1000 0 AllowLogons
SERVER-05 700 0 AllowLogons
SERVER-06 1200 0 AllowLogons
"# -split "`n"
$sampleData | Select-Object -Skip 2 | ForEach-Object {
$len = $_.Length
[PSCustomObject] #{
"ServerName" = $_.Substring(0, 22).Trim()
"ServerLoad" = $_.Substring(22, 13).Trim() -as [Int]
"LoadThrottlingLoad" = $_.Substring(35, 22).Trim()
"LogonMode" = $_.Substring(57, $len - 57).Trim()
}
}
In your case, you should be able to replace $sampleData with your qfarm load command; e.g.:
qfarm /load | Select-Object -Skip 2 | ForEach-Object {
...
Of course, this is assuming no blank lines in the output and that my column positions for the start of each item is correct.
PowerShell version 2 equivalent:
#requires -version 2
function Out-Object {
param(
[Collections.Hashtable[]] $hashData
)
$order = #()
$result = #{}
$hashData | ForEach-Object {
$order += ($_.Keys -as [Array])[0]
$result += $_
}
New-Object PSObject -Property $result | Select-Object $order
}
# sample data output from command
$sampleData = #"
Server Name Server Load Load Throttling Load Logon Mode
-------------------- ----------- -------------------- ------------------
SERVER-01 400 0 AllowLogons
SERVER-02 1364 OFF AllowLogons
SERVER-03 1364 OFF AllowLogons
SERVER-04 1000 0 AllowLogons
SERVER-05 700 0 AllowLogons
SERVER-06 1200 0 AllowLogons
"# -split "`n"
$sampleData | Select-Object -Skip 2 | ForEach-Object {
$len = $_.Length
Out-Object `
#{"ServerName" = $_.Substring(0, 22).Trim()},
#{"ServerLoad" = $_.Substring(22, 13).Trim() -as [Int]},
#{"LoadThrottlingLoad" = $_.Substring(35, 22).Trim()},
#{"LogonMode" = $_.Substring(57, $len - 57).Trim()}
}
You can easily convert your table to PowerShell objects using the ConvertFrom-SourceTable cmdlet from the PowerShell Gallery:
$sampleData = ConvertFrom-SourceTable #"
Server Name Server Load Load Throttling Load Logon Mode
-------------------- ----------- -------------------- ------------------
SERVER-01 400 0 AllowLogons
SERVER-02 1364 OFF AllowLogons
SERVER-03 1364 OFF AllowLogons
SERVER-04 1000 0 AllowLogons
SERVER-05 700 0 AllowLogons
SERVER-06 1200 0 AllowLogons
"#
And than select your columns like:
PS C:\> $SampleData | Select-Object "Server Name", "Server Load"
Server Name Server Load
----------- -----------
SERVER-01 400
SERVER-02 1364
SERVER-03 1364
SERVER-04 1000
SERVER-05 700
SERVER-06 1200
For details see: ConvertFrom-SourceTable -?
The ConvertFrom-SourceTable cmdlet is available for download at the PowerShell Gallery and the source code from the GitHub iRon7/ConvertFrom-SourceTable repository.
Command-line utilities return their outputs as a string array. This should work:
qfarm /load | ForEach-Object { $_.Substring(0,33) }
I have answered something very similar to this in the past. I have a larger function for this but a simplified on work on left aligned string table just as you have shown in you example. See the linked answer for more explanation.
function ConvertFrom-LeftAlignedStringData{
param (
[string[]]$data
)
$headerString = $data[0]
$headerElements = $headerString -split "\s{2,}" | Where-Object{$_}
$headerIndexes = $headerElements | ForEach-Object{$headerString.IndexOf($_)}
$results = $data | Select-Object -Skip 2 | ForEach-Object{
$props = #{}
$line = $_
For($indexStep = 0; $indexStep -le $headerIndexes.Count - 1; $indexStep++){
$value = $null # Assume a null value
$valueLength = $headerIndexes[$indexStep + 1] - $headerIndexes[$indexStep]
$valueStart = $headerIndexes[$indexStep]
If(($valueLength -gt 0) -and (($valueStart + $valueLength) -lt $line.Length)){
$value = ($line.Substring($valueStart,$valueLength)).Trim()
} ElseIf ($valueStart -lt $line.Length){
$value = ($line.Substring($valueStart)).Trim()
}
$props.($headerElements[$indexStep]) = $value
}
New-Object -TypeName PsCustomObject -Property $props
}
return $results
}
$qfarmOutput = qfarm /load
ConvertFrom-LeftAlignedStringData $qfarmOutput | select "Server Name","Server Load"
This approach is based on the position of the header fields. Nothing is hardcoded and it is all custom built based on those indexes and field names. Using those $headerIndexes we carve up every line and place the results, if present, into its respective column. There is logic to ensure that we don't try and grab any part of the string that might not exist and treat the last field special.
Results
Server Name Server Load
----------- -----------
SERVER-01 400
SERVER-02 1364
SERVER-03 1364
SERVER-04 1000
SERVER-05 700
SERVER-06 1200

PowerShell: calculate delta between successive values of the rawvalue property for each counter

'evening,
I'm brand new at PowerShell scripting (and to this site) and need to write a script that:
gathers performance counters (in this case, TCP) several times
calculates the difference between successive values of the rawvalue property of each counter
Stipulating that every interval T the script iterates through the performance counters N times, and said performance counters are "A" and "B", and we're counting from 0, it needs to perform the following calculations:
A[1st] - A[0th],
A[2nd] - A[1st],
A[3rd] - A[2nd]
...
At present, the script only iterates through the counters, twice (i.e. N = 2 in this case). The goal is to be able to be able to iterate through these counters "many" (e.g. a couple hundred) times.
Currently the script reads the raw value of each counter into a single array. Here it is:
$cntr = (get-counter -listset tcpv4).paths
$arry = #()
for ($i=0; $i -lt 2; $i++) {
write-host "`nThis is iteration $i`n"
foreach ($elmt in $cntr) {
$z = (get-counter -counter $elmt).countersamples[0].rawvalue
$arry = $arry + $z
write-host "$elmt is: $z`n"
}
}
When I run this script, I get output just like the following:
This is iteration 0
\TCPv4\Segments/sec is: 24723
\TCPv4\Connections Established is: 27
\TCPv4\Connections Active is: 796
\TCPv4\Connections Passive is: 47
\TCPv4\Connection Failures is: 158
\TCPv4\Connections Reset is: 412
\TCPv4\Segments Received/sec is: 14902
\TCPv4\Segments Sent/sec is: 9822
\TCPv4\Segments Retransmitted/sec is: 199
This is iteration 1
\TCPv4\Segments/sec is: 24727
\TCPv4\Connections Established is: 27
\TCPv4\Connections Active is: 798
\TCPv4\Connections Passive is: 47
\TCPv4\Connection Failures is: 159
\TCPv4\Connections Reset is: 412
\TCPv4\Segments Received/sec is: 14903
\TCPv4\Segments Sent/sec is: 9824
\TCPv4\Segments Retransmitted/sec is: 200
e.g. The two values for the rawvalue property for the "\TCPv4\Segments Retransmitted/sec" counter are $arry[8] and $arry[17] respectively. To derive the difference between the two I'm using:
write-host "The difference between the successive counters for $($cntr[-1]) is $($($arry[17]) - $($arry[8]))."
Any help would be greatly appreciated.
I poked at it some, and this fell out:
$cntr = (get-counter -listset tcpv4).paths
$LastValue = #{}
Get-Counter $cntr -SampleInterval 2 -MaxSamples 5 |
foreach {
foreach ($Sample in $_.CounterSamples)
{
$ht = [Ordered]#{
Counter = $Sample.path.split('\')[-1]
TimeStamp = $_.TimeStamp
RawValue = $Sample.RawValue
LastValue = $LastValue[$Sample.Path]
Change = $Sample.RawValue - $LastValue[$Sample.Path]
}
if ($LastValue.ContainsKey($Sample.path))
{ [PSCustomObject]$ht }
$LastValue[$Sample.Path] = $Sample.RawValue
}
}
Edit:
This should work on V2:
$cntr = (get-counter -listset tcpv4).paths
$LastValue = #{}
Get-Counter $cntr -SampleInterval 10 -MaxSamples 3 |
foreach {
foreach ($Sample in $_.CounterSamples)
{
$Object = '' |
Select-Object Counter,TimeStamp,RawValue,LastValue,Change
$Object.Counter = $Sample.path.split('\')[-1]
$Object.TimeStamp = $_.TimeStamp
$Object.RawValue = $Sample.RawValue
$Object.LastValue = $LastValue[$Sample.Path]
$Object.Change = $Sample.RawValue - $LastValue[$Sample.Path]
if ($LastValue.ContainsKey($Sample.path))
{ $Object }
$LastValue[$Sample.Path] = $Sample.RawValue
}
}
Ok. Well lets work with this then
$cntr = (get-counter -listset tcpv4).paths
$arry = #()
$maximumIterations = 2 # Variable based since you intended to change this.
# Cycle the counters while recording the values. Once completed we will calculate change.
for ($i=1; $i -le $maximumIterations; $i++) {
foreach ($elmt in $cntr) {
$arry += New-Object -TypeName PsCustomObject -Property #{
Iteration = $i
Counter = $elmt
RawValue = (get-counter -counter $elmt).countersamples[0].rawvalue
Change = $null
}
}
}
# Now that we have all the values lets calculate the rate of change over each iteration.
$arry | Where-Object{$_.Iteration -gt 1} | ForEach-Object{
$previousIteration = $_.Iteration - 1
$thisCounter = $_.Counter
$thisValue = $_.RawValue
$previousValue = ($arry | Where-Object{$_.Counter -eq $thisCounter -and $_.Iteration -eq $previousIteration}).RawValue
$_.Change = $thisValue - $previousValue
}
$arry | Select Iteration,Counter,RawValue,Change
Not that we have to but I collected all of the counters data along with their iteration value like you were putting against Write-Host. You will notice that I create a placeholder for Change but do not populate it. Before the calculations $arry would have data like the following. Note: the output is truncated
Iteration Change Counter RawValue
--------- ------ ------- --------
1 \TCPv4\Segments/sec 28324837
1 \TCPv4\Connections Established 120
. .............................. .....
2 \TCPv4\Segments/sec 28325441
2 \TCPv4\Connections Established 125
Once all of that data is collected into $arry we get all the iterations that are not the first and process each item individually. Using the data of the current item in the pipeline we match it up with the previous iterations value. Using the same values as above we get the changes you were hoping to monitor
Iteration Counter RawValue Change
--------- ------- -------- ------
1 \TCPv4\Segments/sec 28324837
1 \TCPv4\Connections Established 120
. .............................. .....
2 \TCPv4\Segments/sec 28325441 604
2 \TCPv4\Connections Established 125 5