PowerShell ForEach removes leading zeros - powershell

I am kind of new with PowerShell and programming in general, so I hope you have some patience while reading this. Before I explain my problem, I feel like I have to first tell you some background information:
I have all my transactions saved in $Transactions. Each transaction has Receiver, Date and Amount.
I have grouped the yearly transactions into $TransactionsPerYear the following way:
$TransactionsPerYear = $Transactions | Group-Object { [int]($_.date -replace '.*\.') }
(Btw. Could someone explain the regex in the end for me, what each character does?)
Next thing I am doing is grouping yearly income and expenses into separate variables. After this I am trying to extract the months from each year and save them into $Months. The date is in the following format dd.MM.yyyy
Question 1:
Here's how I can get all the dates, but how do I extract just the months?
$TransactionsPerYear | Select -ExpandProperty Group | Select -ExpandProperty date | Select -Unique
Question 2:
Because I don't know how to extract the months, I've tried it the following way:
[String[]]$Months = "01","02","03","04","05","06","07","08","09","10","11","12"
When I have each month in $Months I am trying to get monthly transactions and save them into new variables:
ForEach($Month in $Months){
New-Variable -Name "Transactions_$Month$Year" -Value ($Transactions | Where {$_.Date -like "*.$Month.$Year"} | Group-Object 'Receiver' | Select-Object Count, Name, #{L="Total";E={$_ | Select -ExpandProperty Group | Measure-Object Amount -Sum | Select -ExpandProperty Sum}} | Sort-Object {[double]$_.Total})
}
The problem that I am facing here is that ForEach removes the leading zero from each month, and when this happens, this part in ForEach doesn't match with anything, and the new variable is null:
Where {$_.Date -like "*.$Month.$Year"}
Let me know if you need more info. I'd be really thankful if anyone could help me.
The date looks like: 25.02.2016

From your post, it looks like you've jumped further down the rabbithole than necessary.
Instead of trying to do string manipulation every time you need to interact with the Date property, simply turn it into a DateTime object!
$Transactions = $Transactions |Select-Object *,#{Name='DateParsed';Expression={[datetime]::ParseExact($_.Date, 'dd.MM.yyyy', $null)}}
The DateTime.ParseExact() method allows us to specify the format (eg. dd.MM.yyyy), and parse a string representation of a date.
Now you can group on year simply by:
$TransactionsPerYear = $Transactions |Group-Object { $_.DateParsed.Year }
To group by both Year and then Month, I'd create a nested hashtable, like so:
# Create a hashtable, containing one key per year
$MonthlyTransactions = #{}
foreach($Year in $Transactions |Group {$_.DateParsed.Year})
{
# Create another hashtable, containing a key for each month in that year
$MonthlyTransactions[$Year.Name] = #{}
foreach($Month in $Year.Group |Group {$_.DateParsed.Month})
{
# Add the transactions to the Monthly hashtable
$MonthlyTransactions[$Year.Name][$Month.Name] = $Month.Group
}
}
Now you can calculate the transaction value for a specific month by doing:
$TotalValueMay2010 = ($MonthlyTransactions[2010][5] |Measure-Object Amount -Sum).Sum
(Btw. Could someone explain the regex in the end for me, what each character does?)
Sure:
. # match any character
* # zero of more times
\. # match a literal . (dot)
Taking your own example input string 25.02.2016, the first group (.*) will match on 25.02, and \. will match on the . right after, so the only thing left is 2016.

Do you mean this?
$dates = ([DateTime] "1/1/2016"),([DateTime] "1/2/2016"),
([DateTime] "2/1/2016"),([DateTime] "3/1/2016")
$uniqueMonths = $dates | ForEach-Object { $_.Month } | Sort-Object -Unique
# $uniqueMonths contains 1,2,3

Related

How to summarize value rows of one column with reference to another column in PowerShell object

I'm learning to work with the import-excel module and have successfully imported the data from a sample.xlsx file. I need to extract out the total amount based on the values of another column values. Basically, I want to just create a grouped data view where I can store the sum of values next to each type. Here's the sample data view.
Type Amount
level 1 $1.00
level 1 $2.00
level 2 $3.00
level 3 $4.00
level 3 $5.00
Now to import I'm just using the simple code
$fileName = "C:\SampleData.xlsx"
$data = Import-Excel -Path $fileName
#extracting distinct type values
$distinctTypes = $importedExcelRows | Select-Object -ExpandProperty "Type" -Unique
#looping through distinct types and storing it in the output
$output = foreach ($type in $distinctTypes)
{
$data | Group-Object $type | %{
New-Object psobject -Property #{
Type = $_.Name
Amt = ($_.Group | Measure-Object 'Amount' -Sum).Sum
}
}
}
$output
The output I'm looking for looks somewhat like:
Type Amount
level 1 $3.00
level 2 $3.00
level 3 $9.00
However, I'm getting nothing in the output. It's $null I think. Any help is appreciated I think I'm missing something in the looping.
You're halfway there by using Group-Object for this scenario, kudos on that part. Luckily, you can group by the type at your import and then measure the sum:
$fileName = "C:\SampleData.xlsx"
Import-Excel -Path $fileName | Group-Object -Property Type | % {
$group = $_.Group | % {
$_.Amount = $_.Amount -replace '[^0-9.]'
$_
} | Measure-Object -Property Amount -Sum
[pscustomobject]#{
Type = $_.Name
Amount = "{0:C2}" -f $group.Sum
}
}
Since you can't measure the amount in currency format, you can remove the dollar sign with some regex of [^0-9.], removing everything that is not a number, or ., or you could use ^\$ instead as well. This allows for the measurement of the amount and you can just format the amount back to currency format using the string format operator '{0:C2} -f ....
I don't know what your issue is but when the dollar signs are not part of the data you pull from the Excel sheet it should work as expected ...
$InputCsvData = #'
Type,Amount
level 1,1.00
level 1,2.00
level 2,3.00
level 3,4.00
level 3,5.00
'# |
ConvertFrom-Csv
$InputCsvData |
Group-Object -Property Type |
ForEach-Object {
[PSCustomObject]#{
Type = $_.Name
Amt = '${0:n2}'-f ($_.Group | Measure-Object -Property Amount -Sum).Sum
}
}
The ouptut looks like this:
Type Amt
---- ---
level 1 $3,00
level 2 $3,00
level 3 $9,00
Otherwise you may remove the dollar signs before you try to summarize the numbers.

Powershell - Import-CSV Group-Object SUM a number from grouped objects and then combine all grouped objects to single rows

I have a question similar to this one but with a twist:
Powershell Group Object in CSV and exporting it
My file has 42 existing headers. The delimiter is a standard comma, and there are no quotation marks in this file.
master_account_number,sub,txn,cur,last,first,address,address2,city,state,zip,ssn,credit,email,phone,cell,workphn,dob,chrgnum,cred,max,allow,neg,plan,downpayment,pmt2,min,clid,cliname,owner,merch,legal,is_active,apply,ag,offer,settle_perc,min_pay,plan2,lstpmt,orig,placedate
The file's data (the first 6 columns) looks like this:
master_account_number,sub,txn,cur,last,first
001,12,35,50.25,BIRD, BIG
001,34,47,100.10,BIRD, BIG
002,56,9,10.50,BUNNY, BUGS
002,78,3,20,BUNNY, BUGS
003,54,7,250,DUCK, DAFFY
004,44,88,25,MOUSE, JERRY
I am only working with the first column master_account_number and the 4th column cur.
I want to check for duplicates of the"master_account_number" column, if found then add the totals up from the 4th column "cur" for only those dupes found and then do a combine for any rows that we just did a sum on. The summed value from the dupes should replace the cur value in our combined row.
With that said, our out-put should look like so.
master_account_number,sub,txn,cur,last,first
001,12,35,150.35,BIRD, BIG
002,56,9,30.50,BUNNY, BUGS
003,54,7,250,DUCK, DAFFY
004,44,88,25,MOUSE, JERRY
Now that we have that out the way, here is how this question differs. I want to keep all 42 columns intact in the out-put file. In the other question I referenced above, the input was 5 columns and the out-put was 4 columns and this is not what I'm trying to achieve. I have so many more headers, I'd hate to have specify individually all 42 columns. That seems inefficient anyhow.
As for what I have so far for code... not much.
$revNB = "\\server\path\example.csv"
$global:revCSV = import-csv -Path $revNB | ? {$_.is_active -eq "Y"}
$dupesGrouped = $revCSV | Group-Object master_account_number | Select-Object #{Expression={ ($_.Group|Measure-Object cur -Sum).Sum }}
Ultimately I want the output to look identical to the input, only the output should merge duplicate account numbers rows, and add all the "cur" values, where the merged row contains the sum of the grouped cur values, in the cur field.
Last Update: Tried Rich's solution and got an error. Modified what he had to this $dupesGrouped = $revCSV | Group-Object master_account_number | Select-Object Name, #{Name='curSum'; Expression={ ($_.Group | Measure-Object cur -Sum).Sum}}
And this gets me exactly what my own code got me so I am still looking for a solution. I need to output this CSV with all 42 headers. Even for items with no duplicates.
Other things I've tried:
This doesn't give me the data I need in the columns, the columns are there but they are blank.
$dupesGrouped = $revCSV | Group-Object master_account_number | Select-Object #{ expression={$_.Name}; label='master_account_number' },
sub_account_number,
charge_txn,
#{Name='current_balance'; Expression={ ($_.Group | Measure-Object current_balance -Sum).Sum },
last,
}
You're pretty close, but you used current_balance where you probably meant cur.
Here's a start:
$dupesGrouped = $revCSV | Group-Object master_account_number |
Select-Object Name, #{N='curSum'; E={ ($_.Group | Measure-Object cur -Sum).Sum},
#{N='last'; E={ ($_.Group | Select-Object last -first 1).last} }
You can add the other fields by adding Name;Expression hashtables for each of the fields you want to summarize. I assumed you would want to select the first occurrence of repeated last name for the same master_account_number. The output will be incorrect if the last name differs for the same master_account_number.
In the case of changing only part of the data, there is also the following way.
$dupesGrouped = $revCSV | Group-Object master_account_number | ForEach-Object {
# copy the first data in order not to change original data
$new = $_.Group[0].psobject.Copy()
# update the value of cur property
$new.cur = ($_.Group | Measure-Object cur -Sum).Sum
# output
$new
}

Guidance with developing PowerShell script

I am not an advanced scripter by any means, but I have a task which I need to accomplish for work. The task is to create a script which looks at two pieces of information (date and capacity utilized in bytes) from each report file that is contained in a directory. These two pieces of information are located in the same place in each report. Then, using the date value, the script can report which was the highest capacity utilized value for each month. I am thinking of having the final output be in HTML format.
There are two options for acquiring the date value. The report contains the date in the format mm/dd/yyyy in the 3rd line of text and the time is included in the file name as the Epoch time.
So far, I have put together a PowerShell script that parses the date and the capacity utilized from the body of the report. This information is then added to an array.
I am looking for guidance on which date value would be better to use (Epoch time from file name or date from body of report) and what method would be best to utilize for looking at the data for each month and reporting the highest capacity utilization per month.
Here is my script so far:
#Construct an array to use for data export
$fileDirectory = "c:\Temp"
$Array1 = #()
foreach ($file in Get-ChildItem $fileDirectory)
{
#Obtain path to each file in directory
$filePath = $fileDirectory + "\" + $file
#Get content of each file during the loop
$data = Get-Content $filePath
#Create object to enter data into Array1
$myobj = "" | Select "Date","Capacity"
$dateStr = ($data[2].Split(" "))[3]
[long]$capacityStr = ($data[19].Split(","))[2]
[single]$CapacityConv = $capacityStr
$capacityConv = ($capacityConv /= 1099511627776)
#Fill the object myobj
$myobj.date = $dateStr
$myobj.capacity = $capacityConv
#Add the object to Array1
$Array1 += $myobj
#Wipe the object
$myobj = $null
}
#After the loop, export the array to CSV file
$Array1 | export-csv "c:\Scripts\test-output.csv"
$Array1
pause
For the date, it's really up to you. If they're equally accurate then it's a matter of opinion.
For the capacity, I'm assuming these are daily reports given the date format, and you want the highest capacity for a given month.
Since you're creating an object containing a Date and a Capacity property for each report, using the returned array of all those values, you should be able to get the information you need like this:
$Array1 | Group-Object {([DateTime]$_.Date).ToString('MMMM')} | Select-Object Name,#{Name='MaxCap';Expression={ $_.Group.Capacity | Measure-Object -Maximum | Select-Object -ExpandProperty Maximum }}
Now this is kind of a lot so let's break it down.
Group-Object groups your array based on a property. In this case, we want to group by month, but you don't have a month property, so instead of a property name we're passing in a script block to calculate the property on the fly:
([DateTime]$_.Date).ToString('MMMM')
This casts your Date property (which is a [String]) into a [DateTime] object. Then we use .ToString('MMMM') to format it into a month name.
The result will be an array of group objects, where the Name property is the name of the group (in this case it will be the month name) and the Group property will contain all of the original objects that belonged to that group.
Piping this into Select-Object, we want the Name (the month), and then we're creating a new property on the fly named MaxCap:
$_.Group.Capacity | Measure-Object -Maximum | Select-Object -ExpandProperty Maximum
So we take the current Group (the array of all the objects for that month), then expand its Capacity property so now we have an array of all the capacities for the group.
Pipe that into Measure-Object -Maximum to get the max value, then Select-Object -ExpandProperty Maximum (because Measure-Object returns an object with a Maximum property and we just want the value).
The end result is an object with the month and the maximum capacity for that month.

Powershell counting same values from csv

Using PowerShell, I can import the CSV file and count how many objects are equal to "a". For example,
#(Import-csv location | where-Object{$_.id -eq "a"}).Count
Is there a way to go through every column and row looking for the same String "a" and adding onto count? Or do I have to do the same command over and over for every column, just with a different keyword?
So I made a dummy file that contains 5 columns of people names. Now to show you how the process will work I will show you how often the text "Ann" appears in any field.
$file = "C:\temp\MOCK_DATA (3).csv"
gc $file | %{$_ -split ","} | Group-Object | Where-Object{$_.Name -like "Ann*"}
Don't focus on the code but the output below.
Count Name Group
----- ---- -----
5 Ann {Ann, Ann, Ann, Ann...}
9 Anne {Anne, Anne, Anne, Anne...}
12 Annie {Annie, Annie, Annie, Annie...}
19 Anna {Anna, Anna, Anna, Anna...}
"Ann" appears 5 times on it's own. However it is a part of other names as well. Lets use a simple regex to find all the values that are only "Ann".
(select-string -Path 'C:\temp\MOCK_DATA (3).csv' -Pattern "\bAnn\b" -AllMatches | Select-Object -ExpandProperty Matches).Count
That will return 5 since \b is for a word boundary. In essence it is only looking at what is between commas or beginning or end of each line. This omits results like "Anna" and "Annie" that you might have. Select-Object -ExpandProperty Matches is important to have if you have more than one match on a single line.
Small Caveat
It should not matter but in trying to keep the code simple it is possible that your header could match with the value you are looking for. Not likely which is why I don't account for it. If that is a possibility then we could use Get-Content instead with a Select -Skip 1.
Try cycling through properties like this:
(Import-Csv location | %{$record = $_; $record | Get-Member -MemberType Properties |
?{$record.$($_.Name) -eq 'a';}}).Count

How can I extract the latest rows from a log file based on latest date using Powershell

I'm a relatively new Powershell user, and have what I thought was a simple question. I have spent a bit of time looking for similar scenarios and surprisingly haven't found any. I would post my failed attempts, but I can't even get close!
I have a log file with repetitive data, and I want to extract the latest event for each "unique" entry. The problem lies in the fact that each entry is unique due to the individual date stamp. The "unique" criteria is in Column 1.
Example:
AE0440,1,2,3,30/08/2012,12:00:01,XXX
AE0441,1,2,4,30/08/2012,12:02:01,XXX
AE0442,1,2,4,30/08/2012,12:03:01,XXX
AE0440,1,2,4,30/08/2012,12:04:01,YYY
AE0441,1,2,4,30/08/2012,12:06:01,XXX
AE0442,1,2,4,30/08/2012,12:08:01,XXX
AE0441,1,2,5,30/08/2012,12:10:01,ZZZ
Therefore the output I want would be (order not relevant):
AE0440,1,2,4,30/08/2012,12:04:01,YYY
AE0442,1,2,4,30/08/2012,12:08:01,XXX
AE0441,1,2,5,30/08/2012,12:10:01,ZZZ
How can I get this data/discard old data?
Try this, it may look a bit cryptic for first time user. It reads the content of the file, groups the lines by the unique value (now we have 3 groups), each group is sorted by parsing the date time value (again by splitting) and the first value is returned.
Get-Content .\log.txt | Group-Object { $_.Split(',')[0] } | ForEach-Object {
$_.Group | Sort-Object -Descending { [DateTime]::ParseExact(($_.Split(',')[-3,-2] -join ' '),'dd/MM/yyyy HH:mm:ss',$null) } | Select-Object -First 1
}
AE0440,1,2,4,30/08/2012,12:04:01,YYY
AE0441,1,2,5,30/08/2012,12:10:01,ZZZ
AE0442,1,2,4,30/08/2012,12:08:01,XXX
Assuming your data looks exactly like your example:
# you can give more meaningful names to the columns if you want. just make sure the number of columns matches
$data = import-csv .\data.txt -Header Col1,Col2,Col3,Col4,Col5,Col6,Col7
# sort all data by the timestamp, then group by the label in column 1
$grouped = $data | sort {[DateTime]::ParseExact("$($_.Col6) $($_.Col5)", 'HH:mm:ss dd/MM/yyyy', $Null)} -Desc | group Col1
# read off the first element of each group (element with latest timestamp)
$grouped |%{ $_.Group[0] }
This also assumes your timestamps are on a 24-hr clock. i.e. all of your sample data is close to 12 noon, not 12 midnight. One second after midnight would need to be specified '00:00:01'