ForEach loop, trying to calculate average - powershell

My objective is to write a script that examines log files for the duration of an event, calculates the duration based on log entries (start/finish), and then calculates the average of those durations over the last 24 hours and determines whether it is greater than a certain value (let's use 2 hours for an example). So far, I have the first two portions completed, it is examining the logs properly and calculating the duration for each applicable log. I just don't know where to begin with the last step, the averaging of the durations from all of the logs. Below is my code thus far.
$imagesuccess = Get-ChildItem '\\server\osd_logs\success' -Directory |
Where-Object {
($_.Name -like "P0*") -or (($_.Name -like "MININT*") -and
(Test-Path "$($_.FullName)\SCCM_C\Logs\SMSTSLog\Get-PSPName.log")) -and
($_.LastWriteTime -gt (Get-Date).AddHours(-24))
}
$sccmlogpaths = "\\s0319p60\osd_logs\success\$($imagesuccess)\SCCM_C\Logs\SMSTSLog\smsts.log"
foreach ($sccmlogpath in $sccmlogpaths) {
$imagestartline = Select-String -Pattern "<![LOG[New time:" -Path $sccmlogpath -SimpleMatch
$imagestarttime = $imagestartline.ToString().Substring(75, 8)
$imagefinishline = Select-String -Pattern "<![LOG[ Directory: M:\$($imagesuccess)" -Path $sccmlogpath -SimpleMatch
$imagefinishtime = $imagefinishline.ToString().Substring(71, 8)
$imageduration = New-TimeSpan $imagestarttime $imagefinishtime
$imagedurationstring = $imageduration.ToString()
}

Roughly you'd do this:
$durations = foreach ($sccmlogpath in $sccmlogpaths) {
# [snip]
$imageduration = New-TimeSpan $imagestarttime $imagefinishtime
$imageduration # the 'output' of the foreach () {}
}
# $durations is now an array of timespans
$measurements = $durations | Measure-Object -Average -Property TotalHours
$averageHours = $measurements.Average
if (2.5 -lt $averageHours) {
# code here
}
This does sum(n)/count(n) averaging.
NB. if you are querying the last -24 hours, New-TimeSpan won't work nicely if any of the durations cross midnight; it will see 23:01 -> 00:01 as -23 hours.

Related

PowerShell script - Loop list of folders to get file count and sum of files for each folder listed

I want to get the file count & the sum of files for each individual folder listed in DGFoldersTEST.txt.
However, I’m currently getting the sum of all 3 folders.
And now I'm getting 'Index was outside the bounds of the array' error message.
$DGfolderlist = Get-Content -Path C:\DiskGroupsFolders\DGFoldersTEST.txt
$FolderSize =#()
$int=0
Foreach ($DGfolder in $DGfolderlist)
{
$FolderSize[$int] =
Get-ChildItem -Path $DGfolderlist -File -Recurse -Force -ErrorAction SilentlyContinue |
Measure-Object -Property Length -Sum |
Select-Object -Property Count, #{Name='Size(MB)'; Expression={('{0:N2}' -f($_.Sum/1mb))}}
Write-Host $DGfolder
Write-Host $FolderSize[$int]
$int++
}
To explain the error, you're trying to assign a value at index $int of your $FolderSize array, however, when arrays are initialized using the array subexpression operator #(..), they're intialized with 0 Length, hence why the error. It's different as to when you would initialize them with a specific Length:
$arr = #()
$arr.Length # 0
$arr[0] = 'hello' # Error
$arr = [array]::CreateInstance([object], 10)
$arr.Length # 10
$arr[0] = 'hello' # all good
As for how to approach your code, since you don't really know how many items will come as output from your loop, initializing an array with a specific Length is not possible. PowerShell offers the += operator for adding elements to it, however this is a very expensive operation and not a very good idea because each time we append a new element to the array, a new array has to be created, this is because arrays are of a fixed size. See this answer for more information and better approaches.
You can simply let PowerShell capture the output of your loop by assigning the variable to the loop itself:
$FolderSize = foreach ($DGfolder in $DGfolderlist) {
Get-ChildItem -Path $DGfolder -File -Recurse -Force -ErrorAction SilentlyContinue |
Measure-Object -Property Length -Sum |
Select-Object #(
#{ Name = 'Folder'; Expression = { $DGfolder }}
'Count'
#{ Name = 'Size(MB)'; Expression = { ($_.Sum / 1mb).ToString('N2') }}
)
}

How can I get the average lastwritetime from multiple files?

I have a PowerShell script that is modifying multiple files. I would like to verify that they were modified by checking the last write time property and comparing it to the current time minus 30 minutes.
Is there anyway to get the average time from multiple different files?
For example:
$Var = Get-Childitem -path "C:\users\User\Documents\*.txt"
$lastwt = $var.Lastwritetime
If($lastwt -ge (Get-Date).addminutes(-30)){
Do stuff
}
The above won't work because multiple dates are returned all around the same time give or take a few milliseconds.
I want to just get the average of all the times and use that as time comparison instead. Any way to do this?
About
So you should probably use New-Timespan to do time comparisons. So your update code:
Code
$Files = Get-Childitem -path "C:\users\User\Documents*.txt"
$Files | ? {
# Filter by a timespan criteria (last write on this file is greater than 30 minutes before now)
$Mins = New-Timespan $_.LastWriteTime (Get-Date) | % TotalMinutes
return $Mins -ge 30
} | % {
# Work only on files that matched the top criteria
}
Does that help? If you still want the averaging solution, lmk, I'll add it in :)
To get an average (median) LastWriteTime [DateTime] object of a series of files, this may be what you want:
$files = Get-Childitem -Path 'C:\users\User\Documents' -Filter '*txt' -File
# get an array of the LastWriteTime properties and sort that
$dates = $files | Select-Object -ExpandProperty LastWriteTime | Sort-Object
$oldest = $dates[0]
$newest = $dates[-1]
# create a new DateTime object that holds the middle of the oldest and newest file time
$average = [datetime]::new((($oldest.Ticks + $newest.Ticks) / 2), 'Local')
# show what you've got:
Write-Host "Oldest LastWriteTime: $oldest"
Write-Host "Average LastWriteTime: $average" -ForegroundColor Green
Write-Host "Newest LastWriteTime: $newest"

Count number of comments over multiple files, including multi-line comments

I'm trying to write a script that counts all comments in multiple files, including both single line (//) and multi-line (/* */) comments and prints out the total. So, the following file would return 4
// Foo
var text = "hello world";
/*
Bar
*/
alert(text);
There's a requirement to include specific file types and exclude certain file types and folders, which I already have working in my code.
My current code is:
( gci -include *.cs,*.aspx,*.js,*.css,*.master,*.html -exclude *.designer.cs,jquery* -recurse `
| ? { $_.FullName -inotmatch '\\obj' } `
| ? { $_.FullName -inotmatch '\\packages' } `
| ? { $_.FullName -inotmatch '\\release' } `
| ? { $_.FullName -inotmatch '\\debug' } `
| ? { $_.FullName -inotmatch '\\plugin-.*' } `
| select-string "^\s*//" `
).Count
How do I change this to get multi-line comments as well?
UPDATE: My final solution (slightly more robust than what I was asking for) is as follows:
$CodeFiles = Get-ChildItem -include *.cs,*.aspx,*.js,*.css,*.master,*.html -exclude *.designer.cs,jquery* -recurse |
Where-Object { $_.FullName -notmatch '\\(obj|packages|release|debug|plugin-.*)\\' }
$TotalFiles = $CodeFiles.Count
$IndividualResults = #()
$CommentLines = ($CodeFiles | ForEach-Object{
#Get the comments via regex
$Comments = ([regex]::matches(
[IO.File]::ReadAllText($_.FullName),
'(?sm)^[ \t]*(//[^\n]*|/[*].*?[*]/)'
).Value -split '\r?\n') | Where-Object { $_.length -gt 0 }
#Get the total lines
$Total = ($_ | select-string .).Count
#Add to the results table
$IndividualResults += #{
File = $_.FullName | Resolve-Path -Relative;
Comments = $Comments.Count;
Code = ($Total - $Comments.Count)
Total = $Total
}
Write-Output $Comments
}).Count
$TotalLines = ($CodeFiles | select-string .).Count
$TotalResults = New-Object PSObject -Property #{
Files = $TotalFiles
Code = $TotalLines - $CommentLines
Comments = $CommentLines
Total = $TotalLines
}
Write-Output (Get-Location)
Write-Output $IndividualResults | % { new-object PSObject -Property $_} | Format-Table File,Code,Comments,Total
Write-Output $TotalResults | Format-Table Files,Code,Comments,Total
To be clear: Using string matching / regular expressions is not a fully robust way to detect comments in JavaScript / C# code, because there can be false positives (e.g., var s = "/* hi */";); for robust parsing you'd need a language parser.
If that is not a concern, and it is sufficient to detect comments (that start) on their own line, optionally preceded by whitespace, here's a concise solution (PSv3+):
(Get-ChildItem -include *.cs,*.aspx,*.js,*.css,*.master,*.html -exclude *.designer.cs,jquery* -recurse |
Where-Object { $_.FullName -notmatch '\\(obj|packages|release|debug|plugin-.*)' } |
ForEach-Object {
[regex]::matches(
[IO.File]::ReadAllText($_.FullName),
'(?sm)^[ \t]*(//[^\n]*|/[*].*?[*]/)'
).Value -split '\r?\n'
}
).Count
With the sample input, the ForEach-Object command yields 4.
Remove the ^[ \t]* part to match comments starting anywhere on a line.
The solution reads each input file as a single string with [IO.File]::ReadAllText() and then uses the [regex]::Matches() method to extract all (potentially line-spanning) comments.
Note: You could use Get-Content -Raw instead to read the file as a single string, but that is much slower, especially when processing multiple files.
The regex uses in-line options s and m ((?sm)) to respectively make . match newlines too and to make anchors ^ and $ match line-individually.
^[ \t]* matches any mix of spaces and tabs, if any, at the start of a line.
//[^\n]*$ matches a string that starts with // through the end of the line.
/[*].*?[*]/ matches a block comment across multiple lines; note the lazy quantifier, *?, which ensures that very next instance of the closing */ delimiter is matched.
The matched comments (.Value) are then split into individual lines (-split '\r?\n'), which are output.
The resulting lines across all files are then counted (.Count)
As for what you tried:
The fundamental problem with your approach is that Select-String with file-info object input (such as provided by Get-ChildItem) invariably processes the input files line by line.
While this could be remedied by calling Select-String inside a ForEach-Object script block in which you pass each file's content as a single string to Select-String, direct use of the underlying regex .NET types, as shown above, is more efficient.
An IMO better approach is to count net code lines by removing single/multi line comments.
For a start a script that handles single files and returns for your above sample.cs the result 5
((Get-Content sample.cs -raw) -replace "(?sm)^\s*\/\/.*?$" `
-replace "(?sm)\/\*.*?\*\/.*`n" | Measure-Object -Line).Lines
EDIT: without removing empty lines, build the difference from total lines
## Q:\Test\2018\10\31\SO_53092258.ps1
$Data = Get-ChildItem *.cs | ForEach-Object {
$Content = Get-Content $_.FullName -Raw
$TotalLines = (Measure-Object -Input $Content -Line).Lines
$CodeLines = ($Content -replace "(?sm)^\s*\/\/.*?$" `
-replace "(?sm)\/\*.*?\*\/.*`n" | Measure-Object -Line).Lines
$Comments = $TotalLines - $CodeLines
[PSCustomObject]#{
File = $_.FullName
Lines = $TotalLines
Comments= $Comments
}
}
$Data
"="*40
"TotalLines={0} TotalCommentLines={1}" -f (
$Data | Measure-Object -Property Lines,Comments -Sum).Sum
Sample output:
> Q:\Test\2018\10\31\SO_53092258.ps1
File Lines Comments
---- ----- --------
Q:\Test\2018\10\31\example.cs 10 5
Q:\Test\2018\10\31\sample.cs 9 4
============================================
TotalLines=19 TotalCommentLines=9

CSV file - count distinct, group by, sum

I have a file that looks like the following;
- Visitor ID,Revenue,Channel,Flight
- 1234,100,Email,BA123
- 2345,200,PPC,BA112
- 456,150,Email,BA456
I need to produce a file that contains;
The count of distinct Visitor IDs (3)
The total revenue (450)
The count of each Channel
Email 2
PPC 2
The count of each Flight
BA123 1
BA112 1
BA456 1
So far I have the following code, however when executing this on the 350MB file, it takes too long and in some cases breaks the memory limit. As I have to run this function on multiple columns, it is going through the file many times. I ideally need to do this in one file pass.
$file = 'log.txt'
function GroupBy($columnName)
{
$objects = Import-Csv -Delimiter "`t" $file | Group-Object $columnName |
Select-Object #{n=$columnName;e={$_.Group[0].$columnName}}, Count
for($i=0;$i -lt $objects.count;$I++) {
$line += $columnName +"|"+$objects[$I]."$columnName" +"|Count|"+ $objects[$I].'Count' + $OFS
}
return $line
}
$finalOutput += GroupBy "Channel"
$finalOutput += GroupBy "Flight"
Write-Host $finalOutput
Any help would be much appreciated.
Thanks,
Craig
The fact that your are importing the CSV again for each column is what is killing your script. Try to do the loading once, then re-use the data. For example:
$data = Import-Csv .\data.csv
$flights = $data | Group-Object Flight -NoElement | ForEach-Object {[PsCustomObject]#{Flight=$_.Name;Count=$_.Count}}
$visitors = ($data | Group-Object "Visitor ID" | Measure-Object).Count
$revenue = ($data | Measure-Object Revenue -Sum).Sum
$channel = $data | Group-Object Channel -NoElement | ForEach-Object {[PsCustomObject]#{Channel=$_.Name;Count=$_.Count}}
You can display the data like this:
"Revenue : $revenue"
"Visitors: $visitors"
$flights | Format-Table -AutoSize
$channel | Format-Table -AutoSize
This will probably work - using hashmaps.
Pros: It will be faster/use less memory.
Cons: It is less readable
by far than Group-Object, and requires more code.
Make it even less memory-hungry: Read the CSV-file line by line
$data = Import-CSV -Path "C:\temp\data.csv" -Delimiter ","
$DistinctVisitors = #{}
$TotalRevenue = 0
$ChannelCount = #{}
$FlightCount = #{}
$data | ForEach-Object {
$DistinctVisitors[$_.'Visitor ID'] = $true
$TotalRevenue += $_.Revenue
if (-not $ChannelCount.ContainsKey($_.Channel)) {
$ChannelCount[$_.Channel] = 0
}
$ChannelCount[$_.Channel] += 1
if (-not $FlightCount.ContainsKey($_.Flight)) {
$FlightCount[$_.Flight] = 0
}
$FlightCount[$_.Flight] += 1
}
$DistinctVisitorsCount = $DistinctVisitors.Keys | Measure-Object | Select-Object -ExpandProperty Count
Write-Output "The count of distinc Visitor IDs $DistinctVisitorsCount"
Write-Output "The total revenue $TotalRevenue"
Write-Output "The Count of each Channel"
$ChannelCount.Keys | ForEach-Object {
Write-Output "$_ $($ChannelCount[$_])"
}
Write-Output "The count of each Flight"
$FlightCount.Keys | ForEach-Object {
Write-Output "$_ $($FlightCount[$_])"
}

Conditional Logic and Casting in Powershell

I'm creating a script that imports 2 columns of CSV data, sorts by one column cast as type int, and shows only the values between 0 and 10,000. I've been able to get up to the sorting part, and I am able to show only greater than 0. When I try to add "-and -lt 10000" various ways, I am unable to get any useful data. One attempt gave me the data as if it were string again, though.
This only gives me > 0 but sorts as type int. Half way there!:
PS C:\> $_ = Import-Csv .\vc2.csv | Select -Property User_Name, Minutes; $_ | Sort {[int] $_.Minutes} | Where {($_.Minutes -gt 0)}
This gives me 10000 > x > 0 but sorts as string:
PS C:\> $_ = Import-Csv .\vc.csv | Select -Property User_Name, Minutes; $_ | Sort {[int] $_.Minutes} | Where {($_.Minutes -gt 0) -and ($_.Minutes -lt 10)}
Here and here are where I tried recasting as int and it gave me many errors:
PS C:\> $_ = Import-Csv .\vc.csv | Select -Property User_Name, Minutes; $_ | Sort {[int] $_.Minutes} | Where {[int]{($_.Minutes -gt 0) -and ($_.Minutes -lt 10000)}}
PS C:\> $_ = Import-Csv .\vc.csv | Select -Property User_Name, Minutes; $_ | Sort {[int] $_.Minutes} | Where { ({[int]$_.Minutes} -gt 0) -and ({[int]$_.Minutes} -lt 10000) }
Error: Cannot convert the "($.Minutes -gt 0) -and ($.Minutes -lt 10000)" value of type "System.Management.Automation.ScriptBlock" to type "System.Int32".
What is the proper syntax for this?
PowerShell usually coerces arguments of binary operators to the type of the left operand. This means when doing $_.Minutes -gt 10 the 10 gets converted to a string, because the fields in a parsed CSV are always strings. You can either switch the operands around: 10 -lt $_.Minutes or add a cast: [int]$_.Minutes -gt 10 or +$_.Minutes -gt 10.
Usually, when dealing with CSVs that contain non-string data that I want to use as such, I tend to just add a post-processing step, e.g.:
Import-Csv ... | ForEach-Object {
$_.Minutes = [int]$_.Minutes
$_.Date = [datetime]$_.Date
...
}
Afterwards the data is much nicer to handle, without excessive casts and conversions.
The problem is the use of the { and } brackets in the Where statement. Those are being interpreted as script blocks.
Where { ({[int]$_.Minutes} -gt 0) -and ({[int]$_.Minutes} -lt 10000) }
Try using ( and ) or excluding them altogether.
Where { (([int]$_.Minutes) -gt 0) -and (([int]$_.Minutes) -lt 10000) }
The way you're assigning values to $_ is also weird.
$_ represents the current value in the pipeline.
$list = #(1,2,3)
$list | foreach { $_ }
1
2
3
by assigning "$_" a value, you are losing that value as soon as you place it in the pipeline.
try something like:
$mycsv = import-csv .\vc.csv; $mycsv | select ...etc