Deleting records from a log/text file - powershell

I have several laptops that are generating daily activity logs for a process into txt files. I've figured out how to write a script to append the logs into one master file on a daily basis, but now I'm concerned about file size. I'd like to keep a rolling 60 days of data in my master file.
Here is my data format:
2016-06-23T04:02:33,JE5030UA,88011702312014569339,0000000034626,01451560610600980
Using Get-Date.AddDays(-60) I can get the cutoff date, but it's in MM/dd/yyy format.
If I set up a variable to get the date in the same format as my file (Get-Date -format 'yyyyMMdd), I can't use the .AddDays() method with it to get the cutoff date.
That's how far I've got so far. I'd include code, but there's not much there. The script to append the files was so easy. I can't believe it's difficult to purge old records.
My questions:
What am I missing on the date issue?
What is the best cmdlet to purge records > 60 days? There doesn't appear to be a 'delete' cmdlet for records in a file. I was expecting a 'if date > 60 days, then delete record' kind of function.
Do I need to add a header to the text file?

Take a look at the following code to read from your combined log and then filter out rows that are within your date range. You get a DateTime object from (get-date).AddDays(); you get a DateTime object from the time stamp in the file, then you can compare them. This is one way of doing it anyway.
$cutoffDate = (get-date).AddDays(-60);
$fileContents = get-content C:\your\path\combinedLog.txt
foreach($line in $fileContents)
{
write-host "Current line = $line"
$words = $line.Split(',')
$date=get-date $words[0];
write-host "Date of line = $date"
if($date -gt $cutoffDate)
{
# Append $line to your trimmed log
}
}

Since you're using an ISO date format you can remove records older than a given cutoff date by formatting the cutoff date accordingly and comparing the first field of each line to it:
$file = 'C:\path\to\your.log'
$cutoff = (Get-Date).AddDays(-60).ToString('yyyy-MM-dd\THH:mm:ss')
(Get-Content $file) |
Where-Object { $_.Split(',')[0] -ge $cutoff } |
Set-Content $file
However, rotating logs is usually a better appraoch than clearing out a single file. Write your logs to a different file each day, e.g. like this:
... | Set-Content "C:\path\to\master_$(Get-Date -f 'yyyyMMdd).log"
so you can simply remove logs by their last modification date:
$cutoff = (Get-Date).AddDays(-60)
Get-ChildItem 'C:\log\folder\master_*.log' |
Where-Object { $_.LastWriteTime -lt $cutoff } |
Remove-Item

Related

Removing lines with past dates from a text file

I have a text file called HelplineSpecialRoster.txt that looks like this
01/01/2019,6AM,0400012345,Kurt,kurt#outlook.com
02/01/2019,6AM,0412345676,Bill,bill#outlook.com
03/01/2019,6AM,0400012345,Sam,Sam#outlook.com
04/01/2019,6AM,0412345676,Barry,barry#outlook.com
05/01/2019,6AM,0400012345,Kurt,kurt#outlook.com
I'm in Australia so the dates are day/month/year.
I have some code that creates a listbox that displays the lines from the text file, but I want to edit the text file before it is displayed to only show older dates. A helpful person gave me this code and it worked once but now it stopped working for some reason. When I delete the whole text file and recreated it it started working again but only once.
If there is a future shift in the file say
05/02/2019,6AM,0400012345,Kurt,kurt#outlook.com
and todays date being 29/01/2019 it works to delete the older shifts. If there is only old shifts in the file as above, it doesn't delete them. When I add a date that is in the future, then it works to delete the older ones and only keep the future one.
$SpecialRosterPath = "C:\Helpline Dialer\HelplineSpecialRoster.txt"
$CurrentDate2 = (Get-Date).Date # to have a datetime starting today at 00:00:00
Function DeleteOlderShifts {
$CurrentAndFutureShifts = Get-Content $SpecialRosterPath | Where-Object {
$_ -match "^(?<day>\d{2})\/(?<mon>\d{2})\/(?<year>\d{4})" -and
(Get-Date -Year $Matches.year -Month $Matches.mon -Day $Matches.day) -ge $CurrentDate2
}
$CurrentAndFutureShifts
$CurrentAndFutureShifts | Set-Content $SpecialRosterPath
}
DeleteOlderShifts;
Any ideas?
When there are only older dates in your input file the result in $CurrentAndFutureShifts will be empty. Empty values in a pipeline are skipped over, meaning that nothing is written to the output file, so the output file remains unchanged.
You can avoid this issue by passing the variable to the parameter -Value. Change
$CurrentAndFutureShifts | Set-Content $SpecialRosterPath
into
Set-Content -Value $CurrentAndFutureShifts -Path $SpecialRosterPath
Rather than using a text file, use a CSV file with headers. This is essentially just a text file saved with a .csv file extension that includes headers for each column:
Note I have added an additional row at the bottom with a date older than today's to prove testing.
HelpineSpecialRoster.csv content:
Date,Time,Number,Name,Email
01/01/2019,6AM,400012345,Kurt,kurt#outlook.com
02/01/2019,6AM,412345676,Bill,bill#outlook.com
03/01/2019,6AM,400012345,Sam,Sam#outlook.com
04/01/2019,6AM,412345676,Barry,barry#outlook.com
05/01/2019,6AM,400012345,Kurt,kurt#outlook.com
01/02/2019,6AM,400012345,Dan,dan#outlook.com
Set the path of the CSV:
$csvPath = "C:\HelpineSpecialRoster.csv"
Import CSV from file:
$csvData = Import-CSV $csvPath
Get todays date # 00:00
$date = (Get-Date).Date
Filter csv data to show rows where the date is older than today's date:
$csvData = $csvData | ? { (Get-Date $date) -lt (Get-Date $_.Date) }
Export the CSV data back over the original CSV:
$csvData | Export-CSV $csvPath -Force

How to compare to string in PowerShell foreach loop to files

How to compare the current month with file modified in current month using power shell script. My code is working file but it is reading all the csv file in the given directory. I just wanted to read current month file i.e. modified in October 2018.
Please help me out , I have total 6 files in my directory. 3 files having date modified in October 2018 and remaining 3 files are modified in September 2018.
I want my script to check the current month then read all csv of current month i.e. October 2018
Code:
$files = Get-ChildItem 'C:\Users\212515181\Desktop\Logsheet\*.csv'
$targetPath = 'C:\Users\212515181\Desktop\Logsheet'
$result = "$targetPath\Final.csv"
$curr_month = (Get-Date).Month
$curr_year = (Get-Date).Year
# Adding header to output file
[System.IO.File]::WriteAllLines($result,[System.IO.File]::ReadAllLines($files[1])[1])
foreach ($file in $files)
{
$month = $file.LastWriteTime.ToString()
$curr_month=(Get-Date).Month
if ($month= $curr_month)
{
$firstLine = [System.IO.File]::ReadAllLines($file) | Select-Object -first 1
[System.IO.File]::AppendAllText($result, ($firstLine | Out-String))
$lines = [System.IO.File]::ReadAllLines($file)
[System.IO.File]::AppendAllText($result, ($lines[2..$lines.Length] | Out-String))
}
}
# Change output file name to reflect month and year in MMYYYY format
Rename-Item $result "Final_$curr_month$curr_year.csv"
Your comparison is wrong. And will return $true causing all files to be read
It should be
if ($month -eq $curr_month)
Also I would remove the second
$curr_month = (get-date).month
it's adding overhead to your script as you set it before the loop

Importing csv rows to powershell, with date specified as greater than

I have two csv files. One with a report from AD, containing accounts created during last month, second is manually kept database, that should theoretically contain the same information, but from all history of our company, with some additional data needed for accounting. I imported the AD report to powershell, now I need to import specific rows of the database. The rows I need are defined by a value in column "Date added". I need to import only rows, where the date exceeds specific value. I have this code:
$Report = Read-Host "File name" #AD report, last ten chars are date of report creation, in format yyyy-MM-dd
$Date_text = $Report.Substring($Report.get_Length()-10)
$Date = Get-Date -Date $Date_text
$Date_limit = (($Date).AddDays(-$Date.Day)).Date
$Date_start = $Date_limit.AddMonths(-1)
$CSVlicence = Import-Csv $Database -Encoding UTF8 |
where {(Where-Object {![string]::IsNullOrWhiteSpace($_.'Date added')} |
ForEach-Object{$_.'Date added' = $_.'datum Pridani' -as [datetime] $_}) -gt $Date_start}
When run like this, nothing is imported. Without the condition the database is imported successfully, but it's extremely large and the rest of the script takes for ever. So I need to work only with relevant data. I don't care, that when Date_limit is 30th Sep, the Date_start would be 30th Aug instead of 31st Aug. That's just few more rows, but all those 10 years or so really takes for ever, if everything is imported.
So based on your current logic, it filters the PSCustomObject based on your constraints, which is the wrong way to handle it since any item in the object will cause it to be filtered out. You want to filter the source.
$Report = Read-Host -Prompt 'Filename'
## Grabs the datestamp at the end
$Date = Get-Date -Date $Report.Substring($Report.Length - 10)
## Grabs last day of previous month
$Limit = $Date.AddDays(-$Date.Day)
## Grabs last day of two months ago, inaccuracy of one day
$Start = $Limit.AddMonths(-1)
Get-Content -Path $Database -TotalCount 1 | Set-Content -Path 'tempfile'
Get-Content -Path $Database -Encoding 'UTF8' |
## Checks that the entry has a valid date entry within limits
ForEach-Object {
## For m/d/yy or d/m/yy variants, try
## '\d{1,2}\/\d{1,2}\/\d{2,4}'
If ($_ -match '\d{4}-\d{2}-\d{2}')
{
[DateTime]$Date = $Matches[0]
If ($Date -gt $Start -and $Date -lt $Limit) { $_ }
}
} |
Add-Content -Path 'tempfile'
$CSV = Import-Csv -Path 'tempfile' -Encoding 'UTF8'

Finding entries in a log file greater than a defined time

My company needs to analyse log files of a Tomcat to check for specific errors and uses Powershell. Those errors will be stored in an array and checked against 1:1. This happens every 30 minutes by using Windows Task Scheduler. In case such an error is found in the log file, a generated text file will be sent to the administrators.
However it is only of interest to check for errors during the last 30 minutes, not beforehand.
So I have defined first a variable for the time:
$logTimeStart = (Get-Date).AddMinutes(-30).ToString("yyyy-MM-dd HH:mm:ss")
Later on I check for the existence of such an error:
$request = Get-Content ($logFile) | select -last 100 | where { $_ -match $errorName -and $_ -gt $logTimeStart }
Unfortunately this does not work; it also sends errors happened before this interval of 30 minutes.
Here is an extract of the Tomcat log:
2016-05-25 14:21:30,669 FATAL [http-apr-8080-exec-4] admins#company.de de.abc.def.business.service.ExplorerService GH00000476:
de.abc.def.business.VisjBusinessException: Invalid InstanceId
at de.abc.def.business.service.ExplorerService$ExplorerServiceStatic.getExplorer(ExplorerService.java:721)
at de.abc.def.business.service.ExplorerService$ExplorerServiceStatic.getTreeItemList(ExplorerService.java:823)
at sun.reflect.GeneratedMethodAccessor141.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at de.abc.def.business.provider.ServiceProvider.callServiceMethod(ServiceProvider.java:258)
at de.abc.def.business.communication.web.client.ServiceDirectWrapperDelegate.callServiceMethod(ServiceDirectWrapperDelegate.java:119)
at de.abc.def.business.communication.web.client.ServiceWrapperBase.callServiceMethod(ServiceWrapperBase.java:196)
at de.abc.def.business.communication.web.client.ServiceDirectWrapper.callServiceMethod(ServiceDirectWrapper.java:24)
at de.abc.def.web.app.service.stub.AbstractBaseStub.callServiceMethodDirect(AbstractBaseStub.java:72)
at de.abc.def.web.app.service.stub.AbstractBaseStub.callServiceMethod(AbstractBaseStub.java:183)
at de.abc.def.web.app.service.stub.StubSrvExplorer.getTreeItemList(StubSrvExplorer.java:135)
at de.abc.def.web.app.resource.servlet.ExplorerServlet.createXml(ExplorerServlet.java:350)
at de.abc.def.web.app.resource.servlet.ExplorerServlet.callExplorerServlet(ExplorerServlet.java:101)
at de.abc.def.web.app.resource.servlet.VisServlet.handleServlets(VisServlet.java:244)
at de.abc.def.web.app.FlowControlAction.isPing(FlowControlAction.java:148)
at de.abc.def.web.app.FlowControlAction.execute(FlowControlAction.java:101)
at org.apache.struts.action.RequestProcessor.processActionPerform(RequestProcessor.java:484)
at org.apache.struts.action.RequestProcessor.process(RequestProcessor.java:274)
Unfortunately one cannot say how many lines of such an error will show up. Therefore 100 is just an estimate (which works well).
So how to change the related line
$request = Get-Content ($logFile) | select -last 100 |
where { $_ -match $errorName -and $_ -gt $logTimeStart }
to a correct one?
Use Select-String and a regular expression to parse your log file. Basically a log entry consists of a timestamp, the severity, and a message (which may span several lines). A regular expression for that might look like this:
(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3})\s+(\w+)\s+(.*(?:\n\D.*)*(?:\n\t.*)*)
(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3}) matches the timestamp.
(\w+) matches the severity.
(.*(?:\n\D.*)*) matches the log message (the current line followed by zero or more lines not beginning with a number).
The parentheses around each subexpression capture the submatch as a group that can then be used for populating the properties of custom objects.
$datefmt = 'yyyy-MM-dd HH:mm:ss,FFF'
$culture = [Globalization.CultureInfo]::InvariantCulture
$pattern = '(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3})\s+(\w+)\s+(.*(?:\r\n\D.*)*)'
$file = 'C:\path\to\your.log'
Get-Content $file -Raw | Select-String $pattern -AllMatches | ForEach-Object {
$_.Matches | ForEach-Object {
New-Object -Type PSObject -Property #{
Timestamp = [DateTime]::ParseExact($_.Groups[1].Value, $datefmt, $culture)
Severity = $_.Groups[2].Value
Message = $_.Groups[3].Value
}
}
}
Parsing the date substring into a DateTime value isn't actually required (since date strings in ISO format can be sorted properly even with string comparisons), but it's nice to have so you don't have to convert your reference timestamp to a formatted string.
Note that you need to read the entire log file as a single string for this to work. In PowerShell v3 and newer this can be achieved by calling Get-Content with the parameter -Raw. On earlier versions you can pipe the output of Get-Content through the Out-String cmdlet to get the same result:
Get-Content $file | Out-String | ...

Using PowerShell, how would I create/split a file in 1-week chunks?

This is in reference to the post here: How to delete date-based lines from files using PowerShell
Using the below code (contributed by 'mjolinor') I can take a monolithic (pipe "|" delimited) CSV file and create a trimmed CSV file with only lines containing dates less than $date:
$date = '09/29/2011'
foreach ($file in gci *.csv) {
(gc $file) |
? {[datetime]$_.split('|')[1] -lt $date
} | set-content $file
}
The above code works great! What I need to do now is create additional CSV files from the monolithic CSV file with lines containing dates >= $date, and each file needs to be in 1-week chunks going forward from $date.
For example, I need the following 'trimmed' CSV files (all created from original CSV):
All dates less than 09/30/2011 (already done with code above)
File with date range 09/30 - 10/6
File with date range 10/7 - 10/14
Etc, etc, until I reach the most recent date
You can use the GetWeekOfYear method of Calendar like this
$date = (Get-Date)
$di = [Globalization.DateTimeFormatInfo]::CurrentInfo
$week = $di.Calendar.GetWeekOfYear($date, $di.CalendarWeekRule, $di.FirstDayOfWeek)
to determine the week number of a given date.
This is NOT tested with your input (it was adapted from some script I use for time-slicing and counting Windows log events), but should be close to working. It can create files on any arbitrary time span you designate in $span:
$StartString = '09/29/2011'
$inputfile = 'c:\somedir\somefile.csv'
$Span = new-timespan -days 7
$lines = #{}
$TimeBase = [DateTime]::MinValue
$StartTicks = ([datetime]$startString).Ticks
$SpanTicks = $Span.Ticks
get-content $inputfile |
foreach {
$dt = [datetime]$_.split('|')[1]
$Time_Slice = [int][math]::truncate(($dt.Ticks - $StartTicks) / $SpanTicks)
$lines[$Time_Slice] += #($_)
}
$lines.GetEnumerator() |
foreach {
$filename = ([datetime]$StartString + [TimeSpan]::FromTicks($SpanTicks * $_.Name)).tostring("yyyyMMdd") + '.csv'
$_.value | export -csv $filename
}