Determine latest file based on datestamp in filename - powershell

I have a series of files in a directory with the following format:
file_ddMMyyyyhhttss.csv
eg:
myfile_151220171038.csv
myfile_301120171445.csv
myfile_121020161114.csv
I know how to select the latest by LastWriteTime:
gci "$pathtofile" | Sort LastWriteTime | Select -Last 1
but unsure how to split the "datestamp" in the file and then to sort by year, month and then date, in order to determine the latest file. Any suggestions?

Well, the specific format in your file names prevents any useful sorting without parsing it, but you can do just that:
Get-ChildItem $pathtofile |
ForEach-Object {
# isolate the timestamp
$time = $_ -replace '.*(\d{12}).*','$1'
# parse
$timestamp = [DateTime]::ParseExact($time, 'ddMMyyyyHHmm', $null)
# add to the objects so we can sort
$_ | Add-Member -PassThru NoteProperty Timestamp $timestamp
} |
Sort-Object Timestamp
Adjust to fit your exact date/time format, because the one you specified in your question does not match the one on your files.

My proposition ;)
Only take file, only file with format asked, Only print file with date when date are really date.
[System.DateTime]$parsedDate=get-date
Get-ChildItem "c:\temp\" -file -filter "*.csv" | where BaseName -match ".*(\d{12}).*" | %{
$DtString=$_.BaseName.substring($_.BaseName.Length - 12)
if ([DateTime]::TryParseExact($DtString, "ddMMyyyyhhmm",$null,[System.Globalization.DateTimeStyles]::None,[ref]$parseddate))
{
$_ | Add-Member -Name TimeInName -Value $parseddate -MemberType NoteProperty -PassThru
}
} | Sort TimeInName -Descending | Select -First 1

The Sort-Object cmdlet can work not only with regular but also with calculated properties. That allows you to sort the files without passing them through a ForEach-Object loop first. Like this:
$pattern = '.*_(\d{14}).*'
$datefmt = 'ddMMyyyyHHmmss'
$culture = [Globalization.CultureInfo]::InvariantCulture
Get-ChildItem $pathtofile | Sort-Object #{n='Timestamp';e={
$datestr = $_.Basename -replace $pattern, '$1'
[DateTime]::ParseExact($datestr, $datefmt, $culture)
}} | Select-Object -Last 1
I usually recommend using the InvariantCulture constant rather than $null as the third argument for ParseExact(), because using $null gave me errors in some cases.
If your source directory contains files that don't match your filename pattern you may want to exclude them with a Where-Object filter before sorting the rest:
Get-ChildItem $pathtofile | Where-Object {
$_.Basename -match $pattern
} | Sort-Object #{n='Timestamp';e={
$datestr = $_.Basename -replace $pattern, '$1'
[DateTime]::ParseExact($datestr, $datefmt, $culture)
}} | Select-Object -Last 1

Related

How to strip out leading time stamp?

I have some log files.
Some of the UPDATE SQL statements are getting errors, but not all.
I need to know all the statements that are getting errors so I can find the pattern of failure.
I can sort all the log files and get the unique lines, like this:
$In = "C:\temp\data"
$Out1 = "C:\temp\output1"
$Out2 = "C:\temp\output2"
Remove-Item $Out1\*.*
Remove-Item $Out2\*.*
# Get the log files from the last 90 days
Get-ChildItem $In -Filter *.log | Where-Object {$_.LastWriteTime -gt (Get-Date).AddDays(-90)} |
Foreach-Object {
$content = Get-Content $_.FullName
#filter and save content to a file
$content | Where-Object {$_ -match 'STATEMENT'} | Sort-Object -Unique | Set-Content $Out1\$_
}
# merge all the files, sort unique, write to output
Get-Content $Out2\* | Sort-Object -Unique | Set-Content $Out3\output.txt
Works great.
But some of the logs have a leading date-time stamp in the leading 24 char. I need to strip that out, or all those lines are unique.
If it helps, all the files either have the leading timestamp or they don't. The lines are not mixed within a single file.
Here is what I have so far:
# Get the log files from the last 90 days
Get-ChildItem $In -Filter *.log | Where-Object {$_.LastWriteTime -gt (Get-Date).AddDays(-90)} |
Foreach-Object {
$content = Get-Content $_.FullName
#filter and save content to a file
$s = $content | Where-Object {$_ -match 'STATEMENT'}
# strip datetime from front if exists
If (Where-Object {$s.Substring(0,1) -Match '/d'}) { $s = $s.Substring(24) }
$s | Sort-Object -Unique | Set-Content $Out1\$_
}
# merge all the files, sort unique, write to output
Get-Content $Out1\* | Sort-Object -Unique | Set-Content $Out2\output.txt
But it just write the lines out without stripping the leading chars.
Regex /d should be \d (\ is the escape character in general, and character-class shortcuts such as d for a digit[1] must be prefixed with it).
Use a single pipeline that passes the Where-Object output to a ForEach-Object call where you can perform the conditional removal of the numeric prefix.
$content |
Where-Object { $_ -match 'STATEMENT' } |
ForEach-Object { if ($_[0] -match '\d') { $_.Substring(24) } else { $_ } } |
Set-Content $Out1\$_
Note: Strictly speaking, \d matches everything that the Unicode standard considers a digit, not just the ASCII-range digits 0 to 9; to limit matching to the latter, use [0-9].

Select multiple files by file name and get content

I run this script for my monitoring system, but i want to extend the range of the aviable date.
Is there any way to get the content of multiple files with different names. Currently im only looking for one specific name pattern, for example, for all files which include a specific date like 2022-08-08*.log.
So what i want to do, is to collect all files from 7 days ago up to 1 day ago at the same time and get the content.
$backuppath = "random-name-in-logfile"
$yest = (get-date (get-date).addDays(-1) -UFormat "%Y-%m-%d")
# check for pattern in files
$path1 = Get-ChildItem `
-Path "C:\path\to\log" -Filter "$yest*.log" -recurse | `
Select-String -pattern ([regex]::escape($backuppath)) | `
Select-Object -Property Path
# transform string to usable path
$path2 = $path1 -replace ('#{Path=','') -replace ('}','')
# check for more details
$analyze = Get-Content $path2 | Select-String -pattern "Pattern" -SimpleMatch
Ok i think i got it.
It´s easier to do by setting the creation time to a limit like 1 day (or for my example 7 days)
$path1 = Get-ChildItem `
-Path "C:\path\to\log" -Filter "*.log" -recurse | `
Where-Object { $_.CreationTime -gt (Get-Date).AddDays(-7) } | `
Select-String -pattern ([regex]::escape($backuppath)) | `
Select-Object -Property Path

powershell: copy date from XML and create a new file

I have an XMLs with a mask XXXXX-sell-XXXXX.xml, code is:
<?xml version="1.0" encoding="utf-8"?>
<document xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<id>document-20210301-sell</id>
<number>1</number>
<type>a1</type>
<id>document-20210301-sell</id>
<number>2</number>
<type>a1</type>
.....
</document>
I want to copy the newest XML to another folder adding date taken from XML in filename with a mask:
yyyymmdd-sell.xml
My code is not doing a job:
$path1="C:\sell\"
$path2="c:\sell\in\"
Get-ChildItem -Path $path1 -Filter "*-sell-*" |
Where-Object { -not $_.PSIsContainer } |
Sort-Object -Property CreationTime |
Select-Object -Last 1 |
[xml]$xml = Get-Content
$date = $xml.document | Where-Object {$_.id -like "*20210301*"}
Copy-Item -Destination $path2 -PassThru |
Rename-Item -NewName {$Date + "-sell".xml"}
Welcome to StackOverflow, mcq. This might work, sometimes pipes make things complicated:
$path1="C:\sell\"
$path2="c:\sell\in\"
$f = Get-ChildItem -Path $path1 -Filter "*-sell-*" |
Where-Object { -not $_.PSIsContainer } |
Sort-Object -Property CreationTime |
Select-Object -Last 1
[xml]$xml = Get-Content $f
$date = $xml.document.id[0]
#
# get date like yyyymmdd:
#
if($date -match "\d+") {
$date = $matches[0];
}
#
# copy and rename:
#
Copy-Item $f -Destination $path2 -PassThru |
Rename-Item -NewName ($date + "-sell.xml")
With $date -match "\d+", \d is regex (Regular Expression) symbol representing any digit in [0-9] (from 0 to 9), "+" means more than 1 digit, in our case 8 for yyyymmdd numerical value, we capture the date substring in .id value, such as 20210301, 20190925, ...
Please also consult https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_regular_expressions?view=powershell-7.1 to start.

Powershell - Sorting and selecting files in foreach loop

I'm trying to create a script that will find the most recent build_info files from multiple install locations in a server's directory, select the "version: " text from each file, and compare them to see if they're all the same (which is what we hope for), or if certain install locations have different versions. As a bonus, it would also be nice to have each path's install version have its own variable so that if I have to output any differences, I can say which specific paths have which versions. For example, if something is installed in Path1, Path2, and Path3, I want to be able to say, "all paths are on version 3.5," or "Path1 is version 1.2, Path2 is version 3.5, Path3 is version 4.8."
Here's a neater list of what I'm trying to do:
Loop through folders in a directory.
For each folder, sort the txt files with a specific name in that path by Creation Date descending and select the most recent.
Once it has the most recent files from each path, Select-String a specific phrase from each of them. Specifically, "version: ".
Compare the version from each path and see if all are the same or there are differences, then output the result.
This is what I've been able to write so far:
$Directory = dir D:\Directory\Path* | ?{$_.PSISContainer};
$Version = #();
foreach ($d in $Directory) {
$Version = (Select-String -Path D:\Directory\Path*\build_info_v12.txt -Pattern "Version: " | Select-Object -ExpandProperty Line) -replace "Version: ";
}
if (#($Version | Select -Unique).Count -eq 1) {
Write-Host 'The middle tiers are all on version' ($Version | Select -Unique);
}
else {
Write-Host 'One or more middle tiers has a different version.';
}
I've had to hard code in the most recent build_info files because I'm not sure how to incorporate the sorting aspect into this. I'm also not sure how to effectively assign each path's result to a variable and output them if there are differences. This is what I've been messing around with as far as the sorting aspect, but I don't know how to incorporate it and I'm not even sure if it's the right way to approach this:
$Recent = Get-ChildItem -Path D:\Directory\Path*\build_info*.txt | Sort-Object CreationTime -Descending | Select-Object -Index 0;
You can use Sort-Object and Select-Object to determine the most recent file. Here is a function that you can give a collection of files to and it will return the most recent one:
function Get-MostRecentFile{
param(
$fileList
)
$mostRecent = $fileList | Sort-Object LastWriteTime | Select-Object -Last 1
$mostRecent
}
Here is one possible solution:
Get-ChildItem "D:\Directory\Path" -Include "build_info*.txt" -File -Recurse |
Group-Object -Property DirectoryName |
ForEach-Object {
$_.Group |
Sort-Object LastWriteTime -Descending |
Select-Object -First 1 |
ForEach-Object {
New-Object -TypeName PsCustomObject |
Add-Member -MemberType NoteProperty -Name Directory -Value $_.DirectoryName -PassThru |
Add-Member -MemberType NoteProperty -Name FileName -Value $_.Name -PassThru |
Add-Member -MemberType NoteProperty -Name MaxVersion -Value ((Select-String -Path $_.FullName -Pattern "Version: ").Line.Replace("Version: ","")) -PassThru
}
}
This will produce a collection of objects, one for each directory in the tree, with properties for the directory name, most recent version and the file we found the version number in. You can pipe these to further cmdlets for filtering, etc.

How to parse filenames to determine the newest file in each of multiple folders

I have logs that are getting written from various Linux servers to a central windows NAS server. They're in E:\log in the format:
E:\log\process1\log20140901.txt,
E:\log\process2\20140901.txt,
E:\log\process3\log-process-20140901.txt,
etc.
Multiple files get copied on a weekly basis at the same time, so created date isn't a good way to determine what the newest file is. Therefore I wrote a powershell function to parse the date out, and I'm attempting to iterate through and get the newest file in each folder, using the output of my function as the "date". I'm definitely doing something wrong.
Here's the Powershell I've written so far:
Function ReturnDate ($file)
{
$f = $file
$f = [RegEx]::Matches($f,"(\d{8})") | Select-Object -ExpandProperty Value
$sqlDate = $f.Substring(0,4) + "-" + $f.substring(4,2) + "-" + $f.substring(6,2)
return $sqlDate
}
Get-ChildItem E:\log\* |
Where {$_.PsIsContainer} |
foreach-object { Get-ChildItem $_ -Recurse |
Where {!$_.PsIsContainer} |
ForEach-Object { ReturnDate $_}|
Sort-Object ReturnDate -Descending |
Select-Object -First 1 | Select Name,ReturnDate
}
I seem to be confounding properties and causing "You cannot call a method on null-valued expression errors", but I'm uncertain what to do from here.
I suspect your $f variable is null and you're trying to invoke a method (Substring) on a null value. Try this instead:
Get-ChildItem E:\Log -File -Recurse | Where Name -Match '(\d{8})\.' |
Foreach {Add-Member -Inp $_ NoteProperty ReturnDate ($matches[1]) -PassThru} |
Group DirectoryName |
Foreach {$_.Group | Sort ReturnDate -Desc | Select -First 1}
This does require V3 or higher. If you're on V1 or V2 change it to this:
Get-ChildItem E:\Log -Recurse |
Where {!$_.PSIsContainer -and $_.Name -Match '(\d{8})\.'} |
Foreach {Add-Member -Inp $_ NoteProperty ReturnDate ($matches[1]) -PassThru} |
Group DirectoryName |
Foreach {$_.Group | Sort ReturnDate -Desc | Select -First 1}
Your code was ok for me when i tried it up until you did a select you were requesting name and returndate when those properties did not exist. Creating a custom object with those values would make your code work. Also i removed some of the logic from your pipes. End result should still work though (I just made some dummy files to test with like your examples).
Working with your original code you could have something like this. This would only work on v3 or higher. Simple changes could make it work on lower if need be. Mostly where [pscustomobject] is concerned.
Function ReturnDate ($file)
{
$f = $file
$f = [RegEx]::Matches($f,"(\d{8})") | Select-Object -ExpandProperty Value
$sqlDate = $f.Substring(0,4) + "-" + $f.substring(4,2) + "-" + $f.substring(6,2)
[pscustomobject] #{
'Name' = $file.FullName
'ReturnDate' = $sqlDate
}
}
Get-ChildItem C:\temp\E\* -Recurse |
Where-Object {!$_PSIsContainer} |
ForEach-Object{ReturnDate $_} |
Sort-Object ReturnDate -Descending |
Select-Object -First 1
The Sort-Object cmdlet supports sorting by a custom script block and will sort by whatever the script block returns. So, use a regular expression to grab the timestamp and return it.
Get-ChildItem E:\log\* -Directory |
ForEach-Object {
Get-ChildItem $_ -Recurse -File |
Sort-Object -Property {
if( $_.Name -match '(\d{8})' )
{
return $Matches[1]
}
Write-Error ('File ''{0}'' doesn't contain a timestamp in its name.' -f $_.FullName)
} |
Select-Object -Last 1 |
Select Name,ReturnDate
}
Note that Select-Object -First 1 was changed to Select-Object -Last 1, since dates would be sorted from oldest to newest.