Parse out date from filename and sort by date - powershell

I have a series of files named as such in a folder:
- myFile201801010703.file
I'm trying to parse out the yyyymmdd portion of each filename in the folder and sort them based on the date into an array.
So if I had the following files:
myFile201801200000.file (01/20/2018)
myFile201800100000.file (01/01/2018)
myFile201801100000.file (01/10/2018)
It would sort them into an array as such:
myFile201800100000.file (01/01/2018)
myFile201801100000.file (01/10/2018)
myFile201801200000.file (01/20/2018)
I have a process that works for file with timestamps included in the name, though have been unable to tweak it for work with only a date:
# RegEx pattern to parse the timestamps
$Pattern = '(\d{4})(\d{2})(\d{2})*\' + ".fileExtension"
$FilesList = New-Object System.Collections.ArrayList
$Temp = New-Object System.Collections.ArrayList
Get-ChildItem $SourceFolder | ForEach {
if ($_.Name -match $Pattern) {
Write-Verbose "Add $($_.Name)" -Verbose
$Date = $Matches[2],$Matches[3],$Matches[1] -join '/'
$Time = $Matches[4..6] -join ':'
[void]$Temp.Add(
(New-Object PSObject -Property #{
Date = [datetime]"$($Date) $($Time)" #If I comment out $($Time)it doesn't work.
File = $_
}
))
}
}
} catch {
Write-Host "`n*** $Error ***`n"
}
# Sort the files by the parsed timestamp and add to $FilesList
$FilesList.AddRange(#($Temp | Sort Date | Select -Expand File))
# Clear out the temp collection
$Temp.Clear()
The two lines in particular that I think might be culprit are:
$Time = $Matches[4..6] -join ':' Since I'm not parsing any time
Date = [datetime]"$($Date) $($Time)" Again, no time is parsed. Can't change the type to date either it seems?

With this format:
myFileYYYYMMddHHmm.file
the individual parts of the date and time is already arranged from largest (the year) to smallest (the minute) - this makes the string sortable!
Only thing we need to do is grab the last 12 digits of the file name before the extension:
$SortedArray = Get-ChildItem *.file |Sort-Object {$_.BaseName -replace '^.*(\d{12})$','$1'}
The regex pattern used:
^.*(\d{12})$
Can be broken down as follows:
^ # start of string
.* # any character, 0 or more times
( # capture group
\d{12} # any digit, 12 times
) # end of capture group
$ # end of string
The regex engine will expand $1 in the substitution string to "capture group #1", ie. the 12 digits we picked up at the end.

Related

Powershell Files fetch

Am looking for some help to create a PowerShell script.
I have a folder where I have lots of files, I need only those file that has below two content inside it:
must have any matching string pattern as same as in file file1 (the content of file 1 is -IND 23042528525 or INDE 573626236 or DSE3523623 it can be more strings like this)
also have date inside the file in between 03152022 and 03312022 in the format mmddyyyy.
file could be old so nothing to do with creation time.
then save the result in csv containing the path of the file which fulfill above to conditions.
Currently am using the below command that only gives me the file which fulfilling the 1 condition.
$table = Get-Content C:\Users\username\Downloads\ISIN.txt
Get-ChildItem `
-Path E:\data\PROD\server\InOut\Backup\*.txt `
-Recurse |
Select-String -Pattern ($table)|
Export-Csv C:\Users\username\Downloads\File_Name.csv -NoTypeInformation
To test if a file contains a certain keyword from a range of keywords, you can use regex for that. If you also want to find at least one valid date in format 'MMddyyyy' in that file, you need to do some extra work.
Try below:
# read the keywords from the file. Ensure special characters are escaped and join them with '|' (regex 'OR')
$keywords = (Get-Content -Path 'C:\Users\username\Downloads\ISIN.txt' | ForEach-Object {[regex]::Escape($_)}) -join '|'
# create a regex to capture the date pattern (8 consecutive digits)
$dateRegex = [regex]'\b(\d{8})\b' # \b means word boundary
# and a datetime variable to test if a found date is valid
$testDate = Get-Date
# set two variables to the start and end date of your range (dates only, times set to 00:00:00)
$rangeStart = (Get-Date).AddDays(1).Date # tomorrow
$rangeEnd = [DateTime]::new($rangeStart.Year, $rangeStart.Month, 1).AddMonths(1).AddDays(-1) # end of the month
# find all .txt files and loop through. Capture the output in variable $result
$result = Get-ChildItem -Path 'E:\data\PROD\server\InOut\Backup'-Filter '*.txt'-File -Recurse |
ForEach-Object {
$content = Get-Content -Path $_.FullName -Raw
# first check if any of the keywords can be found
if ($content -match $keywords) {
# now check if a valid date pattern 'MMddyyyy' can be found as well
$dateFound = $false
$match = $dateRegex.Match($content)
while ($match.Success -and !$dateFound) {
# we found a matching pattern. Test if this is a valid date and if so
# set the $dateFound flag to $true and exit the while loop
if ([datetime]::TryParseExact($match.Groups[1].Value,
'MMddyyyy',[CultureInfo]::InvariantCulture,
[System.Globalization.DateTimeStyles]::None,
[ref]$testDate)) {
# check if the found date is in the set range
# this tests INCLUDING the start and end dates
$dateFound = ($testDate -ge $rangeStart -and $testDate -le $rangeEnd)
}
$match = $match.NextMatch()
}
# finally, if we also successfully found a date pattern, output the file
if ($dateFound) { $_.FullName }
elseif ($content -match '\bUNKNOWN\b') {
# here you output again, because unknown was found instead of a valid date in range
$_.FullName
}
}
}
# result is now either empty or a list of file fullnames
$result | set-content -Path 'C:\Users\username\Downloads\MatchedFiles.txt'

Rename files in a specific way. Target nth string between symbols

Apologies in advance for a bit vague question (no coding progress).
I have files (they can be .csv but dont have .csv, but that I can add via script easy). The files' name is something like this:
TRD_123456789_ABC123456789_YYMMDD_HHMMSS_12345678_12345_blabla_blabla_blabla_blabla
Now I would need a script that renames the file in a way that it keeps original name except:
It would cut off the ending (blabla_blabla_blabla_blabla) part.
Changes the 12345 before blabla to random 5 characters (can be numbers too)
Change timestamp of HHMMSS to current Hours, minutes, seconds.
In regards to point 3. I think that I can insert arbitary powershell script to any string in " " queotes. So when renaming the files, I was thinking I could just add
Rename-Item -NewName {... + $(get-date -f hhmmss) + ...}
However, I am lost how to write renaming script that renames parts between 4th & 5th _ symbol. And removes string part after 7th _ symbol.
Can somebody help me with the script or help me how to in powershell script target string between Nth Symbols?
Kind Regards,
Kamil.
Split the string on _:
$string = 'TRD_123456789_ABC123456789_YYMMDD_HHMMSS_12345678_12345_blabla_blabla_blabla_blabla'
$parts = $string -split '_'
Then discard all but the first 6 substrings (eg. drop the 12345 part and anything thereafter):
$parts = $parts[0..5]
Now add your random 5-digit number:
$parts = #($parts; '{0:D5}' -f $(Get-Random -Maximum 100000))
Update the string at index 4 (the HHMMSS string):
$parts[4] = Get-Date -Format 'HHmmss'
And finally join all the substrings together with _ again:
$newString = $parts -join '_'
Putting it all together, you could write a nice little helper function:
function Get-NewName {
param(
[string]$Name
)
# split and discard
$parts = $Name -split '_' |Select -First 6
# add random number
$parts = #($parts; '{0:D5}' -f $(Get-Random -Maximum 100000))
# update timestamp
$parts[4] = Get-Date -Format 'HHmmss'
# return new string
return $parts -join '_'
}
And then do:
Get-ChildItem -File -Filter TRD_* |Rename-Item -NewName { Get-NewName $_.Name }

How to reformat the date on files in bulk using powershell

I need to format the file name from ...
2639423_3_30_56 PM_9_4_2020.txt
... to ...
2639423-15-30-56-09-04-2020.txt
i.e. Need to change date in Military time format and replace '_' with '-', Also append with “0” for single digit months and single digit days
Please advise I need to perform this in powershell & need to perform this in bulk.
Start by splitting the file name into two parts - the prefix, which remains the same, and the timestamp, which you want to re-format:
$basename = '2639423_3_30_56 PM_9_4_2020'
$prefix,$timestamp = $basename -split '_',2
Next, parse the timestamp according to it's specific format:
$inputFormat = 'h_mm_ss tt_d_M_yyyy'
$parsedDateTime = [datetime]::ParseExact($timestamp,$inputFormat,$null)
Finally convert the parsed [datetime] object back to a string with the desired output format, and then join the prefix and (updated) timestamp together again:
$outputFormat = 'HH-mm-ss-dd-MM-yyyy'
$timestamp = $parsedDateTime.ToString($outputFormat)
# or
$timestamp = Get-Date $parsedDateTime -Format $outputFormat
$newFileName = $prefix,$timestamp -join '-'
# 2639423-15-30-56-09-04-2020
To rename the files in bulk, pipe the files to Rename-Item and use the parameter binder to generate the new name of each file based on the existing name:
Get-ChildItem -Path .\folder\with\files -Filter *.txt |Rename-Item -NewName {
$prefix,$timestamp = $_.BaseName -split '_',2
$parsedDateTime = [datetime]::ParseExact($timestamp, 'h_mm_ss tt_d_M_yyyy', $null)
$timestamp = $parsedDateTime.ToString('HH-mm-ss-dd-MM-yyyy')
$newBaseName = $prefix,$timestamp -join '-'
$newBaseName + $_.Extension
}

PowerShell script efficiency advice

I have a telephony .csv with compiled data from January 2020 and some days of February, each row has the date and time spent on each status, since someone uses different status over the day the file has one row for each status, my script is supposed to go through the file, find the minimum date and then start saving on new files all the data for the same day, so I'll end with one file for 01-01-2020, 02-01-2020 and so on, but it has 15 hours running and it's still at 1/22.
The column I'm using for the dates is called "DateFull" and this is the script
write-host "opening file"
$AT= import-csv “C:\Users\xxxxxx\Desktop\SignOnOff_20200101_20200204.csv”
write-host "parsing and sorting file"
$go= $AT| ForEach-Object {
$_.DateFull= (Get-Date $_.DateFull).ToString("M/d/yyyy")
$_
}
Write-Host "prep day"
$min = $AT | Measure-Object -Property Datefull -Minimum
Write-Host $min
$dateString = [datetime] $min.Minimum
Write-host $datestring
write-host "Setup dates"
$start = $DateString - $today
$start = $start.Days
For ($i=$start; $i -lt 0; $i++) {
$date = get-date
$loaddate = $date.AddDays($i)
$DateStr = $loadDate.ToString("M/d/yyyy")
$now = Get-Date -Format HH:mm:ss
write-host $datestr " " $now
#Install-Module ImportExcel #optional import if you dont have the module already
$Check = $at | where {$_.'DateFull' -eq $datestr}
write-host $check.count
if ($check.count -eq 0 ){}
else {$AT | where {$_.'DateFull' -eq $datestr} | Export-Csv "C:\Users\xxxxx\Desktop\signonoff\SignOnOff_$(get-date (get-date).addDays($i) -f yyyyMMdd).csv" -NoTypeInformation}
}
$at = ''
The first loop doesn't make much sense. It loops through CSV contents and converts each row's date into different a format. Afterwards, $go is never used.
$go= $AT| ForEach-Object {
$_.DateFull= (Get-Date $_.DateFull).ToString("M/d/yyyy")
$_
}
Later, there is an attempt to calculate a value from uninitialized a variable. $today is never defined.
$start = $DateString - $today
It looks, however, like you'd like to calculate, in days, how old eldest record is.
Then there's a loop that counts from negative days to zero. During each iteration, the whole CSV is searched:
$Check = $at | where {$_.'DateFull' -eq $datestr}
If there are 30 days and 15 000 rows, there are 30*15000 = 450 000 iterations. This has complexity of O(n^2), which means runtime will go sky high for even relative small number of days and rows.
The next part is that the same array is processed again:
else {$AT | where {$_.'DateFull' -eq $datestr
Well, the search condition is exactly the same, but now results are sent to a file. This has a side effect of doubling your work. Still, O(2n^2) => O(n^2), so at least the runtime isn't growing in cubic or worse.
As for how to fix this, there are a few things. If you sort the CSV based on date, it can be processed afterwards in just a single run.
$at = $at | sort -Property datefull
Then, iterate each row. Since the rows are in ascending order, the first is the oldest. For each row, check if date has changed. If not, add it to buffer. If it has, save the old buffer and create a new one.
The sample doesn't convert file names in yyyyMMdd format, and it assumes there are only two columns foo and datefull like so,
$sb = new-object text.stringbuilder
# What's the first date?
$current = $at[0]
# Loop through sorted data
for($i = 0; $i -lt $at.Count; ++$i) {
# Are we on next date?
if ($at[$i].DateFull -gt $current.datefull) {
# Save the buffer
$file = $("c:\temp\OnOff_{0}.csv" -f ($current.datefull -replace '/', '.') )
set-content $file $sb.tostring()
# Pick the current date
$current = $at[$i]
# Create new buffer and save data there
$sb = new-object text.stringbuilder
[void]$sb.AppendLine(("{0},{1}" -f $at[$i].foo, $at[$i].datefull))
} else {
[void]$sb.AppendLine(("{0},{1}" -f $at[$i].foo, $at[$i].datefull))
}
}
# Save the final buffer
$file = $("c:\temp\OnOff_{0}.csv" -f ($current.datefull -replace '/', '.') )
set-content $file $sb.tostring()

Compare current date to date string in a file using powershell

I am writing some PS scripts to log times into a text file, login.txt, using the following code:
$logdir = "C:\FOLDER"
$logfile = "$logdir\LastLogin.txt"
$user = $env:USERNAME
$date = Get-Date -Format "dd-MM-yyyy"
if (!(Test-Path $logdir)){New-Item -ItemType Directory $logdir}else{}
if (!(Test-Path $logfile)){New-Item $logfile}else{}
if (Get-Content $logfile | Select-String $user -Quiet){write-host "exists"}else{"$user - $date" | Add-Content -path $logfile}
(Get-Content $logfile) | Foreach-Object {$_ -replace "$user.+$", "$user - $date"; } | Set-Content $logfile
This creates an entry in the text file like:
UserName - 01-01-1999
Using Powershell, I want to read the text file, compare the date, 01-01-1999, in the text file to the current date and if more than 30 days difference, extract the UserName to a variable to be used later in the script.
I would really appreciate any hints as to how I could do the following:
Compare the date in the text file to the current date.
If difference is more than 30 days, pick up UserName as a variable.
I would really appreciate any advice.
Checking all dates in the file with the help of a RegEx with named capture groups.
$logdir = "C:\FOLDER"
$logfile = Join-Path $logdir "LastLogin.txt"
$Days = -30
$Expires = (Get-Date).AddDays($Days)
Get-Content $logfile | ForEach-Object {
if ($_ -match "(?<User>[^ ]+) - (?<LastLogin>[0-9\-]+)") {
$LastLogin = [datetime]::ParseExact($Matches.LastLogin,"dd-MM-yyyy",$Null)
if ( $Expires -gt $LastLogin ) {
"{0} last login {1} is {2:0} days ago" -F $Matches.User, $Matches.LastLogin,
(New-TimeSpan -Start $LastLogin -End (Get-Date) ).TotalDays
}
}
}
Sample output
username last login 31-12-1999 is 6690 days ago
There is a way of doing that using regex (Regular Expressions). I will assume that the username which you get in your text file is .(dot) separated. For example, username looks like john.doe or jason.smith etc. And the entry in your text file looks like john.doe - 01-01-1999 or jason.smith - 02-02-1999. Keeping these things in mind our approach would be -
Using a regex we would get the username and date entry into a single variable.
Next up, we will split the pattern we have got in step 1 into two parts i.e. the username part and the date part.
Next we take the date part and if the difference is more than 30 days, we would take the other part (username) and store it in a variable.
So the code would look something like this -
$arr = #() #defining an array to store the username with date
$pattern = "[a-z]*[.][a-z]*\s[-]\s[\d]{2}[-][\d]{2}[-][\d]{4}" #Regex pattern to match entires like "john.doe - 01-01-1999"
Get-Content $logfile | Foreach {if ([Regex]::IsMatch($_, $pattern)) {
$arr += [Regex]::Match($_, $pattern)
}
}
$arr | Foreach {$_.Value} #Storing the matched pattern in $arr
$UserNamewithDate = $arr.value -split ('\s[-]\s') #step 2 - Storing the username and date into a variable.
$array = #() #Defining the array that would store the final usernames based on the time difference.
for($i = 1; $i -lt $UserNamewithDate.Length;)
{
$datepart = [Datetime]$UserNamewithDate[$i] #Casting the date part to [datetime] format
$CurrentDate = Get-Date
$diff = $CurrentDate - $datepart
if ($diff.Days -gt 30)
{
$array += $UserNamewithDate[$i -1] #If the difference between current date and the date received from the log is greater than 30 days, then store the corresponding username in $array
}
$i = $i + 2
}
Now you can access the usernames like $array[0], $array[1] and so on. Hope that helps!
NOTE - The regex pattern will change as per the format your usernames are defined. Here is a regex library which might turn out to be helpful.