Check file names and delete files with lower number in suffix - powershell

I have a following task on PowerShell:
I need to check files on remote machines:
For instance:
Get-ChildItem \\ServerName\data\
In this folder I have following files:
standard_file.0.tst
standard_file.1.tst
standard_file.2.tst
standard_file.3.tst
So, i need to delete files with lower number prefix (based on file name).
In the end, into the folder should be only one file with biggest prefix.
For instance:
standard_file.3.tst
I broke up my mind - and have no any ideas how to perform this.
Could you please push me to the right direction?
Thanks in advance.

You could use a regex to get the number and cast it to an int. Then sort the filenames by the number using the Sort-Object cmdlet so the file with the highest number will be the last. Then you select all objects using Select-Object and skip the last one and finally remove it using Remove-Item:
Get-ChildItem '\ServerName\data\' |
Sort-Object { [int][regex]::Match($_, '.*?(\d+)\.[^.]+$').Groups[1].Value } |
Select-Object -SkipLast 1 |
Remove-Item
Regex:
.*?(\d+)\.[^.]+$

This will gather all the files in the path that have numerical suffixes in their names. The way that is done is by using a regex to match all of the digits on the end of the basename. Sorting on the result of that match in descending order will put the wanted file on the top of the list. We then remove the remaining files by skipping that first result.
$path = "c:\temp"
Get-ChildItem $path | Where-Object{$_.BaseName -match "\.\d+$"} |
Sort-Object -Property {$_.BaseName -match "\.(\d+)$";[int]$Matches[1]} -Descending |
Select-Object -Skip 1 |
Remove-Item -Confirm:$false -WhatIf
Remove -WhatIf when you are sure that it will remove the files you want.

Related

Foreach/copy-item based on name contains

I'm trying to create a list of file name criteria (MS Hotfixes) then find each file name containing that criteria in a directory and copy it to another directory. I think I'm close here but missing something simple.
Here is my current attempt:
#Create a list of the current Hotfixes.
Get-HotFix | Select-Object HotFixID | Out-File "C:\Scripts\CurrentHotfixList.txt"
#
#Read the list into an Array (dropping the first 3 lines).
$HotfixList = Get-Content "C:\Scripts\CurrentHotfixList.txt" | Select-Object -Skip 3
#
#Use the Hotfix names and copy the individual hotfixes to a folder
ForEach ($Hotfix in $HotfixList) {
Copy-Item -Path "C:\KBtest\*" -Include *$hotfix* -Destination "C:\KBtarget"
}
If I do a Write-Host $Hotfix and comment out my Copy-Item line I get the list of hotfixes as expected.
If I run just the copy command and input the file name I am looking for - it works.
Copy-Item -Path "C:\KBtest\*" -Include *kb5016693* -Destination "C:\KBtarget"
But when I run my script it copies all the files in the folder and not just the one file I am looking for. I have several hotfixes in that KBtest folder but only one that is correct for testing.
What am I messing up here?
The simplest solution to your problem, taking advantage of the fact that -Include can accept an array of patterns:
# Construct an array of include patterns by enclosing each hotfix ID
# in *...*
$includePatterns = (Get-HotFix).HotfixID.ForEach({ "*$_*" })
# Pass all patterns to a single Copy-Item call.
Copy-Item -Path C:\KBtest\* -Include $includePatterns -Destination C:\KBtarget
As for what you tried:
To save just the hotfix IDs to a plain-text file, each on its own line, use the following, don't use Select-Object -Property HotfixId (-Property is implied if you omit it), use Select-Object -ExpandProperty HotfixId:
Get-HotFix | Select-Object -ExpandProperty HotFixID | Out-File "C:\Scripts\CurrentHotfixList.txt"
Or, more simply, using member-access enumeration:
(Get-HotFix).HotFixID > C:\Scripts\CurrentHotfixList.txt
Using Select-Object -ExpandProperty HotfixID or (...).HotfixID returns only the values of the .HotfixID properties, whereas Select-Object -Property HotfixId - despite only asking for one property - returns an object that has a .HotfixID property - this is a common pitfall; see this answer for more information.
Then you can use a Get-Content call alone to read the IDs (as strings) back into an array (no need for Select-Object -Skip 3):
$HotfixList = Get-Content "C:\Scripts\CurrentHotfixList.txt"
(Note that, as the solution at the top demonstrates, for use in the same script you don't need to save the IDs to a file in order to capture them.)
This will likely fix your primary problem, which stems from how Out-File creates for-display string representations of the objects ([pscustomobject] instances) that Select-Object -Property HotfixID created:
Not only is there an empty line followed by a table header at the start of the output (which your Select-Object -Skip 3 call skips), there are also two empty lines at the end.
When these empty lines were read into $hotfix in your foreach loop, -Include *$hotfix* effectively became -Include **, which therefore copied all files.
first, you do not need to create and import those textfiles:
get-hotfix | ?{$_.hotfixid -notmatch 'KB5016594|KB5004567|KB5012170'} | %{
copy-item -path "C:\kbtest\$($_.HotfixId).exe" -Destination "C:\kbTarget"
}
This filters for the hotfixes you do not want, if you do not need it remove:
?{$_.hotfixid -notmatch 'KB5016594|KB5004567|KB5012170'}
I assume that those files are exe files in my example.

Powershell - Selective moving files into subfolder (keeping the newest of each FIRST13 letter grouping)

extreme powershell newbie here so please be gentle...
I have a filing system where where files in folders are generated semi-automatically, with multiple versions being kept as redundancy (we really do revert regularly).
Files within the folder are named with the first 13 characters as the identifier, with various dates or initials afterwards.
12345-A-01-01_XYZ_20191026.pdf
i.e. the file is 12345-A-01-01 and everything past the first 13 characters is "unpredictable"
FILE000000001xxxxxxx.pdf
FILE000000001yyyy.pdf
FILE000000001zzzzzz.pdf
FILE000000002xxxx.pdf
FILE000000002yyy.pdf
FILE000000002zz.pdf
FILE000000003xx.pdf
FILE000000003yyy.pdf
FILE000000003zzzz.pdf
I'm trying to write a script that can determine the newest version (by date modified file property) of each file "group"
i.e. the newest FILE000000001*.pdf etc
and slide all the others into the .\Superseded subfolder
All I've managed to get so far is a "list" sorting to show the newest at the top of "each" group... now I need to know how to keep that file, and move the others... any direction or help would be great thanks :)....
$_SourcePath = "C:\testfiles"
$_DestinationPath = "C:\testfiles\Superseded"
Get-ChildItem $_SourcePath |
Where-Object {-not $_.PSIsContainer} |
Group-Object { $_.Basename.Substring(0,12) } |
foreach {
$_.Group |
sort LastWriteTime -Descending
} | Move-Item -Destination $_DestinationPath
I think you are pretty close. Since you sorted descending order you should just skip the first file:
$SourcePath = "C:\testfiles"
$DestinationPath = "C:\testfiles\Superseded"
Get-ChildItem $SourcePath -File |
Group-Object { $_.Basename.Substring(0,12) } |
ForEach-Object {
$_.Group |
Sort-Object LastWriteTime -Descending |
Select-Object -skip 1 |
Move-Item -Destination $DestinationPath -WhatIf
# Note: Above, the move has to be in each iteration of the loop
# so we skip the first (newest) of each file.
}
You don't need Where-Object {-not $_.PSIsContainer} , use the -File Parameter instead.
Also I wouldn't name your variables $_***. That's bound to get confused with $_ like the pipeline variable.
I added -WhatIf to the move command so you can test without causing any damage ...
I didn't test it, but it looks about right.

Powershell Delete all files apart from the latest file per day

I have a folder that contains a lot of files, multiple files per day.
I would like to script something that deletes all but the latest file per day.
I have seen a lot of scripts that delete files over X days old but this is slightly different and having written no powershell before yesterday (I'm exclusively tsql), I'm not really sure how to go about it.
I'm not asking anyone to write the code for me but maybe describe the methods of achieving this would be good and I can go off an research how to put it into practise.
All files are in a single directory, no subfolders. there are files I dont want to delete, the files i want to delete have file name in format constant_filename_prefix_YYYYMMDDHHMMSS.zip
Is powershell the right tool? Should i instead be looking at Python (which I also don't know) Powershell is more convinient since other code we have is written in PS.
PowerShell has easy to use cmdlets for this kind of thing.
The question to me is if you want the use the dates in the file names, or the actual LastWriteTime dates of the files themselves (as shown in File Explorer).
Below two ways of handling this. I've put in a lot of code comments to help you get the picture.
If you want to remove the files based on their actual last write times:
$sourceFolder = 'D:\test' # put the path to the folder where your files are here
$filePrefix = 'constant_filename_prefix'
Get-ChildItem -Path $sourceFolder -Filter "$filePrefix*.zip" -File | # get files that start with the prefix and have the extension '.zip'
Where-Object { $_.BaseName -match '_\d{14}$' } | # that end with an underscore followed by 14 digits
Sort-Object -Property LastWriteTime -Descending | # sort on the LastWriteTime property
Select-Object -Skip 1 | # select them all except for the first (most recent) one
Remove-Item -Force -WhatIf # delete these files
OR
If you want to remove the files based the dates in the file names.
Because the date formats you used are sortable, you can safely sort on the last 14 digits of the file BaseName:
$sourceFolder = 'D:\test'
$filePrefix = 'constant_filename_prefix'
Get-ChildItem -Path $sourceFolder -Filter "$filePrefix*.zip" -File | # get files that start with the prefix and have the extension '.zip'
Where-Object { $_.BaseName -match '_\d{14}$' } | # that end with an underscore followed by 14 digits
Sort-Object -Property #{Expression = {$_.BaseName.Substring(14)}} -Descending | # sort on the last 14 digits descending
Select-Object -Skip 1 | # select them all except for the first (most recent) one
Remove-Item -Force -WhatIf # delete these files
In both alternatives you will find there is a switch -WhatIf at the end of the Remove-Item cmdlet. Yhis is for testing the code and no files wil actually be deleted. Instead, with this switch, in the console it writes out what would happen.
Once you are satisfied with this output, you can remove or comment out the -WhatIf switch to have the code delete the files.
Update
As I now understand, there are multiple files for several days in that folder and you want to keep the newest file for each day, deleting the others.
In that case, we have to create 'day' groups of the files and withing every group sort by date and delete the old files.
This is where the Group-Object comes in.
Method 1) using the LastWriteTime property of the files
$sourceFolder = 'D:\test' # put the path to the folder where your files are here
$filePrefix = 'constant_filename_prefix'
Get-ChildItem -Path $sourceFolder -Filter "$filePrefix*.zip" -File | # get files that start with the prefix and have the extension '.zip'
Where-Object { $_.BaseName -match '_\d{14}$' } | # that end with an underscore followed by 14 digits
Group-Object -Property #{Expression = { $_.LastWriteTime.Date }} | # create groups based on the date part without time part
ForEach-Object {
$_.Group |
Sort-Object -Property LastWriteTime -Descending | # sort on the LastWriteTime property
Select-Object -Skip 1 | # select them all except for the first (most recent) one
Remove-Item -Force -WhatIf # delete these files
}
Method 2) using the date taken from the file names:
$sourceFolder = 'D:\test' # put the path to the folder where your files are here
$filePrefix = 'constant_filename_prefix'
Get-ChildItem -Path $sourceFolder -Filter "$filePrefix*.zip" -File | # get files that start with the prefix and have the extension '.zip'
Where-Object { $_.BaseName -match '_\d{14}$' } | # that end with an underscore followed by 14 digits
Group-Object -Property #{Expression = { ($_.BaseName -split '_')[-1].Substring(0,8)}} | # create groups based on the date part without time part
ForEach-Object {
$_.Group |
Sort-Object -Property #{Expression = {$_.BaseName.Substring(14)}} -Descending | # sort on the last 14 digits descending
Select-Object -Skip 1 | # select them all except for the first (most recent) one
Remove-Item -Force -WhatIf # delete these files
}

PS Script to print directory names if file type in it

I am looking for a PS script that checks for a certain file type(.err) in a folder's sub-folders (depth -1) and if it finds at least one file with the required file type, prints only the sub-folder's name, without file patch or file name, e.g.:
[root folder]
[subfolder1]-has .err in it
[subfolder2]-doesn't have .err in it
[subfolder3]-doesn't have .err in it
[subfolder4]-has .err in it
[subfolder5]-has .err in it
Output:
[subfolder1]
[subfolder4]
[subfolder5]
I'm not good at PowerShell, so I only found how to list subfolder names which has .err files in it as many times as it has files inside.
(Get-ChildItem -Path C:\root -Depth 1 -recurse -filter *.err).DirectoryName | echo
Okay, after direction from #mklement0 my suggestion would be,
(Get-ChildItem (C:\root + "\*\*") -Filter "*.err").Directory.Name | select -Unique
If I understand the question properly, this should do what you want:
Get-ChildItem -Path 'C:\Root' -Depth 1 -Recurse -Filter *.err -File |
Group-Object -Property DirectoryName |
ForEach-Object { ($_.Name -split '\\')[-1] }
It searches 1 level deep through the subfolders and if it finds files with extension .err (no matter how many files are in that folder), it outputs the subfolder name only once.
If you are on PowerShell version below 3.0, change the top line into
Get-ChildItem -Path 'C:\Root' -Depth 1 -Recurse -Filter *.err |
Where-Object { !$_.PSIsContainer } |
Update: Karthick Ganesan's answer is the simplest approach.
Try the following:
(Get-ChildItem -Depth 1 -Filter *.err).Directory.Name | Get-Unique | Select -Skip 1
For brevity, I've omitted the -Path argument and also the -File switch that limits matching to files, as it's fair to assume that you won't have any directories named *.err.
The use of -Depth implies the use of -Recurse, so the latter needn't be specified.
.Directory.Name outputs the matching files' directory names as an array (via member-access enumeration, PSv3+).
Get-Unique weeds out duplicates, which is necessary, because a given directory will be output multiple times if it contains multiple *.err files.
Select -Skip 1 (Select is a built-in alias for Select-Object) skips the first output object, because it represents the input directory itself (depth 0).

Delete duplicate files with Powershell except the file specified

I am using the following code to delete duplicate files in one folder:
ls *.wav -recurse | get-filehash | group -property hash | where { $_.count -gt 1 } | % { $_.group | select -skip 1 } | del
I have two issues. I want to limit this to only one filehash at a time and I need to specify the file name I want to keep.
As an example, I have a folder named Recordings. The first five files listed all have the same filehash but only one has the filename that has already been entered in my database.
Recordings
It would be great if I could use the -Exclude parameter for the del cmdlet but that parameter does not accept pipeline input.
I also considered using the code above and then renaming the remaining file afterward but the code is not limited to one filehash.
It all depends on how you want it to work. For example, if you know the file name you want to keep in advance, you could do it this way:
$fileName = 'file1.txt'
$fileHash = Get-FileHash .\$filename
$duplicates = ls -Recurse | Get-FileHash | Where-Object {$_.Hash -eq $fileHash.Hash -and ($_.Path | Split-Path -Leaf) -ne $fileName }
$duplicates | del
This sequence sets the filename, gets the hash of that file, and then the main command checks for other files with that same hash that doesn't have the same filename.
Note: Test first to make sure this will do what you expect before you execute the del command.
Update: It appears that Get-FileHash puts some sort of lock on the files being hashed so you cannot immediately pipe to the del (Remove-Item) command. I modified the results to save the array of duplicates to a variable and then pass that to the delete command which works.