Find files with partial name match and remove desired file - powershell

I have a little over 12000 files that I need to sort through.
18-100-00000-LOD-H.pdf
18-100-00000-LOD-H-1C.pdf
21-200-21197-LOD-H.pdf
21-200-21197-LOD-H-1C.pdf
21-200-21198-LOD-H.pdf
21-200-21198-LOD-H-1C.pdf
I need a way to go through all the files and delete the LOD-H version of the files.
EX:
21-200-21198-LOD-H.pdf
21-200-21198-LOD-H-1C.pdf
With the partial match being the 5 digit code I need a script that would delete the LOD-H case of the partial match.
So far this is what I have but it won't work because I need to supply values for the pattern but since there isn't one set pattern and more like multiple patterns I don't know what to supply it with
$source = "\\Summerhall\GLUONPREP\Market Centers\~Pen Project\Logos\ALL Office Logos"
$destination = "C:\Users\joshh\Documents\EmptySpace"
$toDelete = "C:\Users\joshh\Documents\toDelete"
$allFiles = #(Get-ChildItem $source -File | Select-Object -ExpandProperty FullName)
foreach($file in $allFiles) {
$content = Get-Content -Path $file
if($content | Select-String -SimpleMatch -Quiet){
$dest = $destination
}
else{
$dest = $toDelete
}
}
Any help would be super appreciated, even links to something similar or even links to documentation so I can start piecing a script of my own would be super helpful.
Thank you!

This should work for what you need:
# Get a list of the files with -1C preceeding the extension
$1cFiles = #( ( Get-ChildItem -File "${source}/*-LOD-H-1C.pdf" ).Name )
# Retreive files that match the same pattern without 1C, and iterate over them
Get-ChildItem -File "${source}/*-LOD-H.pdf" | ForEach-Object {
# Get the name of the file if it had the -1C suffix preceeding the .ext
$useName = $_.Name.Insert($_.Name.LastIndexOf('.pdf'), '-1C')
# If the -1C version of the file exists, remove the current (non-1C) file
if( $1cFiles -contains $useName ) {
Remove-Item -Force $_
}
}
Basically, look for the 1C files in $toDelete, then iterate over the non-1C files in $toDelete, removing the non-1C file if adding -1C before the file extension matches an existing file with 1C in the name.

Related

Powershell dropping characters while creating folder names

I am having a strange problem in Powershell (Version 2021.8.0) while creating folders and naming them. I start with a number of individual ebook files in a folder that I set using Set-Location. I use the file name minus the extension to create a new folder with the same name as the e-book file. The code works fine the majority of the time with various file extensions I have stored in an array beginning of the code.
What's happening is that the code creates the proper folder name the majority of the time and moves the source file into the folder after it's created.
The problem is, if the last letter of the source file name, on files with the extension ".epub" end in an "e", then the "e" is missing from the end of the created folder name. I thought that I saw it also drop "r" and "p" but I have been unable to replicate that error recently.
Below is my code. It is set up to run against file extensions for e-books and audiobooks. Please ignore the error messages that are being generated when files of a specific type don't exist in the working folder. I am just using the array for testing and it will be filled automatically later by reading the folder contents.
This Code Creates a Folder for Each File and moves the file into that Folder:
Clear-Host
$SourceFileFolder = 'N:\- Books\- - BMS\- Books Needing Folders'
Set-Location $SourceFileFolder
$MyArray = ( "*.azw3", "*.cbz", "*.doc", "*.docx", "*.djvu", "*.epub", "*.mobi", "*.mp3", "*.pdf", "*.txt" )
Foreach ($FileExtension in $MyArray) {
Get-ChildItem -Include $FileExtension -Name -Recurse | Sort-Object | ForEach-Object { $SourceFileName = $_
$NewDirectoryName = $SourceFileName.TrimEnd($FileExtension)
New-Item -Name $NewDirectoryName -ItemType "directory"
$OriginalFileName = Join-Path -Path $SourceFileFolder -ChildPath $SourceFileName
$DestinationFilename = Join-Path -Path $NewDirectoryName -ChildPath $SourceFileName
$DestinationFilename = Join-Path -Path $SourceFileFolder -ChildPath $DestinationFilename
Move-Item $OriginalFileName -Destination $DestinationFilename
}
}
Thanks for any help you can give. Driving me nuts and I am pretty sure it's something that I am doing wrong, like always.
String.TrimEnd()
Removes all the trailing occurrences of a set of characters specified in an array from the current string.
TrimEnd method will remove all characters that matches in the character array you provided. It does not look for whether or not .epub is at the end of the string, but rather it trims out any of the characters in the argument supplied from the end of the string. In your case, all dots,e,p,u,b will be removed from the end until no more of these characters are within the string. Now, you will eventually (and you do) remove more than what you intended for.
I'd suggest using EndsWith to match your extensions and performing a substring selection instead, as below. If you deal only with single extension (eg: not with .tar.gz or other double extensions type), you can also use the .net [System.IO.Path]::GetFileNameWithoutExtension($MyFileName) method.
$MyFileName = "Teste.epub"
$FileExt = '.epub'
# Wrong approach
$output = $MyFileName.TrimEnd($FileExt)
write-host $output -ForegroundColor Yellow
#Output returns Test
# Proper method
if ($MyFileName.EndsWith($FileExt)) {
$output = $MyFileName.Substring(0,$MyFileName.Length - $FileExt.Length)
Write-Host $output -ForegroundColor Cyan
}
# Returns Tested
#Alternative method. Won't work if you want to trim out double extensions (eg. tar.gz)
if ($MyFileName.EndsWith($FileExt)) {
$Output = [System.IO.Path]::GetFileNameWithoutExtension($MyFileName)
Write-Host $output -ForegroundColor Cyan
}
You're making this too hard on yourself. Use the .BaseName to get the filename without extension.
Your code simplified:
$SourceFileFolder = 'N:\- Books\- - BMS\- Books Needing Folders'
$MyArray = "*.azw3", "*.cbz", "*.doc", "*.docx", "*.djvu", "*.epub", "*.mobi", "*.mp3", "*.pdf", "*.txt"
(Get-ChildItem -Path $SourceFileFolder -Include $MyArray -File -Recurse) | Sort-Object Name | ForEach-Object {
# BaseName is the filename without extension
$NewDirectory = Join-Path -Path $SourceFileFolder -ChildPath $_.BaseName
$null = New-Item -Path $NewDirectory -ItemType Directory -Force
$_ | Move-Item -Destination $NewDirectory
}

Compress File per file, same name

I hope you are all safe in this time of COVID-19.
I'm trying to generate a script that goes to the directory and compresses each file to .zip with the same name as the file, for example:
sample.txt -> sample.zip
sample2.txt -> sample2.zip
but I'm having difficulties, I'm not that used to powershell, I'm learning and improving this script. In the end it will be a script that deletes files older than X days, compresses files and makes them upload in ftp .. the part of excluding with more than X I've already managed it for days, now I grabbed a little bit on this one.
Last try at moment.
param
(
#Future accept input
[string] $InputFolder,
[string] $OutputFolder
)
#test folder
$InputFolder= "C:\Temp\teste"
$OutputFolder="C:\Temp\teste"
$Name2 = Get-ChildItem $InputFolder -Filter '*.csv'| select Name
Set-Variable SET_SIZE -option Constant -value 1
$i = 0
$zipSet = 0
Get-ChildItem $InputFolder | ForEach-Object {
$zipSetName = ($Name2[1]) + ".zip "
Compress-Archive -Path $_.FullName -DestinationPath "$OutputFolder\$zipSetName"
$i++;
$Name2++
if ($i -eq $SET_SIZE) {
$i = 0;
$zipSet++;
}
}
You can simplify things a bit, and it looks like most of the issues are because in your script example $Name2 will contain a different set of items than the Get-ChildItem $InputFolder will return in the loop (i.e. may have other objects other than .csv files).
The best way to deal with things is to use variables with the full file object (i.e. you don't need to use |select name). So I get all the CSV file objects right away and store in the variable $CsvFiles.
We can additionally use the special variable $_ inside the ForEach-Object which represents the current object. We also can use $_.BaseName to give us the name without the extension (assuming that's what you want, otherwise use $_Name to get a zip with the name like xyz.csv).
So a simplified version of the code can be:
$InputFolder= "C:\Temp\teste"
$OutputFolder="C:\Temp\teste"
#Get files to process
$CsvFiles = Get-ChildItem $InputFolder -Filter '*.csv'
#loop through all files to zip
$CsvFiles | ForEach-Object {
$zipSetName = $_.BaseName + ".zip"
Compress-Archive -Path $_.FullName -DestinationPath "$OutputFolder\$zipSetName"
}

Match string with specific numbers from array

I want to create a script that searches through a directory for specific ".txt" files with the Get-ChildItem cmdlet and after that it copies the ".txt" to a location I want. The hard part for me is to extract specific .txt files string from the array. So basically I need help matching specific files names in the array. Here is an example of the array I'm getting back with the following cmdlet:
$arrayObject = (Get-ChildItem -recurse | Where-Object { $_.Name -eq "*.txt"}).Name
The arrayobject variable is something like this:
$arrayobject = "test.2.5.0.txt", "test.1.0.0.txt", "test.1.0.1.txt",
"test.0.1.0.txt", "test.0.1.1.txt", "test.txt"
I want to match my array so it returns the following:
test.2.5.0.txt, test.1.0.0.txt, test.1.0.1.txt
Can someone help me with Regex to match the above file names from the $arrayObject?
As you already add the -Recurse parameter to Get-ChildItem, you can also use the -Include parameter like this:
$findThese = "test.2.5.0.txt", "test.1.0.0.txt", "test.1.0.1.txt"
$filesFound = (Get-ChildItem -Path 'YOUR ROOTPATH HERE' -Recurse -File -Include $findThese).Name
P.S. without the -Recurse parameter you need to add \* to the end of the rootfolder path to be able to use -Include
Maybe something like:
$FileList = Get-ChildItem -path C:\TEMP -Include *.txt -Recurse
$TxtFiles = 'test1.txt', 'test3.txt', 'test9.txt'
Foreach ($txt in $TxtFiles) {
if ($FileList.name -contains $txt) {Write-Host File: $Txt is present}
}
A general rule: Filter as left as possible! Less objects to be processed, less resources to be used, faster to be processed!
Hope it helps!
Please try to clarify what the regex should match.
I created a regex which matches out of the given filenames only the files you wanted to retrieve:
"*.[1-9].[0-9].[0-9].txt"
You can tryout the small check I wrote.
ForEach($file in $arrayobject){
if($file -LIKE "*.[1-9].[0-9].[0-9].txt"){
Write-Host $file
}}
I think the "-LIKE" operator would be better to check if a string matches a regex.
Let me know if this helps.
Sorry for the late reply. Just got back in the office today. My question has been misinterpreted but that's my fault. I wasn't clear what I really want to do.
What I want to do is search through a directory and retrieve/extract in my case the (major)version of a filename. So in my case file "test.2.5.0.txt" would be version 2.5.0. After that I will get the MajorVersion and that's 2. Then in an If statement I would check if it's greater or equal to 1 and then copy it to a specific destination. To add some context to it. It's nupkg files and not txt. But I figured it out. This is code:
$sourceShare = "\\server1name\Share\txtfilesFolder"
destinationShare = "\\server2name\Share\txtfilesFolder"
Get-ChildItem -Path $sourceShare `
-Recurse `
-Include "*.txt" `
-Exclude #("*.nuspec", "*.sha512") `
| Foreach-Object {
$fileName = [System.IO.Path]::GetFileName($_)
[Int]$majorVersion = (([regex]::Match($fileName,"(\d+(.\d+){1,})" )).Value).Split(".")[0]
if ($majorVersion -ge 1)
{
Copy-Item -Path $_.FullName `
-Destination $destinationShare `
-Force
}
}
If you have anymore advice. Let me know. I would be great to extract the major version without using the .Split method
Grtz

It sounds complex but its not, Comparing Files in PowerShell

I have a folder with a huge amount of files in it with lots of different extensions (abc, abc_trg, def, def_trg, ghi, ghi_trg, jkl, mno).
You will see that there are some files that have a matching 'trigger' file, but not all files in this folder need to have a trigger file, it is just the following extensions that must have a trigger file: abc, def, ghi.
filename1.abc
filename1.abc_trg
filename2.def
filename2.def_trg
filename3.abc
filename4.def_trg
filename5.ghi
filename6.jkl
filename7.mno
filename8.ghi
filename8.ghi_trg
filename9.jkl
i.e. The extension types that do have Trigger files (abc, abc_trg, def, def_trg, ghi, ghi_trg) must have a matching filename.
I need a PowerShell script that will analyse the and compare files that are meant to exist with a trigger filetype (abc, abc_trg, def, def_trg, ghi, ghi_trg) and if a match is found (e.g. filename1, filename2, filename8) or if there are files that have extensions not in this list, e.g. jkl & mno (filename6.jkl, filename7.mno, filename9.jkl) then those files are left/not touched.
If there are files that are meant to have a matching extension & trigger file, but do not, i.e. they have become orphaned, then these need to be deleted (e.g. filename3.abc, filename4.def_trg, filename5.ghi)
So the resultant file list should look like this:
filename1.abc
filename1.abc_trg
filename2.def
filename2.def_trg
filename6.jkl
filename7.mno
filename8.ghi
filename8.ghi_trg
filename9.jkl
Here is my code so far:
$strDir = "D:\Temp\FileCompareTest\"
$strFileTypesToIgnore = ".jkl",".mno"
$strExtABC = ".abc"
$strExtABC_Trigger = ".abc_trg"
$strExtDEF = ".def"
$strExtDEF_Trigger = ".def_trg"
$strExtGHI = ".ghi"
$strExtGHI_Trigger = ".ghi_trg"
$arrFiles = Get-ChildItem $strDir -exclude $strFileTypesToIgnore
ForEach ($objFile in $arrFiles) {
$strFilename = $objFile.BaseName
$strExtension = $objFile.Extension
If ($strExtension -eq ".abc") {
$arrFiles2 = Get-ChildItem $strDir -exclude $strFileTypesToIgnore
ForEach ($objFile2 in $arrFiles2) {
$strFilename2 = $objFile2.BaseName
$strExtension2 = $objFile2.Extension
If ($strExtension2 -eq ".abc_trg") {
If (Compare-Object $strFilename $strFilename2) {
Write-Host "match is: $strFilename$strExtension and $strFilename2$strExtension2"
} Else {
Write-Host "Not a match: $strFilename$strExtension and $strFilename2$strExtension2"
}
}
}
}
}
Can you help please?
Regards
Darren
Thanks for including your code. You have the right idea of comparing names and and extensions but lets try a different approach.
We loop each file and check its extension. Depending on if the file is a trg file or not we have a similar mean to check for its partner. Since we use a Where-Object clause the output files are passed onto the pipe to Remove-Item for deletion. Test with the -WhatIf switch to verify it is working.
I tried to use simple cmdlets and methods for string manipulation.
$path = "C:\temp\test"
$extension = "_trg"
$files = Get-ChildItem -Path $path
$fileNames = $files | Select-Object -ExpandProperty Name
$files | Where-Object{
# Check what extension this file is so we can find the appropriate partner
if($_.Extension.Contains($extension)){
# Attempt to find a matching non trg file
$fileNames -notcontains $_.Name.Substring(0, $_.Name.LastIndexOf($extension))
} else {
# Attempt to find a matching trg file
$fileNames -notcontains "$($_.Name)$extension"
}
} | Remove-Item -Force -Confirm:$false -WhatIf
We save all the file names in $fileNames and use -notcontains to see if the file in the loop has its partner in the list. If not it passed through the pipe.

Renaming a new folder file to the next incremental number with powershell script

I would really appreciate your help with this
I should first mention that I have been unable to find any specific solutions and I am very new to programming with powershell, hence my request
I wish to write (and later schedule) a script in powershell that looks for a file with a specific name - RFUNNEL and then renames this to R0000001. There will only be one of such 'RFUNELL' files in the folder at any time. However when next the script is run and finds a new RFUNNEL file I will this to be renamed to R0000002 and so on and so forth
I have struggled with this for some weeks now and the seemingly similar solutions that I have come across have not been of much help - perhaps because of my admittedly limited experience with powershell.
Others might be able to do this with less syntax, but try this:
$rootpath = "C:\derp"
if (Test-Path "$rootpath\RFUNNEL.txt")
{ $maxfile = Get-ChildItem $rootpath | ?{$_.BaseName -like "R[0-9][0-9][0-9][0-9][0-9][0-9][0-9]"} | Sort BaseName -Descending | Select -First 1 -Expand BaseName;
if (!$maxfile) { $maxfile = "R0000000" }
[int32]$filenumberint = $maxfile.substring(1); $filenumberint++
[string]$filenumberstring = ($filenumberint).ToString("0000000");
[string]$newName = ("R" + $filenumberstring + ".txt");
Rename-Item "$rootpath\RFUNNEL.txt" $newName;
}
Here's an alternative using regex:
[cmdletbinding()]
param()
$triggerFile = "RFUNNEL.txt"
$searchPattern = "R*.txt"
$nextAvailable = 0
# If the trigger file exists
if (Test-Path -Path $triggerFile)
{
# Get a list of files matching search pattern
$files = Get-ChildItem "$searchPattern" -exclude "$triggerFile"
if ($files)
{
# store the filenames in a simple array
$files = $files | select -expandProperty Name
$files | Write-Verbose
# Get next available file by carrying out a
# regex replace to extract the numeric part of the file and get the maximum number
$nextAvailable = ($files -replace '([a-z])(.*).txt', '$2' | measure-object -max).Maximum
}
# Add one to either the max or zero
$nextAvailable++
# Format the resulting string with leading zeros
$nextAvailableFileName = 'R{0:000000#}.txt' -f $nextAvailable
Write-Verbose "Next Available File: $nextAvailableFileName"
# rename the file
Rename-Item -Path $triggerFile -NewName $nextAvailableFileName
}