Using Regex to match a path in Powershell - powershell

I wanted to use Powershell's Regex to match a specific string in a path. I then want to pipe that into Get-FileHash and get the MD5 hashes of all the files.
The path can change, depending on where the user has these files. So for instance, it can be
C:\Program Files\StackOverflow\Powershell\Regex
or
C:\StackOverflow\Powershell\Regex
I want to make it so that only the Regex portion is selected which I can then -Recurse and pipe into Get-FileHash. Also please note that there can be subfolders inside of Regex (for instance: /Regex/Folder1 and /Regex/Folder2)
I can't quite get how to go about it. Please help me with this, thank you.

A regex is not needed. There's Split-Path that parses paths and returns desired sections.
For example (this is on MacOS, should work the same on Windows)
PS >pwd
/Users/myHomeDir
PS >pwd | split-path -leaf
myHomeDir
Also, a string that contains a path can be processed. Like so,
$p = "C:\Program Files\StackOverflow\Powershell\Regex"
split-path -leaf $p
Regex

Try the following pattern using named capture group 'path'
(?i)^(?<path>.+?(?:\\|\/)Regex)(?:(?:\\|\/)+.*)?$
If you need case-senstive matches, just remove the (?i) at the very beginning.
The pattern is aware of / and \ as path separator (including end of path). The rest of the path is not captured (?:)
For full explanation and samples see: https://regex101.com/r/xxaJzY/1/
Then use it the following way:
$folderName = 'Regex'
$regexPattern = "(?i)^(?<path>.+?(?:\\|\/)$folderName)(?:(?:\\|\/)+.*)?$"
<some other code>
$path = [regex]::Match($item, $regexPattern).Groups['path'].Value

Related

Removing the front part of a string based on an specific character. (\)

I first create my array with a list of files in a directory (and subdirectories) using the Cmdlet Get-ChildItem, and store them in a variable
$PSVariable = (Get-ChildItem -Path "F:\SQL_Backups" -Recurse *.bak).FullName
I echo the variable ($PSVariable), this is my output (as desired):
F:\SQL_Backups\INTRAPORTAL\StoreDevelopment\StoreDevelopment_backup_2021_02_11_003002_3930170.bak
F:\SQL_Backups\INTRAPORTAL\StoreDevelopment\StoreDevelopment_backup_2021_02_12_003002_4780885.bak
F:\SQL_Backups\JDASQL\DEVMOD\DEVMOD_backup_2021_02_10_190002_5130923.bak
F:\SQL_Backups\JDASQL\DEVMOD\DEVMOD_backup_2021_02_11_190003_7621021.bak
Goal:
I need to remove the directory path from each array entries so it only contains the file name that will be stored in a temporary variable within a foreach loop:
StoreDevelopment_backup_2021_02_11_003002_3930170.bak
StoreDevelopment_backup_2021_02_12_003002_4780885.bak
DEVMOD_backup_2021_02_10_190002_5130923.bak
DEVMOD_backup_2021_02_11_190003_7621021.bak
Some will recommend simply using (.Name) in the Get-ChildItem command, but I need the array to have both the path and filename (FullName) as the array's contents are being used for other parts of the function. I'm a novice when it comes to regular expressions and I can't seem to get the results in the goal section. I've even tried using trim() methods, but no luck. Any recommendations would greatly be appreciated. Thank you.
Expanding on what #AdminOfThings recommended, you are making more work for yourself than you need. PowerShell is an object based scripting language, so to succeed you should use its full POWER.
The approach you're taking now is to take only one property from this useful object and then find you need to start slicing and dicing it in order to make it work.
There's an easier way. We love easy here, and the easy way to do this is to take the full object and then pick and chose its properties where it makes sense, like this:
$i = 0
#changed to remove the .FullName at then end
$PSVariable = (Get-ChildItem -Path "F:\SQL_Backups" -Recurse *.bak)
ForEach ($item in $psVariable){
$i++
Write-host "Processing [$($item.Name)], item number $i of $($psVariable.Count)"
Copy-item -Path $item.FullName -Destination C:\temp -WhatIf
}
It gives you meaningful output and then you have the full selection of properties to work with.
The one that makes the most sense to use is just .Name as you reference above. But then you still have .FullName, which includes the qualified path as well.
If you want to see the full selection of properties, try this:
$PsVariable[0] | Format-list *
Offered only as an inferior option to that of FoxDeploy's you can also use Split-Path to get the filename from a path
$PSVariable = (Get-ChildItem -Path "F:\SQL_Backups" -Recurse *.bak).FullName
$PSVariable | Split-Path -Leaf

Read a file and then based on file content names, move the file from one folder to another using Powershell script

I need to read a file (e.g. file.txt) which has file names as its content. File names are separated by unique character (e.g. '#'). So my file.txt looks something like:
ABC.txt#
CDE.csv#
XYZ.txt#
I need to read its content line by line based on its extension. I have 1 source folder and 1 destination folder. Below is my scenario that I need to achieve:
If extension = txt then
check if that file name exists in destination_folder1 or destination_folder2
if that file exists then
copy that file from source_folder1 to destination_folder1
else delete that file from destination_folder1
Else display msg as "Invalid file"
I am new to powershell scripting. can someone pls help? Thanks in advance.
It will make my job easier if we assume the following pseudocode. Then you can take the elements I demonstrate and change them to fit your needs.
If the string from "file.txt" contains the file extension "txt" then continue.
If the file does not exist in the destination folder then copy the file from the source folder to the destination folder.
Use Get-Content to read a text file.
Get-Content .\file.txt
Get-Content processes files line by line. This has a few consequences:
Each line in our input text file will trigger our code.
Each time our code triggers, it will have input that looks like this: ABC.txt#
We can focus on solving the problem for one line.
If we need to evaluate strings, I suggest using regular expressions.
Remember, we are operating on a single line from the text file:
ABC.txt#
We need to detect the file extension.
A good place to start would be the end of the string.
In regular expressions, the end of a string is represented by $
So let's start there.
Here is our regular expression so far:
$
The next thing that would be useful is if we accounted for that # symbol. We can do that by adding it before $
#$
If there was a different character, we would add that instead: ;$ Keep in mind that there are reserved characters in regular expressions. So we might need to escape certain characters with a backslash: \$$
Now we have to account for the file extension.
We have three letters, we don't know what they are.
Regular expressions have a special escape sequence (called a character class) that can match any letter: \w
Let's add three of those.
\w\w\w#$
Now, while crafting regular expressions, it is a good idea to limit the text we're looking for.
As humans, we know we're looking for .txt# But, so far, the computer only knows about txt# with no dot. So it would accept .txt#, .xlsx#, and anythingGoes# as matches. We limited the right side of our string. Now let's limit the left side.
We're only interested in three characters. And the left side is bounded by a . So let's add that to our regular expression. I'll also mention that a period is a reserved character in regular expressions. So, we will have to escape it.
\.\w\w\w#$
So if we're looking at text like this
ABC.txt#
then our regular expression will output text like this
.txt#
Now, .txt# is a pretty good result. But we can make our job a little easier by limiting the result to just the file extension.
There are several ways of doing this. But I suggest using regular expression groups.
We create a group by surrounding our target with parentheses.
\.(\w\w\w)#$
This now produces output like:
txt
From here, we can just make intuitive comparisons like if txt = txt.
Another piece of the puzzle is testing whether a file already exists. For this we can use the Test-Path and Join-Path cmdlets.
$destination = ".\destination 01"
$exampleFile = "ABC.txt"
$destinationFilePath = Join-Path -Path $destination -ChildPath $exampleFile
Test-Path -Path $destinationFilePath
With these concepts, it is possible to write a working example.
# Folder locations.
$source = ".\source"
$destination = ".\destination 01"
# Load input file.
Get-Content .\file.txt |
Where-Object {
# Enter our regular expression.
# I've added an extra group to capture the file name.
# The $matches automatic variable is created when the -match comparison operator is used.
if ($_ -match '([\w ]+\.(\w\w\w))#$')
{
# Which file extensions are we interested in processing?
# Here $matches[2] represents the file extension: ex "txt".
# We use a switch statement to handle each type of file extension.
# Accept new file types by creating new switch cases.
switch ($matches[2])
{
"txt" {$true; Break}
#"csv" {$true; Break}
#"pdf" {$true; Break}
default {$false}
}
}
else { $false }
} |
ForEach-Object {
# Here $matches[1] is the file name captured from the input file.
$sourceFilePath = Join-Path -Path $source -ChildPath $matches[1]
$destinationFilePath = Join-Path -Path $destination -ChildPath $matches[1]
$fileExists = Test-Path -Path $destinationFilePath
# Copy the source file to the destination if the destination doesn't exist.
if (!$fileExists)
{ Copy-Item -Path $sourceFilePath -Destination $destinationFilePath }
}
Note on Copy-Item
Copy-Item has known issues.
Issue #10458 | PowerShell | GitHub
Issue #2581 | PowerShell | GitHub
You can substitute robocopy which is more reliable.
Robocopy - Wikipedia
The robocopy syntax is:
robocopy <source> <destination> [<file>[ ...]] [<options>]
where <source> and <destination> can be folders only.
So, if you want to copy a file, you have to write it like this:
robocopy .\source ".\destination 01" ABC.txt
We can invoke robocopy using Start-Process and the variables we already have.
# Copy the source file to the destination if the destination doesn't exist.
if (!$fileExists)
{
Start-Process -FilePath "robocopy.exe" -ArgumentList "`"$source`" `"$destination`" `"$($matches[1])`" /unilog+:.\robolog.txt" -WorkingDirectory (Get-Location) -NoNewWindow
}
Using Get-ChildItem
You use file.txt as input. If you wanted to gather a list of files on disc, you can use Get-ChildItem.
Multiple Conditions
You wrote "destination_folder1 or destination_folder2". If you need multiple conditions you can construct this with three things.
Use the if statement. Inside the test condition, you can add multiple conditions with logical -or And you can group statements together to make them easier to read.
Functions
If you need to move a piece of code around, you can use a function. Just remember to create parameters for the inputs to the function. Then call a PowerShell function without parentheses or commas:
# Calling a PowerShell function.
myFunction parameterOne parameterTwo parameterThree
Writing Output
You can use Write-Output to send text to the console.
Write-Output "Invalid File"
Further Reading
Here are some references which you might find useful.
about_Comparison_Operators - PowerShell | Microsoft Docs
about_Pipelines - PowerShell | Microsoft Docs
about_Switch - PowerShell | Microsoft Docs
Regular-Expressions.info - Regex Tutorial, Examples and Reference - Regexp Patterns
Where-Object (Microsoft.PowerShell.Core) - PowerShell | Microsoft Docs

Test-Path -isValid with wildcards

I am trying to validate paths so I can provide meaningful error logging, and I am running into an issue with wildcards.
This returns False unless there is a folder and something in it, but it should return True.
Test-Path -isValid -path:"C:\Somefolder\*"
And like this doesn't work because -literalPath doesn't interpret wildcards.
Test-Path -isValid -literalPath:"C:\Somefolder\*"
My sense is that I am going to have to test for wildcards, and if found Test-Path -isValid on the parent folder. But then I run into issues with -like because I can't really test for a condition like *.EXT. Which has me thinking the only real answer is a RegEx, but this feels like something so basic I shouldn't really need to resort to a RegEx and I am probably missing something.
Note that for a variety of reasons I am limited to PS v2.
EDIT: To clarify, the actual path is variable. Users provide a path in an XML file, I then validate the path and do something with it. So, it might be that the user wants to delete all TXT files in a certain path. Or all files. Or even all files and subfolders. Thus C:\Somefolder\* needs to be supported. If they had C:\\Somefolder\* or C:Somefolder\* I would want to flag that as an invalid path. But C:\Somefolder\* when Somefolder doesn't exist is not an invalid path, it's a missing folder and I want to flag that as a different error.
Indeed you need a regular expression for validating a path specification. Something like this should work:
$re = '^[a-z]:[/\\][^{0}]*$' -f [regex]::Escape(([IO.Path]::InvalidPathChars -join ''))
'C:\something\*' -match $re # returns $true
The expression will match any string starting with a letter followed by a colon, a forward or backslash, and any number of valid path characters.
Note that consecutive path separators are valid in a path, so C:\\something\* -match $re will evaluate to $true as well, as it should.
If you want to validate actual (existing) paths instead of path specs you can use Get-ChildItem:
function Test-WildcardPath($Path) {
Get-ChildItem $Path -ErrorAction SilentlyContinue >$null
return $?
}
Again, C:\\something\* will evaluate to $true, since consecutive path separators are allowed in a path.
Can you try like this :
Test-Path -IsValid C:\Somefolder
Edit : then why don't you leave out the wildchars for test-path?
$test.Substring(0,($test.length-($test.Split("\")[-1]).length-1))

In function repeat an action for each entered parameter

My main script run once gci on a specified drive via -path parameter , then it does multiple different tables from this output. Here below is a part of my script which does a specific table from an directory specified via -folder parameter, for example :
my-globalfunction -path d:\ -folder d:\folder
It work fine, but only for one entered folder path, the goal of this script is that user can enter multiple folders path and get a tables for each entered -folder parameter value, like this :
This clause in your Where-Object would be the issue:
$_.FullName.StartsWith($folder, [System.StringComparison]::OrdinalIgnoreCase)
The array of folders passed are most likely being cast as one long string which would never match. I had a regex solution posted but remembered a simpler way after looking at what your logic was trying to do.
Simpler Way
Even easier way is to put this information right into Get-ChildItem since it accepts string arrays for -Path. This way I don't think you even need to have 2 parameters since you never again use the results from $fol anyway. Based on the assumption that you were looking for all subfolders of $folder
$gdfolders = Get-ChildItem -Path $folder -Recurse -Force | Where-Object{$_.psiscontainer}
That would return all subfolders of the paths provided. If you have PowerShell 3.0 or higher this would even be easier.
$gdfolders = Get-ChildItem -Path $folder -Recurse -Force -Directory
Update from comments
The code you have displayed is incomplete which is what lead me to the solution that you see above. If you do use the variable $fol somewhere else that you do not show lets go back to my earlier regex solution which would work better in place with what you already have.
$regex = "^($(($folder | ForEach-Object{[regex]::Escape($_)}) -join "|")).+"
....
$gdfolders = $fol | Where-Object{($_.Attributes -eq "Directory") -and ($_.FullName -match $regex)}
What this will do is build a regex compare string with what I will assume is the logic of locate folders that begin with either of paths passed.
Using your example input of "d:\folder1", "d:\folder2" the variable $regex would work out to ^(d:\\folder1|d:\\folder2). The proper characters, like \, are escaped automatically by the static method [regex]::Escape which is applied to each element. We then use -join to place a pipe which, in this regex capture group means match whats on the left OR on the right. For completeness sake we state that the match has to occur at the beginning of the path with the caret ^ although this is most likely redundant. It would match paths that start with either "d:\folder1" or "d:\folder2". At the end of the regex string we have .+ which means match 1 to more characters. This should ensure we dont match the actual folder "d:\folder1" but meerly its children
Side Note
The quotes in the line with ’Size (MB)’ are not the proper ones which are '. If you have issues around that code consider changing the quotes.

Powershell concatenating text to a variable

My source files all reside in one folder whose path is contained in a variable named $template.
I need to specify the exact filename as each file goes to a different destination.
My goal is to merely concatenate the filename to the variable.
Example:
$template = "D:\source\templatefiles\"
Filename1 is: "graphic-183.jpg"
I have tried:
Join-Path $template graphic-183.jpg
Issuing this at the cli appears to do what I want.
But now, how do I reference this concatenated file path short of creating a new variable for each file? It isn't as simple as for-nexting my way through a list as depending on the filename that determines where the file goes.
I am toying with case else, elseIf, but surely it isn't this hard.
The bottom line is, I just want to prefix the folder path to each filename and hard code the destination as it will always be the same each time the script is run.
edit
I just edited this as I forgot to mention how I am trying to use this.
In my script I intend to have lines like:
Copy-Item -Path $template filename.ext -Destination $destfolder
It's the highlighted part above that I am trying to join $template to the filename.
Thanks for any advice.
-= Bruce D. Meyer
maybe this is what you want?
you can call cmdlets in place, using parentheses, like so:
Copy-Item -Path (Join-Path $template filename.ext) -Destination $destfolder
this causes PowerShell to go from "argument mode" to "expression mode" - i.e., it returns the output of the Join-Path cmdlet as an expression.
and yes, David's and Ansgar's suggestions are also helpful - try this to get full paths only:
(get-childitem $template) | select fullname
You could build the path like this:
$template = "D:\source\templatefiles\"
Copy-Item -Path "${template}filename.ext" ...
However, I think David's suggestion might be a better solution for your problem. You could map filenames to destination folders with a hash table and do something like this:
$locations = #{
"foo" = "C:\some",
"bar" = "C:\other",
...
}
Get-ChildItem $template | % { Copy-Item $_ $location[$_.Name] }