powershell -match Unexpect Results - powershell

i've written a simple PowerShell script that is designed to take a file name and then move the file into a particular folder.
The files in question are forms scanned in as PDF documents.
Within the file name, I have defined a unique string of characters used to help identify which form it is and then my script will sort the file into the correct folder.
I've captured the file name as a string variable.
I am using -match to evaluate the file name string variable and my issue is that match is acting like...well -like.
For example, my script looks for (F1) in the string and if this returns true my script will move the file into a folder named IT Account Requests.
This all works well until my script finds a file with (F10) in the name, as 'match' will evaluate the string and find a match for F1 also.
How can I use 'match' to return true for an exact string block?
I know this may sound like a fairly basic newbie question to ask, but how do I use -match to tell the different between the two file types?
I've scoured the web looking to learn how to force -match to do what I would like but maybe I need a re-think here and use something other than 'match' to gain the result I need?
I appreciate your time reading this.
Code Example:
$Master_File_name = "Hardware Request (F10).pdf"
if ($Master_File_name -match "(F1)"){Write-Output "yes"}
if ($Master_File_name -match "(F10)"){Write-Output "yes"}
Both if statements return 'yes'

-match does a regular expression based match against your string, meaning that the right-hand side argument is a regex pattern, not a literal string.
In regex, (F1) means "match on F and 1, and capture the substring as a separate group".
To match on the literal string (F1), escape the pattern either manually:
if($Master_File_Name -match '\(F1\)'){Write-Output 'yes'}
or have it done for you automatically using the Regex.Escape() method:
if($Master_File_Name -match [regex]::Escape('(F1)')){Write-Output 'yes'}

Related

Having problem with split method using powershell

I have an xml file where i have line some
<!--<__AMAZONSITE id="-123456780" instance ="CATZ00124"__/>-->
and i need the id and instance values from that particular line.
where i need have -123456780 as well as CATZ00124 in 2 different variables.
Below is the sample code which i have tried
$xmlfile = 'D:\Test\sample.xml'
$find_string = '__AMAZONSITE'
$array = #((Get-Content $xmlfile) | select-string $find_string)
Write-Host $array.Length
foreach ($commentedline in $array)
{
Write-Host $commentedline.Line.Split('id=')
}
I am getting below result:
<!--<__AMAZONSITE
"-123456780"
nstance
"CATZ00124"__/>
The preferred way still is to use XML tools for XML files.
As long a line with AMAZONSITE and instance is unique in the file this could do:
## Q:\Test\2019\09\13\SO_57923292.ps1
$xmlfile = 'D:\Test\sample.xml' # '.\sample.xml' #
## see following RegEx live and with explanation on https://regex101.com/r/w34ieh/1
$RE = '(?<=AMAZONSITE id=")(?<id>[\d-]+)" instance ="(?<instance>[^"]+)"'
if((Get-Content $xmlfile -raw) -match $RE){
$AmazonSiteID = $Matches.id
$Instance = $Matches.instance
}
LotPings' answer sensibly recommends using a regular expression with capture groups to extract the substrings of interest from each matching line.
You can incorporate that into your Select-String call for a single-pipeline solution (the assumption is that the XML comments of interest are all on a single line each):
# Define the regex to use with Select-String, which both
# matches the lines of interest and captures the substrings of interest
# ('id' an 'instance' attributes) via capture groups, (...)
$regex = '<!--<__AMAZONSITE id="(.+?)" instance ="(.+?)"__/>-->'
Select-String -LiteralPath $xmlfile -Pattern $regex | ForEach-Object {
# Output a custom object with properties reflecting
# the substrings of interest reported by the capture groups.
[pscustomobject] #{
id = $_.Matches.Groups[1].Value
instance = $_.Matches.Groups[2].Value
}
}
The result is an array of custom objects that each have an .id and .instance property with the values of interest (which is preferable to setting individual variables); in the console, the output would look something like this:
id instance
-- --------
-123456780 CATZ00124
-123456781 CATZ00125
-123456782 CATZ00126
As for what you tried:
Note: I'm discussing your use of .Split(), though for extracting a substring, as is your intent, .Split() is not the best tool, given that it is only the first step toward isolating the substring of interest.
As LotPings notes in a comment, in Windows PowerShell, $commentedline.Line.Split('id=') causes the String.Split() method to split the input string by any of the individual characters in split string 'id=', because the method overload that Windows PowerShell selects takes a char[] value, i.e. an array of characters, which is not your intent.
You could rectify this as follows, by forcing use of the overload that accepts string[] (even though you're only passing one string), which also requires passing an options argument:
$commentedline.Line.Split([string[] 'id=', 'None') # OK, splits by whole string
Note that in PowerShell Core the logic is reversed, because .NET Core introduced a new overload with just [string] (with an optional options argument), which PowerShell Core selects by default. Conversely, this means that if you do want by-any-character splitting in PowerShell Core, you must cast the split string to [char[]].
On a general note, PowerShell has the -split operator, which is regex-based and offers much more flexibility than String.Split() - see this answer.
Applied to your case:
$commentedline.Line -split 'id='
While id= is interpreted a regex by -split, that makes no difference here, given that the string contains no regex metacharacters (characters with special meaning); if you do want to safely split by a literal substring, use [regex]::Escape('...') as the RHS.
Note that -split is case-insensitive by default, as PowerShell generally is; however, you can use the -csplit variant for case-sensitive matching.

Get Substring of value when using import-Csv in PowerShell

I have a PowerShell script that imports a CSV file, filters out rows from two columns and then concatenates a string and exports to a new CSV file.
Import-Csv "redirect_and_canonical_chains.csv" |
Where { $_."Number of Redirects" -gt 1} |
Select {"Redirect 301 ",$_.Address, $_."Final Address"} |
Export-Csv "testing-export.csv" –NoTypeInformation
This all works fine however for the $_.Address value I want to strip the domain, sub-domain and protocol etc using the following regex
^(?:https?:\/\/)?(?:[^#\/\n]+#)?(?:www\.)?([^:\/\n]+)
This individually works and matches as I want but I am not sure of the best way to implement when selecting the data (should I use $match, -replace etc) or whether I should do it after importing?
Any advice greatly appreciated!
Many thanks
Mike
The best place to do it would be in the select clause, as in:
select Property1,Property2,#{name='NewProperty';expression={$_.Property3 -replace '<regex>',''}}
That's what a calculated property is: you give the name, and the way to create it.Your regex might need revision to work with PowerShell, though.
I've realized now that I can just use .Replace in the following way :)
Select {"Redirect 301 ",$_.Address.Replace('http://', 'testing'), $_."Final Address"}
Based on follow-up comments, the intent behind your Select[-Object] call was to create a single string with space-separated entries from each input object.
Note that use of Export-Csv then makes no sense, because it will create a single Length column with the input strings' length rather than output the strings themselves.
In a follow-up comment you posted a solution that used Write-Host to produce the output string, but Write-Host is generally the wrong tool to use, unless the intent is explicitly to write to the display only, thereby bypassing PowerShell's output streams and thus the ability to send the output to other commands, capture it in a variable or redirect it to a file.
Here's a fixed version of your command, which uses the -join operator to join the elements of a string array to output a single, space-separated string:
$sampleCsvInput = [pscustomobject] #{
Address = 'http://www.example.org/more/stuff';
'Final Address' = 'more/stuff2'
}
$sampleCsvInput | ForEach-Object {
"Redirect 301 ",
($_.Address -replace '^(?:https?://)?(?:[^#/\n]+#)?(?:www\.)?([^:/\n]+)', ''),
$_.'Final Address' -join ' '
}
Note that , - PowerShell's array-construction operator - has higher precedence than the -join operator, so the -join operation indeed joins all 3 preceding array elements.
The above yields the following string:
Redirect 301 /more/stuff more/stuff2

Splitting a string and selecting a substring in PowerShell

I am attempting to isolate and return a small variable string from a larger string.
I am struggling because the larger string I am extracting from is in list format. I can split this into substrings successfully, but I do not know how to select one of these substrings without returning the entire string. The string is generated by a command line process.
$StringList
AppTitle1.1.1221.aaa111
AppSubTitle
AnotherAppTitle1.1.1221.aaa111
AnotherAppSubTitle
...and so on
I can split the list string into substrings by line using regular expressions to split at whitespace (there is no whitespace within any given line).
$StringList -split "\s"
Once I have split the string into the desired substrings, however, I am not sure how to select the desired substring. The length of the list (i.e. the number of apps present in it) and the location of the app I need to retrieve the title of within that list are entirely variable, so I cannot simply use substring reference numbers. I've tried several approaches to selecting the substring, but each has simply returned the entire string, or nothing at all.
Here are two approaches I've attempted. The first returns the entire string list and the second returns nothing.
$DesiredAppTitle = Select-String -InputObject $StringList -Pattern "AnotherAppTitle"
or
$DesiredAppTitle = foreach ($_.substring in $StringList)
{
if ($_.substring -contains "AnotherAppTitle")
{
return $_.name
}
}
What I'd like for it to return is:
AnotherAppTitle1.1.1221.aaa111
I'm sure there are a million ways to do this, so if neither of my approaches seems like a good fit, I'm open to other suggestions. Any assistance would be greatly appreciated. Thanks in advance!
# Multi-line input string.
$StringList = #'
AppTitle1.1.1221.aaa111
AppSubTitle
AnotherAppTitle1.1.1221.aaa111
AnotherAppSubTitle
'#
# Split it into whitespace-separated tokens.
$tokens = -split $StringList
# Match the token of interest.
$tokens -match '^AnotherAppTitle'
The above yields:
AnotherAppTitle1.1.1221.aaa111
Note the use of regex-matching operator with anchor ^ to ensure that the search term matches at the start of a token, and the use of the unary form of the -split operator, which splits the input by any nonempty whitespace runs.
As for what you tried:
If you pass a multi-line string to Select-String, it is considered a single "line" and, in case of a match, that whole "line" is output.
foreach ($_.substring in $StringList) won't even run, because $_.substring is not a valid iteration variable (you shouldn't use $_, which is an automatic variable, as an enumeration variable at all, and the .substring access breaks the syntax).
If you used $_ instead of $_.substring, the loop would technically work (even though, again, $_ shouldn't be used as an iteration variable), but the loop would only execute once, for the entire multi-line string.
Even if $_.substring did refer to a line (it doesn't), -contains is the wrong operator to use, because it tests if a LHS collection contains the RHS value in full.
Also, use break to exit a loop, not return.
Using the -match approach as demonstrated at the top is the better approach, but if you did want to solve this with a foreach loop:
$DesiredAppTitle = foreach ($token in -split $StringList) {
if ($token -match '^AnotherAppTitle') { $token; break }
}

How can I test file existence using regular expressions

I want to test if the file C:\workspace\test_YYYYMMDD.txt, where YYYYMMDD means year, month, and date, exists on my disk.
How can I do this in PowerShell?
I know that test-path test_*.txt command returns true.
But test_*.txt also returns true when the file name is something like test_20170120asdf.txt, or test_2015cc1119aabb.txt.
I don't want file names like test_20170120asdf.txt being marked as true in test-path.
I'd like to apply regular expression test_\d{8}\.txt in test-path. How can I do this in PowerShell?
Something like:
gci C:\workspace\test_*.txt | ? {$_.Name -match '^test_\d{8}\.txt$'}
Wildcard expressions are much more limited in their matching abilities than regular expressions - see Get-Help about_Wildcards - but in this particular case they're enough:
Test-Path test_[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9].txt
If more sophisticated matching is needed, see LotPing's answer, which shows how to use regular expressions.

PowerShell Script - Report specific string from filename

I am currently trying to build a simple script that would report back a string from a filename
I have a directory full of files named as follows:
C:\[20141002134259308][302de103-6dc8-4e29-b303-5fdbd39c60c3][U0447744][10.208.15.40_54343][ABC_01].txt
C:\[20141002134239815][302de103-6dc8-4e29-b303-5fdbd39c60c3][U0011042][10.168.40.34_57350][ABC_01].txt
C:\[20141002134206386][302de103-6dc8-4e29-b303-5fdbd39c60c3][u1603381][10.132.171.132_54385][ABC_01].txt
C:\[2014100212260259][302de103-6dc8-4e29-b303-5fdbd39c60c3][U0010217][10.173.0.132_49921][ABC_01].txt
So, I'd like to extract from each filename the user ids that are identified starting with a letter U and seven digits, then create a new txt o csv and have all these Ids listed. That's it.
As Patrice pointed out, you really should try and do it yourself and come to us with the relevant piece of code you tried and the error that you are getting. That said, I'm bored, and this is really easy. I'd use a regex match against the name of the file, and then for each one that matched I'd output the captured string:
Get-ChildItem 'C:\Path\To\Files\*.txt' | Where{$_.Name -match "\[(U\d{7})\]"} | ForEach{$Matches[1]}
That will return:
U0447744
U0011042
u1603381
U0010217
If you want to output it to a file, just pipe that to Out-File, and specify the full path and name of the file you want to save that in.