Reading official docs it's obvious that PowerShell -match operator is more powerful than -like (due to regular expressions). Secondly, it seems ~10 times faster according to this article https://www.pluralsight.com/blog/software-development/powershell-operators-like-match.
Are there specific cases when I should prefer -like instead of -match? If there not, why at all should I use -like? Does it exist because of historical reasons?
I've never seen -match test that much faster than -like, if at all. Normally I see -like at about the same or better speed.
But I never rely on one test instance, I usually run through about 10K reps of each.
If your're looking for performance, always prefer string methods if they'll meet the requirements:
$string = '123abc'
(measure-command {
for ($i=0;$i -lt 1e5;$i++)
{$string.contains('3ab')}
}).totalmilliseconds
(measure-command {
for ($i=0;$i -lt 1e5;$i++)
{$string -like '*3ab*'}
}).totalmilliseconds
(measure-command {
for ($i=0;$i -lt 1e5;$i++)
{$string -match '3ab'}
}).totalmilliseconds
265.3494
586.424
646.4878
See this explanation from Differences Between -Like and -Match
In a nutshell, if you are thinking, 'I am probably going to need a wildcard to find this item', then start with -Like. However, if you are pretty sure of most of the letters in the word that you are looking for, then you are better off experimenting with -Match.
Here is a more technical distinction: -Match is a regular expression, whereas -Like is just a wildcard comparison, a subset of -Match.
So, whenever you are not sure what character classes, i.e. digits, letters, punctuation, etc., can there be inside, when you just want to match any character, you should be using -Like with its wildcards.
When you know there must be a digit at the start followed with 1+ sequences of a colon followed with alphanumeric characters up to the end of the string, use -Match with its powerful regular expressions.
You should prefer -like when your comparator string is a dos-style filename wildcard. If you have a cmdlet that is designed to look like a "standard" windows command line application, then you can expect file name parameters to include dos-style wildcards.
You might have a grep-like cmdlet that takes a regex and a list of files. I can imagine it being used like this:
> yourMagicGrepper "^Pa(tt).*rn" *.txt file.*
You would use -match when working with the first parameter, and -like for all other parameters.
In other words: it depends on your functional requirements.
Related
I m learning PowerShell and one of the task I did is to filter a Csv file records.
Based on this link: https://4sysops.com/archives/create-sort-and-filter-csv-files-in-powershell/ I tried something similar to:
Import-Csv -Path '.\sample.csv' | Select-Object EmailAddress,UniqueName,LastLoginDate | ? EmailAddress -like *gmail.com -Or ? EmailAddress -like *outlook.com | Export-Csv -Path $fileOut -NoTypeInformation
But the above gives me the error mentioned in the title.
Based on this link: https://www.computerperformance.co.uk/powershell/match/ I addressed the error by using Where-Object instead after the Select-Object line as follows:
Where-Object {$_.EmailAddress -Like "*gmail.com" -Or $_.EmailAddress -Like "*outlook.com"}
Why does the first example give me error but not the second example?
tl;dr
Both your commands use the Where-Object cmdlet; ? is simply a built-in alias for it.
However, your commands use different syntax forms: your first command uses the simpler and more concise, but feature-limited individual argument-based simplified syntax, whereas your second one uses the verbose, but fully featured script-block syntax - see next section.
Because you need to combine multiple -like operations, you must use script-block syntax - simplified syntax limits you to a single operation.
Regular, script block-based syntax:
Example:
# You're free to add additional expressions inside { ... }
Where-Object { $_.EmailAddress -like '*gmail.com' }
uses a single argument that is a script block ({ ... }), inside of which the condition to test is formulated based on the automatic $_ variable that represents the input object at hand.
This syntax:
Places no constraints on the complexity of the expression - the whole PowerShell language is at your disposal inside a script block.
However, it is somewhat verbose.
Simplified, multi-argument syntax:
Example:
# Equivalent of the above.
# Note the absence of { ... }, $_, and "..."
Where-Object EmailAddress -like *gmail.com
Simplified syntax is an alternative syntax that may be used with Where-Object as well as ForEach-Object, which:
as the name implies, is simpler and less verbose.
but is limited to a single conditional / operation based on a single property, or, in the case of method calls with ForEach-Object, the input object itself.
With simplified syntax the parts that make up a conditional / method call are passed as separate arguments, which therefore bind to distinct parameters that are specifically designed to work with this syntax:
Because separate arguments are used, there is no { ... } enclosure (no script block is used).
$_ need not be referenced, because its use is implied; e.g. EmailAddress is the equivalent of $_.EmailAddress in the script block-syntax.
A notable limitation as of PowerShell 7.2.x is that with Where-Object you cannot operate on the input object itself - you must specify a property. GitHub issue #8357 discusses overcoming this limitation in the future, but there hasn't been any activity in a long time.
As usual in argument-mode parsing, quoting around string values is optional, assuming they don't contain metacharacters such as spaces; e.g., *.gmail.com - without "..." or '...' - works with simplified syntax, whereas the expression-mode parsing inside the equivalent script block requires quoting, e.g. '*gmail.com'
I'm trying to work out if a string exists in an array, even if it's a substring of a value in the array.
I've tried a few methods and just can't get it to work, not sure where I'm going wrong.
I have the below code, you can see that $val2 exists within $val1, but I always get a FALSE when I run it.
$val1 = "folder1\folder2\folder3"
$val2 = "folder1\folder2"
$val3 = "folder9"
$val_array = #()
$val_array += $val1
$val_array += $val3
$null -ne ($val_array | ? { $val2 -match $_ }) # Returns $true
I also tried:
foreach ($item in $val_array) {
if ($item -match $val2) {
Write-Host "yes"
}
}
The -Match operator does a regular expression comparison. Where the backslash character (\) has a special meaning (it escapes the following character).
Instead you might use the -Like operator:
$val_array -Like "*$val2*"
Yields:
folder1\folder2\folder3
iRon's helpful answer offers the best solution to your problem, using wildcard matching via the -like operator.
Note:
The need to escape select characters in a search pattern in order for the pattern to be taken verbatim in principle also applies to the wildcard-based -like operator, not just to the regex-based -match operator, but since wildcard expressions have far fewer metacharacters than regexes - namely just *, ?, and [ - the need for such escaping doesn't often arise in practice; whereas regexes require \ as the escape characters, wildcards use `, and programmatic escaping can be achieved with [WildcardPattern]::Escape()
Unfortunately, as of PowerShell 7.2, there is no dedicated operator for verbatim substring matching:
A workaround for this limitation is to call the [string] .NET type's .Contains() method (on a single input string only), however, this performs case-sensitive matching, whereas PowerShell operators are case-insensitive by default, but offer case-sensitive variants simply by prefixing the operator name with c (e.g., -clike, -cmatch).
In Windows PowerShell, .Contains() is invariably case-sensitive, but in PowerShell (Core) 7+ an additional overload is available that offers case-insensitive matching:
'Foo'.Contains('fo') # -> $false, due to case difference
# PowerShell (Core) 7+ *only*:
'Foo'.Contains('fo', 'InvariantCultureIgnoreCase') # -> $true
Caveat: Despite the name similarity, PowerShell's -contains operator does not perform substring matching; instead, it tests whether a collection contains a given element (in full).
As for what you tried:
Your primary problem is that you've accidentally swapped the -match operator's operands: the search pattern - which is invariably interpreted as a regex (regular expression) - must be on the RHS.
As iRon points out, in order for your search pattern to be taken verbatim (literally), you need to escape regex metacharacters with \, and the robust, programmatic way to do this is with [regex]::Escape().
Therefore, the immediate fix would have been (? is a built-in alias of the Where-Object cmdlet):
# OK, but SLOW.
$val_array | ? { $_ -match [regex]::Escape($val2) }
However, this solution is inefficient (it involves the pipeline and a cmdlet).
Fortunately, PowerShell's comparison operators can be applied to arrays (collections) directly, in which case they act as filters, i.e. they return the sub-array of matching elements - see the docs.
iRon's answer uses this technique with -like, but it equally works with -match, so that your expression can be simplified to the following, much more efficient form:
# MUCH FASTER.
$val_array -match [regex]::Escape($val2)
Try the string method Contains:
$null -ne ($val_array | ? { $_.Contains($val2) })
I am very new to PowerShell and i am trying run some code if a string does not start with a certain character, however i can not get this to work with multiple characters.
This is the code that works fine.
if (-Not $recdata.StartsWith("1"))
{
//mycode.
}
But what i want is multiple checks like this
if (-Not $recdata.StartsWith("1") -Or -Not $recdata.StartsWith("2"))
{
//mycode.
}
But this does not work it breaks the whole function eventhough powershell does not throw any errors. I have tried multiple things but i cant find any solution
MundoPeter has pointed out the logic flaw in your approach - -or should be -and - and Santiago Squarzon has offered an alternative based on the regex-based -match operator.
Let me offer the following PowerShell-idiomatic solutions, taking advantage of the fact that PowerShell's operators offer negated variants simply by prepending not to their names:
$recdata[0] -notin '1', '2' # check 1st char of LHS against RHS array
$recdata -notlike '[12]*' # check LHS against wildcard expression
$recdata -notmatch '^[12]' # check LHS against regex
See also:
-in, the is-the-LHS-contained-in-the-RHS-collection operator
-like, the wildcard matching operator
-match, the regular-expression matching operator
EDIT
I think I now know what the issue is - The copy numbers are not REALLY part of the filename. Therefore, when the array pulls it and then is used to get the match info, the file as it is in the array does not exist, only the file name with no copy number.
I tried writing a rename script but the same issue exists... only the few files I manually renamed (so they don't contain copy numbers) were renamed (successfully) by the script. All others are shown not to exist.
How can I get around this? I really do not want to manually work with 23000+ files. I am drawing a blank..
HELP PLEASE
I am trying to narrow down a folder full of emails (copies) with the same name "SCADA Alert.eml", "SCADA Alert[1].eml"...[23110], based on contents. And delete the emails from the folder that meet specific content criteria.
When I run it I keep getting the error in the subject line above. It only sees the first file and the rest it says do not exist...
The script reads through the folder, creates an array of names (does this correctly).
Then creates an variable, $email, and assigns the content of that file. for each $filename in the array.
(this is where is breaks)
Then is should match the specific string I am looking for to the content of the $email var and return true or false. If true I want it to remove the email, $filename, from the folder.
Thus narrowing down the email I have to review.
Any help here would be greatly appreciated.
This is what I have so far... (Folder is in the root of C:)
$array = Get-ChildItem -name -Path $FolderToRead #| Get-Content | Tee C:\Users\baudet\desktop\TargetFile.txt
Foreach ($FileName in $array){
$FileName # Check File
$email = Get-Content $FolderToRead\$FileName
$email # Check Content
$ContainsString = "False" # Set Var
$ContainsString # Verify Var
$ContainsString = %{$email -match "SYS$,ROC"} # Look for String
$ContainsString # Verify result of match
#if ($ContainsString -eq "True") {
#Remove-Item $FolderToRead\$element
#}
}
Here's a PowerShell-idiomatic solution that also resolves your original problems:
Get-ChildItem -File -LiteralPath $FolderToRead | Where-Object {
(Get-Content -Raw -LiteralPath $_.FullName) -match 'SYS\$,ROC'
} | Remove-Item -WhatIf
Note: The -WhatIf common parameter in the command above previews the operation. Remove -WhatIf once you're sure the operation will do what you want.
Note how the $ character in the RHS regex of the -match operator is \-escaped in order to use it verbatim (rather than as metacharacter $, the end-of-input anchor).
Also, given that $ is also used in PowerShell's string interpolation, it's better to use '...' strings (single-quoted, verbatim strings) to represent regexes, assuming no actual up-front string expansion is needed before the regex engine sees the resulting string - see this answer for more information.
As for what you tried:
The error message stemmed from the fact that Get-Content $FolderToRead\$FileName binds the file-name argument, $FolderToRead\$FileName, implicitly (positionally) to Get-Content's -Path parameter, which expects PowerShell wildcard patterns.
Since your file names literally contain [ and ] characters, they are misinterpreted by the (implied) -Path parameter, which can be avoided by using the -LiteralPath parameter instead (which must be specified explicitly, as a named argument).
%{$email -match "SYS$,ROC"} is unnecessarily wrapped in a ForEach-Object call (% is a built-in alias); while that doesn't do any harm in this case, it adds unnecessary overhead;
$email -match "SYS$,ROC" is enough, though it needs to be corrected to
$email -match 'SYS\$,ROC', as explained above.
[System.IO.Directory]::EnumerateFiles($Folder) |
Where-Object {$true -eq [System.IO.File]::ReadAllText($_, [System.Text.Encoding]::UTF8).Contains('SYS$,ROC') } |
ForEach-Object {
Write-Host "Removing $($_)"
#[System.IO.File]::Delete($_)
}
Your mistakes:
%{$email -match "SYS$,ROC"} - What % is intended to be? This is ForEach-Object alias.
%{$email -match "SYS$,ROC"} - Why use -match? This is much slower than -like or String.Contains()
%{$email -match "SYS$,ROC"} - When using $ inside double quotes, you should escape this using single backtick symbol (I have `$100). Otherwise, everything after $ is variable name: Hello, $username; I's $($weather.ToString()) today!
Write debug output in a right way: use Write-Debug, Write-Verbose, Write-Host, Write-Warning, Write-Error, Write-Information.
Can be better:
Avoid using Get-ChildItem, because Get-ChildItem returns files with attributes (like mtime, atime, ctime, etc). This additional info is additional request per file. When you need only list of files, use native .Net EnumerateFiles from System.IO.Directory. This is significant performace boost on huge amounts of files.
Use RealAllText or ReadAllLines or ReadAllBytes from System.IO.File static class to be more concrete instead of using universal Get-Content.
Use pipelines ;-)
I currently have a series of Dyanmic Distribution Groups that I want to edit the recipient filter on. Our company is based by location number, which is a 4 digit number. This number is part of the display name of the dynamic distribution group...example webcontact_1234_DG....1234 would be the 4 digit center number. I am wanting to replace the recipient filter to have office -eq (1234) but have it pull the number from the display name. All display names are going to be the same number of characters before the 4 digit center number, for example, webcontact_1234_DG, webcontact_2345_DG, webcontact_3456_DG, etc.
I have a replace code but it changed the office location to null.
Here is the code that I am using:
$groups=Get-DynamicDistributionGroup -filter {alias -like "webcontact*"}
foreach ($group in $groups) {
$locationcode=$($group.alias).tostring.replace("\\D", "")
set-dynamicdistributiongroup $group -recipientfilter {((((office -eq $locationcode) -and
(have the recipent filter here but can't display due to confidential information) -and
(RecipientType -eq 'UserMailbox') -and
(-not(RecipientTypeDetails -eq 'RoomMailbox')) -and
(-not(RecipientTypeDetails -eq 'SharedMailbox'))))}
}
This is the best answer I can come up with based on the given information. I assume you are having a problem getting that number out of your webcontact_1234_DG, I would use regex to get those numbers out and into another variable.
$locationcode = [regex]::Match($group.alias,'^[^_]+_([^_]+)_[^_]+$').Groups[1].Value
The above code will grab everything in between the two underscores.
Try that and let me know.
It's easiest to use the -split operator to extract the number (text) from your values[1]:
$locationcode = ($group.Alias -split '_')[1]
-split '_' returns the array of tokens that result when you split the input string by _ chars., and [1] returns the 2nd token, which is the desired location number.
Simple example:
PS> ('webcontact_3456_DG' -split '_')[1]
3456
Alternatively, a corrected version of your own attempt (see below) would use the -replace operator:
PS> 'webcontact_3456_DG' -replace '\D' # remove all non-digit chars.
3456
As for what you tried:
$($group.alias).tostring.replace("\\D", "")
The [string] type's .Replace() searches by literal strings, so the search for \\D in your names will fail, and no replacement will occur.
Additionally, note that PowerShell's escape character is ` (backtick), not \, and that \ therefore doesn't require escaping: in PowerShell "\\D" literally becomes \\D.
As an aside: There is generally no need to put $(...) around expressions.
[1] You could also use the [string] type's .Split() .NET method in this simple case, but I suggest using the far more flexible -split PS operator as a matter of habit.