Powershell escaping quotation marks in a variable that is used in another variable - powershell

I've found plenty of post explaining how to literally escape both single and double quotation marks using either """" for one double quotation mark, or '''' for a single quotation mark (or just doing `"). I find myself in a situation where I need to search through a list of names that is input in a different query:
Foreach($Username in $AllMyUsers.Username){
$Filter = "User = '$Username'"
# do some more code here using $Filter
}
The problem occurs when I reach a username like O'toole or O'brian which contains quotation marks. If the string is literal I could escape it with
O`'toole
O''brian
etc.
But, since it's in a loop I need to escape the quotation mark for each user.
I tried to use [regex]::Escape() but that doesn't escape quotation marks.
I could probably do something like $Username.Replace("'","''") but it feels like there should be a more generic solution than having to manually escape the quotation marks. In other circumstances I might need to escape both single and double, and just tacking on .Replace like so $VariableName.Replace("'","''").Replace('"','""') doesn't feel like it's the most efficient way to code.
Any help is appreciated!
EDIT: This feels exactly like a "how can I avoid SQL injection?" question but for handling strings in Powershell. I guess I'm looking for something like mysqli_real_escape_string but for Powershell.

I could probably do something like $Username.Replace("'","''") but it feels like there should be a more generic solution than having to manually escape the quotation marks
As implied by Mathias R. Jessen's comment, there is no generic solution, given that you're dealing with an embedded '...' string, and the escaping requirements entirely depend on the ultimate target API or utility - which is unlikely to be PowerShell itself (where ' inside a '...' string must be doubled, i.e. represented as '').
In the case at hand, where you're passing the filter string to a System.Data.DataTable's .DataView's .RowFilter property, '' is required as well.
The conceptually cleanest way to handle the required escaping is to use -f, the format operator, combined with a separate string-replacement operation:
$Filter = "User = '{0}'" -f ($UserName -replace "'", "''")
Note how PowerShell's -replace operator - rather than the .NET [string] type's .Replace() method - is used to perform the replacement.
Aside from being more PowerShell-idiomatic (with respect to: syntax, being case-insensitive by default, accepting arrays as input, converting to strings on demand), -replace is regex-based, which also makes performing multiple replacements easier.
To demonstrate with your hypothetical .Replace("'","''").Replace('"','""') example:
#'
I'm "fine".
'# -replace '[''"]', '$0$0'
Output:
I''m ""fine"".

Related

Why does backtick in Set-Content not create multiple lines?

Both of these commands doesn't create multiple line text:
Set-Content .\test.md 'Hello`r`nWorld'
Set-Content .\test.md 'Hello\r\nWorld'
Only this can
Set-Content .\test.md #("Hello`nWorld")
Do you know why is that?
Escape sequences such as `r`n only work inside "...", i.e, expandable (interpolating) strings.
By contrast, '...' strings are verbatim strings that do not interpret their contents - even ` instances are used as verbatim (literally).
Only ` (the so-called backtick) serves as the escape character in PowerShell, not \.
That is, in both "..." and '...' strings a \ is a literal.
(However, \ is the escape character in the context of regexes (regular expressions), but it is then the .NET regex engine that interprets them, not PowerShell; e.g.,
"`r" -match '\r' is $true: the (interpolated) literal CR char. matched its escaped regex representation).
As for what you tried:
It is the fact that "Hello`nWorld" in your last command is a "..." string that made it work.
By contrast, enclosing the string in #(...), the array-subexpression operator, is incidental to the solution. (Set-Content's (positionally implied) -Value parameter is array-valued anyway (System.Object[]), so even a single string getting passed is coerced to an array).
Finally, note that Set-Content by default adds a trailing, platform-native newline to the output file; use -NoNewLine to suppress that, but note that doing so also places no newline between the (string representations of) multiple input objects, if applicable (in your case there's only one).
Therefore (note the -NoNewLine and the trailing `n):
Set-Content -NoNewLine .\test.md "Hello`nWorld`n"
Optional reading: design rationale for PowerShell's behavior:
Why doesn't PowerShell use the backslash (\) as the escape character, like other languages?
Because PowerShell must (also) function on Windows (it started out as Windows-only), use of \ as the escape character - as known from Unix (POSIX-compatible) shells such as Bash - is not an option, given that \ is used as the path separator on Windows.
If \ were the escape character, you'd have to use Get-ChildItem C:\\Windows\\System32 instead of Get-ChildItem C:\Windows\System32, which is obviously impractical in a shell, where dealing with file-system paths is very common.
Thus, a different character had to be chosen, which turned out to be `, the so-called backtick: At least on US keyboards, it is easy to type (just like \), and it has the benefit of occurring rarely (as itself) in real-world strings, so that the need to escape it rarely arises.
Note that the much older legacy shell on Windows, cmd.exe, too had to pick a different character: it chose ^, the so-called caret.
Why doesn't it use single quote and double quote interchangeably, like other languages?
Different languages made different design choices, but in making "..." strings interpolating, but '...' strings not, PowerShell did follow existing languages here, namely that of POSIX-compatible shells such as Bash.
As an improvement on the latter PowerShell also supports embedding verbatim ' inside '...', escaped as '' (e.g., '6'' tall')
Given PowerShell's commitment to backward compatibility, this behavior won't change, especially given how fundamental it is to the language.
Conceptually speaking, you could argue that the aspect of what quoting character a string uses should be separate from whether it is interpolating, so that you'd be free to situationally choose one or the other quoting style for syntactic convenience, while separately controlling whether interpolation should occur.
Thus, hypothetically, PowerShell could have used a separate sigil to make a string interpolating, say $"..." and $'...' (similar to what C# now offers, though it notably only has one string-quoting style).
(As an aside: Bash and Ksh do have this syntax form, but it serves a different purpose (localization of strings) and is rarely used in pratice).
In practice, however, once you know how "..." and '...' work in PowerShell, it isn't hard to make them work as intended.
See this answer for a juxtaposition of PowerShell, cmd.exe, and POSIX-compatible shells with respect to fundamental features.

How PowerShell does decide which mode when parsing?

I can't quite understand how PowerShell parses commands and need your help.
I read the following explanation by Microsoft's about_parsing documentation:
When processing a command, the PowerShell parser operates in expression mode or in argument mode:
In expression mode, character string values must be contained in quotation marks. Numbers not enclosed in quotation marks are treated as numerical values (rather than as a series of characters).
In argument mode, each value is treated as an expandable string unless it begins with one of the following special characters: dollar sign ($), at sign (#), single quotation mark ('), double quotation mark ("), or an opening parenthesis (().
If preceded by one of these characters, the value is treated as a value expression.
I can understand when parsing a command, PowerShell uses either expression mode or argument mode, but I can't quite understand the following examples.
$a = 2+2
Write-Output $a #4(int), expression mode
Write-Output $a/H #4/H(str), argument mode
I wonder PowerShell expands variable first and then decide which mode when parsing, but is it right?
If so, there's another question about data type.
It seems reasonable for me the former command produces integral, but the latter one doesn't. Why can integer 4 be put next to string /H?
I tried this example and it worked. It seems variables turn into string whatever data type they are when expanded. Is it right?
$b = 100
Add-Content C:\Users\Owner\Desktop\$b\test.txt 'test'
I appreciate for your help.
Edited to clarify the point after got the comment
I've got the comment that the both Write-Output examples are argument mode, so can the examples be interpreted like this?
Write-Output "$a"
Write-Output "$a/H"
I'm terribly sorry for too ambiguous question, but I want to know:
In argument mode, double quotations are omitted?
The Write-Output examples are quoted from microsoft's document I linked and it says the first example produces integral, but is it wrong?

CSV specification - double quotes at the start and end of fields

Question (because I can't work it out), should ""hello world"" be a valid field value in a CSV file according to the specification?
i.e should:
1,""hello world"",9.5
be a valid CSV record?
(If so, then the Perl CSV-XS parser I'm using is mildly broken, but if not, then $line =~ s/\342\200\234/""/g; is a really bad idea ;) )
The weird thing is is that this code has been running without issue for years, but we've only just hit a record that started with both a left double quote and contained no comma (the above is from a CSV pre-parser).
The canonical format definition of CSV is https://www.rfc-editor.org/rfc/rfc4180.txt. It says:
Each field may or may not be enclosed in double quotes (however
some programs, such as Microsoft Excel, do not use double quotes
at all). If fields are not enclosed with double quotes, then
double quotes may not appear inside the fields. For example:
"aaa","bbb","ccc" CRLF
zzz,yyy,xxx
Fields containing line breaks (CRLF), double quotes, and commas
should be enclosed in double-quotes. For example:
"aaa","b CRLF
bb","ccc" CRLF
zzz,yyy,xxx
If double-quotes are used to enclose fields, then a double-quote
appearing inside a field must be escaped by preceding it with
another double quote. For example:
"aaa","b""bb","ccc"
Last rule means your line should have been:
1,"""hello world""",9.5
But not all parsers/generators follow this standard perfectly, so you might need for interoperability reasons to relax some rules. It all depends on how much you control the CSV format writing and CSV format parsing parts.
That depends on the escape character you use. If your escape character is '"' (double quote) then your line should look like
1,"""hello world""",9.5
If your escape character is '\' (backslash) then your line should look like
1,"\"hello world\"",9.5
Check your parser/environment defaults or explicitly configure your parser with the escape character you need e.g. to use backslash do:
my $csv = Text::CSV_XS->new ({ quote_char => '"', escape_char => "\\" });

Powershell - Replacing substrings with wildcards

I am writing a function in powershell, and part of it needs to replace occurrences of substrings with a wildcard. Strings will look something like this:
"something-#{reference}-somethingElse-#{anotherReference}-01"
I want it to end up looking like this:
"something-*-somethingElse-*-01"
The problem I have here is that I don't know what "#{something}" will be, just that there will be multiple substrings enclosed inside a hashtag followed by curly braces. I've tried the Replace method like so:
$newString = $originalString.Replace('#{*}', '*')
I was hoping that would replace everything from the hashtag to the ending curly brace, but it doesn't work like that. I'm trying to avoid cumbersome code that is based on finding the indices of '#' and '}' and then replacing, and hoping there is a simpler and more elegant solution.
Your replace has at least one problem, possibly two;
the method $string.Replace() is from the .Net framework string class - it's PowerShell, but it's exactly what you'd get in C#, minimal PowerShell script-y convenience added on top - and it's for literal text replacements - it doesn't support wildcards or regular expressions.
The 'wildcard' support in PowerShell is quite limited, to the -like operator only, as far as I know. That can't do text replacing, and it's a convenience for people who don't know regular expression; behind the scenes it converts to a regular expression anyway. So the dream of a a*b replace won't work either.
As #PetSerAl comments, regular expressions and the PowerShell -replace operator are the PowerShell way to do every string pattern replace quickly and without .indexOf().
Their pattern #{[^}]*} expands to:
#{} on the outside, as literal characters
[^}] as a character class saying "not a } character, but anything else"
[*}]* - as many not }'s as there are.
So, match hash and open brace, everything that isn't the closing brace brace (to avoid overrunning past the closing brace), then the closing brace. Replace it all with literal *.
Implicitly, do that search/replace as many times as possible in the input string.

escaping single quote sign in PowerShell

I have a replace statement in my code whereby Band's is being replaced by Cig's. However when I put single quote it took the first sentence... Example 'Band'
I tried to use double quote but it does not work. Do you know how to escape the single quote sign?
-replace 'Band's', 'Cig's'
See Escape characters, Delimiters and Quotes and Get-Help about_Quoting_Rules from the built-in help (as pointed out by as Nacimota).
To include a ' inside a single-quoted string, simply double it up as ''. (Single-quote literals don't support any of the other escape characters.)
> "Band's Toothpaste" -replace 'Band''s', 'Cig''s'
Or, simply switch to double-quotes. (Double-quote literals are required when wishing to use interpolation or escape characters.)
> "Band's Toothpaste" -replace "Band's", "Cig's"
(Don't forget that -replace uses a regular expression)
Escape a single quote with two single quotes:
"Band's Toothpaste" -replace 'Band''s', 'Cig''s'
Also, this is a duplicate of
Can I use a single quote in a Powershell 'string'?
For trivial cases, you can use embedded escape characters. For more complex cases, you can use here-strings.
$Find = [regex]::escape(#'
Band's
'#)
$Replace = #'
Cig's
'#
"Band's Toothpaste" -replace $Find,$Replace
Then put the literal text you want to search for and replace in the here-strings.
Normal quoting rules don't apply within the here-string #' - '# delimiters, so you can put whatever kind of quotes you want, wherever you want them without needing any escape characters.
The [regex]::excape() on $Find will take care of doing the backslash escapes on any regex reserved characters that might be in the search pattern.