Extract all Capitalized words from document using PowerShell - powershell

Using PowerShell extract all Capitalized words from a Document. Everything works until the last line of code as far as I can tell. Something wrong with my RegEx or is my approach all wrong?
#Extract content of Microsoft Word Document to text
$word = New-Object -comobject Word.Application
$word.Visible = $True
$doc = $word.Documents.Open("D:\Deleteme\test.docx")
$sel = $word.Selection
$paras = $doc.Paragraphs
$path = "D:\deleteme\words.txt"
foreach ($para in $paras)
{
$para.Range.Text | Out-File -FilePath $path -Append
}
#Find all capitalized words :( Everything works except this. I want to extract all Capitalized words
$capwords = Get-Content $path | Select-string -pattern "/\b[A-Z]+\b/g"

PowerShell uses strings to store regexes and has no syntax for regex literals such as /.../ - nor for post-positional matching options such as g.
PowerShell is case-insensitive by default and requires opt-in for case-sensitivity (-CaseSensitive in the case of Select-String).
Without that, [A-Z] is effectively the same as [A-Za-z] and therefore matches both upper- and lowercase (English) letters.
The equivalent of the g option is Select-String's -AllMatches switch, which looks for all matches on each input line (by default, it only looks for the first.
What Select-String outputs aren't strings, i.e. not the matching lines directly, but wrapper objects of type [Microsoft.PowerShell.Commands.MatchInfo] with metadata about each match.
Instances of that type have a .Matches property that contains array of [System.Text.RegularExpressions.Match] instances, whose .Value property contains the text of each match (whereas the .Line property contains the matching line in full).
To put it all together:
$capwords = Get-Content -Raw $path |
Select-String -CaseSensitive -AllMatches -Pattern '\b[A-Z]+\b' |
ForEach-Object { $_.Matches.Value }
Note the use of -Raw with Get-Content, which greatly speeds up processing, because the entire file content is read as a single, multi-line string - essentially, Select-String then sees the entire content as a single "line". This optimization is possible, because you're not interested in line-by-line processing and only care about what the regex captured, across all lines.
As an aside:
$_.Matches.Value takes advantage of PowerShell's member-access enumeration, which you can similarly leverage to avoid having to loop over the paragraphs in $paras explicitly:
# Use member-access enumeration on collection $paras to get the .Range
# property values of all collection elements and access their .Text
# property value.
$paras.Range.Text | Out-File -FilePath $path
.NET API alternative:
The [regex]::Matches() .NET method allows for a more concise - and better-performing - alternative:
$capwords = [regex]::Matches((Get-Content -Raw $path), '\b[A-Z]+\b').Value
Note that, in contrast with PowerShell, the .NET regex APIs are case-sensitive by default, so no opt-in is required.
.Value again utilizes member-access enumeration in order to extract the matching text from all returned match-information objects.

I modified your script and was able to get all the upper-case words in my test doc.
$word = New-Object -comobject Word.Application
$word.Visible = $True
$doc = $word.Documents.Open("D:\WordTest\test.docx")
$sel = $word.Selection
$paras = $doc.Paragraphs
$path = "D:\WordTest\words.txt"
foreach ($para in $paras)
{
$para.Range.Text | Out-File -FilePath $path -Append
}
# Get all words in the content
$AllWords = (Get-Content $path)
# Split all words into an array
$WordArray = ($AllWords).split(' ')
# Create array for capitalized words to capture them during ForEach loop
$CapWords = #()
# ForEach loop for each word in the array
foreach($SingleWord in $WordArray){
# Perform a check to see if the word is fully capitalized
$Check = $SingleWord -cmatch '\b[A-Z]+\b'
# If check is true, remove special characters and put it into the $CapWords array
if($Check -eq $True){
$SingleWord = $SingleWord -replace '[\W]', ''
$CapWords += $SingleWord
}
}
I had it come out as an array of capitalized words, but you could always join it back if you wanted it to be a string:
$CapString = $CapWords -join " "

Related

PowerShell: Replace string in all .txt files within directory

I am trying to replace every instance of a string within a directory. However my code is not replacing anything.
What I have so far:
Test Folder contains multiple files and folders containing content that I need to change.
The folders contain .txt documents, the .txt documents contain strings like this: Content reference="../../../PartOfPath/EN/EndofPath/Caution.txt" that i need to change into this: Content reference="../../../PartOfPath/FR/EndofPath/Caution.txt"
Before this question comes up, yes it has to be done this way, as there are other similar strings that I don't want to edit. So I cannot just replace all instances of EN with FR.
$DirectoryPath = "C:\TestFolder"
$Parts =#(
#{PartOne="/PartOfPath";PartTwo="EndofPath/Caution.txt"},
#{PartOne="/OtherPartOfPath";PartTwo="EndofPath/Note.txt"},
#{PartOne="/ThirdPartOfPath";PartTwo="OtherEndofPath/Warning.txt"}) | % { New-Object object | Add-Member -NotePropertyMembers $_ -PassThru }
Get-ChildItem $DirectoryPath | ForEach {
foreach($n in $Parts){
[string]$PartOne = $n.PartOne
[string]$PartTwo = $n.PartTwo
$ReplaceThis = "$PartOne/EN/$PartTwo"
$WithThis = "$PartOne/FR/$PartTwo"
(Get-Content $_) | ForEach {$_ -Replace $ReplaceThis, $WithThis} | Set-Content $_
}
}
The code will run and overwrite files, however no edits will have been made.
While troubleshooting I came across this potential cause:
This test worked:
$FilePath = "C:\TestFolder\Test.txt"
$ReplaceThis ="/PartOfPath/EN/Notes/Note.txt"
$WithThis = "/PartOfPath/FR/Notes/Note.txt"
(Get-Content -Path $FilePath) -replace $ReplaceThis, $WithThis | Set-Content $FilePath
But this test did not
$FilePath = "C:\TestFolder\Test.txt"
foreach($n in $Parts){
[string]$PartOne = $n.PartOne
[string]$PartTwo = $n.PartTwo
[string]$ReplaceThis = "$PartOne/EN/$PartTwo"
[string]$WithThis = "$PartOne/FR/$PartTwo"
(Get-Content -Path $FilePath) -replace $ReplaceThis, $WithThis | Set-Content $FilePath
}
If you can help me understand what is wrong here I would greatly appreciate it.
Thanks to #TessellatingHeckler 's comments I revised my code and found this solution:
$DirectoryPath = "C:\TestFolder"
$Parts =#(
#{PartOne="/PartOfPath";PartTwo="EndofPath/Caution.txt"},
#{PartOne="/OtherPartOfPath";PartTwo="EndofPath/Note.txt"},
#{PartOne="/ThirdPartOfPath";PartTwo="OtherEndofPath/Warning.txt"}) | % { New-Object object | Add-Member -NotePropertyMembers $_ -PassThru }
Get-ChildItem $LanguageFolderPath -Filter "*.txt" -Recurse | ForEach {
foreach($n in $Parts){
[string]$PartOne = $n.PartOne
[string]$PartTwo = $n.PartTwo
$ReplaceThis = "$PartOne/EN/$PartTwo"
$WithThis = "$PartOne/FR/$PartTwo"
(Get-Content $_) | ForEach {$_.Replace($ReplaceThis, $WithThis)} | Set-Content $_
}
}
There were two problems:
Replace was not working as I intended, so I had to use .replace instead
The original Get-ChildItem was not returning any values and had to be replaced with the above version.
PowerShell's -replace operator is regex-based and case-insensitive by default:
To perform literal replacements, \-escape metacharacters in the pattern or call [regex]::Escape().
By contrast, the [string] type's .Replace() method performs literal replacement and is case-sensitive, invariably in Windows PowerShell, by default in PowerShell (Core) 7+ (see this answer for more information).
Therefore:
As TessellatingHeckler points out, given that your search strings seem to contain no regex metacharacters (such as . or \) that would require escaping, there is no obvious reason why your original approach didn't work.
Given that you're looking for literal substring replacements, the [string] type's .Replace() is generally the simpler and faster option if case-SENSITIVITY is desired / acceptable (invariably so in Windows PowerShell; as noted, in PowerShell (Core) 7+, you have the option of making .Replace() case-insensitive too).
However, since you need to perform multiple replacements, a more concise, single-pass -replace solution is possible (though whether it actually performs better would have to be tested; if you need case-sensitivity, use -creplace in lieu of -replace):
$oldLang = 'EN'
$newLang = 'FR'
$regex = #(
"(?<prefix>/PartOfPath/)$oldLang(?<suffix>/EndofPath/Caution.txt)",
"(?<prefix>/OtherPartOfPath/)$oldLang(?<suffix>/EndofPath/Note.txt)",
"(?<prefix>/ThirdPartOfPath/)$oldLang(?<suffix>/OtherEndofPath/Warning.txt)"
) -join '|'
Get-ChildItem C:\TestFolder\Test.txt -Filter *.txt -Recurse | ForEach-Object {
($_ |Get-Content -Raw) -replace $regex, "`${prefix}$newLang`${suffix}" |
Set-Content -LiteralPath $_.FullName
}
See this regex101.com page for an explanation of the regex and the ability to experiment with it.
The expression used as the replacement operand, "`${prefix}$newLang`${suffix}", mixes PowerShell's up-front string interpolation ($newLang, which could also be written as ${newLang}) with placeholders referring to the named capture groups (e.g. (?<prefix>...)) in the regex, which only coincidentally use the same notation as PowerShell variables (though enclosing the name in {...} is required; also, here the $ chars. must be `-escaped to prevent PowerShell's string interpolation from interpreting them); see this answer for background information.
Note the use of -Raw with Get-Content, which reads a text file as a whole into memory, as a single, multi-line string. Given that you don't need line-by-line processing in this case, this greatly speeds up the processing of a given file.
As a general tip: you may need to use the -Encoding parameter with Set-Content to ensure the desired character encoding, given that PowerShell never preserves a file's original coding when reading it. By default, you'll get ANSI-encoded files in Windows PowerShell, and BOM-less UTF-8 files in PowerShell (Core) 7+.

Sort-Object -Unique

I'm making a script that collects all the subkeys from a specific location and converts the REG_BINARY keys to text, but for some reason I can't remove the duplicate results or sort them alphabetically.
PS: Unfortunately I need the solution to be executable from the command line.
Code:
$List = ForEach ($i In (Get-ChildItem -Path 'HKCU:SOFTWARE\000' -Recurse)) {$i.Property | ForEach-Object {([System.Text.Encoding]::Unicode.GetString($i.GetValue($_)))} | Select-String -Pattern ':'}; ForEach ($i In [char[]]'ABCDEFGHIJKLMNOPQRSTUVWXYZ') {$List = $($List -Replace("$i`:", "`n$i`:")).Trim()}; $List | Sort-Object -Unique
Test.reg:
Windows Registry Editor Version 5.00
[HKEY_CURRENT_USER\SOFTWARE\000\Test1]
"HistorySZ1"="Test1"
"HistoryBIN1"=hex:43,00,3a,00,5c,00,54,00,65,00,73,00,74,00,5c,00,44,00,2e,00,\
7a,00,69,00,70,00,5c,00,00,00,43,00,3a,00,5c,00,54,00,65,00,73,00,74,00,5c,\
00,43,00,2e,00,7a,00,69,00,70,00,5c,00,00,00,43,00,3a,00,5c,00,54,00,65,00,\
73,00,74,00,5c,00,42,00,2e,00,7a,00,69,00,70,00,5c,00,00,00,43,00,3a,00,5c,\
00,54,00,65,00,73,00,74,00,5c,00,41,00,2e,00,7a,00,69,00,70,00,5c,00,00,00
[HKEY_CURRENT_USER\SOFTWARE\000\Test2]
"HistorySZ2"="Test2"
"HistoryBIN2"=hex:4f,00,3a,00,5c,00,54,00,65,00,73,00,74,00,5c,00,44,00,2e,00,\
7a,00,69,00,70,00,5c,00,00,00,43,00,3a,00,5c,00,54,00,65,00,73,00,74,00,5c,\
00,43,00,2e,00,7a,00,69,00,70,00,5c,00,00,00,44,00,3a,00,5c,00,54,00,65,00,\
73,00,74,00,5c,00,42,00,2e,00,7a,00,69,00,70,00,5c,00,00,00,41,00,3a,00,5c,\
00,54,00,65,00,73,00,74,00,5c,00,41,00,2e,00,7a,00,69,00,70,00,5c,00,00,00
The path strings that are encoded in your array of bytes are separated with NUL characters (code point 0x0).
Therefore, you need to split your string by this character into an array of individual paths, on which you can then perform operations such as Sort-Object:
You can represent a NUL character as "`0" in an expandable PowerShell string, or - inside a regex to pass to the -split operator - \0:
# Convert the byte array stored in the registry to a string.
$text = [System.Text.Encoding]::Unicode.GetString($i.GetValue($_))
# Split the string into an *array* of strings by NUL.
# Note: -ne '' filters out empty elements (the one at the end, in your case).
$list = $text -split '\0' -ne ''
# Sort the list.
$list | Sort-Object -Unique
After many attempts I discovered that it is necessary to use the Split command to make the lines break and thus be able to organize the result.
{$List = ($List -Replace("$i`:", "`n$i`:")) -Split("`n")}

Finding and changing a string inside a text file using Powershell

I am trying to change the domains of emails inside a text file for example "john#me.com to john#gmail.com". The emails are stored in a array and I am currently using a for loop with the replace method but I cannot get it to work. Here is the code that I have so far.
$folders = #('Folder1','Folder2','Folder3','Folder4','Folder5')
$names = #('John','Mary','Luis','Gary', 'Gil')
$emails = #("John#domain.com", "Mary#domain.com", "Luis#domain.com", "Gary#domain.com", "Gil#domain.com")
$emails2 = #("John#company.com", "Mary#company.com", "Luis#company.com", "Gary#company.com", "Gil#comapny.com")
$content = "C:\Users\john\Desktop\contact.txt"
#create 10 new local users
foreach ($user in $users){ New-LocalUser -Name $user -Description "This is a test account." -NoPassword }
#Create 5 folders on desktop
$folders.foreach({New-Item -Path "C:\Users\John\Desktop\$_" -ItemType directory})
#create 5 folders on in documents
$folders.foreach({New-Item -Path "C:\users\john\Documents\$_" -ItemType directory})
#create contact.tct
New-Item -Path "C:\Users\John\Desktop" -Name "contact.txt"
#add 5 names to file
ForEach($name in $names){$name | Add-Content -Path "C:\Users\John\Desktop\contact.txt"}
#add 5 emails to file
ForEach($email in $emails){$email | Add-Content -Path "C:\Users\John\Desktop\contact.txt"}
#change emails to #comapny.com
for($i = 0; $i -lt 5; $i++){$emails -replace "$emails[$i]", $emails2[$i]}
In your particular example, you want to replace one string with another in each of your array elements. You can do that without looping:
$emails = $emails -replace '#domain\.com$','#company.com'
Since -replace uses regex matching, the . metacharacter must be escaped to be matched literally. In your case it probably does not matter since . matches any character, but for completeness, you should escape it.
Using the .NET Regex class method Escape(), you can programmatically escape metacharacters.
$emails -replace [regex]::Escape('#domain.com'),'#company.com'
With your code, in order to update $emails, you need to interpolate your array strings properly and update your variable on each loop iteration:
for($i = 0; $i -lt 5; $i++) {
$emails = $emails -replace $emails[$i], $emails2[$i]
}
$emails # displays the updates
If $emails contains other regex metacharacters besides just the single ., it could be another reason why you are having matching issues. It would then just be easiest to escape the metacharacters:
for($i = 0; $i -lt 5; $i++) {
$emails = $emails -replace [regex]::Escape($emails[$i]), $emails2[$i]
}
$emails # displays the updates
Explanation:
When double quotes are parsed (if not inside a verbatim string), the parser will do string expansion. When this happens to variables references that include operators, only the variables are expanded and the rest of the quoted expression including the operator characters is treated as a verbatim string. You can see this with a trivial example:
$str = 'my string 1','my string 2'
"$str[0]"
Output:
my string 1 my string 2[0]
To get around this behavior, you either need to not use quotes around the expression or use the sub-expression operator $():
$str[0]
"$($str[0])"
Note that a quoted array reference will convert the array into a string. Each element of that array will be separated based on the $OFS value (single space by default) of your environment.

Using regex in a key/value lookup table in powershell?

I am creating the below script to search through and replace data in a set of files. The problem I'm running into is I need to ONLY match if it's the beginning of the line, and I'm not sure how/where would I use regex in the below example (e.g. ^A, ^B) when doing the comparison? I tried putting the caret in front of the name values in the table, but that didn't work...
$lookupTable = #{
'A'='1';
'B'='2'
#etc
}
Get-ChildItem 'c:\windows\system32\dns' -Filter *.dns |
Foreach-Object {
$file = $_
Write-Host "$file"
(Get-Content -Path $file -Raw) | ForEach-Object {
$line = $_
$lookupTable.GetEnumerator() | ForEach-Object {
$line = $line -replace $_.Name, $_.Value
}
$line
} | Set-Content -Path $file
}
The -replace operator accepts Regex. Just $line = $line -replace "^$($_.Name)", "$_.Value".
the way that regex works makes getting a proper "start of line" marker into the regex pattern along with the $VarName a tad iffy. so i broke it out into it's own line and used the -f string format operator to build the regex pattern.
then i used the way that -replace works on an array of strings that one usually gets from Get-Content to work on the whole array at each pass.
note that the strings have lower case items where they otta be replaced, and uppercase items where the item should NOT be replaced. [grin]
$LookUpTable = #{
A = 'Wizbang Shadooby'
Z = '666 is the number of the beast'
}
$LineList = #(
'a sdfq A er Z xcv'
'qwertyuiop A'
'z xcvbnm'
'z A xcvbnm'
'qwertyuiop Z'
)
$LookUpTable.GetEnumerator() |
ForEach-Object {
$Target = '^{0}' -f $_.Name
$LineList = $LineList -replace $Target, $_.Value
}
$LineList
output ...
Wizbang Shadooby sdfq A er Z xcv
qwertyuiop A
666 is the number of the beast xcvbnm
666 is the number of the beast A xcvbnm
qwertyuiop Z
# Here is a complete, working script that beginners can read.
# This thread
# Using regex in a key/value lookup table in powershell?
# https://stackoverflow.com/questions/57277282/using-regex-in-a-key-value-lookup-table-in-powershell
# User-modifiable variables.
# substitutions
# We need to specify what we're looking for (keys).
# We need to specify our substitutions (values).
# Example: Looking for A and substituting 1 in its place.
# Add as many pairs as you like.
# Here I use an array of objects instead of a Hashtable so that I can specify upper- and lowercase matches.
# Use the regular expression caret (^) to match the beginning of a line.
$substitutions = #(
[PSCustomObject]#{ Key = '^A'; Value = '1' },
[PSCustomObject]#{ Key = '^B'; Value = '2' },
[PSCustomObject]#{ Key = '^Sit'; Value = '[Replaced Text]' }, # Example for my Latin placeholder text.
[PSCustomObject]#{ Key = 'nihil'; Value = '[replaced text 2]' }, # Lowercase example.
[PSCustomObject]#{ Key = 'Nihil'; Value = '[Replaced Text 3]' } # Omit comma for the last array item.
)
# Folder where we are looking for files.
$inputFolder = 'C:\Users\Michael\PowerShell\Using regex in a key value lookup table in powershell\input'
# Here I've created some sample files using Latin placeholder text from
# https://lipsum.com/
# Folder where we are saving the modified files.
# This can be the same as the input folder.
# I'm creating this so we can test without corrupting the original files.
$outputFolder = 'C:\Users\Michael\PowerShell\Using regex in a key value lookup table in powershell\output'
#$outputFolder = $inputFolder
# We are only interested in files ending with .dns
$filterString = '*.dns'
# Here is an example for text files.
#$filterString = '*.txt'
# For all files.
#$filterString = '*.*'
# More info.
# https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.management/get-childitem?view=powershell-6#parameters
# Search on the page for -Filter
# You won't need to update any variables after this line.
# ===================================================================
# Generate a list of files to look at.
$fileList = Get-ChildItem $inputFolder -Filter $filterString
# Simple example.
# get-content .\apple.dns | % { $_ -replace "sit", "michael" } | set-content "C:\output\apple.dns"
# input file substitutions output
# Set up loops.
# For each file.
#{
# For each key-value pair.
#}
# "For each key-value pair."
# Create a function.
# Pipe in a string.
# Specify a list of substitutions.
# Make the substitutions.
# Output a modified string.
filter find_and_replace ([object[]] $substitutions)
{
# The automatic variable $_ will be a line from the file.
# This comes from the pipeline.
# Copy the input string.
# This avoids modifying a pipeline object.
$myString = $_
# Look at each key-value pair passed to the function.
# In practice, these are the ones we defined at the top of the script.
foreach ($pair in $substitutions)
{
# Modify the strings.
# Update the string after each search.
# case-sensitive -creplace instead of -replace
$myString = $myString -creplace $pair.Key, $pair.Value
}
# Output the final, modified string.
$myString
}
# "For each file."
# main
# Do something with each file.
foreach ($file in $fileList)
{
# Where are we saving the output?
$outputFile = Join-Path -Path $outputFolder -ChildPath $file.Name
# Create a pipeline.
# Pipe strings to our function.
# Let the function modify the strings.
# Save the output to the output folder.
# This mirrors our simple example but with dynamic files and substitutions.
# find_and_replace receives strings from the pipeline and we pass $substitutions into it.
Get-Content $file | find_and_replace $substitutions | Set-Content $outputFile
# The problem with piping files into a pipeline is that
# by the time the pipeline gets to Set-Content,
# we only have modified strings
# and we have no information to create the path for an output file.
# ex [System.IO.FileInfo[]] | [String[]] | [String] | Set-Content ?
#
# Instead, we're in a loop that preserves context.
# And we have the opportunity to create and use the variable $outputFile
# ex foreach ($file in [System.IO.FileInfo[]])
# ex $outputFile = ... $file ...
# ex [String[]] | [String] | Set-Content $outputFile
# Quote
# (Get-Content -Path $file -Raw)
# By omitting -Raw, we get: one string for each line.
# This is instead of getting: one string for the whole file.
# This keeps us from having to use
# the .NET regular expression multiline option (and the subexpression \r?$)
# while matching.
#
# What it is.
# Multiline Mode
# https://learn.microsoft.com/en-us/dotnet/standard/base-types/regular-expression-options#Multiline
#
# How you would get started.
# Miscellaneous Constructs in Regular Expressions
# https://learn.microsoft.com/en-us/dotnet/standard/base-types/miscellaneous-constructs-in-regular-expressions
}

Remove comment blocks in Powershell

I'm using the following bit of code to process a SQL Script and split it up using the GO command:
[string]$batchDelimiter = "[gG][oO]"
$scriptContent = Get-Content $sqlScript | Out-String
$batches = $scriptContent -split "\s*$batchDelimiter\s*\r?\n"
foreach($batch in $batches)
{
if(![string]::IsNullOrEmpty($batch.Trim()))
{
$SqlCmd.CommandText = $batch
$reader = $SqlCmd.ExecuteNonQuery()
}
}
The problem I have is when a GO command appears in the middle of a comment block:
/*
IF OBJECT_ID('AmyTempMapRetroDateFK') IS NOT NULL
DROP FUNCTION AmyTempMapRetroDateFK
GO
*/
Is there a way of removing all of the comment blocks before processing the script? I've seen a few examples in c# but nothing for Powershell.
Assuming that there are no nested comments (PSv3+ syntax):
(Get-Content -Raw $sqlScript) -split '(?s)/\*.*?\*/' -split '\r?\ngo\r?\n' -notmatch '^\s*$' |
ForEach-Object { $SqlCmd.CommandText = $_.Trim(); $reader = $SqlCmd.ExecuteNonQuery() }
Note: If there's a chance that the final line doesn't end in a line break,
use '\r?\ngo(\r?\n|$) instead of
'\r?\ngo\r?\n'
Get-Content -Raw, available since PSv3, reads the entire file into a single string - it is the simpler and more efficient equivalent of Get-Content $sqlScript | Out-String
-split '(?s)/\*.*?\*/' splits the input string by /* ... */ spans; note the inline option, (?s), which is required to make . match newlines too; non-greedy quantifier .*? is needed to only match up to the next */ instance; the result is an array of line blocks with the comment blocks excluded.
-split '\r?\ngo\r?\n' then further splits that array by the word go preceded and followed by a newline.
Note that -split is case-insensitive by default, so you needn't worry about case variations such as GO.
(You could use alias -isplit to make the case-insensitive behavior more explicit; similarly,
-csplit can be used for case-sensitive matching.)
-notmatch '^\s*$' filters out blank / empty elements from the resulting array, and sends the filtered array through the pipeline (|).
The ForEach-Object cmdlet then operates on each array element - now containing an individual SQL command - via automatic variable $_, which always represents the input object at hand.
A simplified version of the solution marked as the best answer, adapted here to remove PS comment blocks:
(get-content .\myscript.ps1 -raw) -replace "(?s)<#.+?#>",'' > myscript_Clean.ps1
Not sure why GO statements should interfere here. Applied to SQL comment blocks, this should do it:
(get-content .\myscript.sql -raw) -replace "(?s)/\*.+?*\/",'' > myscript_Clean.sql
Perhaps you could split on /* and join the resulting array.
Then split that on */, and join the resulting array.
Joins would be easier read with a `r`n(carriage return, newline) delimiter