Issue with Powershell update string in each file within a folder - powershell

I have a set of SQL files stored in a folder. These files contain the schema name in the format dbo_xxxxxx where xxxxxx is the year & month e.g. dbo_202001, dbo_202002 etc. I want the powershell script to replace the xxxxx number with a new one in each of these SQL files. I'm using the below script to achieve that. However, the issue is that it seems to partially match on the old string (instead of matching on the full string) and puts the new string in place e.g. instead of replacing [dbo_202001] with [dbo_201902], anywhere it finds d, b, o etc. it replaces it with [dbo_201902]. Anyway to fix this?
$sourceDir = "C:\SQL_Scripts"
$SQLScripts = Get-ChildItem $sourceDir *.sql -rec
foreach ($file in $SQLScripts)
{
(Get-Content $file.PSPath) |
Foreach-Object { $_ -replace "[dbo_202001]", "[dbo_201902]" } |
Set-Content $file.PSPath -NoNewline
}

vonPryz and marsze have provided the crucial pointers: since the -replace operator operates on regexes (regular expressions), you must \-escape special characters such as [ and ] in order to treat them verbatim (as literals):
$_ -replace '\[dbo_202001\]', '[dbo_201902]'
While use of the -replace operator is generally preferable, the [string] type's .Replace() method directly offers verbatim (literal) string replacements and is therefore also faster than -replace.
Typically, this won't matter, but in situations similar to yours, where many iterations are involved, it may (note that the replacement is case-sensitive):
$_.Replace('[dbo_202001]', '[dbo_201902]')
See the bottom section for guidance on when to use -replace vs. .Replace().
The performance of your code can be greatly improved:
$sourceDir = 'C:\SQL_Scripts'
foreach ($file in Get-ChildItem -File $sourceDir -Filter *.sql -Recurse)
{
# CAVEAT: This overwrites the files in-place.
Set-Content -NoNewLine $file.PSPath -Value `
(Get-Content -Raw $file.PSPath).Replace('[dbo_202001]', '[dbo_201902]')
}
Since you're reading the whole file into memory anyway, using Get-Content's -Raw switch to read it as a single, multi-line string (rather than an array of lines) on which you can perform a single .Replace() operation is much faster.
Set-Content's -NoNewLine switch is needed to prevent an additional newline from getting appended on writing back to the file.
Note the use of the -Value parameter rather than the pipeline to provide the file content. Since there's only a single string object to write here, it makes little difference, but in general, with many objects to write that are already collected in memory, Set-Content ... -Value $array is much faster than $array | Set-Content ....
Guidance on use of the -replace operator vs. the .Replace() method:
-replace is the PowerShell-specific regex (regular-expression)-based string replacement operator,
whereas .Replace() is a method of the .NET [string] type. (System.String), which performs verbatim (literal) string replacements.
Note that both features invariably replace all matches they find, and, conversely, return the original string if none are found.
Generally, PowerShell's -replace operator is a more natural fit in PowerShell code - both syntactically and due to its case-insensitivity - and offers more functionality, thanks to being regex-based.
The .Replace() method is limited to verbatim replacements and in Windows PowerShell to case-sensitive ones, but has the advantage of being faster and not having to worry about escaping special characters in its arguments:
Only use the [string] type's .Replace() method:
for invariably verbatim string replacements
with the following case-sensitivity:
PowerShell [Core] v6+: case-sensitive by default, optionally case-insensitive via an additional argument; e.g.:
'FOO'.Replace('o', '#', 'InvariantCultureIgnoreCase')
Windows PowerShell: invariably(!) case-sensitive
if functionally feasible, when performance matters
Otherwise, use PowerShell's -replace operator (covered in more detail here):
for regex-based replacements:
enables sophisticated, pattern-based matching and dynamic construction of replacement strings
To escape metacharacters (characters with special syntactic meaning) in order to treat them verbatim:
in the pattern (regex) argument: \-escape them (e.g., \. or \[)
in the replacement argument: only $ is special, escape it as $$.
To escape an entire operand in order to treat its value verbatim (to effectively perform literal replacement):
in the pattern argument: call [regex]::Escape($pattern)
in the replacement argument: call $replacement.Replace('$', '$$')
with the following case-sensitivity:
case-insensitive by default
optionally case-sensitive via its c-prefixed variant, -creplace
Note: -replace is a PowerShell-friendly wrapper around the [regex]::Replace() method that doesn't expose all of the latter's functionality, notably not its ability to limit the number of replacements; see this answer for how to use it.
Note that -replace can directly operation on arrays (collections) of strings as the LHS, in which case the replacement is performed on each element, which is stringified on demand.
Thanks to PowerShell's fundamental member-access enumeration feature, .Replace() too can operate on arrays, but only if all elements are already strings. Also, unlike -replace, which always also returns an array if the LHS is one, member-access enumeration returns a single string if the input object happens to be a single-element array.
As an aside: similar considerations apply to the use of PowerShell's -split operator vs. the [string] type's .Split() method - see this answer.
Examples:
-replace - see this answer for syntax details:
# Case-insensitive replacement.
# Pattern operand (regex) happens to be a verbatim string.
PS> 'foo' -replace 'O', '#'
f##
# Case-sensitive replacement, with -creplace
PS> 'fOo' -creplace 'O', '#'
f#o
# Regex-based replacement with verbatim replacement:
# metacharacter '$' constrains the matching to the *end*
PS> 'foo' -replace 'o$', '#'
fo#
# Regex-based replacement with dynamic replacement:
# '$&' refers to what the regex matched
PS> 'foo' -replace 'o$', '>>$&<<'
fo>>o<<
# PowerShell [Core] only:
# Dynamic replacement based on a script block.
PS> 'A1' -replace '\d', { [int] $_.Value + 1 }
A2
# Array operation, with elements stringified on demand:
PS> 1..3 -replace '^', '0'
01
02
03
# Escape a regex metachar. to be treated verbatim.
PS> 'You owe me $20' -replace '\$20', '20 dollars'
You owe me 20 dollars.
# Ditto, via a variable and [regex]::Escape()
PS> $var = '$20'; 'You owe me $20' -replace [regex]::Escape($var), '20 dollars'
You owe me 20 dollars.
# Escape a '$' in the replacement operand so that it is always treated verbatim:
PS> 'You owe me 20 dollars' -replace '20 dollars', '$$20'
You owe me $20
# Ditto, via a variable and [regex]::Escape()
PS> $var = '$20'; 'You owe me 20 dollars' -replace '20 dollars', $var.Replace('$', '$$')
You owe me $20.
.Replace():
# Verbatim, case-sensitive replacement.
PS> 'foo'.Replace('o', '#')
f##
# No effect, because matching is case-sensitive.
PS> 'foo'.Replace('O', '#')
foo
# PowerShell [Core] v6+ only: opt-in to case-INsensitivity:
PS> 'FOO'.Replace('o', '#', 'InvariantCultureIgnoreCase')
F##
# Operation on an array, thanks to member-access enumeration:
# Returns a 2 -element array in this case.
PS> ('foo', 'goo').Replace('o', '#')
f##
g##
# !! Fails, because not all array elements are *strings*:
PS> ('foo', 42).Replace('o', '#')
... Method invocation failed because [System.Int32] does not contain a method named 'Replace'. ...

Related

Check if a string exists in an array even as a substring in PowerShell

I'm trying to work out if a string exists in an array, even if it's a substring of a value in the array.
I've tried a few methods and just can't get it to work, not sure where I'm going wrong.
I have the below code, you can see that $val2 exists within $val1, but I always get a FALSE when I run it.
$val1 = "folder1\folder2\folder3"
$val2 = "folder1\folder2"
$val3 = "folder9"
$val_array = #()
$val_array += $val1
$val_array += $val3
$null -ne ($val_array | ? { $val2 -match $_ }) # Returns $true
I also tried:
foreach ($item in $val_array) {
if ($item -match $val2) {
Write-Host "yes"
}
}
The -Match operator does a regular expression comparison. Where the backslash character (\) has a special meaning (it escapes the following character).
Instead you might use the -Like operator:
$val_array -Like "*$val2*"
Yields:
folder1\folder2\folder3
iRon's helpful answer offers the best solution to your problem, using wildcard matching via the -like operator.
Note:
The need to escape select characters in a search pattern in order for the pattern to be taken verbatim in principle also applies to the wildcard-based -like operator, not just to the regex-based -match operator, but since wildcard expressions have far fewer metacharacters than regexes - namely just *, ?, and [ - the need for such escaping doesn't often arise in practice; whereas regexes require \ as the escape characters, wildcards use `, and programmatic escaping can be achieved with [WildcardPattern]::Escape()
Unfortunately, as of PowerShell 7.2, there is no dedicated operator for verbatim substring matching:
A workaround for this limitation is to call the [string] .NET type's .Contains() method (on a single input string only), however, this performs case-sensitive matching, whereas PowerShell operators are case-insensitive by default, but offer case-sensitive variants simply by prefixing the operator name with c (e.g., -clike, -cmatch).
In Windows PowerShell, .Contains() is invariably case-sensitive, but in PowerShell (Core) 7+ an additional overload is available that offers case-insensitive matching:
'Foo'.Contains('fo') # -> $false, due to case difference
# PowerShell (Core) 7+ *only*:
'Foo'.Contains('fo', 'InvariantCultureIgnoreCase') # -> $true
Caveat: Despite the name similarity, PowerShell's -contains operator does not perform substring matching; instead, it tests whether a collection contains a given element (in full).
As for what you tried:
Your primary problem is that you've accidentally swapped the -match operator's operands: the search pattern - which is invariably interpreted as a regex (regular expression) - must be on the RHS.
As iRon points out, in order for your search pattern to be taken verbatim (literally), you need to escape regex metacharacters with \, and the robust, programmatic way to do this is with [regex]::Escape().
Therefore, the immediate fix would have been (? is a built-in alias of the Where-Object cmdlet):
# OK, but SLOW.
$val_array | ? { $_ -match [regex]::Escape($val2) }
However, this solution is inefficient (it involves the pipeline and a cmdlet).
Fortunately, PowerShell's comparison operators can be applied to arrays (collections) directly, in which case they act as filters, i.e. they return the sub-array of matching elements - see the docs.
iRon's answer uses this technique with -like, but it equally works with -match, so that your expression can be simplified to the following, much more efficient form:
# MUCH FASTER.
$val_array -match [regex]::Escape($val2)
Try the string method Contains:
$null -ne ($val_array | ? { $_.Contains($val2) })

Replacing text that includes backslashes in a txt file with Powershell v1 [duplicate]

I have a set of SQL files stored in a folder. These files contain the schema name in the format dbo_xxxxxx where xxxxxx is the year & month e.g. dbo_202001, dbo_202002 etc. I want the powershell script to replace the xxxxx number with a new one in each of these SQL files. I'm using the below script to achieve that. However, the issue is that it seems to partially match on the old string (instead of matching on the full string) and puts the new string in place e.g. instead of replacing [dbo_202001] with [dbo_201902], anywhere it finds d, b, o etc. it replaces it with [dbo_201902]. Anyway to fix this?
$sourceDir = "C:\SQL_Scripts"
$SQLScripts = Get-ChildItem $sourceDir *.sql -rec
foreach ($file in $SQLScripts)
{
(Get-Content $file.PSPath) |
Foreach-Object { $_ -replace "[dbo_202001]", "[dbo_201902]" } |
Set-Content $file.PSPath -NoNewline
}
vonPryz and marsze have provided the crucial pointers: since the -replace operator operates on regexes (regular expressions), you must \-escape special characters such as [ and ] in order to treat them verbatim (as literals):
$_ -replace '\[dbo_202001\]', '[dbo_201902]'
While use of the -replace operator is generally preferable, the [string] type's .Replace() method directly offers verbatim (literal) string replacements and is therefore also faster than -replace.
Typically, this won't matter, but in situations similar to yours, where many iterations are involved, it may (note that the replacement is case-sensitive):
$_.Replace('[dbo_202001]', '[dbo_201902]')
See the bottom section for guidance on when to use -replace vs. .Replace().
The performance of your code can be greatly improved:
$sourceDir = 'C:\SQL_Scripts'
foreach ($file in Get-ChildItem -File $sourceDir -Filter *.sql -Recurse)
{
# CAVEAT: This overwrites the files in-place.
Set-Content -NoNewLine $file.PSPath -Value `
(Get-Content -Raw $file.PSPath).Replace('[dbo_202001]', '[dbo_201902]')
}
Since you're reading the whole file into memory anyway, using Get-Content's -Raw switch to read it as a single, multi-line string (rather than an array of lines) on which you can perform a single .Replace() operation is much faster.
Set-Content's -NoNewLine switch is needed to prevent an additional newline from getting appended on writing back to the file.
Note the use of the -Value parameter rather than the pipeline to provide the file content. Since there's only a single string object to write here, it makes little difference, but in general, with many objects to write that are already collected in memory, Set-Content ... -Value $array is much faster than $array | Set-Content ....
Guidance on use of the -replace operator vs. the .Replace() method:
-replace is the PowerShell-specific regex (regular-expression)-based string replacement operator,
whereas .Replace() is a method of the .NET [string] type. (System.String), which performs verbatim (literal) string replacements.
Note that both features invariably replace all matches they find, and, conversely, return the original string if none are found.
Generally, PowerShell's -replace operator is a more natural fit in PowerShell code - both syntactically and due to its case-insensitivity - and offers more functionality, thanks to being regex-based.
The .Replace() method is limited to verbatim replacements and in Windows PowerShell to case-sensitive ones, but has the advantage of being faster and not having to worry about escaping special characters in its arguments:
Only use the [string] type's .Replace() method:
for invariably verbatim string replacements
with the following case-sensitivity:
PowerShell [Core] v6+: case-sensitive by default, optionally case-insensitive via an additional argument; e.g.:
'FOO'.Replace('o', '#', 'InvariantCultureIgnoreCase')
Windows PowerShell: invariably(!) case-sensitive
if functionally feasible, when performance matters
Otherwise, use PowerShell's -replace operator (covered in more detail here):
for regex-based replacements:
enables sophisticated, pattern-based matching and dynamic construction of replacement strings
To escape metacharacters (characters with special syntactic meaning) in order to treat them verbatim:
in the pattern (regex) argument: \-escape them (e.g., \. or \[)
in the replacement argument: only $ is special, escape it as $$.
To escape an entire operand in order to treat its value verbatim (to effectively perform literal replacement):
in the pattern argument: call [regex]::Escape($pattern)
in the replacement argument: call $replacement.Replace('$', '$$')
with the following case-sensitivity:
case-insensitive by default
optionally case-sensitive via its c-prefixed variant, -creplace
Note: -replace is a PowerShell-friendly wrapper around the [regex]::Replace() method that doesn't expose all of the latter's functionality, notably not its ability to limit the number of replacements; see this answer for how to use it.
Note that -replace can directly operation on arrays (collections) of strings as the LHS, in which case the replacement is performed on each element, which is stringified on demand.
Thanks to PowerShell's fundamental member-access enumeration feature, .Replace() too can operate on arrays, but only if all elements are already strings. Also, unlike -replace, which always also returns an array if the LHS is one, member-access enumeration returns a single string if the input object happens to be a single-element array.
As an aside: similar considerations apply to the use of PowerShell's -split operator vs. the [string] type's .Split() method - see this answer.
Examples:
-replace - see this answer for syntax details:
# Case-insensitive replacement.
# Pattern operand (regex) happens to be a verbatim string.
PS> 'foo' -replace 'O', '#'
f##
# Case-sensitive replacement, with -creplace
PS> 'fOo' -creplace 'O', '#'
f#o
# Regex-based replacement with verbatim replacement:
# metacharacter '$' constrains the matching to the *end*
PS> 'foo' -replace 'o$', '#'
fo#
# Regex-based replacement with dynamic replacement:
# '$&' refers to what the regex matched
PS> 'foo' -replace 'o$', '>>$&<<'
fo>>o<<
# PowerShell [Core] only:
# Dynamic replacement based on a script block.
PS> 'A1' -replace '\d', { [int] $_.Value + 1 }
A2
# Array operation, with elements stringified on demand:
PS> 1..3 -replace '^', '0'
01
02
03
# Escape a regex metachar. to be treated verbatim.
PS> 'You owe me $20' -replace '\$20', '20 dollars'
You owe me 20 dollars.
# Ditto, via a variable and [regex]::Escape()
PS> $var = '$20'; 'You owe me $20' -replace [regex]::Escape($var), '20 dollars'
You owe me 20 dollars.
# Escape a '$' in the replacement operand so that it is always treated verbatim:
PS> 'You owe me 20 dollars' -replace '20 dollars', '$$20'
You owe me $20
# Ditto, via a variable and [regex]::Escape()
PS> $var = '$20'; 'You owe me 20 dollars' -replace '20 dollars', $var.Replace('$', '$$')
You owe me $20.
.Replace():
# Verbatim, case-sensitive replacement.
PS> 'foo'.Replace('o', '#')
f##
# No effect, because matching is case-sensitive.
PS> 'foo'.Replace('O', '#')
foo
# PowerShell [Core] v6+ only: opt-in to case-INsensitivity:
PS> 'FOO'.Replace('o', '#', 'InvariantCultureIgnoreCase')
F##
# Operation on an array, thanks to member-access enumeration:
# Returns a 2 -element array in this case.
PS> ('foo', 'goo').Replace('o', '#')
f##
g##
# !! Fails, because not all array elements are *strings*:
PS> ('foo', 42).Replace('o', '#')
... Method invocation failed because [System.Int32] does not contain a method named 'Replace'. ...

Literal Find and replace exact match. Ignore regex [duplicate]

I'm writing a powershell program to replace strings using
-replace "$in", "$out"
It doesn't work for strings containing a backslash, how can I do to escape it?
The -replace operator uses regular expressions, which treat backslash as a special character. You can use double backslash to get a literal single backslash.
In your case, since you're using variables, I assume that you won't know the contents at design time. In this case, you should run it through [RegEx]::Escape():
-replace [RegEx]::Escape($in), "$out"
That method escapes any characters that are special to regex with whatever is needed to make them a literal match (other special characters include .,$,^,(),[], and more.
You'll need to either escape the backslash in the pattern with another backslash or use the .Replace() method instead of the -replace operator (but be advised they may perform differently):
PS C:\> 'asdf' -replace 'as', 'b'
bdf
PS C:\> 'a\sdf' -replace 'a\s', 'b'
a\sdf
PS C:\> 'a\sdf' -replace 'a\\s', 'b'
bdf
PS C:\> 'a\sdf' -replace ('a\s' -replace '\\','\\'), 'b'
bdf
Note that only the search pattern string needs to be escaped. The code -replace '\\','\\' says, "replace the escaped pattern string '\\', which is a single backslash, with the unescaped literal string '\\' which is two backslashes."
So, you should be able to use:
-replace ("$in" -replace '\\','\\'), "$out"
[Note: briantist's solution is better.]
However, if your pattern has consecutive backslashes, you'll need to test it.
Or, you can use the .Replace() string method, but as I said above, it may not perfectly match the behavior of the -replace operator:
PS C:\> 'a\sdf'.replace('a\\s', 'b')
a\sdf
PS C:\> 'a\sdf'.replace( 'a\s', 'b')
bdf

How do I strip part of a file name?

Suppose I have a file database_partial.xml.
I am trying to strip the file from "_partial" as well as extension (xml) and then capitalize the name so that it becomes DATABASE.
Param($xmlfile)
$xml = Get-ChildItem "C:\Files" -Filter "$xmlfile"
$db = [IO.Path]::GetFileNameWithoutExtension($xml).ToUpper()
That returns DATABASE_PARTIAL, but I don't know how to strip the _PARTIAL part.
You don't need GetFileNameWithoutExtension() for removing the extension. The FileInfo objects returned by Get-ChildItem have a property BaseName that gives you the filename without extension. Uppercase that, then remove the "_PARTIAL" suffix. I would also recommend processing the output of Get-ChildItem in a loop, just in case it doesn't return exactly one result.
Get-ChildItem "C:\Files" -Filter "$xmlfile" | ForEach-Object {
$_.BaseName.ToUpper().Replace('_PARTIAL', '')
}
If the substring after the underscore can vary, use a regular expression replacement instead of a string replacement, e.g. like this:
Get-ChildItem "C:\Files" -Filter "$xmlfile" | ForEach-Object {
$_.BaseName.ToUpper() -replace '_[^_]*$'
}
Ansgar Wiechers's helpful answer provides an effective solution.
To focus on the more general question of how to strip (remove) part of a file name (string):
Use PowerShell's -replace operator, whose syntax is:<stringOrStrings> -replace <regex>, <replacement>:
<regex> is a regex (regular expression) that matches the part to replace,
<replacement> is replacement operand (the string to replace what the regex matched).
In order to effectively remove what the regex matched, specify '' (the empty string) or simply omit the operand altogether - in either case, the matched part is effectively removed from the input string.
For more information about -replace, see this answer.
Applied to your case:
$db = 'DATABASE_PARTIAL' # sample input value
PS> $db -replace '_PARTIAL$', '' # removes suffix '_PARTIAL' from the end (^)
DATABASE
PS> $db -replace '_PARTIAL$' # ditto, with '' implied as the replacement string.
DATABASE
Note:
-replace is case-insensitive by default, as are all PowerShell operators. To explicitly perform case-sensitive matching, use the -creplace variant.
By contrast, the [string] type's .Replace() method (e.g., $db.Replace('_PARTIAL', ''):
matches by string literals only, and therefore offers less flexibility; in this case, you couldn't stipulate that _PARTIAL should only be matched at the end of the string, for instance.
is invariably case-sensitive in the .NET Framework (though .NET Core offers a case-insensitive overload).
Building on Ansgar's answer, your script can therefore be streamlined as follows:
Param($xmlfile)
$db = ((Get-ChildItem C:\Files -Filter $xmlfile).BaseName -replace '_PARTIAL$').ToUpper()
Note that in PSv3+ this works even if $xmlfile should match multiple files, due to member-access enumeration and the ability of -replace to accept an array of strings as input, the desired substring removal would be performed on the base names of all files, as would the subsequent uppercasing - $db would then receive an array of stripped base names.

how to sort a txt file in specific order in Powershell

I have this first text for example
today is sunny in the LA
and the temperature is 21C
today is cloudy in the NY
and the temperature is 18C
today is sunny in the DC
and the temperature is 25C
and this is the order I want:
18C
25C
21C
I want to change the first file to be the same order as the second one but without deleting anything:
today is cloudy in the NY
and the temperature is 18C
today is sunny in the DC
and the temperature is 25C
today is sunny in the LA
and the temperature is 21C
Note: The PSv3+ solution below answers a different question: it sorts the paragraphs numerically by the temperature values contained in them, not in an externally prescribed order.
As such, it may still be of interest, given the question's generic title.
For an answer to the question as asked, see my other post.
Here's a concise solution, but note that it requires reading the input file into memory as a whole (in any event, Sort-Object collects its input objects all in memory as well, since it does not use temporary files to ease potential memory pressure):
((Get-Content -Raw file.txt) -split '\r?\n\r?\n' -replace '\r?\n$' |
Sort-Object { [int] ($_ -replace '(?s).+ (\d+)C$', '$1') }) -join
[Environment]::NewLine * 2
(Get-Content -Raw file.txt) reads the input file into memory as a whole, as a single, multi-line string.
-split '\r?\n\r?\n' breaks the multi-line string into an array of paragraphs (blocks of lines separated by an empty line), and -replace '\r?\n$' removes a trailing newline, if any, from the paragraph at the very end of the file.
Regex \r?\n matches both Windows-style CRLF and Unix-style LF-only newlines.
Sort-Object { [int] ($_ -replace '(?s).+ (\d+)C$', '$1') }) numerically sorts the paragraphs by the temperature number at the end of each paragraph (e.g. 18).
$_ represents the input paragraph at hand.
-replace '...', '...' performs string replacement based on a regex, which in this case extracts the temperature number string from the end of the paragraph.
See Get-Help about_Regular_Expressions for information about regexes (regular expressions) and Get-Help about_Comparison_Operators for information about the -replace operator.
Cast [int] converts the number string to an integer for proper numerical sorting.
-join [Environment]::NewLine * 2 reassembles the sorted paragraphs into a single multi-line string, with paragraphs separated by an empty line.
[Environment]::NewLine is the platform-appropriate newline sequence; you can alternatively hard-code newlines as "`r`n" (CRLF) or "`n" (LF).
You can send the output to a new file by appending something like
... | Set-Content sortedFile.txt (which makes the file "ANSI"-encoded in Windows PowerShell, and UTF-8-encoded in PowerShell Core by default; use -Encoding as needed).
Since the entire input file is read into memory up front, it is possible to write the results directly back to the input file (... | Set-Content file.txt), but doing so bears the slight risk of data loss, namely if writing is interrupted before completion.
Nas' helpful answer works, but it is an O(m*n) operation; that is, with m paragraphs to output in prescribed order and n input paragraphs, m * n operations are required; if all input paragraphs are to be output (in the prescribed order), i.e, if m equals n, the effort is quadratic.
The following PSv4+ solution will scale better, as it only requires linear rather than quadratic effort:
# The tokens prescribing the sort order, which may come from
# another file read with Get-Content, for instance.
$tokensToSortBy = '18C', '25C', '21C'
# Create a hashtable that indexes the input file's paragraphs by the sort
# token embedded in each.
((Get-Content -Raw file.txt) -split '\r?\n\r?\n' -replace '\r?\n$').ForEach({
$htParagraphsBySortToken[$_ -replace '(?s).* (\d+C)$(?:\r?\n)?', '$1'] = $_
})
# Loop over the tokens prescribing the sort order, and retrieve the
# corresponding paragraph, then reassemble the paragraphs into a single,
# multi-line string with -join
$tokensToSortBy.ForEach({ $htParagraphsBySortToken[$_] }) -join [Environment]::NewLine * 2
(Get-Content -Raw file.txt) reads the input file into memory as a whole, as a single, multi-line string.
-split '\r?\n\r?\n' breaks the multi-line string into an array of paragraphs (blocks of lines separated by an empty line), and -replace '\r?\n$' removes a trailing newline, if any, from the paragraph at the very end of the file.
Regex \r?\n matches both Windows-style CRLF and Unix-style LF-only newlines.
$_ -replace '(?s).* (\d+C)$(?:\r?\n)?', '$1' extracts the sort token (e.g., 25C) from each paragraph, which becomes the hashtable's key.
$_ represents the input paragraph at hand.
-replace '...', '...' performs string replacement based on a regex.
See Get-Help about_Regular_Expressions for information about regexes (regular expressions) and Get-Help about_Comparison_Operators for information about the -replace operator.
-join [Environment]::NewLine * 2 reassembles the sorted paragraphs into a single multi-line string, with paragraphs separated by an empty line.
[Environment]::NewLine is the platform-appropriate newline sequence; you can alternatively hard-code newlines as "`r`n" (CRLF) or "`n" (LF).
You can send the output to a new file by appending something like
... | Set-Content sortedFile.txt to the last statement (which makes the file "ANSI"-encoded in Windows PowerShell, and UTF-8-encoded in PowerShell Core by default; use -Encoding as needed).
$text = Get-Content -path C:\text.txt
$order = '18C','25C','21C'
foreach ($item in $order)
{
$text | ForEach-Object {
if ($_ -match "$item`$") { # `$ to match string at the end of the line
Write-Output $text[($_.ReadCount-2)..($_.ReadCount)] # output lines before and after match
}
}
}