How to prepend to a file in PowerShell? - powershell

I'm generating two files, userscript.meta.js and userscript.user.js. I need the output of userscript.meta.js to be placed at the very beginning of userscript.user.js.
Add-Content doesn't seem to accept a parameter to prepend and Get-Content | Set-Content will fail because userscript.user.js is being used by Get-Content.
I'd rather not create an intermediate file if it's physically possible to have a clean solution.
How to achieve this?

The Subexpression operator $( ) can evaluate both Get-Content statements which are then enumerated and passed through the pipeline to Set-Content:
$(
Get-Content userscript.meta.js -Raw
Get-Content userscript.user.js -Raw
) | Set-Content userscript.user.js
Consider using the Absolute Path of the files if your current directory is not where those files are.
An even more simplified approach than the above would be to put the paths in the desired order since both, the -Path and -LiteralPath parameters can take multiple values:
(Get-Content userscript.meta.js, userscript.user.js -Raw) |
Set-Content userscript.user.js
And in case you want to get rid of excess leading or trailing white-space, you can include the String.Trim Method:
(Get-Content userscript.meta.js, userscript.user.js -Raw).Trim() |
Set-Content userscript.user.js
Note that in above examples the grouping operator ( ) is mandatory as we need to consume all output from Get-Content before being passed through the pipeline to Set-Content. See Piping grouped expressions for more details.

For future folks, here's a snippet if you need to prepend the same thing to multiple files:
example: prepending an #include directive to a bunch of auto-generated C++ files so it works with my Windows environment.
Get-ChildItem -Path . -Filter *.cpp | ForEach-Object {
$file = $_.FullName
# the -Raw param was important for me as it didn't read the entire
# file properly without it. I even tried [System.IO.File]::ReadAllText
# and got the same thing, so there must have been some characater that
# caused the file read to return prematurely
$content = Get-Content $file -Raw
$prepend = '#include "stdafx.h"' + "`r`n"
#this could also be from a file: aka
# $prepend = Get-Content 'path_to_my_file_used_for_prepending'
$content = $prepend + $content
Set-Content $file $content
}

Related

PowerShell: Replace string in all .txt files within directory

I am trying to replace every instance of a string within a directory. However my code is not replacing anything.
What I have so far:
Test Folder contains multiple files and folders containing content that I need to change.
The folders contain .txt documents, the .txt documents contain strings like this: Content reference="../../../PartOfPath/EN/EndofPath/Caution.txt" that i need to change into this: Content reference="../../../PartOfPath/FR/EndofPath/Caution.txt"
Before this question comes up, yes it has to be done this way, as there are other similar strings that I don't want to edit. So I cannot just replace all instances of EN with FR.
$DirectoryPath = "C:\TestFolder"
$Parts =#(
#{PartOne="/PartOfPath";PartTwo="EndofPath/Caution.txt"},
#{PartOne="/OtherPartOfPath";PartTwo="EndofPath/Note.txt"},
#{PartOne="/ThirdPartOfPath";PartTwo="OtherEndofPath/Warning.txt"}) | % { New-Object object | Add-Member -NotePropertyMembers $_ -PassThru }
Get-ChildItem $DirectoryPath | ForEach {
foreach($n in $Parts){
[string]$PartOne = $n.PartOne
[string]$PartTwo = $n.PartTwo
$ReplaceThis = "$PartOne/EN/$PartTwo"
$WithThis = "$PartOne/FR/$PartTwo"
(Get-Content $_) | ForEach {$_ -Replace $ReplaceThis, $WithThis} | Set-Content $_
}
}
The code will run and overwrite files, however no edits will have been made.
While troubleshooting I came across this potential cause:
This test worked:
$FilePath = "C:\TestFolder\Test.txt"
$ReplaceThis ="/PartOfPath/EN/Notes/Note.txt"
$WithThis = "/PartOfPath/FR/Notes/Note.txt"
(Get-Content -Path $FilePath) -replace $ReplaceThis, $WithThis | Set-Content $FilePath
But this test did not
$FilePath = "C:\TestFolder\Test.txt"
foreach($n in $Parts){
[string]$PartOne = $n.PartOne
[string]$PartTwo = $n.PartTwo
[string]$ReplaceThis = "$PartOne/EN/$PartTwo"
[string]$WithThis = "$PartOne/FR/$PartTwo"
(Get-Content -Path $FilePath) -replace $ReplaceThis, $WithThis | Set-Content $FilePath
}
If you can help me understand what is wrong here I would greatly appreciate it.
Thanks to #TessellatingHeckler 's comments I revised my code and found this solution:
$DirectoryPath = "C:\TestFolder"
$Parts =#(
#{PartOne="/PartOfPath";PartTwo="EndofPath/Caution.txt"},
#{PartOne="/OtherPartOfPath";PartTwo="EndofPath/Note.txt"},
#{PartOne="/ThirdPartOfPath";PartTwo="OtherEndofPath/Warning.txt"}) | % { New-Object object | Add-Member -NotePropertyMembers $_ -PassThru }
Get-ChildItem $LanguageFolderPath -Filter "*.txt" -Recurse | ForEach {
foreach($n in $Parts){
[string]$PartOne = $n.PartOne
[string]$PartTwo = $n.PartTwo
$ReplaceThis = "$PartOne/EN/$PartTwo"
$WithThis = "$PartOne/FR/$PartTwo"
(Get-Content $_) | ForEach {$_.Replace($ReplaceThis, $WithThis)} | Set-Content $_
}
}
There were two problems:
Replace was not working as I intended, so I had to use .replace instead
The original Get-ChildItem was not returning any values and had to be replaced with the above version.
PowerShell's -replace operator is regex-based and case-insensitive by default:
To perform literal replacements, \-escape metacharacters in the pattern or call [regex]::Escape().
By contrast, the [string] type's .Replace() method performs literal replacement and is case-sensitive, invariably in Windows PowerShell, by default in PowerShell (Core) 7+ (see this answer for more information).
Therefore:
As TessellatingHeckler points out, given that your search strings seem to contain no regex metacharacters (such as . or \) that would require escaping, there is no obvious reason why your original approach didn't work.
Given that you're looking for literal substring replacements, the [string] type's .Replace() is generally the simpler and faster option if case-SENSITIVITY is desired / acceptable (invariably so in Windows PowerShell; as noted, in PowerShell (Core) 7+, you have the option of making .Replace() case-insensitive too).
However, since you need to perform multiple replacements, a more concise, single-pass -replace solution is possible (though whether it actually performs better would have to be tested; if you need case-sensitivity, use -creplace in lieu of -replace):
$oldLang = 'EN'
$newLang = 'FR'
$regex = #(
"(?<prefix>/PartOfPath/)$oldLang(?<suffix>/EndofPath/Caution.txt)",
"(?<prefix>/OtherPartOfPath/)$oldLang(?<suffix>/EndofPath/Note.txt)",
"(?<prefix>/ThirdPartOfPath/)$oldLang(?<suffix>/OtherEndofPath/Warning.txt)"
) -join '|'
Get-ChildItem C:\TestFolder\Test.txt -Filter *.txt -Recurse | ForEach-Object {
($_ |Get-Content -Raw) -replace $regex, "`${prefix}$newLang`${suffix}" |
Set-Content -LiteralPath $_.FullName
}
See this regex101.com page for an explanation of the regex and the ability to experiment with it.
The expression used as the replacement operand, "`${prefix}$newLang`${suffix}", mixes PowerShell's up-front string interpolation ($newLang, which could also be written as ${newLang}) with placeholders referring to the named capture groups (e.g. (?<prefix>...)) in the regex, which only coincidentally use the same notation as PowerShell variables (though enclosing the name in {...} is required; also, here the $ chars. must be `-escaped to prevent PowerShell's string interpolation from interpreting them); see this answer for background information.
Note the use of -Raw with Get-Content, which reads a text file as a whole into memory, as a single, multi-line string. Given that you don't need line-by-line processing in this case, this greatly speeds up the processing of a given file.
As a general tip: you may need to use the -Encoding parameter with Set-Content to ensure the desired character encoding, given that PowerShell never preserves a file's original coding when reading it. By default, you'll get ANSI-encoded files in Windows PowerShell, and BOM-less UTF-8 files in PowerShell (Core) 7+.

How to add counter into Powershells ForEach-Object function

So I have a Pipe that will search a file for a specific stream and if found will replace it with a masked value, I am trying to have a counter for all of the times the oldValue is replaced with the newValue. It doesn't necessarily need to be a one liner just curious how you guys would go about this. TIA!
Get-Content -Path $filePath |
ForEach-Object {
$_ -replace "$oldValue", "$newValue"
} |
Set-Content $filePath
I suggest:
Reading the entire input file as a single string with Get-Content's -Raw switch.
Using -replace / [regex]::Replace() with a script block to determine the substitution text, which allows you to increment a counter variable every time a replacement is made.
Note: Since you're replacing the input file with the results, be sure to make a backup copy first, to be safe.
In PowerShell (Core) 7+, the -replace operator now directly accepts a script block that allows you to determine the substitution text dynamically:
$count = 0
(Get-Content -Raw $filePath) -replace $oldValue, { $newValue; ++$count } |
Set-Content -NoNewLine $filePath
$count now contains the number of replacements, across all lines (including multiple matches on the same line), that were performed.
In Windows PowerShell, direct use of the underlying .NET API, [regex]::Replace(), is required:
$count = 0
[regex]::Replace(
'' + (Get-Content -Raw $filePath),
$oldValue,
{ $newValue; ++(Get-Variable count).Value }
) | Set-Content -NoNewLine $filePath
Note:
'' + ensures that the call succeeds even if file $filePath has no content at all; without it, [regex]::Replace() would complain about the argument being null.
++(Get-Variable count).Value must be used in order to increment the $count variable in the caller's scope (Get-Variable can retrieve variables defined in ancestral scopes; -Scope 1 is implied here, thanks to PowerShell's dynamic scoping). Unlike with -replace in PowerShell 7+, the script block runs in a child scope.
As an aside:
For this use case, the only reason a script block is used is so that the counter variable can be incremented - the substitution text itself is static. See this answer for an example where the substitution text truly needs to be determined dynamically, by deriving it from the match at hand, as passed to the script block.
Changing my answer due to more clarifications in comments. The best way I can think of is to get the count of the $Oldvalue ahead of time. Then replace!
$content = Get-Content -Path $filePath
$toBeReplaced = Select-String -InputObject $content -Pattern $oldValue -AllMatches
$replacedTotal = $toBeReplaced.Matches.Count
$content | ForEach-Object {$_ -replace "$oldValue", "$newValue"} | Set-Content $filePath

How can I (efficiently) match content (lines) of many small files with content (lines) of a single large file and update/recreate them

I've tried solving the following case:
many small text files (in subfolders) need their content (lines) matched to lines that exist in another (large) text file. The small files then need to be updated or copied with those matching Lines.
I was able to come up with some running code for this but I need to improve it or use a complete other method because it is extremely slow and would take >40h to get through all files.
One idea I already had was to use a SQL Server to bulk-import all files in a single table with [relative path],[filename],[jap content] and the translation file in a table with [jap content],[eng content] and then join [jap content] and bulk-export the joined table as separate files using [relative path],[filename]. Unfortunately I got stuck right at the beginning due to formatting and encoding issues so I dropped it and started working on a PowerShell script.
Now in detail:
Over 40k txt files spread across multiple subfolders with multiple lines each, every line can exist in multiple files.
Content:
UTF8 encoded Japanese text that also can contain special characters like \\[*+(), each Line ending with a tabulator character. Sounds like csv files but they don't have headers.
One large File with >600k Lines containing the translation to the small files. Every line is unique within this file.
Content:
Again UTF8 encoded Japanese text. Each line formatted like this (without brackets):
[Japanese Text][tabulator][English Text]
Example:
ใƒ†ใ‚นใƒˆ[1] Test [1]
End result should be a copy or a updated version of all these small files where their lines got replaced with the matching ones of the translation file while maintaining their relative path.
What I have at the moment:
$translationfile = 'B:\Translation.txt'
$inputpath = 'B:\Working'
$translationarray = [System.Collections.ArrayList]#()
$translationarray = #(Get-Content $translationfile -Encoding UTF8)
Get-Childitem -path $inputpath -Recurse -File -Filter *.txt | ForEach-Object -Parallel {
$_.Name
$filepath = ($_.Directory.FullName).substring(2)
$filearray = [System.Collections.ArrayList]#()
$filearray = #(Get-Content -path $_.FullName -Encoding UTF8)
$filearray = $filearray | ForEach-Object {
$result = $using:translationarray -match ("^$_" -replace '[[+*?()\\.]','\$&')
if ($result) {
$_ = $result
}
$_
}
If(!(test-path B:\output\$filepath)) {New-Item -ItemType Directory -Force -Path B:\output\$filepath}
#$("B:\output\"+$filepath+"\")
$filearray | Out-File -FilePath $("B:\output\" + $filepath + "\" + $_.Name) -Force -Encoding UTF8
} -ThrottleLimit 10
I would appreciate any help and ideas but please keep in mind that I rarely write scripts so anything to complex might fly right over my head.
Thanks
As zett42 states, using a hash table is your best option for mapping the Japanese-only phrases to the dual-language lines.
Additionally, use of .NET APIs for file I/O can speed up the operation noticeably.
# Be sure to specify all paths as full paths, not least because .NET's
# current directory usually differs from PowerShell's
$translationfile = 'B:\Translation.txt'
$inPath = 'B:\Working'
$outPath = (New-Item -Type Directory -Force 'B:\Output').FullName
# Build the hashtable mapping the Japanese phrases to the full lines.
# Note that ReadLines() defaults to UTF-8
$ht = #{ }
foreach ($line in [IO.File]::ReadLines($translationfile)) {
$ht[$line.Split("`t")[0] + "`t"] = $line
}
Get-ChildItem $inPath -Recurse -File -Filter *.txt | Foreach-Object -Parallel {
# Translate the lines to the matching lines including the $translation
# via the hashtable.
# NOTE: If an input line isn't represented as a key in the hashtable,
# it is passed through as-is.
$lines = foreach ($line in [IO.File]::ReadLines($_.FullName)) {
($using:ht)[$line] ?? $line
}
# Synthesize the output file path, ensuring that the target dir. exists.
$outFilePath = (New-Item -Force -Type Directory ($using:outPath + $_.Directory.FullName.Substring(($using:inPath).Length))).FullName + '/' + $_.Name
# Write to the output file.
# Note: If you want UTF-8 files *with BOM*, use -Encoding utf8bom
Set-Content -Encoding utf8 $outFilePath -Value $lines
} -ThrottleLimit 10
Note: Your use of ForEach-Object -Parallel implies that you're using PowerShell [Core] 7+, where BOM-less UTF-8 is the consistent default encoding (unlike in Window PowerShell, where default encodings vary wildly).
Therefore, in lieu of the .NET [IO.File]::ReadLines() API in a foreach loop, you could also use the more PowerShell-idiomatic switch statement with the -File parameter for efficient line-by-line text-file processing.

Replacing text only in lines that match a criteria, using the pipeline

My goal is to replace specific texts in specific lines in a text file, and I want to do that using the pipeline.
At first, I tried to write the code for the text replacement, without the condition that set the replacement to happen only in specific lines:
$fileName = Read-Host "Enter the full path of the file, without quotes"
(Get-Content -Path $fileName -Encoding UTF8) |
ForEach-Object { $_ -replace "01", "January " } |
Set-Content -Path $fileName -Encoding UTF8
It seems that it works. But then, I inserted an IF statement to the pipeline:
$fileName = Read-Host "Enter the full path of the file, without quotes"
(Get-Content -Path $fileName -Encoding UTF8) |
ForEach-Object { if ($_ -match "Month") {$_ -replace "03", "March"} } |
Set-Content -Path $fileName -Encoding UTF8
When I ran the last script, at the end of the process I got a file that includes only the lines that matched the if Statement. If I'm understanding correctly what happened, it seems that only the lines that match the if statement are passed to the next stage in the pipeline. So I understand why the output of the process, but I still can't figure how to solve this - How to pass all the lines in the files through all the stages of the pipeline, but to still make the text replacements to happen only in specific lines that match a specific criteria.
Could you please assist me with this issue?
Please notice that I would like not to use a temporary file for this and also remember that I prefer an elegant way of doing this, using the pipeline.
You have to add else statement like:
(Get-Content -Path $fileName -Encoding UTF8) |
Foreach-Object { If ($_ - match "Month") { $_ -replace "03", "March"} else { $_ } } |
Set-Content -Path $fileName - Encoding UTF8
Without else you didn't put line in pipeline. So your if was like filter
Depending on what your input data looks like you may not need a nested conditional (or a ForEach-Object) at all. If your input looks for instance like this:
Month: 03
you can do the replacement like this:
(Get-Content -Path $fileName -Encoding UTF8) -replace '^(.*Month.*)03','$1March' |
Set-Content -Path $fileName -Encoding UTF8
That will modify just the lines matching the pattern (^(.*Month.*)03) and leave everything else unchanged.

Powershell. Writing out lines based on string within the file

I'm looking for a way to export all lines from within a text file where part of the line matches a certain string. The string is actually the first 4 bytes of the file and I'd like to keep the command to only checking those bytes; not the entire row. I want to write the entire row. How would I go about this?
I am using Windows only and don't have the option to use many other tools that might do this.
Thanks in advance for any help.
Do you want to perform a simple "grep"? Then try this
select-string .\test.txt -pattern "\Athat" | foreach {$_.Line}
or this (very similar regex), also writes to an outfile
select-string .\test.txt -pattern "^that" | foreach {$_.Line} | out-file -filepath out.txt
This assumes that you want to search for a 4-byte string "that" at the beginning of the string , or beginning of the line, respectively.
Something like the following Powershell function should work for you:
function Get-Lines {
[cmdletbinding()]
param(
[string]$filename,
[string]$prefix
)
if( Test-Path -Path $filename -PathType Leaf -ErrorAction SilentlyContinue ) {
# filename exists, and is a file
$lines = Get-Content $filename
foreach ( $line in $lines ) {
if ( $line -like "$prefix*" ) {
$line
}
}
}
}
To use it, assuming you save it as get-lines.ps1, you would load the function into memory with:
. .\get-lines.ps1
and then to use it, you could search for all lines starting with "DATA" with something like:
get-lines -filename C:\Files\Datafile\testfile.dat -prefix "DATA"
If you need to save it to another file for viewing later, you could do something like:
get-lines -filename C:\Files\Datafile\testfile.dat -prefix "DATA" | out-file -FilePath results.txt
Or, if I were more awake, you could ignore the script above, use a simpler solution such as the following one-liner:
get-content -path C:\Files\Datafile\testfile.dat | select-string -Pattern "^DATA"
Which just uses the ^ regex character to make sure it's only looking for "DATA" at the beginning of each line.
To get all the lines from c:\somedir\somefile.txt that begin with 'abcd' :
(get-content c:\somedir\somefile.txt) -like 'abcd*'
provided c:\somedir\somefile.txt is not an unusually large (hundreds of MB) file. For that situation:
get-content c:\somedir\somefile.txt -readcount 1000 |
foreach {$_ -like 'abcd*'}