Using Powershell to copy and replace content from one file to another - powershell

I have two files: FileA and FileB, they are nearly identical.
Both files have a section which starts with ////////// MAIN \\\\\\\\\\. I need to replace the whole content from this point until the end of the file.
So the process high level looks like:
find content (starting with ////////// MAIN \\\\\\\\\\) until the end of the file in FileA and copy it to clipboard
find content (starting with ////////// MAIN \\\\\\\\\\) until the end of the file in FileB and replace it with the content from the clipboard
How do I do this?
I understand that it would look like this (found it online) but I'm missing the pattern and logic I can use for selecting the text until the end of the file:
# FileA
$inputFileA = "C:\fileA.txt"
# Text to be inserted
$inputFileB = "C:\fileB.txt"
# Output file
$outputFile = "C:\fileC.txt"
# Find where the last </location> tag is
if ((Select-String -Pattern "\</location\>" -Path $inputFileA |
select -last 1) -match ":(\d+):")
{
$insertPoint = $Matches[1]
# Build up the output from the various parts
Get-Content -Path $inputFileA | select -First $insertPoint | Out-File $outputFile
Get-Content -Path $inputFileB | Out-File $outputFile -Append
Get-Content -Path $inputFileA | select -Skip $insertPoint | Out-File $outputFile -Append
}

You could do that in two lines of code:
# first write the top part including the '////////// MAIN \\\\\\\\\\' from FileB to the new file
((Get-Content -Path "D:\Test\fileB.txt" -Raw) -split '(?<=/+ MAIN \\+\r?\n)', 2)[0] | Set-Content -Path "D:\Test\fileC.txt" -NoNewline
# then append the bottom part excluding the '////////// MAIN \\\\\\\\\\' from FileA to the new file
((Get-Content -Path "D:\Test\fileA.txt" -Raw) -split '/+ MAIN \\+\r?\n', 2)[-1] | Add-Content -Path "D:\Test\fileC.txt"
Regex details:
(?<= # Assert that the regex below can be matched, with the match ending at this position (positive lookbehind)
/ # Match the character “/” literally
+ # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
\ MAIN\ # Match the characters “ MAIN ” literally
\\ # Match the character “\” literally
+ # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
\r # Match a carriage return character
? # Between zero and one times, as many times as possible, giving back as needed (greedy)
\n # Match a line feed character
)
Or, if the files are quite large:
# first write the top part including the '////////// MAIN \\\\\\\\\\' from FileB to the new file
$copyThis = $true
$content = switch -Regex -File "D:\Test\fileB.txt" {
'/+ MAIN \\+' { $copyThis = $false; $_ ; break}
default { if ($copyThis) { $_ } }
}
$content | Set-Content -Path "D:\Test\fileC.txt"
# then append the bottom part excluding the '////////// MAIN \\\\\\\\\\' from FileA to the new file
$copyThis = $false
$content = switch -Regex -File "D:\Test\fileA.txt" {
'/+ MAIN \\+' { $copyThis = $true }
default { if ($copyThis) { $_ } }
}
$content | Add-Content -Path "D:\Test\fileC.txt"

Related

How to search and replace combined with if & else in powershell

Every night I got a text file that needs to be edited manually.
The file contains approximately 250 rows. Three example of a rows:
112;20-21;32;20-21;24;0;2;248;271;3;3;;
69;1;4;173390;5;0;0;5460;5464;3;3;;
24;7;4;173390;227;0;0;0;0;3;3;;
I need to replace the two last values in each row.
All rows ending with ;0;3;3;; should be replaced with ;0;17;18;; (the last one, solved)
The logic for the other two:
If the row contain a '-' it should replace the two last values from ;3;3;; to ;21;21;;
If it don´t have a '-' it should replace the two last values from ;3;3;; to ;22;22;;
This is my script
foreach ($file in Get-ChildItem *.*)
{
(Get-Content $file) -replace ';0;3;3;;',';;0;17;18;;' -replace ';3;3;;',';21;21;;' |Out-file -encoding ASCII $file-new}
If I could add a '-' in the end of each row continga a '-' I could solve the issue with a modified script:
(Get-Content $file) -replace ';0;3;3;;',';;0;17;18;;' -replace ';3;3;;-',';22;22;;' -replace ';3;3;;',';21;21;;'|Out-file -encoding ASCII $file-new}`
But how do I add a '-' in the end of a row, if the row contain a '-'?
Best Regards
Mr DXF
I tried with select-string, but I can´t figure it out...
if select-string -pattern '-' {append-text '-'|out-file -encoding ascii $file-new
else end
}
The following might do the trick, it uses a switch with the -Regex flag to read your files and match lines with regular expressions.
foreach ($file in Get-ChildItem *.* -File) {
& {
switch -Regex -File $file.FullName {
# if the line ends with `;3;3;;` but is not preceded by `;0`
'(?<!;0);3;3;;$' {
# if it contains a `-`
if($_.Contains('-')) {
$_ -replace ';3;3;;$', ';21;21;;'
continue
}
# if it doesn't contain a `-`
$_ -replace ';3;3;;$', ';22;22;;'
continue
}
# if the line ends with `';0;3;3;;`
';0;3;3;;$' {
$_ -replace ';0;3;3;;$', ';0;17;18;;'
continue
}
# if none of the above conditions are matched,
# output as is
Default { $_ }
}
} | Set-Content "$($file.BaseName)-new$($file.Extension)" -Encoding ascii
}
Using the content example in question the end result would become:
112;20-21;32;20-21;24;0;2;248;271;21;21;;
69;1;4;173390;5;0;0;5460;5464;22;22;;
24;7;4;173390;227;0;0;0;0;17;18;;

Powershell remove lines containing the incorrect number of words

Can this be done in easily or at all in powershell?
How would one remove all lines from "test.txt" that do not contain exactly 24 words
This is not too hard in PowerShell.
Something like below should do it:
# read the file and use Where-Object to capture only those lines that have 24 words exactly.
# the regex -split uses '\W+', meaning to split each line on (at least one) Non-Word character.
$result = Get-Content -Path 'D:\test.txt' | Where-Object { ($_ -split '\W+').Count -eq 24 }
# output on screen
$result
# write output to new file
$result | Out-File -FilePath 'D:\test24.txt' -Force

New line is being added after find and replace [duplicate]

This question already has answers here:
How can I prevent additional newlines with set-content while keeping existing ones when saving in UTF8?
(2 answers)
Set-Content appends a newline (line break, CRLF) at the end of my file
(3 answers)
Closed 4 years ago.
I have a snippet of code which gets the content of each file and will replace values within it if it matches my variable list.
The code works fine. However, after the scan it's leaving a blank line at the end of the file which I do not want to happen.
# From the location set in the first statement
# Recurse through each file in each folder that has an extension defined
# in-Include
$configFiles = Get-ChildItem $Destination -Recurse -File -Exclude *.exe,*.css,*.scss,*.png,*.min.js
foreach ($file in $configFiles) {
Write-Host $file.FullName
# Get the content of each file and search and replace values as defined in
# the searc/replace table
$fileContent = Get-Content $file.FullName
$fileContent | ForEach-Object {
$line = $_
$lookupTable.GetEnumerator() | ForEach-Object {
# [Regex]::Escape($_.Key) treats regex metacharacters in the search
# string as string literals
if ($line -match [Regex]::Escape($_.Key)) {
$line = $line -replace [Regex]::Escape($_.Key), $_.Value
}
}
$line
} | Set-Content $file.FullName
}
I've tried adding:
Set-Content $file.FullName -NoNewline
This just puts everything in the file on one line.
Some files will already have a blank line at the end which I want to stay the same, so I can't just remove the last line of every file.
How do I stop this script from adding a new line once finished scanning?
$lookuptable for reference:
$lookupTable = #{
'Dummy' = $ReplacementValue
'Dummy2' = $ReplacementValue
}

PowerShell read text file line by line and find missing file in folders

I am a novice looking for some assistance. I have a text file containing two columns of data. One column is the Vendor and one is the Invoice.
I need to scan that text file, line by line, and see if there is a match on Vendor and Invoice in a path. In the path, $Location, the first wildcard is the Vendor number and the second wildcard is the Invoice
I want the non-matches output to a text file.
$Location = "I:\\Vendors\*\Invoices\*"
$txt = "C:\\Users\sbagford.RECOEQUIP\Desktop\AP.txt"
$Output ="I:\\Vendors\Missing\Missing.txt"
foreach ($line in Get-Content $txt) {
if (-not($line -match $location)){$line}
}
set-content $Output -value $Line
Sample Data from txt or csv file.
kvendnum wapinvoice
000953 90269211
000953 90238674
001072 11012016
002317 448668
002419 06123711
002419 06137343
002419 06134382
002419 759208
002419 753087
002419 753069
002419 762614
003138 N6009348
003138 N6009552
003138 N6009569
003138 N6009612
003182 770016
003182 768995
003182 06133429
In above data the only match is on the second line: 000953 90238674
and the 6th line: 002419 06137343
Untested, but here's how I'd approach it:
$Location = "I:\\Vendors\\.+\\Invoices\\.+"
$txt = "C:\\Users\sbagford.RECOEQUIP\Desktop\AP.txt"
$Output ="I:\\Vendors\Missing\Missing.txt"
select-string -path $txt -pattern $Location -notMatch |
set-content $Output
There's no need to pick through the file line-by-line; PowerShell can do this for you using select-string. The -notMatch parameter simply inverts the search and sends through any lines that don't match the pattern.
select-string sends out a stream of matchinfo objects that contain the lines that met the search conditions. These objects actually contain far more information that just the matching line, but fortunately PowerShell is smart enough to know how to send the relevant item through to set-content.
Regular expressions can be tricky to get right, but are worth getting your head around if you're going to do tasks like this.
EDIT
$Location = "I:\Vendors\{0}\Invoices\{1}.pdf"
$txt = "C:\\Users\sbagford.RECOEQUIP\Desktop\AP.txt"
$Output = "I:\Vendors\Missing\Missing.txt"
get-content -path $txt |
% {
# extract fields from the line
$lineItems = $_ -split " "
# construct path based on fields from the line
$testPath = $Location -f $lineItems[0], $lineItems[1]
# for debugging purposes
write-host ( "Line:'{0}' Path:'{1}'" -f $_, $testPath )
# test for existence of the path; ignore errors
if ( -not ( get-item -path $testPath -ErrorAction SilentlyContinue ) ) {
# path does not exist, so write the line to pipeline
write-output $_
}
} |
Set-Content -Path $Output
I guess we will have to pick through the file line-by-line after all. If there is a more idiomatic way to do this, it eludes me.
Code above assumes a consistent format in the input file, and uses -split to break the line into an array.
EDIT - version 3
$Location = "I:\Vendors\{0}\Invoices\{1}.pdf"
$txt = "C:\\Users\sbagford.RECOEQUIP\Desktop\AP.txt"
$Output = "I:\Vendors\Missing\Missing.txt"
get-content -path $txt |
select-string "(\S+)\s+(\S+)" |
%{
# pull vendor and invoice numbers from matchinfo
$vendor = $_.matches[0].groups[1]
$invoice = $_.matches[0].groups[2]
# construct path
$testPath = $Location -f $vendor, $invoice
# for debugging purposes
write-host ( "Line:'{0}' Path:'{1}'" -f $_.line, $testPath )
# test for existence of the path; ignore errors
if ( -not ( get-item -path $testPath -ErrorAction SilentlyContinue ) ) {
# path does not exist, so write the line to pipeline
write-output $_
}
} |
Set-Content -Path $Output
It seemed that the -split " " behaved differently in a running script to how it behaves on the command line. Weird. Anyway, this version uses a regular expression to parse the input line. I tested it against the example data in the original post and it seemed to work.
The regex is broken down as follows
( Start the first matching group
\S+ Greedily match one or more non-white-space characters
) End the first matching group
\s+ Greedily match one or more white-space characters
( Start the second matching group
\S+ Greedily match one or more non-white-space characters
) End the second matching groups

Powershell - reading ahead and While

I have a text file in the following format:
.....
ENTRY,PartNumber1,,,
FIELD,IntCode,123456
...
FIELD,MFRPartNumber,ABC123,,,
...
FIELD,XPARTNUMBER,ABC123
...
FIELD,InternalPartNumber,3214567
...
ENTRY,PartNumber2,,,
...
...
the ... indicates there is other data between these fields. The ONLY thing I can be certain of is that the field starting with ENTRY is a new set of records. The rows starting with FIELD can be in any order, and not all of them may be present in each group of data.
I need to read in a chunk of data
Search for any field matching the
string ABC123
If ABC123 found, search for the existence of the
InternalPartNumber field & return that row of data.
I have not seen a way to use Get-Content that can read in a variable number of rows as a set & be able to search it.
Here is the code I currently have, which will read a file, searching for a string & replacing it with another. I hope this can be modified to be used in this case.
$ftype = "*.txt"
$fnames = gci -Path $filefolder1 -Filter $ftype -Recurse|% {$_.FullName}
$mfgPartlist = Import-Csv -Path "C:\test\mfrPartList.csv"
foreach ($file in $fnames) {
$contents = Get-Content -Path $file
foreach ($partnbr in $mfgPartlist) {
$oldString = $mfgPartlist.OldValue
$newString = $mfgPartlist.NewValue
if (Select-String -Path $file -SimpleMatch $oldString -Debug -Quiet) {
$stringData = $contents -imatch $oldString
$stringData = $stringData -replace "[\n\r]","|"
foreach ($dataline in $stringData) {
$file +"|"+$stringData+"|"+$oldString+"|"+$newString|Out-File "C:\test\Datachanges.txt" -Width 2000 -Append
}
$contents = $contents -replace $oldString $newString
Set-Content -Path $file -Value $contents
}
}
}
Is there a way to read & search a text file in "chunks" using Powershell? Or to do a Read-ahead & determine what to search?
Assuming your fine isn't too big to read into memory all at once:
$Text = Get-Content testfile.txt -Raw
($Text -split '(?ms)^(?=ENTRY)') |
foreach {
if ($_ -match '(?ms)^FIELD\S+ABC123')
{$_ -replace '(?ms).+(^Field\S+InternalPartNumber.+?$).+','$1'}
}
FIELD,InternalPartNumber,3214567
That reads the entire file in as a single multiline string, and then splits it at the beginning of any line that starts with 'ENTRY'. Then it tests each segment for a FIELD line that contains 'ABC123', and if it does, removes everything except the FIELD line for the InternalPartNumber.
This is not my best work as I have just got back from vacation. You could use a while loop reading the text and set an entry flag to gobble up the text in chunks. However if your files are not too big then you could just read up the text file at once and use regex to split up the chunks and then process accordingly.
$pattern = "ABC123"
$matchedRowToReturn = "InternalPartNumber"
$fileData = Get-Content "d:\temp\test.txt" | Where-Object{$_ -match '^(entry|field)'} | Out-String
$parts = $fileData | Select-String '(?smi)(^Entry).*?(?=^Entry|\Z)' -AllMatches | Select-Object -ExpandProperty Matches | Select-Object -ExpandProperty Value
$parts | Where-Object{$_ -match $pattern} | Select-String "$matchedRowToReturn.*$" | Select-Object -ExpandProperty Matches | Select-Object -ExpandProperty Value
What this will do is read in the text file, drop any lines that are not entry or field related, as one long string and split it up into chunks that start with lines that begin with the work "Entry".
Then we drop those "parts" that do not contain the $pattern. Of the remaining that match extract the InternalPartNumber line and present.