Concatenate files using PowerShell - powershell

I am using PowerShell 3.
What is best practice for concatenating files?
file1.txt + file2.txt = file3.txt
Does PowerShell provide a facility for performing this operation directly? Or do I need each file's contents be loaded into local variables?

If all the files exist in the same directory and can be matched by a simple pattern, the following code will combine all files into one.
Get-Content .\File?.txt | Out-File .\Combined.txt

I would go this route:
Get-Content file1.txt, file2.txt | Set-Content file3.txt
Use the -Encoding parameter on Set-Content if you need something other than ASCII which is the default for Set-Content.

If you need more flexibility, you could use something like
Get-ChildItem -Recurse *.cs | ForEach-Object { Get-Content $_ } | Out-File -Path .\all.txt

Warning: Concatenation using a simple Get-Content (whether or not using -Raw flag) works for text files; Powershell is too helpful for that:
Without -Raw, it "fixes" (i.e. breaks, pun intended) line breaks, or what Powershell thinks is a line break.
With -Raw, you get a terminating line end (normally CR+LF) at the
end of each file part, which is added at the end of the pipeline. There's an option for that in newer Powershells' Set-Content.
To concatenate a binary file (that is, an arbitrary file that was split for some reason and needs to be put together again), use either this:
Get-Content -Raw file1, file2 | Set-Content -NoNewline destination
or something like this:
Get-Content file1 -Encoding Byte -Raw | Set-Content destination -Encoding Byte
Get-Content file2 -Encoding Byte -Raw | Add-Content destination -Encoding Byte
An alternative is to use the CMD shell and use
copy file1 /b + file2 /b + file3 /b + ... destinationfile
You must not overwrite any part, that is, use any of the parts as destination. The destination file must be different from any of the parts. Otherwise you're up for a surprise and must find a backup copy of the file part.

a generalization based on #Keith answer:
gc <some regex expression> | sc output

Here is an interesting example of how to make a zip-in-image file based on Powershell 7
Get-Content -AsByteStream file1.png, file2.7z | Set-Content -AsByteStream file3.png
Get-Content -AsByteStream file1.png, file2.7z | Add-Content -AsByteStream file3.png

gc file1.txt, file2.txt > output.txt
I think this is as short as it gets.

In case you would like to ensure the concatenation is done in a specific order, use the Sort-Object -Property <Some Name> argument. For example, concatenate based on the name sorting in an ascending order:
Get-ChildItem -Path ./* -Include *.txt -Exclude output.txt | Sort-Object -Property Name | ForEach-Object { Get-Content $_ } | Out-File output.txt
IMPORTANT: -Exclude and Out-File MUST contain the same values, otherwise, it will recursively keep on adding to output.txt until your disk is full.
Note that you must append a * at the end of the -Path argument because you are using -Include, as mentioned in Get-ChildItem documentation.

Related

Find lines with specific characters recursively

I am searching for all lines with '.png' and '.jpg' strings in them across multiple folders of TXT files.
Tried:
(Get-ChildItem K:\FILES -Recurse -Include '*.txt') | ForEach-Object {
(Get-Content $_) -match '\.png','\.jpg' | out-file K:\Output.txt
}
but it does not output anything. No error either. I did something similar recently and it was working. I am scratching my head wondering what am I doing wrong here...
By placing your Out-File call inside the ForEach-Object script block, you're rewriting your output file in full for every input file, so that the last input file's results - which may be none - end up as the sole content of the file.
The immediate fix is to move the Out-File call to its own pipeline segment, so that it receives all output, across all files:
Get-ChildItem K:\FILES -Recurse -Include '*.txt' |
ForEach-Object {
#(Get-Content $_) -match '\.png', '\.jpg'
} |
Out-File K:\Output.txt
Note: Technically, adding -Append to your Out-File call inside the ForEach-Object could have worked too, but this approach should be avoided:
Every Out-File call must open and close the output file, which makes the operation much slower.
You need to ensure that there is no preexisting output file beforehand - otherwise you'll end up appending to that file's existing content.
However, consider speeding up your command with the help of Select-String:
Get-ChildItem K:\FILES -Recurse -Include '*.txt' |
Select-String -Pattern '\.png', '\.jpg' |
ForEach-Object Line |
Out-File K:\Output.txt
Note:
In PowerShell (Core) 7+, you can use the -Raw switch with Select-String, which directly outputs only the text of all matching lines, in which case ForEach-Object Line isn't needed.
If you want to prefix each matching line with the source file path:
Get-ChildItem K:\FILES -Recurse -Include '*.txt' |
Select-String -Pattern '\.png', '\.jpg' |
ForEach-Object { '{0}: {1}' -f $_.Path, $_.Line } |
Out-File K:\Output.txt
Note: If you pipe Select-String output directly (without -Raw or ForEach-Object Line) to Out-File (or if you use >), you'll get similar output (even including a character position), but with limitations:
You'll get a blank line at the top and the bottom of the file.
Long line texts may be truncated.
The reason is that Out-File and its virtual alias > send the for-display representations of the input objects to the output file, which aren't meant for programmatic processing and can incur truncation of the data based on the line length (number of columns) of the current console window.

Add eighth and ninth lines to all *.txt files

i have more than 100 txt files in C:\myfolder*.txt
when i run this script from "C:\myfolder" i can add eighth and ninth lines to somename.txt
#echo off
powershell "$f=(Get-Content somename.txt);$f[8]='heretext1';$f | set-content somename.txt"
powershell "$f=(Get-Content somename.txt);$f[9]='heretext2';$f | set-content somename.txt"
but how can i add eighth and ninth lines to all *.txt files located in path C:\myfolder*.txt
Can someone explain me how to do it please...
Sorry for my English and Sorry if i didn't explaned my problem. i will try now:
I uses "*.uci" files, instead of *.txt files. i wrote txt because uci extensions are unknown for most of the people. These *.uci files are settings for chess engines with uci protocol.
So when you use chessbase program you have a lot of chess engines and each engine creates their "enginename.uci" file.
If you want to change the numbers of core used on your PC from 1 to 16 you need to do it manually by adding following information in *.uci file like this:
[OPTIONS]
Threads=1
That's why is better to make small batch or ps1 to change settings to all engines by adding these two lines with one click
Perhaps something like this PowerShell script would suit your task:
Get-ChildItem -Path 'C:\myfolder' -Filter '*.txt' | ForEach-Object {
$LineIndex = 0
$FileContent = Switch -File $_.FullName {Default {
$LineIndex++
If ($LineIndex -Eq 8) {#'
heretext1
heretext2
'#}
$_}}
Set-Content -Path $_.FullName -Value $FileContent}
Note:
Your code isn't adding lines, it is modifying existing lines. The solution below does the same.
Indices [8] and [9] access the 9th and 10th lines, not the 8th and 9th, given that array indexing is 0-based.
You need to call Get-ChildItem with your file-name pattern, C:\myfolder\*.txt, and process each matching file via ForEach-Object:
#echo off
powershell "Get-ChildItem C:\myfolder\*.txt | ForEach-Object { $f=$_ | Get-Content -ReadCount 0; $f[8]='heretext1'; $f[9]='heretext2'; Set-Content $_.FullName $f }"
Due to calling from a batch file (cmd.exe), the PowerShell command is specified on a single line; here's the readable version:
Get-ChildItem C:\myfolder\*.txt | # get all matching files
ForEach-Object { # process each
$f = $_ | Get-Content -ReadCount 0 # read all lines
$f[8] = 'heretext1'; $f[9] = 'heretext2' # update the 9th and 10th line
Set-Content $_.FullName $f # save result back to input file
}
Note:
Consider adding -noprofile after powershell, so as to suppress potentially unnecessary loading of profile files - see the documentation of the Windows PowerShell CLI, powershell.exe.
Using -ReadCount 0 with Get-Content greatly speeds up processing, because all lines are then read into a single array, instead of streaming the lines one by one, which requires collecting them in an array, which is much slower.
Note: If a given file has fewer than 10 lines, the above solution won't work, because you can only assign to existing elements of an array (an array is a fixed-size data structure). If you need to deal wit this case, insert the following after the $f = $_ | Get-Content -ReadCount 0 line, which inserts empty lines as needed to ensure that at least 10 lines are present:
if ($f.Count -lt 10) { $f += #('') * (10 - $f.Count) }
Easiest solution I can think of is using the -Index parameter provided in Select-Object for that.
Get-ChildItem -Path .\Desktop\*.txt | % { Get-Content $_.FullName | Select-Object -Index 7,8 } |
Out-File -FilePath .\Desktop\index.txt
Edit: based on your post.

Batch command to remove a string pattern from input file

I'm very new to scripting.
I have a couple of files File1.txt and File2.txt. "RemPattern" is the pattern which I'm expecting to find and remove recursively from the above files.
Is it possible to remove them with the help of any windows or powershell batch command?
I have seen Get-Content can be used to remove an entire line of the matched pattern, but it doesn't fit for my case.
(Get-Content 'File1.txt') -notmatch 'RemPattern' | Set-Content 'File1.txt'
Is it required to write a batch file to achieve this or is it possible to do it by batch commands?
You can try out the -replace instead of -nomatch.
(Get-Content 'D:\File.txt') -replace 'RemPattern' | Set-Content 'D:\File.txt'
I was assuming that you wanted to recurse through a set of files and not do them by manually typing the filenames. So you can:
Get-ChildItem F:\ -Filter File*.txt | Foreach-Object{
(Get-Content $_.FullName) | Foreach-Object {$_ -replace 'RemPattern'} | Set-Content $_.FullName
}
The filter here simply checks File*.txt which in your example will do the replacement for both File1.txt and File2.txt without havign to type out each file manually per line. You can change the filter as you please.

How do I remove carriage returns from text file using Powershell?

I'm outputting the contents of a directory to a txt file using the following command:
$SearchPath="c:\searchpath"
$Outpath="c:\outpath"
Get-ChildItem "$SearchPath" -Recurse | where {!$_.psiscontainer} | Format-Wide -Column 1'
| Out-File "$OutPath\Contents.txt" -Encoding ASCII -Width 200
What I end up with when I do this is a txt file with the information I need, but it adds numerous carriage returns I don't need, making the output harder to read.
This is what it looks like:
c:\searchpath\directory
name of file.txt
name of another file.txt
c:\searchpath\another directory
name of some file.txt
That makes a txt file that requires a lot of scrolling, but the actual information isn't that much, usually a lot less than a hundred lines.
I would like for it to look like:
c:\searchpath\directory
nameoffile.txt
c:\searchpath\another directory
another file.txt
This is what I've tried so far, not working
$configFiles=get-childitem "c:\outpath\*.txt" -rec
foreach ($file in $configFiles)
{
(Get-Content $file.PSPath) |
Foreach-Object {$_ -replace "'n", ""} |
Set-Content $file.PSPath
}
I've also tried 'r but both options leave the file unchanged.
Another attempt:
Select-String -Pattern "\w" -Path 'c:\outpath\contents.txt' | foreach {$_.line}'
| Set-Content -Path c:\outpath\contents2.txt
When I run that string without the Set-content at the end, it appears exactly as I need it in the ISE, but as soon as I add the Set-Content at the end, it once agains carriage returns where I don't need them.
Here's something interesting, if I create a text file with a few carriage returns and a few tabs, then if I use the same -replace script I've been using, but uset to replace the tabs, it works perfect. Butr and n do not work. It's almost as though it doesn't recognize them as escape characters. But if I addr and `n in the txt file then run the script, it still doesn't replace anything. Doesn't seem to know what to do with it.
Set-Content adds newlines by default. Replacing Set-Content by Out-File in your last attempt in your question will give you the file you want:
Select-String -Pattern "\w" -Path 'c:\outpath\contents.txt' | foreach {$_.line} |
Out-File -FilePath c:\outpath\contents2.txt
It's not 'r (apostrophe), it's a back tick: `r. That's the key above the tab key on the US keyboard layout. :)
You can simply avoid all those empty lines by using Select-Object -ExpandProperty Name:
Get-ChildItem "$SearchPath" -Recurse |
Where { !$_.PSIsContainer } |
Select-Object -ExpandProperty Name |
Out-File "$OutPath\Contents.txt" -Encoding ASCII -Width 200
... if you don't need the folder names.

Using PowerShell, read multiple known file names, append text of all files, create and write to one output file

I have five .sql files and know the name of each file. For this example, call them one.sql, two.sql, three.sql, four.sql and five.sql. I want to append the text of all files and create one file called master.sql. How do I do this in PowerShell? Feel free to post multiple answers to this problem because I am sure there are several ways to do this.
My attempt does not work and creates a file with several hundred thousand lines.
PS C:\sql> get-content '.\one.sql' | get-content '.\two.sql' | get-content '.\three.sql' | get-content '.\four.sql' | get-content '.\five.sql' | out-file -encoding UNICODE master.sql
Get-Content one.sql,two.sql,three.sql,four.sql,five.sql > master.sql
Note that > is equivalent to Out-File -Encoding Unicode. I only tend to use Out-File when I need to specify a different encoding.
There are some good answers here but if you have a whole lot of files and maybe you don't know all of the names this is what I came up with:
$vara = get-childitem -name "path"
$varb = foreach ($a in $vara) {gc "path\$a"}
example
$vara = get-childitem -name "c:\users\test"
$varb = foreach ($a in $vara) {gc "c:\users\test\$a"}
You can obviously pipe this directly into | add-content or whatever but I like to capture in variables so I can manipulate later on.
See if this works better
get-childitem "one.sql","two.sql","three.sql","four.sql","five.sql" | get-content | out-file -encoding UNICODE master.sql
I needed something similar, Chris Berry's post helped, but I think this is more efficient:
gci -name "*PathToFiles*" | gc > master.sql
The first part gci -name "*PathToFiles*" gets you your file list. This can be done with wildcards to just get your .sql files i.e. gci -name "\\share\folder\*.sql"
Then pipes to Get-Content and redirects the output to your master.sql file. As noted by Kieth Hill, you can use Out-File in place of > to better control your output if needed.
I think logical way of solving this is to use Add-Content
$files = Get-ChildItem '.\one.sql', '.\two.sql', '.\three.sql', '.\four.sql', '.\five.sql'
$files | foreach { Get-Content $_ | Add-Content '.\master.sql' -encoding UNICODE }
hovewer Get-Content is usually very slow when reading multiple very large files. If its your case this article could help: http://keithhill.spaces.live.com/blog/cns!5A8D2641E0963A97!756.entry
What about:
Get-Content .\one.sql,.\two.sql,.\three.sql,.\four.sql,.\five.sql | Set-Content .\master.sql
Here is how I do concatenate sql files from the Sql folder:
# Set the current location of the script to use relative path
Set-Location $PSScriptRoot
# Concatenate all the sql files
$concatSql = Get-Content -Path .\Sql\*.sql
# Write/overwrite sql to single file
Add-Content -Path concatFile.sql -Value $concatSql