Merge same named .txt file in different directories? - powershell

I have a folder that is filled with sub folders of past dates (20120601 for example), inside each date folder there is a file named test.txt, along with another file named example.txt. How can I merge all the test.txt files into one?
I am trying to do this in Windows and have access to Windows PowerShell and Windows Command Processor (cmd.exe). What would be the best way to do this?
My hierarchy would look something like this:
\Data
\20120601
test.txt
example.txt
\20120602
test.txt
example.txt
\20120603
test.txt
example.txt
\20120604
test.txt
example.txt
\20120605
test.txt
example.txt
I would imagine it is something like
copy *\test.txt alltestfiles.txt
Is that possible? Can you specify a wildcard for a directory?

Fairly easy, actually:
Get-ChildItem \Data -Recurse -Include test.txt |
Get-Content |
Out-File -Encoding UTF8 alltestfiles.txt
or shorter:
ls \Data -r -i test.txt | gc | sc -enc UTF8 alltestfile.txt
This will first gather all test.txt files, then read their contents and finalle write out the combined contents into the new file.

List all the files. Read one file's content and add it into the combined file. Like so,
cd data
gci -Recurse | ? { -not $_.psIsContainer -and $_.name -like "test.txt"}
$files | % { Add-Content -Path .\AllTests.txt -Value (Get-Content $_.FullName) }

A simple command like cat path\**\* > target_single_file also works.

Related

How to list file and folder names in powershell?

I want to write all files and folders' names into a .gitignore file like the below:
Folder1
Folder2
File1.bar
File2.foo
and so on.
The writing part can be achieved with the Out-File command, but I'm stuck in printing those names like the format above.
I'm aware of the command Get-ChildItem but it prints out a bunch of metadata like dates and icons too which are useless for the matter. btw, I'm looking for a single-line command, not a script.
Just print the Name property of the files
$ (ls).Name >.gitignore
$ (Get-ChildItem).Name | Out-File .gitignore
I'm aware of the command Get-ChildItem but it prints out a bunch of metadata like dates and icons [...]
That's because PowerShell cmdlets output complex objects rather than raw strings. The metadata you're seeing for a file is all attached to a FileInfo object that describes the underlying file system entry.
To get only the names, simply reference the Name property of each. For this, you can use the ForEach-Object cmdlet:
# Enumerate all the files and folders
$fileSystemItems = Get-ChildItem some\root\path -Recurse |Where-Object Name -ne .gitignore
# Grab only their names
$fileSystemNames = $fileSystemItems |ForEach-Object Name
# Write to .gitignore (beware git usually expects ascii or utf8-encoded configs)
$fileSystemNames |Out-File -LiteralPath .gitignore -Encoding ascii
Would this do?
(get-childitem -Path .\ | select name).name | Out-File .gitignore

Delete first n lines in file (.zip file with many "strange" symbols)

This powershell code delete first 4 lines in file
(gc "old.zip" | select -Skip 4) | sc "new.zip"
But old.zip file has Unix (LF) line endings
And this code also converts file line endings to Windows (CR LF)
How to delete first 4 lines without converting ?
Because of the presence of many "strange" symbols in .zip, other ways to remove the first n lines in a .zip file do not work. For example, more than +4 "old.zip"> "new.zip" in cmd does not work, etc.
Through powershell something like it is removed but also not without problems.
Do you know other ways to remove the first n lines in a .zip file ?
somthing like this?
$PathZipFile="c:\temp\File_1.zip"
$NewPathZipFile="c:\temp\NewFile_1.zip"
#create temporary directory name
$TemporyDir=[System.IO.Path]::Combine("c:\temp", [System.IO.Path]::GetRandomFileName())
#extract archive into temporary directory
Expand-Archive "c:\temp\File_1.zip" -DestinationPath $TemporyDir
#loop file for remove 4 first lines for every files and compress in new archive
Get-ChildItem $TemporyDir -file | %{
(Get-Content $_.FullName | select -Skip 4) | Set-Content -Path $_.FullName;
Compress-Archive $_.FullName $NewPathZipFile -Update
}
Remove-Item $TemporyDir -Recurse -Force
PowerShell:
(gc "old.txt" | select -Skip 4 | Out-String) -replace "`r`n", "`n" | Out-File "new.txt"
C#:
File.WriteAllText("new.txt", string.Join("\n", File.ReadLines("old.txt").Skip(4)));

Concatenate files using PowerShell

I am using PowerShell 3.
What is best practice for concatenating files?
file1.txt + file2.txt = file3.txt
Does PowerShell provide a facility for performing this operation directly? Or do I need each file's contents be loaded into local variables?
If all the files exist in the same directory and can be matched by a simple pattern, the following code will combine all files into one.
Get-Content .\File?.txt | Out-File .\Combined.txt
I would go this route:
Get-Content file1.txt, file2.txt | Set-Content file3.txt
Use the -Encoding parameter on Set-Content if you need something other than ASCII which is the default for Set-Content.
If you need more flexibility, you could use something like
Get-ChildItem -Recurse *.cs | ForEach-Object { Get-Content $_ } | Out-File -Path .\all.txt
Warning: Concatenation using a simple Get-Content (whether or not using -Raw flag) works for text files; Powershell is too helpful for that:
Without -Raw, it "fixes" (i.e. breaks, pun intended) line breaks, or what Powershell thinks is a line break.
With -Raw, you get a terminating line end (normally CR+LF) at the
end of each file part, which is added at the end of the pipeline. There's an option for that in newer Powershells' Set-Content.
To concatenate a binary file (that is, an arbitrary file that was split for some reason and needs to be put together again), use either this:
Get-Content -Raw file1, file2 | Set-Content -NoNewline destination
or something like this:
Get-Content file1 -Encoding Byte -Raw | Set-Content destination -Encoding Byte
Get-Content file2 -Encoding Byte -Raw | Add-Content destination -Encoding Byte
An alternative is to use the CMD shell and use
copy file1 /b + file2 /b + file3 /b + ... destinationfile
You must not overwrite any part, that is, use any of the parts as destination. The destination file must be different from any of the parts. Otherwise you're up for a surprise and must find a backup copy of the file part.
a generalization based on #Keith answer:
gc <some regex expression> | sc output
Here is an interesting example of how to make a zip-in-image file based on Powershell 7
Get-Content -AsByteStream file1.png, file2.7z | Set-Content -AsByteStream file3.png
Get-Content -AsByteStream file1.png, file2.7z | Add-Content -AsByteStream file3.png
gc file1.txt, file2.txt > output.txt
I think this is as short as it gets.
In case you would like to ensure the concatenation is done in a specific order, use the Sort-Object -Property <Some Name> argument. For example, concatenate based on the name sorting in an ascending order:
Get-ChildItem -Path ./* -Include *.txt -Exclude output.txt | Sort-Object -Property Name | ForEach-Object { Get-Content $_ } | Out-File output.txt
IMPORTANT: -Exclude and Out-File MUST contain the same values, otherwise, it will recursively keep on adding to output.txt until your disk is full.
Note that you must append a * at the end of the -Path argument because you are using -Include, as mentioned in Get-ChildItem documentation.

Combine content of several files in folder

I have around 30 directories with .log files in them. I want to go into each folder and combine the text of all the files in the sub-directories separately. I do not want to combine the text of all the files in all the sub-directories.
Example
I have a directory called Machines
in Machines\ I have
Machine2\
Machine3\
Machine4\
Within each Machine* folder, I have :
1.log
2.log
3.log
etc..
I want to create a script that will do:
First: Go into the directory Machine2 and combine the text of all text files in that directory
Second: Go into the Machine3 directory and combine the text of all text file in that directory.
I can use the below if only had one folder, but I need it to loop through several sub folders so I do not have to enter the sub-directory in the command below.
Get-ChildItem -path "W:\Machines\Machine2" -recurse |?{ ! $_.PSIsContainer } |?{($_.name).contains(".log")} | %{ Out-File -filepath c:\machine1.txt -inputobject (get-content $_.fullname) -Append}
I think a recursive solution would work well. Given a directory, grab the content of all *.log files and dump into COMBINED.txt. Then pull the names of all subdirectories, and repeat for each.
function CombineLogs
{
param([string] $startingDir)
dir $startingDir -Filter *.log | Get-Content | Out-File (Join-Path $startingDir COMBINED.txt)
dir $startingDir |?{ $_.PsIsContainer } |%{ CombineLogs $_.FullName }
}
CombineLogs 'c:\logs'

Using PowerShell, read multiple known file names, append text of all files, create and write to one output file

I have five .sql files and know the name of each file. For this example, call them one.sql, two.sql, three.sql, four.sql and five.sql. I want to append the text of all files and create one file called master.sql. How do I do this in PowerShell? Feel free to post multiple answers to this problem because I am sure there are several ways to do this.
My attempt does not work and creates a file with several hundred thousand lines.
PS C:\sql> get-content '.\one.sql' | get-content '.\two.sql' | get-content '.\three.sql' | get-content '.\four.sql' | get-content '.\five.sql' | out-file -encoding UNICODE master.sql
Get-Content one.sql,two.sql,three.sql,four.sql,five.sql > master.sql
Note that > is equivalent to Out-File -Encoding Unicode. I only tend to use Out-File when I need to specify a different encoding.
There are some good answers here but if you have a whole lot of files and maybe you don't know all of the names this is what I came up with:
$vara = get-childitem -name "path"
$varb = foreach ($a in $vara) {gc "path\$a"}
example
$vara = get-childitem -name "c:\users\test"
$varb = foreach ($a in $vara) {gc "c:\users\test\$a"}
You can obviously pipe this directly into | add-content or whatever but I like to capture in variables so I can manipulate later on.
See if this works better
get-childitem "one.sql","two.sql","three.sql","four.sql","five.sql" | get-content | out-file -encoding UNICODE master.sql
I needed something similar, Chris Berry's post helped, but I think this is more efficient:
gci -name "*PathToFiles*" | gc > master.sql
The first part gci -name "*PathToFiles*" gets you your file list. This can be done with wildcards to just get your .sql files i.e. gci -name "\\share\folder\*.sql"
Then pipes to Get-Content and redirects the output to your master.sql file. As noted by Kieth Hill, you can use Out-File in place of > to better control your output if needed.
I think logical way of solving this is to use Add-Content
$files = Get-ChildItem '.\one.sql', '.\two.sql', '.\three.sql', '.\four.sql', '.\five.sql'
$files | foreach { Get-Content $_ | Add-Content '.\master.sql' -encoding UNICODE }
hovewer Get-Content is usually very slow when reading multiple very large files. If its your case this article could help: http://keithhill.spaces.live.com/blog/cns!5A8D2641E0963A97!756.entry
What about:
Get-Content .\one.sql,.\two.sql,.\three.sql,.\four.sql,.\five.sql | Set-Content .\master.sql
Here is how I do concatenate sql files from the Sql folder:
# Set the current location of the script to use relative path
Set-Location $PSScriptRoot
# Concatenate all the sql files
$concatSql = Get-Content -Path .\Sql\*.sql
# Write/overwrite sql to single file
Add-Content -Path concatFile.sql -Value $concatSql