compare two files and update the differences to 2nd file - powershell

I am trying to use PowerShell using Get-Content to read 2 files and update the changes in file 1 to file 2, here is my code:
Compare-Object (Get-Content c:\file1) (Get-Content c:file2) | diff > (Get-Content c:file2)
and its not working, I need to append the file so it appends any changes to the 2nd file.

Couple of issues here
You are calling diff but in PowerShell that is an alias for Compare-Object which you see from get-alias diff. I am guessing that was not expected.
If you want to append the differences that occur in the first file you need to filter the output from compare-object accordingly.
So with that in mind I present...
$file1 = "c:\file1"
$file2 = "c:\file2"
Compare-Object (Get-Content $file1) (Get-Content $file2) | Where-Object{$_.SideIndicator -eq "<="} | Add-Content $file2
$_.SideIndicator -eq "<=" Will only allow the entries that are unique to $file1 to continue thru the pipe to Add-Content. If you just look at the output of compare-object before the Where-Object you can get a good idea of whats going on.

Related

Use Powershell to compare two text files and remove lines with duplicate

I have two text files that contain many duplicate lines. I would like to run a powershell statement that will output a new file with only the values NOT already in the first file. Below is an example of two files.
File1.txt
-----------
Alpha
Bravo
Charlie
File2.txt
-----------
Alpha
Echo
Foxtrot
In this case, only Echo and Foxtrot are not in the first file. So these would be the desired results.
OutputFile.txt
------------
Echo
Foxtrot
I reviewed the below link which is similar to what I want, but this does not write the results to an output file.
Remove lines from file1 that exist in file2 in Powershell
Here's one way to do it:
# Get unique values from first file
$uniqueFile1 = (Get-Content -Path .\File1.txt) | Sort-Object -Unique
# Get lines in second file that aren't in first and save to a file
Get-Content -Path .\File2.txt | Where-Object { $uniqueFile1 -notcontains $_ } | Out-File .\OutputFile.txt
Using the approach in the referenced link will work however, for every line in the original file, it will trigger the second file to be read from disk. This could be painful depending on the size of your files. I think the following approach would meet your needs.
$file1 = Get-Content .\File1.txt
$file2 = Get-Content .\File2.txt
$compareParams = #{
ReferenceObject = $file1
DifferenceObject = $file2
}
Compare-Object #compareParams |
Where-Object -Property SideIndicator -eq '=>' |
Select-Object -ExpandProperty InputObject |
Out-File -FilePath .\OutputFile.txt
This code does the following:
Reads each file into a separate variable
Creates a hashtable for the parameters of Compare-Object (see about_Splatting for more information)
Compares the two files in memory and passes the results to Out-File
Writes the contents of the pipeline to "OutputFile.txt"
If you are comfortable with the overall flow of this, and are only using this in one-off situations, the whole thing can be compressed into a one-liner.
(Compare-Object (gc .\File1.txt) (gc .\File2.txt) | ? SideIndicator -eq '=>').InputObject | Out-File .\OutputFile.txt

Powershell Find and Replace Loop, OutOfMemoryException

I have a working powershell script to find and and replace a few different strings with a new string in thousands of files, without changing the modified date on the files. In any given file there could be hundreds of instances of said strings to replace. The files themselves aren't very large and probably range from 1-50MB (a quick glance at the directory I am testing with shows the largest as ~33MB).
I'm running the script inside a Server 2012 R2 VM with 4 vCPUs and 4GB of RAM. I have set the MaxMemoryPerShellMB value for Powershell to 3GB. As mentioned previously, the script works, but after 2-4 hours powershell will start throwing OutOfMemoryExceptions and crash. The script is 'V2 friendly' and I haven't adopted it to V3+ but I doubt that matters too much.
My question is whether or not the script can be improved to prevent/eliminate the memory exceptions I am running into at the moment. I don't mind if it runs slower, as long as it can get the job done without having to check back every couple of hours and restart it.
$i=0
$all = Get-ChildItem -Recurse -Include *.txt
$scriptfiles = Select-String -Pattern string1,string2,string3 $all
$output = "C:\Temp\scriptoutput.txt"
foreach ($file in $scriptFiles)
{
$filecreate=(Get-ChildItem $file.Path).creationtime
$fileaccess=(Get-ChildItem $file.Path).lastaccesstime
$filewrite=(Get-ChildItem $file.Path).lastwritetime
"$file.Path,Created: $filecreate,Accessed: $fileaccess,Modified: $filewrite" | out-file -FilePath $output -Append
(Get-Content $file.Path) | ForEach-Object {$_ -replace "string1", "newstring" `
-replace "string2", "newstring" `
-replace "string3", "newstring"
} | Set-Content $file.Path
(Get-ChildItem $file.Path).creationtime=$filecreate
(Get-ChildItem $file.Path).lastaccesstime=$fileaccess
(Get-ChildItem $file.Path).lastwritetime=$filewrite
$filecreate=(Get-ChildItem $file.Path).creationtime
$fileaccess=(Get-ChildItem $file.Path).lastaccesstime
$filewrite=(Get-ChildItem $file.Path).lastwritetime
"$file.Path,UPDATED Created: $filecreate,UPDATED Accessed: $fileaccess,UPDATED Modified: $filewrite" | out-file -FilePath $output -Append
$i++}
Any comments, criticisms, and suggestions welcomed.
Thanks
Biggest issue I can see is that you are repeatedly getting the file for every property you are querying. Replace that with one call per loop pass and save it to be used during the pass. Also Out-File is one of the slower methods of outputting data to file.
$output = "C:\Temp\scriptoutput.txt"
$scriptfiles = Get-ChildItem -Recurse -Include *.txt |
Select-String -Pattern string1,string2,string3 |
Select-Object -ExpandProperty Path
$scriptfiles | ForEach-Object{
$file = Get-Item $_
# Save currrent file times
$filecreate=$file.creationtime
$fileaccess=$file.lastaccesstime
$filewrite=$file.lastwritetime
"$file,Created: $filecreate,Accessed: $fileaccess,Modified: $filewrite"
# Update content.
(Get-Content $file) -replace "string1", "newstring" `
-replace "string2", "newstring" `
-replace "string3", "newstring" | Set-Content $file
# Write all the original times back.
$file.creationtime=$filecreate
$file.lastaccesstime=$fileaccess
$file.lastwritetime=$filewrite
# Verify the changes... Should not be required but it is what you were doing.
$filecreate=$file.creationtime
$fileaccess=$file.lastaccesstime
$filewrite=$file.lastwritetime
"$file,UPDATED Created: $filecreate,UPDATED Accessed: $fileaccess,UPDATED Modified: $filewrite"
} | Set-Content $output
Not tested but should be fine.
Depending on what you replacements are actually like you could probably save some time there as well. Test first before running in production obviously.
I remove the counter you had since it appeared nowhere in the code.
Your logging could easily be csv based since you have all the object ready to go but I just want to be sure we are one the right track before we go to far.

Can I combine these CSV files into 1 larger CSV file?

My Old Bat file
Copy F:\File.hdr+F:*.csv F:\FinalOutput.csv
the HDR file is a single entry file that has only header data for the CSV files
Is there a way to perform this in PowerShell (to combine all the CSV files into a single file)?
Here is my powershell script that doesn't work
$CSVFolder = 'F:\Input\';
$OutputFile = 'F:\Output\NewOutput.csv';
$CSV= #();
Get-ChildItem -Path $CSVFolder -Filter *.inv | ForEach-Object {
$CSV += #(Import-Csv -Path $CSVFolder\$_)
}
$CSVHeader = Import-Csv 'F:\Input\Headings.hdr'
$CSV = $CSVHeader + $CSV
$CSV | Export-Csv -Path $OutputFile -NoTypeInformation -Force;
I get the list of FileNames that are exported and not the content of the Files.
The script is also modifying the date/time stamp on my INV files. It shouldn't be doing that.
You can skip the whole CSV bit if you just append the files as you would before.
Something like this should work:
# First we create the new file and add the header.
get-content $headerfile | set-content $outputfile
# Then we get the input files, read them out with get-content
# and append them to the output file (add-content).
get-childitem -path $csvfolder *.inv | get-content | add-content $outputfile
The CSV commandlets are handy if you want to be processing the CSV data in your script, but in your case simply appending the files will do the trick. Not bothering with the CSV conversion will be a lot faster as Powershell doesn't have to parse the CSV lines and create PS-objects. It's really fast with pure text though.
Another trick here is how the get-content and add-content are used in the pipeline. Since they are aware of the pipeline you can pass in file objects without having to use a foreach loop. This makes your statements a lot shorter.
How about:
get-childitem *.inv | foreach-object {
import-csv $_ -header (get-content Headings.hdr)
} | export-csv NewOutput.csv -notypeinformation

How do I remove carriage returns from text file using Powershell?

I'm outputting the contents of a directory to a txt file using the following command:
$SearchPath="c:\searchpath"
$Outpath="c:\outpath"
Get-ChildItem "$SearchPath" -Recurse | where {!$_.psiscontainer} | Format-Wide -Column 1'
| Out-File "$OutPath\Contents.txt" -Encoding ASCII -Width 200
What I end up with when I do this is a txt file with the information I need, but it adds numerous carriage returns I don't need, making the output harder to read.
This is what it looks like:
c:\searchpath\directory
name of file.txt
name of another file.txt
c:\searchpath\another directory
name of some file.txt
That makes a txt file that requires a lot of scrolling, but the actual information isn't that much, usually a lot less than a hundred lines.
I would like for it to look like:
c:\searchpath\directory
nameoffile.txt
c:\searchpath\another directory
another file.txt
This is what I've tried so far, not working
$configFiles=get-childitem "c:\outpath\*.txt" -rec
foreach ($file in $configFiles)
{
(Get-Content $file.PSPath) |
Foreach-Object {$_ -replace "'n", ""} |
Set-Content $file.PSPath
}
I've also tried 'r but both options leave the file unchanged.
Another attempt:
Select-String -Pattern "\w" -Path 'c:\outpath\contents.txt' | foreach {$_.line}'
| Set-Content -Path c:\outpath\contents2.txt
When I run that string without the Set-content at the end, it appears exactly as I need it in the ISE, but as soon as I add the Set-Content at the end, it once agains carriage returns where I don't need them.
Here's something interesting, if I create a text file with a few carriage returns and a few tabs, then if I use the same -replace script I've been using, but uset to replace the tabs, it works perfect. Butr and n do not work. It's almost as though it doesn't recognize them as escape characters. But if I addr and `n in the txt file then run the script, it still doesn't replace anything. Doesn't seem to know what to do with it.
Set-Content adds newlines by default. Replacing Set-Content by Out-File in your last attempt in your question will give you the file you want:
Select-String -Pattern "\w" -Path 'c:\outpath\contents.txt' | foreach {$_.line} |
Out-File -FilePath c:\outpath\contents2.txt
It's not 'r (apostrophe), it's a back tick: `r. That's the key above the tab key on the US keyboard layout. :)
You can simply avoid all those empty lines by using Select-Object -ExpandProperty Name:
Get-ChildItem "$SearchPath" -Recurse |
Where { !$_.PSIsContainer } |
Select-Object -ExpandProperty Name |
Out-File "$OutPath\Contents.txt" -Encoding ASCII -Width 200
... if you don't need the folder names.

Using PowerShell, read multiple known file names, append text of all files, create and write to one output file

I have five .sql files and know the name of each file. For this example, call them one.sql, two.sql, three.sql, four.sql and five.sql. I want to append the text of all files and create one file called master.sql. How do I do this in PowerShell? Feel free to post multiple answers to this problem because I am sure there are several ways to do this.
My attempt does not work and creates a file with several hundred thousand lines.
PS C:\sql> get-content '.\one.sql' | get-content '.\two.sql' | get-content '.\three.sql' | get-content '.\four.sql' | get-content '.\five.sql' | out-file -encoding UNICODE master.sql
Get-Content one.sql,two.sql,three.sql,four.sql,five.sql > master.sql
Note that > is equivalent to Out-File -Encoding Unicode. I only tend to use Out-File when I need to specify a different encoding.
There are some good answers here but if you have a whole lot of files and maybe you don't know all of the names this is what I came up with:
$vara = get-childitem -name "path"
$varb = foreach ($a in $vara) {gc "path\$a"}
example
$vara = get-childitem -name "c:\users\test"
$varb = foreach ($a in $vara) {gc "c:\users\test\$a"}
You can obviously pipe this directly into | add-content or whatever but I like to capture in variables so I can manipulate later on.
See if this works better
get-childitem "one.sql","two.sql","three.sql","four.sql","five.sql" | get-content | out-file -encoding UNICODE master.sql
I needed something similar, Chris Berry's post helped, but I think this is more efficient:
gci -name "*PathToFiles*" | gc > master.sql
The first part gci -name "*PathToFiles*" gets you your file list. This can be done with wildcards to just get your .sql files i.e. gci -name "\\share\folder\*.sql"
Then pipes to Get-Content and redirects the output to your master.sql file. As noted by Kieth Hill, you can use Out-File in place of > to better control your output if needed.
I think logical way of solving this is to use Add-Content
$files = Get-ChildItem '.\one.sql', '.\two.sql', '.\three.sql', '.\four.sql', '.\five.sql'
$files | foreach { Get-Content $_ | Add-Content '.\master.sql' -encoding UNICODE }
hovewer Get-Content is usually very slow when reading multiple very large files. If its your case this article could help: http://keithhill.spaces.live.com/blog/cns!5A8D2641E0963A97!756.entry
What about:
Get-Content .\one.sql,.\two.sql,.\three.sql,.\four.sql,.\five.sql | Set-Content .\master.sql
Here is how I do concatenate sql files from the Sql folder:
# Set the current location of the script to use relative path
Set-Location $PSScriptRoot
# Concatenate all the sql files
$concatSql = Get-Content -Path .\Sql\*.sql
# Write/overwrite sql to single file
Add-Content -Path concatFile.sql -Value $concatSql