Powershell Find and Replace Loop, OutOfMemoryException - powershell

I have a working powershell script to find and and replace a few different strings with a new string in thousands of files, without changing the modified date on the files. In any given file there could be hundreds of instances of said strings to replace. The files themselves aren't very large and probably range from 1-50MB (a quick glance at the directory I am testing with shows the largest as ~33MB).
I'm running the script inside a Server 2012 R2 VM with 4 vCPUs and 4GB of RAM. I have set the MaxMemoryPerShellMB value for Powershell to 3GB. As mentioned previously, the script works, but after 2-4 hours powershell will start throwing OutOfMemoryExceptions and crash. The script is 'V2 friendly' and I haven't adopted it to V3+ but I doubt that matters too much.
My question is whether or not the script can be improved to prevent/eliminate the memory exceptions I am running into at the moment. I don't mind if it runs slower, as long as it can get the job done without having to check back every couple of hours and restart it.
$i=0
$all = Get-ChildItem -Recurse -Include *.txt
$scriptfiles = Select-String -Pattern string1,string2,string3 $all
$output = "C:\Temp\scriptoutput.txt"
foreach ($file in $scriptFiles)
{
$filecreate=(Get-ChildItem $file.Path).creationtime
$fileaccess=(Get-ChildItem $file.Path).lastaccesstime
$filewrite=(Get-ChildItem $file.Path).lastwritetime
"$file.Path,Created: $filecreate,Accessed: $fileaccess,Modified: $filewrite" | out-file -FilePath $output -Append
(Get-Content $file.Path) | ForEach-Object {$_ -replace "string1", "newstring" `
-replace "string2", "newstring" `
-replace "string3", "newstring"
} | Set-Content $file.Path
(Get-ChildItem $file.Path).creationtime=$filecreate
(Get-ChildItem $file.Path).lastaccesstime=$fileaccess
(Get-ChildItem $file.Path).lastwritetime=$filewrite
$filecreate=(Get-ChildItem $file.Path).creationtime
$fileaccess=(Get-ChildItem $file.Path).lastaccesstime
$filewrite=(Get-ChildItem $file.Path).lastwritetime
"$file.Path,UPDATED Created: $filecreate,UPDATED Accessed: $fileaccess,UPDATED Modified: $filewrite" | out-file -FilePath $output -Append
$i++}
Any comments, criticisms, and suggestions welcomed.
Thanks

Biggest issue I can see is that you are repeatedly getting the file for every property you are querying. Replace that with one call per loop pass and save it to be used during the pass. Also Out-File is one of the slower methods of outputting data to file.
$output = "C:\Temp\scriptoutput.txt"
$scriptfiles = Get-ChildItem -Recurse -Include *.txt |
Select-String -Pattern string1,string2,string3 |
Select-Object -ExpandProperty Path
$scriptfiles | ForEach-Object{
$file = Get-Item $_
# Save currrent file times
$filecreate=$file.creationtime
$fileaccess=$file.lastaccesstime
$filewrite=$file.lastwritetime
"$file,Created: $filecreate,Accessed: $fileaccess,Modified: $filewrite"
# Update content.
(Get-Content $file) -replace "string1", "newstring" `
-replace "string2", "newstring" `
-replace "string3", "newstring" | Set-Content $file
# Write all the original times back.
$file.creationtime=$filecreate
$file.lastaccesstime=$fileaccess
$file.lastwritetime=$filewrite
# Verify the changes... Should not be required but it is what you were doing.
$filecreate=$file.creationtime
$fileaccess=$file.lastaccesstime
$filewrite=$file.lastwritetime
"$file,UPDATED Created: $filecreate,UPDATED Accessed: $fileaccess,UPDATED Modified: $filewrite"
} | Set-Content $output
Not tested but should be fine.
Depending on what you replacements are actually like you could probably save some time there as well. Test first before running in production obviously.
I remove the counter you had since it appeared nowhere in the code.
Your logging could easily be csv based since you have all the object ready to go but I just want to be sure we are one the right track before we go to far.

Related

Possible to slow down a PowerShell Recursive Search?

This is going to be a weird one for ya.
I need to rate limit or slow down a powershell one liner that I am using to search a local file system, as well as network shares in an enterprise environment. I would like to slow down the search to minimize the possibility of network impact.
The script searches files (mostly document or text files) for certain keywords and phrases. Getting this done quickly is not an issue as I am not on a time crunch, safety is key.
Here is the one liner:
Get-ChildItem -path C:\ -recurse -Filter *.txt -ErrorAction Continue | Select-String -Pattern "xxxx" | select filename, Linenumber, Line, Path | Format-Table
With your code added then, I think I was on the right track. With your original code (formatted a bit):
Get-ChildItem -path C:\ -recurse -Filter *.txt -ErrorAction Continue |
ForEach-Item -Process {
Start-Sleep -Seconds 1
$_
} |
Select-String -Pattern "xxxx" |
select filename, Linenumber, Line, Path |
Format-Table
Want to make it look more idiomatic? Write your own function that accepts pipeline input and delays execution. I might even use a filter which is a shorthand way of writing a pipeline-aware function:
filter Delay-Object ([int]$Milliseconds) {
Start-Sleep -Milliseconds $Milliseconds
$_
}
Get-ChildItem -path C:\ -recurse -Filter *.txt -ErrorAction Continue |
Delay-Object -Milliseconds 1000 |
Select-String -Pattern "xxxx" |
select filename, Linenumber, Line, Path |
Format-Table
Without your code this is pure speculation but let's say you're doing something like this:
Get-ChildItem \\my\share\*.* | ForEach-Object {
# do your search here
}
You can just introduce a delay right into your iteration:
Get-ChildItem \\my\share\*.* | ForEach-Object {
Start-Sleep -Seconds 1
# do your search here
}
If you're not using your own script block, let's say you're using Select-String:
Get-ChildItem \\my\share\*.* | Select-String findme
Then the solution is the same: insert a ForEach-Object!
Get-ChildItem \\my\share\*.* |
ForEach-Object {
Start-Sleep -Seconds 1
$_ # have to return the original object back to the pipeline
} |
Select-String findme
You might be thinking that Get-ChildItem is going to find all the files first and pass them all along and that even that will be too much stress on the network drives, but the pipeline doesn't work that way. *
Each item found is going to be passed to the next command in the pipeline one by one, so your delays will happen between each item. Therefore, you can basically insert a delay between any pipeline commands.
* some pipeline commands like Sort-Object need to collect all of the items and then pass them all out at once; from the POV of the next commands it still looks the same, but it will change how/where you need to put delays.
Invoke-command has a -throttle parameter which might help. It limits the number of threads so might help limit the throughput.

Simplify PowerShell Script with Replace Statements

I am trying to run this script on a 50GB file in Windows 2012 R2 and I would like to hopefully get the three replace statements into one pass rather than three. Also, it is important that the replaces occur in that order. Any suggestions to simplify this and make it run efficiently would be greatly appreciated!
$filePath = "D:\FileLocation\file_name.csv"
(Get-Content $filePath | out-string).Replace('"', '""') | Set-Content $filePath
(Get-Content $filePath | out-string).Replace('|~|', '"') | Set-Content $filePath
(Get-Content $filePath | out-string).Replace('|#|', ',') | Set-Content $filePath
With such a large file, I suggest you process the file line by line (or in batches) which should speed up the entire process.
You can copy the Script mentioned by True here http://community.idera.com/powershell/ask_the_experts/f/learn_powershell-12/18821/how-to-remove-specific-rows-from-csv-files-in-powershell
but instead of writing $Line straight away, performing you replaces
$sw.WriteLine($line.replace().replace().replace())
Be careful with get-content since that will try to load the entire file and becomes very slow once you are out of memory.
Also be careful if you don't have much disk space. The linked solution will make a copy of the file (with the changes) before replacing it.
You can use -replace operator
$filepath="c:\temp\text.txt"
(Get-Content $filepath) -replace 'test','1' -replace 'text','2' -replace '123','3' |Set-Content $filepath
You can combine the .replace() in the same line.
$filepath="/Users/me/Desktop/text.txt"
'test text 123' |Out-File -Path $filepath
(Get-Content $filepath|Out-String).Replace('test','1').Replace('text','2').Replace('123','3')|Set-Content $filepath
Get-Content $filepath
1 2 3

powershell append to output

I'm 'teaching myself to powershell' and have come a cropper already, and google/this site hasn't enabled me to find a solution. I'm compiling a text file with filelists from different directories, but i'm having trouble appemnding new data to the file.
get-childitem $dir -recurse | % {write-output $_.fullname} >$file
creates my file, but then i want to APPEND new records from the below
get-childitem $dir2 -recurse | % {write-output $_.fullname} >$file
I've tried both add-content and -append, but I cant figure out what I'm not doing to get it right.
Try:
get-childitem $dir -recurse | % {write-output $_.fullname} >> $file
(Tested and works)
The double >> makes it append always, a single > overwrites each time.
Or change your syntax to use Out-File
get-childitem $dir -recurse | % {write-output $_.fullname} | out-file -filepath $file -Append
(untested)
In this case the variable $file must hold the full path. Like: C:\directory\filename.txt
You can use Out-File to write to a file, adding the append parameter will append to the file.
Get-ChildItem $dir -recurse | Select-object -ExpandProperty Fullname | Out-File -FilePath $file
Get-ChildItem $dir2 -recurse | Select-object -ExpandProperty Fullname | Out-File -FilePath $file -Append
Short Answer
The pipeline used here can be eliminated, and usage of Out-File would make life easy:
Out-File (Get-ChildItem $dir -Recurse).FullName -FilePath $File
To append would be to simply use the -Append flag:
Out-File (Get-ChildItem $dir2 -Recurse).FullName -FilePath $File -Append
Note: This only works in PowerShell v3 and up, as PowerShell v2 relied on the pipeline to expand properties of objects within an array. In that case, the best route is to use something more like #david-martin proposed on this same thread.
Long Answer, and Best Practices
In a different thread, Script to Append The File, they were having similar difficulties with appending files. Though, they were also using the pipeline in a way that was unnecessary (more so than you have used in your example).
Their pipeline usage looked like this:
$PathArray | % {$_} | Out-File "C:\SearchString\Output.txt"
Now, again, Out-File has an -Append parameter. Simply modifying their code to have it tagged on at the end took care of things.
Though, their ForEach-Object statement (the % symbol) is pretty useless in the pipeline and isn't needed (very close in similarity to how yours is used). This is because you are only using the ForEach-Object loop to output the object without any modification. This is exactly what the pipeline does by default, which is pass each object along to the next command.
For more information on the pipeline: About Pipelines
If Update-Help has been run locally, one can use Get-Help to locally run Get-Help about_pipelines to see information too.
Instead of this:
$PathArray | % {$_} | Out-File "C:\SearchString\Output.txt" -Append
We could do this:
$PathArray | Out-File "C:\SearchString\Output.txt" -Append
[Recommended] That example can also eliminate the need for the pipeline all together, as using a pipeline is less efficient if it can be done without it. Doing everything one can possibly do without the pipeline, or to the left of each pipe in the pipeline, is to "filter left" (see the following article for more about why one should filter left, format right: Filtering Command Output in PowerShell):
Out-File -InputObject $PathArray -FilePath "C:\SearchString\Output.txt" -Append
Note: In the case above, -Append is only needed if the file already exists and is being extended.
Remember: Get-Help, and Read The Friendly Manual (RTFM)
The easiest way to troubleshoot is to checkout help documentation. Use Get-Help to checkup whatever you need: parameter sets, available parameters, examples, etc. Make sure to run Update-Help in order to have detailed documentation available locally. To checkout everything:
Update-Help
Get-Help Out-File -Full
For more detailed information that is good to know about data stream/output redirection:
PowerShell redirection operators, such as > and >> (but also redirection of data streams with n> and n>&1), and the available streams per PowerShell version: About Redirection in PowerShell (or: Get-Help about_redirection in PowerShell)
Tee-Object cmdlet), a cmdlet that acts as a more robust version of Out-File (or: Get-Help tee-object in powerShell)

Delete files containing string

How can I delete all files in a directory that contain a string using powershell?
I've tried something like
$list = get-childitem *.milk | select-string -pattern "fRating=2" | Format-Table Path
$list | foreach { rm $_.Path }
And that worked for some files but did not remove everything. I've tried other various things but nothing is working.
I can easily get the list of file names and can create an array with the path's only using
$lista = #(); foreach ($f in $list) { $lista += $f.Path; }
but can't seem to get any command (del, rm, or Remove-Item) to do anything. Just returns immediately without deleting the files or giving errors.
Thanks
First we can simplify your code as:
Get-ChildItem "*.milk" | Select-String -Pattern "fRating=2" | Select-Object -ExcludeProperty path | Remove-Item -Force -Confirm
The lack of action and errors might be addressable by one of two things. The Force parameter which:
Allows the cmdlet to remove items that cannot otherwise be changed,
such as hidden or read-only files or read-only aliases or variables.
I would aslo suggest that you run this script as administrator. Depending where these files are located you might not have permissions. If this is not the case or does not work please include the error you are getting.
Im going to guess the error is:
remove-item : Cannot remove item C:\temp\somefile.txt: The process cannot access the file 'C:\temp\somefile.txt'
because it is being used by another process.
Update
In testing, I was also getting a similar error. Upon research it looks like the Select-String cmd-let was holding onto the file preventing its deletion. Assumption based on i have never seen Get-ChildItem do this before. The solution in that case would be encase the first part of this in parentheses as a sub expression so it would process all the files before going through the pipe.
(Get-ChildItem | Select-String -Pattern "tes" | Select-Object -ExpandProperty path) | Remove-Item -Force -Confirm
Remove -Confirm if deemed required. It exists as a precaution so that you don't open up a new powershell in c:\windows\system32 and copy paste a remove-item cmdlet in there.
Another Update
[ and ] are wildcard searches in powershell in order to escape those in some cmdlets you use -Literalpath. Also Select-String can return multiple hits in files so we should use -Unique
(Get-ChildItem *.milk | Select-String -Pattern "fRating=2" | Select-Object -ExpandProperty path -Unique) | ForEach-Object{Remove-Item -Force -LiteralPath $_}
Why do you use select-string -pattern "fRating=2"? You would like to select all files with this name?
I think the Format-Table Path don't work. The command Get-ChildItem don't have a property called "Path".
Work this snipped for you?
$list = get-childitem *.milk | Where-Object -FilterScript {$_.Name -match "fRating=2"}
$list | foreach { rm $_.FullName }
The following code gets all files of type *.milk and puts them in $listA, then uses that list to get all the files that contain the string fRating=[01] and stores them in $listB. The files in $listB are deleted and then the number of files deleted versus the number of files that contained the match is displayed(they should be equal).
sv -name listA -value (Get-ChildItem *.milk); sv -name listB -value ($listA | Select-String -Pattern "fRating=[01]"); (($listB | Select-Object -ExpandProperty path) | ForEach-Object {Remove-Item -Force -LiteralPath $_}); (sv -name FCount -value ((Get-ChildItem *.milk).Count)); Write-Host -NoNewline Files Deleted ($listA.Count - $FCount)/($listB.Count)`n;
No need to complicate things:
1. $sourcePath = "\\path\to\the\file\"
2. Remove-Item "$sourcePath*whatever*"
I tried the answer, unfortunately, errors seems to always come up, however, I managed to create a solution to get this done:
Without using Get-ChilItem; You can use select-string directly to search for files matching a certain string, yes, this will return the filename:count:content ... etc, but, internally these have names that you can chose or omit, the one you need is the "filename" to do this pipe this into "select-object" choosing the "FileName" from the output.
So, to select all *.MSG files that has the pattern of "Subject: Webservices restarted", you can do the following:
Select-String -Path .*.MSG -Pattern 'Subject: WebServices Restarted'
-List | select-object Filename
Also, to remove these files on the fly, you could pip into a ForEach statement with the RM command as follows:
Select-String -Path .*.MSG -Pattern 'Subject: WebServices Restarted'
-List | select-object Filename | foreach { rm $_.FileName }
I tried this myself, works 100%.
I hope this helps

Using PowerShell, read multiple known file names, append text of all files, create and write to one output file

I have five .sql files and know the name of each file. For this example, call them one.sql, two.sql, three.sql, four.sql and five.sql. I want to append the text of all files and create one file called master.sql. How do I do this in PowerShell? Feel free to post multiple answers to this problem because I am sure there are several ways to do this.
My attempt does not work and creates a file with several hundred thousand lines.
PS C:\sql> get-content '.\one.sql' | get-content '.\two.sql' | get-content '.\three.sql' | get-content '.\four.sql' | get-content '.\five.sql' | out-file -encoding UNICODE master.sql
Get-Content one.sql,two.sql,three.sql,four.sql,five.sql > master.sql
Note that > is equivalent to Out-File -Encoding Unicode. I only tend to use Out-File when I need to specify a different encoding.
There are some good answers here but if you have a whole lot of files and maybe you don't know all of the names this is what I came up with:
$vara = get-childitem -name "path"
$varb = foreach ($a in $vara) {gc "path\$a"}
example
$vara = get-childitem -name "c:\users\test"
$varb = foreach ($a in $vara) {gc "c:\users\test\$a"}
You can obviously pipe this directly into | add-content or whatever but I like to capture in variables so I can manipulate later on.
See if this works better
get-childitem "one.sql","two.sql","three.sql","four.sql","five.sql" | get-content | out-file -encoding UNICODE master.sql
I needed something similar, Chris Berry's post helped, but I think this is more efficient:
gci -name "*PathToFiles*" | gc > master.sql
The first part gci -name "*PathToFiles*" gets you your file list. This can be done with wildcards to just get your .sql files i.e. gci -name "\\share\folder\*.sql"
Then pipes to Get-Content and redirects the output to your master.sql file. As noted by Kieth Hill, you can use Out-File in place of > to better control your output if needed.
I think logical way of solving this is to use Add-Content
$files = Get-ChildItem '.\one.sql', '.\two.sql', '.\three.sql', '.\four.sql', '.\five.sql'
$files | foreach { Get-Content $_ | Add-Content '.\master.sql' -encoding UNICODE }
hovewer Get-Content is usually very slow when reading multiple very large files. If its your case this article could help: http://keithhill.spaces.live.com/blog/cns!5A8D2641E0963A97!756.entry
What about:
Get-Content .\one.sql,.\two.sql,.\three.sql,.\four.sql,.\five.sql | Set-Content .\master.sql
Here is how I do concatenate sql files from the Sql folder:
# Set the current location of the script to use relative path
Set-Location $PSScriptRoot
# Concatenate all the sql files
$concatSql = Get-Content -Path .\Sql\*.sql
# Write/overwrite sql to single file
Add-Content -Path concatFile.sql -Value $concatSql