Looking for some pointers / tips to increase the speed and/or efficacy of below. Would be open to other methods, but have only dabbled in powershell,cmd and python.
Also credit where credit is due: This is a hack-job on the following: https://stackoverflow.com/a/44183234/12834479
Rather than working local, I'm hitting a Network Share over VPN with abysmal connection speeds.
Roughly, it's working at 8 secs / PDF.
Issues I've tried to take care of, goal is to ensure each PDF is readable by Adobe. Images saved as PDF (but not pdfs) will open in some PDF software, but Adobe hates them. I have the method to convert, but my rate limiter is identifying them.
Adobe PDFs -start with %PDF
Some Bank PDFs - start with "blank space" then %PDF
3rd party software - Junk Headers, but %PDF is within document
$items = Get-ChildItem | Where-Object {$_.Extension -eq ".pdf"}
$arrary = #()
$logFile = "RESULTS_$(get-date -Format yyyymmdd).log"
$badCounter = 0
$goodCounter = 0
$msg = "`n`nProcessing " + $items.count + " files... "
Write-Host -nonewline -foregroundcolor Yellow $msg
foreach ($item in $items)
{
trap { Write-Output "Error trapped: $_"; continue; }
try {
$pdfText = Get-Content $item -raw
$ptr3 = '%PDF'
if ('%PDF' -ne $pdfText.SubString(([System.Math]::Max(0,$pdfText.IndexOf($ptr3))),4)) { $arrary+= "$item |-failed" >>$logfile;$badCounter += 1; $badCounter} else { $goodCounter += 1; $goodCounter}
continue;}
catch [System.Exception]{write-output "$item $_";}}
$totalCounter = $badCounter + $goodCounter
Write-Output $arrary >> $logFile
1..3 | %{ Write-Output "" >> $logFile }
Write-Output "Total: $totalCounter / BAD: $badCounter / GOOD: $goodCounter" >> $logFile
Write-Output "DONE!`n`n"
If any difference currently running in PS Version 7.1.3 / but also have 5.1.18 on local.
Actually, PDF files aren't plaintext files at all, but binary files, so you should not read them in as string.
What you are looking for is called a FourCC magic number in the file. This four-character code can be seen as Magic number to identify the file type.
For PDF files, these 4 bytes are 0x25, 0x50, 0x44, 0x46 ("%PDF") and the file should start with those bytes.
For those true PDF files, you could test with:
[byte[]]$fourCC = Get-Content -Encoding Byte -ReadCount 4 -TotalCount 4 -Path 'X:\TheFile.pdf'
if ([System.Text.Encoding]::ASCII.GetString($fourCC) -ceq '%PDF') {
Write-Host "This is a true PDF file"
}
However, as you say "Bank pdf's usually start with a blank space", to also consider those files "good", you can do:
[byte[]]$sixCC = Get-Content -Encoding Byte -ReadCount 6 -TotalCount 6 -Path 'X:\TheFile.pdf'
if ([System.Text.Encoding]::ASCII.GetString($sixCC) -cmatch '%PDF') {
Write-Host "This is a PDF file"
}
If you also want to treat files where "%PDF" is found anyhere in the file as "good", you will need to read the whole file as string, but with a one-to-one byte mapping of the bytes.
For that you can use below helper function:
function ConvertTo-BinaryString {
# converts the bytes of a file to a string that has a
# 1-to-1 mapping back to the file's original bytes.
# Useful for performing binary regular expressions.
Param (
[Parameter(Mandatory = $True, ValueFromPipeline = $True, Position = 0)]
[ValidateScript( { Test-Path $_ -PathType Leaf } )]
[String]$Path
)
# Note: Codepage 28591 returns a 1-to-1 char to byte mapping
$Encoding = [Text.Encoding]::GetEncoding(28591)
$Stream = [System.IO.FileStream]::new($Path, 'Open', 'Read')
$StreamReader = [System.IO.StreamReader]::new($Stream, $Encoding)
$BinaryText = $StreamReader.ReadToEnd()
$StreamReader.Close()
$Stream.Close()
return $BinaryText
}
Next, you can use that function as:
$binString = ConvertTo-BinaryString -Path 'X:\TheFile.pdf'
if ($binString.IndexOf("%PDF") -ge 0) {
Write-Host "This is a PDF file"
}
Putting it all together and assuming you want all files marked as .PDF files where the magic number '%PDF' (case-sensitive) can be found anywhere in the file:
function ConvertTo-BinaryString {
# converts the bytes of a file to a string that has a
# 1-to-1 mapping back to the file's original bytes.
# Useful for performing binary regular expressions.
Param (
[Parameter(Mandatory = $True, ValueFromPipeline = $True, Position = 0)]
[ValidateScript( { Test-Path $_ -PathType Leaf } )]
[String]$Path
)
# Note: Codepage 28591 returns a 1-to-1 char to byte mapping
$Encoding = [Text.Encoding]::GetEncoding(28591)
$Stream = [System.IO.FileStream]::new($Path, 'Open', 'Read')
$StreamReader = [System.IO.StreamReader]::new($Stream, $Encoding)
$BinaryText = $StreamReader.ReadToEnd()
$StreamReader.Close()
$Stream.Close()
return $BinaryText
}
$badCounter = 0
$goodCounter = 0
$logFile = "RESULTS_{0:yyyyMMdd}.log" -f (Get-Date)
# get an array of pdf file FullNames
$files = #(Get-ChildItem -File -Filter '*.pdf').FullName
Write-Host "Processing $($files.Count) files... " -ForegroundColor Yellow
# loop through the array, test if '%PDF' is found and output strings for the log file
$result = foreach ($item in $files) {
$pdfText = ConvertTo-BinaryString -Path $item
if ($pdfText.IndexOf("%PDF") -ge 0) {
$goodCounter++
"Success - $item"
}
else {
$badCounter++
"Fail - $item"
}
}
# write the output to the log file
$result | Set-Content -Path $logFile
"=" * 25 | Add-Content -Path $logFile
"BAD: $badCounter" | Add-Content -Path $logFile
"GOOD: $goodCounter" | Add-Content -Path $logFile
"Total: $($files.Count)" | Add-Content -Path $logFile
Write-Host "DONE!" -ForegroundColor Green
Guys i'm having some issues converting my Perl script to powershell, I need some help. In the host file of our machines, we have all of the URL's to our test environments blocked. In my PERL script, based on which environment is selected, it will comment out the line of the environment selected to allow access and block others so the testers can't mistakenly do things in the wrong environment.
I need help converting to powershell
Below is what I have in PERL:
sub editHosts {
print "Editing hosts file...\n";
my $file = 'C:\\Windows\\System32\\Drivers\\etc\\hosts';
my $data = readFile($file);
my #lines = split /\n/, $data;
my $row = '1';
open (FILE, ">$file") or die "Cannot open $file\n";
foreach my $line (#lines) {
if ($line =~ m/$web/) {
print FILE '#'."$line\n"; }
else {
if ($row > '21') {
$line =~ s/^\#*127\.0\.0\.1/127\.0\.0\.1/;
$line =~ s/[#;].*$//s; }
print FILE "$line\n"; }
$row++;
}
close(FILE);
}
Here is what i've tried in Powershell:
foreach ($line in get-content "C:\windows\system32\drivers\etc\hosts") {
if ($line -contains $web) {
$line + "#"
}
I've tried variation including set-content with what used to be in the host file, etc.
Any help would be appreciated!
Thanks,
Grant
-contains is a "set" operator, not a substring operator. Try .Contains() or -like.
This will comment out lines matching the variable $word, while removing # from non-matches (except the header):
function Edit-Hosts ([string]$Web, $File = "C:\windows\system32\drivers\etc\hosts") {
#If file exists and $web is not empty/whitespace
if((Test-Path -Path $file -PathType Leaf) -and $web.Trim()) {
$row = 1
(Get-Content -Path $file) | ForEach-Object {
if($_ -like "*$web*") {
#Matched PROD, comment out line
"#$($_)"
} else {
#No match. If past header = remove comment
if($row -gt 21) { $_ -replace '^#' } else { $_ }
}
$row++
} | Set-Content -Path $file
} else {
Write-Error -Category InvalidArgument -Message "'$file' doesn't exist or Web-parameter is empty"
}
}
Usage:
Edit-Hosts -Web "PROD"
This is a similar answer to Frode F.'s answer, but I'm not yet able to comment to add my 2c worth, so have to provide an alternative answer instead.
It looks like one of the gotchas moving from perl to PowerShell, in this example, is that when we get the content of the file using Get-Content it is an "offline" copy, i.e. any edits are not made directly to the file itself. One approach is to compile the new content to the whole file and then write that back to disk.
I suppose that the print FILE "some text\n"; construct in perl might be similar to "some text" | Out-File $filename -Encoding ascii -Append in PowerShell, albeit you would use the latter either (1) to write line-by-line to a new/empty file or (2) accept that you are appending to existing content.
Two other things about editing the hosts file:
Be sure to make sure that your hosts file is ASCII encoded; I have caused a major outage for a key enterprise application (50k+ users) in learning that...
You may need to remember to run your PowerShell / PowerShell ISE by right-clicking and choosing Run as Administrator else you might not be able to modify the file.
Anyway, here's a version of the previous answer using Out-File:
$FileName = "C:\windows\system32\drivers\etc\hosts"
$web = "PROD"
# Get "offline" copy of file contents
$FileContent = Get-Content $FileName
# The following creates an empty file and returns a file
# object (type [System.IO.FileInfo])
$EmptyFile = New-Item -Path $FileName -ItemType File -Force
foreach($Line in $FileContent) {
if($Line -match "$web") {
"# $Line" | Out-File $EmptyFile -Append -Encoding ascii
} else {
"$Line" | Out-File $EmptyFile -Append -Encoding ascii
}
}
Edit
The ($Line -match "$web") takes whatever is in the $web variable and treats it as a regular expression. In my example I was assuming that you were just wanting to match a simple text string, but you might well be trying to match an IP address, etc. You have a couple of options:
Use ($Line -like "*$web*") instead.
Convert what is in $web to be an escaped regex, i.e. one that will match literally. Do this with ($Line -match [Regex]::Escape($web)).
You also wanted to strip off comments from any line past row 21 of the hosts file, should that line not match $web. In perl you have used the s substitution operator; the PowerShell equivalent is -replace.
So... here is an updated version of that foreach loop:
$LineCount = 1
foreach($Line in $FileContent) {
if($Line -match [Regex]::Escape($web) {
# ADD comment to any matched line
$Line = "#" + $Line
} elseif($LineCount -gt 21) {
# Uncomment the other lines
$Line = $Line -replace '^[# ]+',''
}
# Remove 'stacked up' comment characters, if any
$Line = $Line -replace '[#]+','#'
$Line | Out-File $EmptyFile -Append -Encoding ascii
$LineCount++
}
More Information
Are there good references for moving from Perl to Powershell?
How to use operator '-replace' in PowerShell to replace strings of texts with special characters and replace successfully
about_Comparison_Operators
http://www.comp.leeds.ac.uk/Perl/sandtr.html
If you wanted to verify what was in there and then add entries, you could use the below which is designed to be ran interactively and returns any existing entries you specify in the varibles:
Note: the `t is powershell's in script method for 'Tab' command.
$hostscontent
# Script to Verify and Add Host File Entries
$hostfile = gc 'C:\Windows\System32\drivers\etc\hosts'
$hostscontent1 = $hostfile | select-string "autodiscover.XXX.co.uk"
$hostscontent2 = $hostfile | select-string "webmail.XXX.co.uk"
$1 = "XX.XX.XXX.XX`tautodiscover.XXX.co.uk"
$2 = "webmail.XXX.co.uk"
# Replace this machines path with a path to your list of machines e.g. $machines = gc \\machine\machines.txt
$machines = gc 'c:\mytestmachine.txt'
ForEach ($machine in $machines) {
If ($hostscontent1 -ne $null) {
Start-Sleep -Seconds 1
Write-Host "$machine Already has Entry $1" -ForegroundColor Green
} Else {
Write-Host "Adding Entry $1 for $machine" -ForegroundColor Green
Start-Sleep -Seconds 1
Add-Content -Path C:\Windows\System32\drivers\etc\hosts -Value "XX.XX.XXX.XX`tautodiscover.XXX.co.uk" -Force
}
If ($hostscontent2 -ne $null) {
Start-Sleep -Seconds 1
Write-Host "$machine Already has Entry $2" -ForegroundColor Green
} Else {
Write-Host "Adding Entry $2 for $machine" -ForegroundColor Green
Start-Sleep -Seconds 1
Add-Content -Path C:\Windows\System32\drivers\etc\hosts -Value "XX.XX.XXX.XX`twebmail.XXX.co.uk" -Force
}
}
Using PowerShell, I want to replace all exact occurrences of [MYID] in a given file with MyValue. What is the easiest way to do so?
Use (V3 version):
(Get-Content c:\temp\test.txt).replace('[MYID]', 'MyValue') | Set-Content c:\temp\test.txt
Or for V2:
(Get-Content c:\temp\test.txt) -replace '\[MYID\]', 'MyValue' | Set-Content c:\temp\test.txt
I prefer using the File-class of .NET and its static methods as seen in the following example.
$content = [System.IO.File]::ReadAllText("c:\bla.txt").Replace("[MYID]","MyValue")
[System.IO.File]::WriteAllText("c:\bla.txt", $content)
This has the advantage of working with a single String instead of a String-array as with Get-Content. The methods also take care of the encoding of the file (UTF-8 BOM, etc.) without you having to take care most of the time.
Also the methods don't mess up the line endings (Unix line endings that might be used) in contrast to an algorithm using Get-Content and piping through to Set-Content.
So for me: Fewer things that could break over the years.
A little-known thing when using .NET classes is that when you have typed in "[System.IO.File]::" in the PowerShell window you can press the Tab key to step through the methods there.
(Get-Content file.txt) |
Foreach-Object {$_ -replace '\[MYID\]','MyValue'} |
Out-File file.txt
Note the parentheses around (Get-Content file.txt) is required:
Without the parenthesis the content is read, one line at a time, and flows down the pipeline until it reaches out-file or set-content, which tries to write to the same file, but it's already open by get-content and you get an error. The parenthesis causes the operation of content reading to be performed once (open, read and close). Only then when all lines have been read, they are piped one at a time and when they reach the last command in the pipeline they can be written to the file. It's the same as $content=content; $content | where ...
The one above only runs for "One File" only, but you can also run this for multiple files within your folder:
Get-ChildItem 'C:yourfile*.xml' -Recurse | ForEach {
(Get-Content $_ | ForEach { $_ -replace '[MYID]', 'MyValue' }) |
Set-Content $_
}
I found a little known but amazingly cool way to do it from Payette's Windows Powershell in Action. You can reference files like variables, similar to $env:path, but you need to add the curly braces.
${c:file.txt} = ${c:file.txt} -replace 'oldvalue','newvalue'
You could try something like this:
$path = "C:\testFile.txt"
$word = "searchword"
$replacement = "ReplacementText"
$text = get-content $path
$newText = $text -replace $word,$replacement
$newText > $path
This is what I use, but it is slow on large text files.
get-content $pathToFile | % { $_ -replace $stringToReplace, $replaceWith } | set-content $pathToFile
If you are going to be replacing strings in large text files and speed is a concern, look into using System.IO.StreamReader and System.IO.StreamWriter.
try
{
$reader = [System.IO.StreamReader] $pathToFile
$data = $reader.ReadToEnd()
$reader.close()
}
finally
{
if ($reader -ne $null)
{
$reader.dispose()
}
}
$data = $data -replace $stringToReplace, $replaceWith
try
{
$writer = [System.IO.StreamWriter] $pathToFile
$writer.write($data)
$writer.close()
}
finally
{
if ($writer -ne $null)
{
$writer.dispose()
}
}
(The code above has not been tested.)
There is probably a more elegant way to use StreamReader and StreamWriter for replacing text in a document, but that should give you a good starting point.
Credit to #rominator007
I wrapped it into a function (because you may want to use it again)
function Replace-AllStringsInFile($SearchString,$ReplaceString,$FullPathToFile)
{
$content = [System.IO.File]::ReadAllText("$FullPathToFile").Replace("$SearchString","$ReplaceString")
[System.IO.File]::WriteAllText("$FullPathToFile", $content)
}
NOTE: This is NOT case sensitive!!!!!
See this post: String.Replace ignoring case
If You Need to Replace Strings in Multiple Files:
It should be noted that the different methods posted here can be wildly different with regard to the time it takes to complete. For me, I regularly have large numbers of small files. To test what is most performant, I extracted 5.52 GB (5,933,604,999 bytes) of XML in 40,693 separate files and ran through three of the answers I found here:
## 5.52 GB (5,933,604,999 bytes) of XML files (40,693 files)
$xmls = (Get-ChildItem -Path "I:\TestseT\All_XML" -Recurse -Filter *.xml).FullName
#### Test 1 - Plain Replace
$start = Get-Date
foreach ($xml in $xmls) {
(Get-Content $xml).replace("'", " ") | Set-Content $xml
}
$end = Get-Date
New-TimeSpan –Start $Start –End $End
# TotalMinutes: 103.725113128333
#### Test 2 - Replace with -Raw
$start = Get-Date
foreach ($xml in $xmls) {
(Get-Content $xml -Raw).replace("'", " ") | Set-Content $xml
}
$end = Get-Date
New-TimeSpan –Start $Start –End $End
# TotalMinutes: 10.1600227983333
#### Test 3 - .NET, System.IO
$start = Get-Date
foreach ($xml in $xmls) {
$txt = [System.IO.File]::ReadAllText("$xml").Replace("'"," ")
[System.IO.File]::WriteAllText("$xml", $txt)
}
$end = Get-Date
New-TimeSpan –Start $Start –End $End
# TotalMinutes: 5.83619516833333
Since this comes up often, I defined a function for it. I defaulted to case-sensitive, regex-based matching, but I included switches for targeting literal text and ignoring case.
# Find and replace text in each pipeline string. Omit the -Replace parameter to delete
# text instead. Use the -SimpleMatch switch to work with literal text instead of regular
# expressions. Comparisons are case-sensitive unless the -IgnoreCase switch is used.
Filter Edit-String {
Param([string]$Find, [string]$Replace='', [switch]$SimpleMatch, [switch]$IgnoreCase)
if ($SimpleMatch) {
if ($IgnoreCase) {
return $_.Replace($Find, $Replace,
[System.StringComparison]::OrdinalIgnoreCase)
}
return $_.Replace($Find, $Replace)
}
if ($IgnoreCase) {
return $_ -replace $Find, $Replace
}
return $_ -creplace $Find, $Replace
}
Set-Alias replace Edit-String
Set-Alias sc Set-Content
Usage
# 1 file
$f = a.txt; gc $f | replace '[MYID]' 'MyValue' -SimpleMatch | sc $f
# 0 to many files
gci *.txt | % { gc $_ | replace '\[MYID\]' 'MyValue' | sc $_ }
# Several replacements chained together
... | replace '[1-9]' T | replace a b -IgnoreCase | replace 'delete me' | ...
# Alias cheat sheet
# gci Get-ChildItem
# gc Get-Content
# sc Set-Conent
# % ForEach-Object
This worked for me using the current working directory in PowerShell. You need to use the FullName property, or it won't work in PowerShell version 5. I needed to change the target .NET framework version in ALL my CSPROJ files.
gci -Recurse -Filter *.csproj |
% { (get-content "$($_.FullName)")
.Replace('<TargetFramework>net47</TargetFramework>', '<TargetFramework>net462</TargetFramework>') |
Set-Content "$($_.FullName)"}
A bit old and different, as I needed to change a certain line in all instances of a particular file name.
Also, Set-Content was not returning consistent results, so I had to resort to Out-File.
Code below:
$FileName =''
$OldLine = ''
$NewLine = ''
$Drives = Get-PSDrive -PSProvider FileSystem
foreach ($Drive in $Drives) {
Push-Location $Drive.Root
Get-ChildItem -Filter "$FileName" -Recurse | ForEach {
(Get-Content $_.FullName).Replace($OldLine, $NewLine) | Out-File $_.FullName
}
Pop-Location
}
This is what worked best for me on this PowerShell version:
Major.Minor.Build.Revision
5.1.16299.98
Here's a fairly simple one that supports multiline regular expressions, multiple files (using the pipeline), specifying output encoding, etc. Not recommended for very large files due to the ReadAllText method.
# Update-FileText.ps1
#requires -version 2
<#
.SYNOPSIS
Updates text in files using a regular expression.
.DESCRIPTION
Updates text in files using a regular expression.
.PARAMETER Pattern
Specifies the regular expression pattern.
.PARAMETER Replacement
Specifies the regular expression replacement pattern.
.PARAMETER Path
Specifies the path to one or more files. Wildcards are not supported. Each file is read entirely into memory to support multi-line searching and replacing, so performance may be slow for large files.
.PARAMETER CaseSensitive
Specifies case-sensitive matching. The default is to ignore case.
.PARAMETER SimpleMatch
Specifies a simple match rather than a regular expression match (i.e., the Pattern parameter specifies a simple string rather than a regular expression).
.PARAMETER Multiline
Changes the meaning of ^ and $ so they match at the beginning and end, respectively, of any line, and not just the beginning and end of the entire file. The default is that ^ and $, respectively, match the beginning and end of the entire file.
.PARAMETER UnixText
Causes $ to match only linefeed (\n) characters. By default, $ matches carriage return+linefeed (\r\n). (Windows-based text files usually use \r\n as line terminators, while Unix-based text files usually use only \n.)
.PARAMETER Overwrite
Overwrites a file by creating a temporary file containing all replacements and then replacing the original file with the temporary file. The default is to output but not overwrite.
.PARAMETER Force
Allows overwriting of read-only files. Note that this parameter cannot override security restrictions.
.PARAMETER Encoding
Specifies the encoding for the file when -Overwrite is used. Possible values for this parameter are ASCII, BigEndianUnicode, Unicode, UTF32, UTF7, and UTF8. The default value is ASCII.
.INPUTS
System.IO.FileInfo.
.OUTPUTS
System.String (single-line file) or System.String[] (file with more than one line) without the -Overwrite parameter, or nothing with the -Overwrite parameter.
.LINK
about_Regular_Expressions
.EXAMPLE
C:\> Update-FileText.ps1 '(Ferb) and (Phineas)' '$2 and $1' Story.txt
This command replaces the text 'Ferb and Phineas' with the text 'Phineas and Ferb' in the file Story.txt and outputs the content. Note that the pattern and replacement strings are enclosed in single quotes to prevent variable expansion.
.EXAMPLE
C:\> Update-FileText.ps1 'Perry' 'Agent P' Story2.txt -Overwrite
This command replaces the text 'Perry' with the text 'Agent P' in the file Story2.txt.
#>
[CmdletBinding(SupportsShouldProcess = $true,ConfirmImpact = "High")]
param(
[Parameter(Mandatory = $true,Position = 0,ValueFromPipeline = $true)]
[String[]] $Path,
[Parameter(Mandatory = $true,Position = 1)]
[String] $Pattern,
[Parameter(Mandatory = $true,Position = 2)]
[AllowEmptyString()]
[String] $Replacement,
[Switch] $CaseSensitive,
[Switch] $SimpleMatch,
[Switch] $Multiline,
[Switch] $UnixText,
[Switch] $Overwrite,
[Switch] $Force,
[ValidateSet("ASCII","BigEndianUnicode","Unicode","UTF32","UTF7","UTF8")]
[String] $Encoding = "ASCII"
)
begin {
function Get-TempName {
param(
$path
)
do {
$tempName = Join-Path $path ([IO.Path]::GetRandomFilename())
}
while ( Test-Path $tempName )
$tempName
}
if ( $SimpleMatch ) {
$Pattern = [Regex]::Escape($Pattern)
}
else {
if ( -not $UnixText ) {
$Pattern = $Pattern -replace '(?<!\\)\$','\r$'
}
}
function New-Regex {
$regexOpts = [Text.RegularExpressions.RegexOptions]::None
if ( -not $CaseSensitive ) {
$regexOpts = $regexOpts -bor [Text.RegularExpressions.RegexOptions]::IgnoreCase
}
if ( $Multiline ) {
$regexOpts = $regexOpts -bor [Text.RegularExpressions.RegexOptions]::Multiline
}
New-Object Text.RegularExpressions.Regex $Pattern,$regexOpts
}
$Regex = New-Regex
function Update-FileText {
param(
$path
)
$pathInfo = Resolve-Path -LiteralPath $path
if ( $pathInfo ) {
if ( (Get-Item $pathInfo).GetType().FullName -eq "System.IO.FileInfo" ) {
$fullName = $pathInfo.Path
Write-Verbose "Reading '$fullName'"
$text = [IO.File]::ReadAllText($fullName)
Write-Verbose "Finished reading '$fullName'"
if ( -not $Overwrite ) {
$regex.Replace($text,$Replacement)
}
else {
$tempName = Get-TempName (Split-Path $fullName -Parent)
Set-Content $tempName $null -Confirm:$false
if ( $? ) {
Write-Verbose "Created file '$tempName'"
try {
Write-Verbose "Started writing '$tempName'"
[IO.File]::WriteAllText("$tempName",$Regex.Replace($text,$Replacement),[Text.Encoding]::$Encoding)
Write-Verbose "Finished writing '$tempName'"
Write-Verbose "Started copying '$tempName' to '$fullName'"
Copy-Item $tempName $fullName -Force:$Force -ErrorAction Continue
if ( $? ) {
Write-Verbose "Finished copying '$tempName' to '$fullName'"
}
Remove-Item $tempName
if ( $? ) {
Write-Verbose "Removed file '$tempName'"
}
}
catch [Management.Automation.MethodInvocationException] {
Write-Error $Error[0]
}
}
}
}
else {
Write-Error "The item '$path' must be a file in the file system." -Category InvalidType
}
}
}
}
process {
foreach ( $PathItem in $Path ) {
if ( $Overwrite ) {
if ( $PSCmdlet.ShouldProcess("'$PathItem'","Overwrite file") ) {
Update-FileText $PathItem
}
}
else {
Update-FileText $PathItem
}
}
}
Also available as a gist on Github.
Sample to replace all strings inside a folder:
$path=$args[0]
$oldString=$args[1]
$newString=$args[2]
Get-ChildItem -Path $path -Recurse -File |
ForEach-Object {
(Get-Content $_.FullName).replace($oldString,$newString) | Set-Content $_.FullName
}
Small correction for the Set-Content command. If the searched string is not found the Set-Content command will blank (empty) the target file.
You can first verify if the string you are looking for exist or not. If not it will not replace anything.
If (select-string -path "c:\Windows\System32\drivers\etc\hosts" -pattern "String to look for") `
{(Get-Content c:\Windows\System32\drivers\etc\hosts).replace('String to look for', 'String to replace with') | Set-Content c:\Windows\System32\drivers\etc\hosts}
Else{"Nothing happened"}
I need to only search the 1st line and last line in a text file to find a "-" and remove it.
How can I do it?
I tried select-string, but I don't know to find the 1st and last line and only remove "-" from there.
Here is what the text file looks like:
% 01-A247M15 G70
N0001 G30 G17 X-100 Y-100 Z0
N0002 G31 G90 X100 Y100 Z45
N0003 ; --PART NO.: NC-HON.PHX01.COVER-SHOE.DET-1000.050
N0004 ; --TOOL: 8.55 X .3937
N0005 ;
N0006 % 01-A247M15 G70
Something like this?
$1 = Get-Content C:\work\test\01.I
$1 | select-object -index 0, ($1.count-1)
Ok, so after looking at this for a while, I decided there had to be a way to do this with a one liner. Here it is:
(gc "c:\myfile.txt") | % -Begin {$test = (gc "c:\myfile.txt" | select -first 1 -last 1)} -Process {if ( $_ -eq $test[0] -or $_ -eq $test[-1] ) { $_ -replace "-" } else { $_ }} | Set-Content "c:\myfile.txt"
Here is a breakdown of what this is doing:
First, the aliases for those now familiar. I only put them in because the command is long enough as it is, so this helps keep things manageable:
gc means Get-Content
% means Foreach
$_ is for the current pipeline value (this isn't an alias, but I thought I would define it since you said you were new)
Ok, now here is what is happening in this:
(gc "c:\myfile.txt") | --> Gets the content of c:\myfile.txt and sends it down the line
% --> Does a foreach loop (goes through each item in the pipeline individually)
-Begin {$test = (gc "c:\myfile.txt" | select -first 1 -last 1)} --> This is a begin block, it runs everything here before it goes onto the pipeline stuff. It is loading the first and last line of c:\myfile.txt into an array so we can check for first and last items
-Process {if ( $_ -eq $test[0] -or $_ -eq $test[-1] ) --> This runs a check on each item in the pipeline, checking if it's the first or the last item in the file
{ $_ -replace "-" } else { $_ } --> if it's the first or last, it does the replacement, if it's not, it just leaves it alone
| Set-Content "c:\myfile.txt" --> This puts the new values back into the file.
Please see the following sites for more information on each of these items:
Get-Content uses
Get-Content definition
Foreach
The Pipeline
Begin and Process part of the Foreach (this are usually for custom function, but they work in the foreach loop as well)
If ... else statements
Set-Content
So I was thinking about what if you wanted to do this to many files, or wanted to do this often. I decided to make a function that does what you are asking. Here is the function:
function Replace-FirstLast {
[CmdletBinding()]
param(
[Parameter( `
Position=0, `
Mandatory=$true)]
[String]$File,
[Parameter( `
Position=1, `
Mandatory=$true)]
[ValidateNotNull()]
[regex]$Regex,
[Parameter( `
position=2, `
Mandatory=$false)]
[string]$ReplaceWith=""
)
Begin {
$lines = Get-Content $File
} #end begin
Process {
foreach ($line in $lines) {
if ( $line -eq $lines[0] ) {
$lines[0] = $line -replace $Regex,$ReplaceWith
} #end if
if ( $line -eq $lines[-1] ) {
$lines[-1] = $line -replace $Regex,$ReplaceWith
}
} #end foreach
}#End process
end {
$lines | Set-Content $File
}#end end
} #end function
This will create a command called Replace-FirstLast. It would be called like this:
Replace-FirstLast -File "C:\myfiles.txt" -Regex "-" -ReplaceWith "NewText"
The -Replacewith is optional, if it is blank it will just remove (default value of ""). The -Regex is looking for a regular expression to match your command. For information on placing this into your profile check this article
Please note: If you file is very large (several GBs), this isn't the best solution. This would cause the whole file to live in memory, which could potentially cause other issues.
try:
$txt = get-content c:\myfile.txt
$txt[0] = $txt[0] -replace '-'
$txt[$txt.length - 1 ] = $txt[$txt.length - 1 ] -replace '-'
$txt | set-content c:\myfile.txt
You can use the select-object cmdlet to help you with this, since get-content basically spits out a text file as one huge array.
Thus, you can do something like this
get-content "path_to_my_awesome_file" | select -first 1 -last 1
To remove the dash after that, you can use the -Replace switch to find the dash and remove it. This is better than using System.String.Replace(...) method because it can match regex statements and replace whole arrays of strings too!
That would look like:
# gc = Get-Content. The parens tell Powershell to do whatever's inside of it
# then treat it like a variable.
(gc "path_to_my_awesome_file" | select -first 1 -last 1) -Replace '-',''
If your file is very large you might not want to read the whole file to get the last line. gc -Tail will get the last line very quickly for you.
function GetFirstAndLastLine($path){
return New-Object PSObject -Property #{
First = Get-Content $path -TotalCount 1
Last = Get-Content $path -Tail 1
}
}
GetFirstAndLastLine "u_ex150417.log"
I tried this on a 20 gb log file and it returned immediately. Reading the file takes hours.
You will still need to read the file if you want to keep all excising content and you want only to remove from the end. Using the -Tail is a quick way to check if it is there.
I hope it helps.
A cleaner answer to the above:
$Line_number_were_on = 0
$Awesome_file = Get-Content "path_to_ridiculously_excellent_file" | %{
$Line = $_
if ($Line_number_were_on -eq $Awesome_file.Length)
{ $Line -Replace '-','' }
else
{ $Line } ;
$Line_number_were_on++
}
I like one-liners, but I find that readability tends to suffer sometimes when I put terseness over function. If what you're doing is going to be part of a script that other people will be reading/maintaining, readability might be something to consider.
Following Nick's answer: I do need to do this on all text files in the directory tree and this is what I'm using now:
Get-ChildItem -Path "c:\work\test" -Filter *.i | where { !$_.PSIsContainer } | % {
$txt = Get-Content $_.FullName;
$txt[0] = $txt[0] -replace '-';
$txt[$txt.length - 1 ] = $txt[$txt.length - 1 ] -replace '-';
$txt | Set-Content $_.FullName
}
and it looks like it's working well now.
Simple process:
Replace $file.txt with your filename
Get-Content $file_txt | Select-Object -last 1
I was recently searching for comments in the last line of .bat files. It seems to mess up the error code of previous commands. I found this useful for searching for a pattern in the last line of files. Pspath is a hidden property that get-content outputs. If I used select-string, I would lose the filename. *.bat gets passed as -filter for speed.
get-childitem -recurse . *.bat | get-content -tail 1 | where { $_ -match 'rem' } |
select pspath
PSPath
------
Microsoft.PowerShell.Core\FileSystem::C:\users\js\foo\file.bat