select-string with multiple conditions with powershell - powershell

I'm looking for a way to find 2 different lines in a file and only if those 2 line exist I need to perform a task.
So far this is my code
$folderPath = c:\test
$files = Get-ChildItem $Folderpath -Filter *.txt
$find = 'stringA'
$find2 = 'StringB'
$replace = 'something to replace with string b'
if ($files.Length -gt 0 ) {
$files |
select -ExpandProperty fullname |
foreach {
If(Select-String -Path $_ -pattern $find , $find2 -quiet )
{
(Get-Content $_) |
ForEach-Object {$_ -replace $find2, $replace } |
Set-Content $_
write-host "File Changed : " $_
}
}
}
else {
write-host "no files changed"
}
Currently if I run it once it will change the files but if I run it again it will also notify me that it changed the same files instead of the output "no files changed"
Is there a simpler way to make it happen?
Thanks

The Select-String cmdlet selects lines matching any of the patterns supplied to it. This means that the following file contains a match:
PS> Get-Content file.txt
This file contains only stringA
PS> Select-String -Pattern 'stringA', 'stringB' -Path file.txt
file.txt:1:This file contains only stringA
Passing the -Quiet flag to Select-String will produce a boolean result instead of a list of matches. The result is $True even though only one of the patterns is present.
PS> Get-Content file.txt
This file contains only stringA
PS> Select-String -Pattern 'stringA', 'stringB' -Path file.txt -Quiet
True
In your case, Select-String chooses all the files containing either 'stringA' or 'stringB', then replaces all instances of 'stringB' in those files. (Note that replacements are also performed in files you did not want to alter)
Even after the replacements, files containing only 'stringA' still exist: these files are caught and reported by your script the second time you run it.
One solution is to have two separate conditions joined by the -and operator:
If (
(Select-String -Path $_ -Pattern 'stringA' -Quiet) -and
(Select-String -Path $_ -Pattern 'stringB' -Quiet)
)
After this the script should work as intended, except that it won't report "no files changed" correctly.
If you fix your indentation you'll realise that the final else clause actually checks if there are no .txt files in the folder:
$files = Get-ChildItem $Folderpath -Filter *.txt
...
if ($files.length -gt 0) {
...
} else {
# will only display when there are no text files in the folder!
Write-Host "no files changed"
}
The way to resolve this would be to have a separate counter variable that increments every time you find a match. Then at the end, check if this counter is 0 and call Write-Host accordingly.
$counter = 0
...
foreach {
if ((Select-String ...) ...) {
...
$counter += 1
}
}
if ($counter -eq 0) {
Write-Host "no files changed"
}

To complement equatorialsnowfall's helpful answer, which explains the problem with your approach well, with a streamlined, potentially more efficient solution:
$folderPath = c:\test
$searchStrings = 'stringA', 'stringB'
$replace = 'something to replace string B with'
$countModified = 0
Get-ChildItem $Folderpath -Filter *.txt | ForEach-Object {
if (
(
($_ | Select-String -Pattern $searchStrings).Pattern | Select-Object -Unique
).Count -eq $searchStrings.Count
) {
($_ | Get-Content -Raw) -replace $searchStrings[1], $replace |
Set-Content $_.FullName
++$countModified
write-host "File Changed: " $_
}
}
if ($countModified -eq 0) {
Write-Host "no files changed"
}
A single Select-String call is used to determine if all pattern match (the solution scales to any number of patterns):
Each Microsoft.PowerShell.Commands.MatchInfo output object has a .Pattern property that indicates which of the patterns passed to -Pattern matched on a given line.
If, after removing duplicates with Select-Object -Unique, the number of patterns associated with matching lines is the same as the number of input patterns, you can assume that all input patterns matched (at least once).
Reading each matching file as a whole with Get-Content's -Raw switch and therefore performing only a single -replace operation per file is much faster than line-by-line processing.

Related

Correction in sub folder names by replacing first two characters, if needed

I am using below Powershell script which successfully traverses through all my case folders within the main folder named Test. What it is incapable of doing is to rename each sub folder, if required, as can be seen in current and desired output. Script should first sort the sub folders based on current numbering and then give them proper serial numbers as folder name prefix by replacing undesired serial numbers.
I have hundreds of such cases and their sub folders which need to be renamed properly.
The below output shows two folders named "352" and "451" (take them as order IDs for now) and each of these folders have some sub-folders with a 2 digit prefix in their names. But as you can notice they are not properly serialized.
$Search = Get-ChildItem -Path "C:\Users\User\Desktop\test" -Filter "??-*" -Recurse -Directory | Select-Object -ExpandProperty FullName
$Search | Set-Content -Path 'C:\Users\User\Desktop\result.txt'
Below is my current output:
C:\Users\User\Desktop\test\Case-352\02-Proceedings
C:\Users\User\Desktop\test\Case-352\09-Corporate
C:\Users\User\Desktop\test\Case-352\18-Notices
C:\Users\User\Desktop\test\Case-451\01-Contract
C:\Users\User\Desktop\test\Case-451\03-Application
C:\Users\User\Desktop\test\Case-451\09-Case Study
C:\Users\User\Desktop\test\Case-451\14-Violations
C:\Users\User\Desktop\test\Case-451\21-Verdict
My desired output is as follows:
C:\Users\User\Desktop\test\Case-352\01-Proceedings
C:\Users\User\Desktop\test\Case-352\02-Corporate
C:\Users\User\Desktop\test\Case-352\03-Notices
C:\Users\User\Desktop\test\Case-451\01-Contract
C:\Users\User\Desktop\test\Case-451\02-Application
C:\Users\User\Desktop\test\Case-451\03-Case Study
C:\Users\User\Desktop\test\Case-451\04-Violations
C:\Users\User\Desktop\test\Case-451\05-Verdict
Thank you so much. If my desired functionality can be extended to this script, it will be of great help.
Syed
You can do the following based on what you have posted:
$CurrentParent = $null
$Search = Get-ChildItem -Path "C:\Users\User\Desktop\test" -Filter '??-*' -Recurse -Directory | Where Name -match '^\d\d-\D' | Foreach-Object {
if ($_.Parent.Name -eq $CurrentParent) {
$Increment++
} else {
$CurrentParent = $_.Parent.Name
$Increment = 1
}
$CurrentNumber = "{0:d2}" -f $Increment
Join-Path $_.Parent.FullName ($_.Name -replace '^\d\d',$CurrentNumber)
}
$Search | Set-Content -Path 'C:\Users\User\Desktop\result.txt'
I added Where to filter more granularly beyond what -Filter allows.
-match and -replace both use regex to perform the matching. \d is a digit. \D is a non-digit. ^ matches the position at the beginning of the string.
The string format operator -f is used to maintain the 2-digit requirement. If you happen to reach 3-digit numbers, then 3 digit numbers will be output instead.
You can take this further to perform a rename operation:
$CurrentParent = $null
Get-ChildItem . -Filter '??-*' -Recurse -Directory | Where Name -match '^\d\d-\D' | Foreach-Object {
if ($_.Parent.Name -eq $CurrentParent) {
$Increment++
} else {
$CurrentParent = $_.Parent.Name
$Increment = 1
}
$CurrentNumber = "{0:d2}" -f $Increment
$NewName = $_.Name -replace '^\d\d',$CurrentNumber
$_ | Where Name -ne $NewName | Rename-Item -NewName $NewName -WhatIf
}
$NewName is used to simply check if the new name already exists. If it does, a rename will not happen for that object. Remove the -WhatIf if you are happy with the results.

power shell is giving file not found error when renaming a folder

What I am trying to do is change the name of a folder if any file in it contains contains string "ErrorCode :value" where the value can be anything other than zero. I have old path in $dirName folder and new path with a suffix Error in $newPath folder.
Here is the code for that
$fileNames = Get-ChildItem -Path $scriptPath -Recurse -Include *.data
foreach ($file in $fileNames) {
If (Get-Content $file | %{$_ -match '"ErrorCode": 0'})
{
}
else{
$dirName= Split-Path $file
$newPath=$dirName+"Error"
Rename-Item $dirName $newPath
}
}
I expect it to rename the folder instead it gives me an error that it does not exist.
How do I solve that? Also is there any better approach for this situation because I have started learning powershell before two days?
If there are multiple *.data-files in the same folder you code will rename the folder after it found the first file that does NOT contain "ErrorCode": 0. When it tries to get the next file or to rename the folder again, it won't be able to find it since it has been renamed.
You wrote you want to rename the folder if the file -match '"ErrorCode": 0' but if this condition is fulfilled you execute {} (nothing). However if the condition is not fulfilled you execute your code else{...}
To prevent your code from renaming the folder multiple times while working in the folder, collect the foldernames first in an array an rename them later:
$fileNames = Get-ChildItem -Path $scriptPath -Recurse -Include *.data
$FoldersToRename = #() #initialize as array
foreach ($file in $fileNames) {
If (Get-Content $file | %{$_ -match '"ErrorCode": 0'})
{
$FoldersToRename += Split-Path $file
}
}
$SingelFolders = $FoldersToRename | Select-Object -Unique #Select every folder just once
$SingelFolders | ForEach-Object {
$newPath=$_ + "Error"
Rename-Item $_ $newPath
}
edit: Matching anything BUT '"ErrorCode": 0'
-match uses regular expressions (regex) wich comes very handy in here.
Any single-digit number but 0 would be [1-9] in regex. If your ErrorCode can have multiple digits, you can use \d{2,} to match 2 or more ({2,}) numbers (\d). Combined these would look like this: ([1-9]|\d{2,}) (| = or)
And here is it in the code from above:
foreach ($file in $fileNames) {
If (Get-Content $file | %{$_ -match '"ErrorCode": ([1-9]|\d{2,})'})
{
$FoldersToRename += Split-Path $file
}
}
edit2: Ignoring whitespaces /tabs:
regex for anykind of whitespace is \s. * means 0 or more:
the string would be '"ErrorCode":\s*([1-9]|\d{2,})'
edit3: "Code" optional:
Here is the ultimate regex string to match ay kind of Error with optional quotation marks, "Code" and the colon:
"?Error(Code)?"?:?\s*([1-9]|\d{2,}) > {$_ -match '"?Error(Code)?"?:?\s*([1-9]|\d{2,})'}
Matchingexamples:
"ErrorCode": 404
"ErrorCode": 5
"ErrorCode": 0404
"ErrorCode":0404
Error:1
Error1
test it yourself at regex101.com

Splitting file into smaller files, working script, but need some tweaks

I have a script here that looks for a delimiter in a text file with several reports in it.  The script saves each individual report as it's own text document. The tweaks I'm trying to achieve are:
In the middle of the data of each page there is - SPEC #: RX:<string>.  I want that string to be saved as the filename.
it currently saves from the delimiter down to the next one. This ignores the first report and grabs every one after. I want it to go from the delimiter UP to the next one, but I haven't figured out how to achieve that.
$InPC = "C:\Users\path"
Get-ChildItem -Path $InPC -Filter *.txt | ForEach-Object -Process {
$basename= $_.BaseName
$m = ( ( Get-Content $_.FullName | Where { $_ | Select-String "END OF
REPORT" -Quiet } | Measure-Object | ForEach-Object { $_.Count } ) -ge 2)
$a = 1
if ($m) {
Get-Content $_.FullName | % {
If ($_ -match "END OF REPORT") {
$OutputFile = "$InPC\$basename _$a.txt"
$a++
}
Add-Content $OutputFile $_
}
Remove-Item $_.FullName
}
}
This works, as stated it outputs the file with END OF REPORT on top, the first report in the file gets omitted as it does not have END OF REPORT above it.
Edited code:
$InPC = 'C:\Path' #
ForEach($File in Get-ChildItem -Path $InPC -Filter *.txt){
$RepNum=0
ForEach($Report in (([IO.File]::ReadAllText('C:\Path'$File) -split 'END OF REPORT\r?\n?' -ne '')){
if ($Report -match 'SPEC #: RX:(?<ReportFile>.*?)\.'){
$ReportFile=$Matches.ReportFile
}
$OutputFile = "{0}\{1}_{2}_{3}.txt" -f $InPC,$File.BaseName,$ReportFile,++$RepNum
$Report | Add-Content $OutputFile
}
# Remove-Item $File.FullName
}
I suggest to use Regular Expressions to
read in the file with -raw parameter and
split the file at the marker END OF REPORT into sections
use the 'SPEC #: RX:(?<ReportFile>.*?)\.' with a named capture group to extract the string
Edit adapted to PowerShell v2
## Q:\Test\2019\09\12\SO_57911471.ps1
$InPC = 'C:\Users\path' # 'Q:\Test\2019\09\12\' #
ForEach($File in Get-ChildItem -Path $InPC -Filter *.txt){
$RepNum=0
ForEach($Report in (((Get-Content $File.FullName) -join "`n") -split 'END OF REPORT\r?\n?' -ne '')){
if ($Report -match 'SPEC #: RX:(?<ReportFile>.*?)\.'){
$ReportFile=$Matches.ReportFile
}
$OutputFile = "{0}\{1}_{2}_{3}.txt" -f $InPC,$File.BaseName,$ReportFile,++$RepNum
$Report | Add-Content $OutputFile
}
# Remove-Item $File.FullName
}
This construed sample text:
## Q:\Test\2019\09\12\SO_57911471.txt
I have a script here that looks for a delimiter in a text file with several reports in it.
In the middle of the data of each page there is -
SPEC #: RX:string1.
I want that string to be saved as the filename.
END OF REPORT
I have a script here that looks for a delimiter in a text file with several reports in it.
In the middle of the data of each page there is -
SPEC #: RX:string2.
I want that string to be saved as the filename.
END OF REPORT
yields:
> Get-ChildItem *string* -name
SO_57911471_string1_1.txt
SO_57911471_string2_2.txt
The added ReportNum is just a precaution in case the string could not be grepped.

How do I select files with specific words in their content from a folder?

I am trying to filter files having any of the words January or February in their content.
$files = Get-ChildItem "C:\Users\Desktop\NewFolder\" -Recurse -Filter "*Support*"
$count = 0
$p = 'january', 'February'
foreach ($file in $files){
if((Get-Content $file.FullName) | Select-String -Pattern '^%january%'){
Write-Host "File found"
#write-host $file.FullName
$count++
}
else {
Write-Host "File NOt found"
}
}
Write-Host $count
Currently I am just getting "File NOt found" even though the file exists
Your issue might simply be your regex string although improvement could still be made as a whole. The percent sign is not a wildcard character in regex also are you expecting the month to appear at the start of a line? That is what the anchor ^ represents.
So likely your files do not have the string %January% at the start of any line. Like I mentioned earlier I don't think that is what you wanted.
So lets find all the files you want and filter those files based on the presence of either of the works in $p (like in your example above)
$p ='january','February'
$regexPattern = ($p | ForEach-Object{[regex]::Escape($_)}) -join "|"
$files = Get-ChildItem -Path "c:\temp\" -filter "*.txt"
$files | Where-Object{Select-String -Path $_.Fullname -Pattern $regexPattern}
That will spit out any file objects that have the work January or February in them anywhere in the line.
$regexPattern would end up being a pipeline delimited string of the words in $p. [regex]::Escape() is a good way to avoid special regex characters in your strings especially if you are just using examples.
You would of course need to change the -Path and -Filter accordingly as well as including -Recurse if the situation calls for it.
I think
Select-String -Path "C:\Users\Desktop\NewFolder\*Support*" -Pattern January,february
or (if you need to recurse the path)
Get-ChildItem -Path "C:\Users\Desktop\NewFolder" -Include *Support* -Recurse | Select-String -Pattern January,february
should get you what you want?
(Select-String also has a -CaseSensitive switch if you should need that)

Using Powershell to replace multiple strings in multiple files & folders

I have a list of strings in a CSV file. The format is:
OldValue,NewValue
223134,875621
321321,876330
....
and the file contains a few hundred rows (each OldValue is unique). I need to process changes over a number of text files in a number of folders & subfolders. My best guess of the number of folders, files, and lines of text are - 15 folders, around 150 text files in each folder, with approximately 65,000 lines of text in each folder (between 400-500 lines per text file).
I will make 2 passes at the data, unless I can do it in one. First pass is to generate a text file I will use as a check list to review my changes. Second pass is to actually make the change in the file. Also, I only want to change the text files where the string occurs (not every file).
I'm using the following Powershell script to go through the files & produce a list of the changes needed. The script runs, but is beyond slow. I haven't worked on the replace logic yet, but I assume it will be similar to what I've got.
# replace a string in a file with powershell
[reflection.assembly]::loadwithpartialname("Microsoft.VisualBasic") | Out-Null
Function Search {
# Parameters $Path and $SearchString
param ([Parameter(Mandatory=$true, ValueFromPipeline = $true)][string]$Path,
[Parameter(Mandatory=$true)][string]$SearchString
)
try {
#.NET FindInFiles Method to Look for file
[Microsoft.VisualBasic.FileIO.FileSystem]::GetFiles(
$Path,
[Microsoft.VisualBasic.FileIO.SearchOption]::SearchAllSubDirectories,
$SearchString
)
} catch { $_ }
}
if (Test-Path "C:\Work\ListofAllFilenamesToSearch.txt") { # if file exists
Remove-Item "C:\Work\ListofAllFilenamesToSearch.txt"
}
if (Test-Path "C:\Work\FilesThatNeedToBeChanged.txt") { # if file exists
Remove-Item "C:\Work\FilesThatNeedToBeChanged.txt"
}
$filefolder1 = "C:\TestFolder\WorkFiles"
$ftype = "*.txt"
$filenames1 = Search $filefolder1 $ftype
$filenames1 | Out-File "C:\Work\ListofAllFilenamesToSearch.txt" -Width 2000
if (Test-Path "C:\Work\FilesThatNeedToBeChanged.txt") { # if file exists
Remove-Item "C:\Work\FilesThatNeedToBeChanged.txt"
}
(Get-Content "C:\Work\NumberXrefList.CSV" |where {$_.readcount -gt 1}) | foreach{
$OldFieldValue, $NewFieldValue = $_.Split("|")
$filenamelist = (Get-Content "C:\Work\ListofAllFilenamesToSearch.txt" -ReadCount 5) #|
foreach ($j in $filenamelist) {
#$testvar = (Get-Content $j )
#$testvar = (Get-Content $j -ReadCount 100)
$testvar = (Get-Content $j -Delimiter "\n")
Foreach ($i in $testvar)
{
if ($i -imatch $OldFieldValue) {
$j + "|" + $OldFieldValue + "|" + $NewFieldValue | Out-File "C:\Work\FilesThatNeedToBeChanged.txt" -Width 2000 -Append
}
}
}
}
$FileFolder = (Get-Content "C:\Work\FilesThatNeedToBeChanged.txt" -ReadCount 5)
Get-ChildItem $FileFolder -Recurse |
select -ExpandProperty fullname |
foreach {
if (Select-String -Path $_ -SimpleMatch $OldFieldValue -Debug -Quiet) {
(Get-Content $_) |
ForEach-Object {$_ -replace $OldFieldValue, $NewFieldValue }|
Set-Content $_ -WhatIf
}
}
In the code above, I've tried several things with Get-Content - default, with -ReadCount, and -Delimiter - in an attempt to avoid an out of memory error.
The only thing I have control over is the length of the old & new replacement strings file. Is there a way to do this in Powershell? Is there a better option/solution? I'm running Windows 7, Powershell version 3.0.
Your main problem is that you're reading the file over and over again to change each of the terms. You need to invert the looping of the replace terms and looping of the files. Also, pre-load the csv. Something like:
$filefolder1 = "C:\TestFolder\WorkFiles"
$ftype = "*.txt"
$filenames = gci -Path $filefolder1 -Filter $ftype -Recurse
$replaceValues = Import-Csv -Path "C:\Work\NumberXrefList.CSV"
foreach ($file in $filenames) {
$contents = Get-Content -Path $file
foreach ($replaceValue in $replaceValues) {
$contents = $contents -replace $replaceValue.OldValue, $replaceValue.NewValue
}
Copy-Item $file "$file.old"
Set-Content -Path $file -Value $contents
}