Matching Lines in a text file based on values in CSV - powershell

Hi Everyone,
I am having trouble with the below script. Here is the requirement:
1) Each text file needs to be compared with a single CSV file. The CSV file contains the data to that if present in the text file should match.
2) If the data in the text file matches, output the matches only and run jobs etc..
3) If the text file has no matches to the CSV file, exit with 0 as no matches are found.
I have tried to do this, but what I end up with is matches, and also non matches. What I really need is to match the lines, run the jobs,exit, if text file has no matches, then return 0
$CSVFIL = Import-Csv -Path $DRIVE\test\csvfile.csv
$TEXTFIL = Get-Content -Path "$TEXTFILFOL\*.txt" |
Select-String -Pattern 'PAT1' |
Select-String -Pattern 'PAT2' |
Select-String -Pattern 'TEST'
ForEach ($line in $CSVFIL) {
If ($TEXTFIL -match $line.COL1) {
Write-Host 'RUNNING:' ($line.JOB01)
} else {
write-host "No Matches Found Exiting"

I would handle this a different way. First you need to find matches, if there are matches then process else output 0.
$matches = #()
foreach ($line in $CSVFIL)
{
if ($TEXTFIL -contains $line.COL1)
{ $matches += $line }
}
if ($matches.Count -gt 0)
{
$matches | Foreach-Object {
Write-Output "Running: $($_.JOB01)"
}
}
else
{
Write-Output "No matches found, exiting"
}

$CSVFIL = Import-Csv -Path "$DRIVE\test\csvfile.csv"
Get-Content -Path "$TEXTFILFOL\*.txt" |
where {$_ -like "*PAT1*" -and $_ -like "*PAT2*" -and $_ -like "*TEST*" } |
%{
$TEXTFOUNDED=$_; $CSVFIL | where {$TEXTFOUNDED -match $_.COL1} |
%{ [pscustomobject]#{Job=$_.JOB01;TextFounded=$TEXTFOUNDED;Col=$_.COL1 } }
}

Related

select-string with multiple conditions with powershell

I'm looking for a way to find 2 different lines in a file and only if those 2 line exist I need to perform a task.
So far this is my code
$folderPath = c:\test
$files = Get-ChildItem $Folderpath -Filter *.txt
$find = 'stringA'
$find2 = 'StringB'
$replace = 'something to replace with string b'
if ($files.Length -gt 0 ) {
$files |
select -ExpandProperty fullname |
foreach {
If(Select-String -Path $_ -pattern $find , $find2 -quiet )
{
(Get-Content $_) |
ForEach-Object {$_ -replace $find2, $replace } |
Set-Content $_
write-host "File Changed : " $_
}
}
}
else {
write-host "no files changed"
}
Currently if I run it once it will change the files but if I run it again it will also notify me that it changed the same files instead of the output "no files changed"
Is there a simpler way to make it happen?
Thanks
The Select-String cmdlet selects lines matching any of the patterns supplied to it. This means that the following file contains a match:
PS> Get-Content file.txt
This file contains only stringA
PS> Select-String -Pattern 'stringA', 'stringB' -Path file.txt
file.txt:1:This file contains only stringA
Passing the -Quiet flag to Select-String will produce a boolean result instead of a list of matches. The result is $True even though only one of the patterns is present.
PS> Get-Content file.txt
This file contains only stringA
PS> Select-String -Pattern 'stringA', 'stringB' -Path file.txt -Quiet
True
In your case, Select-String chooses all the files containing either 'stringA' or 'stringB', then replaces all instances of 'stringB' in those files. (Note that replacements are also performed in files you did not want to alter)
Even after the replacements, files containing only 'stringA' still exist: these files are caught and reported by your script the second time you run it.
One solution is to have two separate conditions joined by the -and operator:
If (
(Select-String -Path $_ -Pattern 'stringA' -Quiet) -and
(Select-String -Path $_ -Pattern 'stringB' -Quiet)
)
After this the script should work as intended, except that it won't report "no files changed" correctly.
If you fix your indentation you'll realise that the final else clause actually checks if there are no .txt files in the folder:
$files = Get-ChildItem $Folderpath -Filter *.txt
...
if ($files.length -gt 0) {
...
} else {
# will only display when there are no text files in the folder!
Write-Host "no files changed"
}
The way to resolve this would be to have a separate counter variable that increments every time you find a match. Then at the end, check if this counter is 0 and call Write-Host accordingly.
$counter = 0
...
foreach {
if ((Select-String ...) ...) {
...
$counter += 1
}
}
if ($counter -eq 0) {
Write-Host "no files changed"
}
To complement equatorialsnowfall's helpful answer, which explains the problem with your approach well, with a streamlined, potentially more efficient solution:
$folderPath = c:\test
$searchStrings = 'stringA', 'stringB'
$replace = 'something to replace string B with'
$countModified = 0
Get-ChildItem $Folderpath -Filter *.txt | ForEach-Object {
if (
(
($_ | Select-String -Pattern $searchStrings).Pattern | Select-Object -Unique
).Count -eq $searchStrings.Count
) {
($_ | Get-Content -Raw) -replace $searchStrings[1], $replace |
Set-Content $_.FullName
++$countModified
write-host "File Changed: " $_
}
}
if ($countModified -eq 0) {
Write-Host "no files changed"
}
A single Select-String call is used to determine if all pattern match (the solution scales to any number of patterns):
Each Microsoft.PowerShell.Commands.MatchInfo output object has a .Pattern property that indicates which of the patterns passed to -Pattern matched on a given line.
If, after removing duplicates with Select-Object -Unique, the number of patterns associated with matching lines is the same as the number of input patterns, you can assume that all input patterns matched (at least once).
Reading each matching file as a whole with Get-Content's -Raw switch and therefore performing only a single -replace operation per file is much faster than line-by-line processing.

How to - Find and replace the first occurrence only

I have a script that seems to work correctly only it works to good.
I have files that contain multiple lines with the string "PROCEDURE DIVISION.", with the period at the end.
What I need to do...
ONLY remove the [2nd occurrence] of the string "PROCEDURE DIVISION." if it's in the text file twice and bypass the file if it is only found once. I need to preserve the 1st occurrence and change/remove the 2nd occurrence.
I can find and replace all the occurrences easily, I have no clue how to replace only 1 of 2.
Is this possible using Powershell?
Here is my code so far...
Get-ChildItem 'C:\Temp\*.cbl' -Recurse | ForEach {#
(Get-Content $_ | ForEach { $_ -replace "PROCEDURE DIVISION\.", " "}) | Set-Content $_
}
UPDATE
I got this to work and it's not pretty.
The only problem is is is capturing the string in the comments section.
What I need to do is only count the string as a hit when it's found starting in position 8 on each line.
Is that possible?
Get-ChildItem 'C:\Thrivent\COBOL_For_EvolveWare\COBOL\COBOL\*.*' -Recurse | ForEach {
($cnt=(Get-Content $_ | select-string -pattern "PROCEDURE DIVISION").length)
if ($cnt -gt "1") {
(Get-Content $_ | ForEach { $_ -replace "PROCEDURE DIVISION\.", " "}) | Set-Content $_
$FileName = $_.FullName
Write-Host "$FileName = $cnt" -foregroundcolor green
}
There are potential issues with all of the provided answers. Reading a file using switch statement is likely going to be the fastest method. But it needs to take into account PROCEDURE DIVISION. appearing multiple times on the same line. The method below will be more memory intensive than using switch but will consider the multi-match, single line condition. Note that you can use -cmatch for case- sensitive matching.
# Matches second occurrence of match when starting in position 7 on a line
Get-ChildItem 'C:\Temp\*.cbl' -Recurse -File | ForEach-Object {
$text = Get-Content -LiteralPath $_.Fullname -Raw
if ($text -match '(?sm)(\A.*?^.{6}PROCEDURE DIVISION\..*?^.{6})PROCEDURE DIVISION\.(.*)\Z') {
Write-Host "Changing file $($_.FullName)"
$matches.1+$matches.2 | Set-Content $_.FullName
}
}
This maybe a bit of a hack, but it works. $myMatches = $pattern.Matches in the case below gives us 3 matches, $myMatches[1].Index is the position of the second occurrence of the string you want to replace.
$text = "Hello foo, where are you foo? I'm here foo."
[regex]$pattern = "foo"
$myMatches = $pattern.Matches($text)
if ($myMatches.count -gt 1)
{
$newtext = $text.Substring(0,$myMatches[1].Index) + "bar" + $text.Substring($myMatches[1].Index + "foo".Length)
$newtext
}
try this:
$Founded=Get-ChildItem 'C:\Temp\' -Recurse -file -Filter "*.cbl" | Select-String -Pattern 'PROCEDURE DIVISION.' -SimpleMatch | where LineNumber -GT 1 | select Path -Unique
$Founded | %{
$Nb=0
$FilePath=$_.Path
$Content=Get-Content $FilePath | %{
if($_ -like '*PROCEDURE DIVISION.*')
{
$Nb++
if ($Nb -gt 1)
{
$_.replace('PROCEDURE DIVISION.', '')
}
else
{
$_
}
}
else
{
$_
}
}
$Content | Set-Content -Path $FilePath
}
You could use switch for this:
Get-ChildItem -Path 'C:\Temp' -Filter '*.cbl' -File -Recurse | ForEach-Object {
$occurrence = 0
$contentChanged = $false
$newContent = switch -Regex -File $_.FullName {
'PROCEDURE DIVISION\.' {
$occurrence++
if ($occurrence -eq 2) {
$_ -replace 'PROCEDURE DIVISION\.', " "
$contentChanged = $true
}
else { $_ }
}
default { $_ }
}
# only rewrite the file if a change has been made
if ($contentChanged) {
Write-Host "Updating file '$($_.FullName)'"
$newContent | Set-Content -Path $_.FullName -Force
}
}

Powershell Test-Path in foreach loop null

I have a list of numbers in a file (123-45-678, 876-54-321, but they're on separate lines) and I'm trying to do something if a number is in the file and something else if it is not.
$list = Get-Content $env:USERPROFILE\Documents\list.txt
Foreach ($obj in $list)
{
$nxt = Get-ChildItem -Path $env:USERPROFILE\Documents\files -Recurse -Filter "*$obj*"
$FileExists = Test-Path $nxt
If ($FileExists -ne $True)
{
Write-Host "Yippee"
}
Else
{
Write-Host "Not found"
}
}
If there is a number in the file it seems to work but if it's not I get the following
TEST-PATH : Cannot bind argument to parameter 'Path' because it is null.
I thought the purpose of test-path was to return something if it couldn't find anything?
Why are you negating the results of both Test-Path and if([System.IO.File]::Exists($nxt)) ?
If either returns $true, then the file EXISTS.
Because you say you want to do something if a file with any such number in its name is found and something else if not, I would do it using the regular expression -match operator.
For that, you need to put all possible numbers from the text file in a single string, separated by the regex OR operator |:
# get the numbers from the file and for security remove empty or whitespace-only lines
$list = Get-Content -Path "$env:USERPROFILE\Documents\list.txt" | Where-Object { $_ -match '\S' }
# build a regex string from that array
$regex = ($list | ForEach-Object { [regex]::Escape($_) }) -join '|'
# get all files and test if they match any of the numbers
Get-ChildItem -Path "$env:USERPROFILE\Documents\files" -Recurse -File | ForEach-Object {
if ( $_.Name -match $regex ) {
Write-Host "File '$($_.Name)' exists" -ForegroundColor Green
}
else {
Write-Host "File '$($_.FullName)' does not match any of the items in the list."
}
}
Result would look something like this:
File '876-54-321.txt' exists
File 'C:\snowman\Documents\files\somefile.txt' does not match any of the items in the list.
File 'Whatever 123-45-678-901.txt' exists
File 'XYZ 876-54-321.txt' exists

Powershell - Compare multiple files against single csv files

I am trying to compare multiple files against a single document. I have managed to make that part work however where my issue is, is that i want to be able to check if the files exist before a comparison is run.
i.e. check if file A exists, if so compare against master csv file, if not continue on and check if file b exists, if so compare against master csv and so on.
my script so far goes:
$files = get-content -path "H:\Compare\File Location\servername Files.txt"
$prod = "H:\compare\Results\master_SystemInfo.csv"
foreach ($file in $files) {
If((Test-Path -path $file))
{
Write-Host "File exists, comparing against production"
$content1 = Get-Content "H:\Compare\Results\$file"
$content2 = Get-Content $prod
$comparedLines = Compare-Object $content1 $content2 -IncludeEqual |
Sort-Object { $_.InputObject.ReadCount }
$lineNumber = 0
$comparedLines | foreach {
$pattern = ".*"
if($_.SideIndicator -eq "==" -or $_.SideIndicator -eq "=>")
{
$lineNumber = $_.InputObject.ReadCount
}
if($_.InputObject -match $pattern)
{
if($_.SideIndicator -ne "==")
{
if($_.SideIndicator -eq "=>")
{
$lineOperation = "prod"
}
elseif($_.SideIndicator -eq "<=")
{
$lineOperation = "test"
}
[PSCustomObject] #{
Line = $lineNumber
File = $lineOperation
Text = $_.InputObject
}
}
}
} | Export-Csv "h:\compare\Comparison Reports\Prod.vs.$file" - NoTypeInformation
}
Else
{ "File does not exist, aborting" ; return}
}
The comparison is working just need to add the check for file before running comparison as it is still spitting out results for files that don't exist.
Thank you very much,
I have found the answer by altering the code, this time im just creating a txt file from the files in the folder first that way i don't need to test-path. This now generates a file list from the folder, then compares each file against the master file and outputs multiple files, one for each comparison saving it as the original filename i.e. "Prod.vs._SystemInfor.csv"
FYI - In the first line the abc123* is a variable i put in to look for specific server names within the folder and generate a file list based on those only. We have a number of servers all with similar naming conventions just the last 4 digits are different depending on where they are located.
Thanks
Working Powershell script:
Get-ChildItem -file abc123* H:\Compare\Results -Name | Out-File "H:\Compare\Results\Office Files.txt"
$officefiles = get-content -path "H:\Compare\results\Office Files.txt"
$officeprod = "H:\compare\Results\master_SystemInfo.csv"
foreach ($officefile in $officefiles) {
$content1 = Get-Content "H:\Compare\Results\$officefile"
$content2 = Get-Content $officeprod
$comparedLines = Compare-Object $content1 $content2 -IncludeEqual |
Sort-Object { $_.InputObject.ReadCount }
$lineNumber = 0
$comparedLines | foreach {
$pattern = ".*"
if($_.SideIndicator -eq "==" -or $_.SideIndicator -eq "=>")
{
$lineNumber = $_.InputObject.ReadCount
}
if($_.InputObject -match $pattern)
{
if($_.SideIndicator -ne "==")
{
if($_.SideIndicator -eq "=>")
{
$lineOperation = "prod"
}
elseif($_.SideIndicator -eq "<=")
{
$lineOperation = "test"
}
[PSCustomObject] #{
Line = $lineNumber
File = $lineOperation
Text = $_.InputObject
}
}
}
} | Export-Csv "h:\compare\Comparison Reports\Prod.vs.$officefile" -NoTypeInformation
}

Powershell ,Read from a txt file and Format Data( remove lines, remove blank spaces in between)

I am really very new to powershell. I want to use powershell to read a txt file and change it to another format.
Read from a txt file.
Format Data( remove lines, remove blank spaces in between)
Count of records ( "T 000000002" 9 chars)
and then write the output to a new file.
I just started powershell two days ago so I don't know how to do this yet.
Reading from a file:
Get-Content file.txt
Not quite sure what you want here. Get-Content returns an array of strings. You can then manipulate what you get and pass it on. The most helpful cmdlets here are probably Where-Object (for filtering) and ForEach-Object (for manipulating).
For example, to remove all blank lines you can do
Get-Content file.txt | Where-Object { $_ -ne '' } > file2.txt
This can be shortened to
Get-Content file.txt | Where-Object { $_ } > file2.txt
since an empty string in a boolean context evaluates to false.
Or to remove spaces in every line:
Get-Content file.txt | ForEach-Object-Object { $_ -replace ' ' } > file2.txt
Again, not quite sure what you're after here. Possible things I could think of from your overly elaborate description are something along the lines of
$_.Substring(2).Length
or
$_ -match '(\d+)' | Out-Null
$Matches[1].Length
function Count-Object() {
begin {
$count = 0
}
process {
$count += 1
}
end {
$count
}
}
$a= get-content .\members.txt |
Foreach-Object { ($_ -replace '\s','') } |
Foreach-Object { ($_ -replace '-','') } |
Foreach-Object { ($_ -replace 'OP_ID','') } |
Foreach-Object { ($_ -replace 'EFF_DT','') } |
Where-Object { $_ -ne '' }|
set-content .\newmembers.txt
$b = Get-Content .\newmembers.txt |
Count-Object $b
"T {0:D9}" -f $b | add-content .\newmembers.txt
I also like the ? used in place of the where-object to trim it down just that much more.
Get-Content file.txt | ?{ $_ } > file2.txt