Remove entries of one text file present in another - powershell

I have 2 text file like below
exclude.txt
10.1.1.3
10.1.1.4
10.1.1.5
10.1.1.6
free.txt
10.1.1.3
10.1.1.4
10.1.1.5
10.1.1.6
10.1.1.7
10.1.1.8
10.1.1.9
10.1.1.10
I want to write exclude the entries of exclude.txt from free.txt and write to another file
10.1.1.7
10.1.1.8
10.1.1.9
10.1.1.10
I tried :
compare-object (get-content $freeips) (get-content $excludeip) -PassThru | format-list | Out-File $finalips
Here in the final output I am always getting the first IP of the exclude.txt
10.1.1.7
10.1.1.8
10.1.1.9
10.1.1.10
10.1.1.3
and another way I tried
$exclude = Get-Content "C:\exclude.txt"
foreach($ip in $exclude)
{
get-content "C:\free.txt" | select-string -pattern $ip -notmatch | Out-File "C:\diff.txt"
}
But in this case also I am getting the entries of exclude.txt in the final output.
Please let me know where I am doing wrong here

The Select-String solution is probably faster. Besides it doesn't require the iteration through the IP addresses as the -Pattern parameters accepts a string array (String[]). The point is thou that by default the pattern(s) repressent a regular expression where a dot (.) is a place holder for any character. To search for a literal pattern you should use the -SimpleMatch switch:
$exclude = Get-Content .\exclude.txt
get-content .\free.txt |Select-String -pattern $exclude -NotMatch -SimpleMatch
Note: The space in top of the displayed exclude.txt file suggests that there might be an empty line in top of the file (which regex matches any string). To get rid of any empty lines, use:
$exclude = Get-Content .\exclude.txt |Where-Object { $_ }

When comparing, $excludeip should be the referenceObject and $freeips comes after, like this:
compare-object (get-content $excludeip) (get-content $freeips) -PassThru | Out-File $finalips

Related

Multiple Select Strings in a for loop to separate files

I wrote this script to search a lot of text files (~100,000) for 4 different search criteria and export to 4 separate files, I thought it would be more efficient to perform all 4 searches on each file as it is loaded vs doing 4 full searches like the first iteration below does. I may be missing some other major inefficiencies as I am pretty new to powershell.
I have this script re written from the first version to the second, but can't figure out how to get the path and data to display together like the first version did. I am struggling to reference the object within the loop, and have pieced this second version together, which is working, but not giving me the path to the file which is necessary.
It seems like I am just missing one or two little things to get me going in the right direction. Thanks in advance for your help
1st version:
Get-ChildItem -Filter *.txt -Path "\\file\to\search" -Recurse | Select-String -Pattern "abc123" -Context 0,3 | Out-File -FilePath "\\c:\out.txt"
Get-ChildItem -Filter *.txt -Path "\\file\to\search2" -Recurse | Select-String -Pattern "abc124" -Context 0,3 | Out-File -FilePath "\\c:\out2.txt"
Get-ChildItem -Filter *.txt -Path "\\file\to\search3" -Recurse | Select-String -Pattern "abc125" -Context 0,3 | Out-File -FilePath "\\c:\out3.txt"
Get-ChildItem -Filter *.txt -Path "\\file\to\search4" -Recurse | Select-String -Pattern "abc126" -Context 0,3 | Out-File -FilePath "\\c:\out4.txt"
Output:
\\file\that\was\found\example.txt:84: abc123
\\file\that\was\found\example.txt:90: abc123
\\file\that\was\found\example.txt:91: abc123
2nd version:
##$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ Configuration $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
############################################ Global Parameters #############################################
$SearchPath="\\file\to\search"
$ProgressFile=""\\progress\file\ResultsCount.txt"
$records = 105325
##----------------------------------------- End Global Parameters -----------------------------------------
########################################### Search Parameters ##############################################
##Search Pattern 1
$Pattern1="abc123"
$SaveFile1="\\c:\out.txt"
##Search Pattern 2
$Pattern2="abc124"
$SaveFile2="\\c:\out2.txt"
##Search Pattern 3
$Pattern3= "abc125"
$SaveFile3= "\\c:\out3.txt"
##Search Pattern 4
$Pattern4= "abc126"
$SaveFile4="\\c:\out4.txt"
##Search Pattern 5
$Pattern5= ""
$SaveFile5=""
##----------------------------------------- End Search Parameters ------------------------------------------
##$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ End of Config $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
############################### SCRIPT #####################################################################
## NOTES
## ------
##$files=Get-ChildItem -Filter *.txt -Path $SearchPath -Recurse ## Set all files to variable #### Long running, needs to be a better way #######
##$records=$files.count ## Set record #
Get-ChildItem -Filter *.txt -Path $SearchPath -Recurse | Foreach-Object { ## loop through search folder
$i=$i+1 ## increment record
##
Get-Content $_.FullName | Select-String -Pattern $Pattern1 -Context 0,3 | Out-File -FilePath $SaveFile1 ## pattern1 search
Get-Content $_.FullName | Select-String -Pattern $Pattern2 | Out-File -FilePath $SaveFile2 ## pattern2 search
Get-Content $_.FullName | Select-String -Pattern $Pattern3 -Context 0,1 | Out-File -FilePath $SaveFile3 ## pattern3 search
Get-Content $_.FullName | Select-String -Pattern $Pattern4 -Context 0,1 | Out-File -FilePath $SaveFile4 ## pattern4 search
##Get-Content $_.FullName | Select-String -Pattern $Pattern5 -Context 0,1 | Out-File -FilePath $SaveFile5 ## pattern5 search (Comment out unneeded search lines like this one)
$progress ="Record $($i) of $($records)" ## set progress
Write-Host "Record $($i) of $($records)" ## Writes progress to window
$progress | Out-File -FilePath $ProgressFile ## progress file
} ##
############################################################################################################
Output:
abc123
abc123
abc123
Edit: Also I am trying to figure out a good way to not have to hard code in the number of records for a decent progress readout, I commented out the way I thought would work (1st & 2nd line of the script), but there needs to be a more efficient way than rerunning the same search twice, one for a count and one for the for loop.
I would be very interested in any runtime efficiency information you could provide.
[edit - thanks to mklement0 for pointing out the errors about speed and the -SimpleMatch switch. [grin]]
the Select-String cmdlet will accept a -Path parameter ... and it is FAR [i was thinking of Get-Content, not Get-ChidItem] faster than using Get-ChildItem to feed the files to S-S. [grin]
also, the -Pattern parameter accepts a regex OR pattern like Thing|OtherThing|YetAnotherThing - and it accepts simple string patterns if you use the -SimpleMatch switch parameter.
what the code does ...
defines the source dir
defines the file spec
joins those two into a wildcard file path
builds an array of string patterns to use
calls Select-String with a path and an array of strings to search for
uses Group-Object and a calculated property to group the matches by the last part of .Line property from the S-S call
saves that to a $Var
shows that on screen
at that point, you can use the .Name property of each GroupInfo to select the items to send out to each file AND to build your file names.
the code ...
$SourceDir = 'D:\Temp\zzz - Copy'
$FileSpec = '*.log'
$SD_FileSpec = Join-Path -Path $SourceDir -ChildPath $FileSpec
$TargetPatternList = #(
'Accordion Cajun Zydeco'
'better-not-be-there'
'Piano Rockabilly Rowdy'
)
$GO_Results = Select-String -Path $SD_FileSpec -SimpleMatch $TargetPatternList |
Group-Object -Property {$_.Line.Split(':')[-1]}
$GO_Results
output ...
Count Name Group
----- ---- -----
6 Accordion Cajun Zydeco {D:\Temp\zzz - Copy\Grouping-List_08-02.log:11:Accordion Cajun Zydeco, D:\Temp\zzz - Copy\Grouping-List_08-09.log:11:Accordion Cajun Zy...
6 Bawdy Dupe Piano Rocka... {D:\Temp\zzz - Copy\Grouping-List_08-02.log:108:Bawdy Dupe Piano Rockabilly Rowdy, D:\Temp\zzz - Copy\Grouping-List_08-09.log:108:Bawdy...
6 Bawdy Piano Rockabilly... {D:\Temp\zzz - Copy\Grouping-List_08-02.log:138:Bawdy Piano Rockabilly Rowdy, D:\Temp\zzz - Copy\Grouping-List_08-09.log:138:Bawdy Pian...
6 Dupe Piano Rockabilly ... {D:\Temp\zzz - Copy\Grouping-List_08-02.log:948:Dupe Piano Rockabilly Rowdy, D:\Temp\zzz - Copy\Grouping-List_08-09.log:948:Dupe Piano ...
6 Instrumental Piano Roc... {D:\Temp\zzz - Copy\Grouping-List_08-02.log:1563:Instrumental Piano Rockabilly Rowdy, D:\Temp\zzz - Copy\Grouping-List_08-09.log:1563:I...
6 Piano Rockabilly Rowdy {D:\Temp\zzz - Copy\Grouping-List_08-02.log:1781:Piano Rockabilly Rowdy, D:\Temp\zzz - Copy\Grouping-List_08-09.log:1781:Piano Rockabil...
note that the .Group contains an array of lines from the matches sent out by the S-S call. you can send that to your output file.
Here is my take at solving this problem, very similar to Lee_Dailey's nice answer but with a foreach loop. I would recommend investing some time into researching the multi-threading options available on PowerShell in case you need to increase the performance of the script, you can look specifically at the ThreadJob module by Microsoft which is really easy to use or if you can't install modules due to some work policy, you can use Runspace.
It is worth adding that you can use the -List switch on Select-String, this way the performance of the script would be increased even more:
-List
Only the first instance of matching text is returned from each input file. This is the most efficient way to retrieve a list of files that have contents matching the regular expression.
$map = #{
abc123 = 'C:\out_abc123.txt'
abc124 = 'C:\out_abc124.txt'
abc125 = 'C:\out_abc125.txt'
}
$pattern = $map.Keys -join '|'
$match = foreach($file in Get-ChildItem *.txt)
{
Select-String -LiteralPath $file.FullName -Pattern $pattern
}
$match | Group-Object { $_.Matches.Value } | ForEach-Object {
$_.Group | Select-Object Path, LineNumber, Line | Out-File $map[$_.Name]
}
To compliment the answers #Santiago Squarzon and Lee_Dailey, I think you were actually on the good way yourself knowing that the Group-Object cmdlet is pretty expensive especially in memory usage as it chokes the PowerShell pipeline causing all the search results to be piled up in memory.
Besides, the Select-String cmdlet supports multiple (-SimpleMatch) patterns, where concatenating the search patters with an | (-join '|') will force you to use an (escaped) regular expression.
To continue on your approach:
(note that in the example, I am using my own settings to search through my script files)
$ProgressFile = '.\ResultsCount.txt'
$SearchRoot = '..\'
$Filter = '*.ps1'
$Searches = #{
'Null' = '.\Null.txt'
'Test' = '.\Test.txt'
'Object' = '.\Object.txt'
}
$Files = Get-ChildItem -Filter $Filter -Path $SearchRoot -Recurse
$Total = $Files.count
$Searches.Values |ForEach-Object { Set-Content -LiteralPath $_ -Value '' }
$i = 0
ForEach ($File in $Files) {
Get-Content -LiteralPath $File.FullName |
Select-String #($Searches.Keys) -AllMatches |ForEach-Object {
$Value = '{0}:{1}:{2}' -f $File.FullName, $_.LineNumber, $_
Add-Content -LiteralPath $Searches[$_.Pattern] -Value $Value
}
'Record {0} of {1}' -f ++$i, $Total |Tee-Object -Append .\ProgressFile.txt
}
Explanations
$Searches = #{ ...
Maps the search patters with the files, you might also use a PSObject list to specify each search (where you could add columns with e.g. context start/end values, etc.)
$Searches.Values |ForEach-Object { Set-Content -LiteralPath $_ -Value '' }
Empties the result files (knowing that they are not part of the main stream you can't use Add-Content)
$i = 0
Unfortunately there is no automatic index that initializes with a foreach loop (yet, see: #13772 Automatic variable for the pipeline index)
Get-Content -LiteralPath $File.FullName
Load the content once into memory
Note1: this is a string array.
Note2: the $Content will be reused each iteration and therefore overwrites the previous one and unloads it from memory
Select-String #($Searches.Keys) -AllMatches |ForEach-Object {
Searches the string array using your (multiple) defined patterns. (you might consider to use the -SimpleMatch parameter if your search strings contain special characters.)
Note: Unfortunately you need to embedded the $Searches.Keys in a array subexpression operator #( ), for details see .Net issue: #56835 Make OrderedDictionaryKeyValueCollection implement IList
$Value = '{0}:{1}:{2}' -f $File.FullName, $_.LineNumber, $_
Build an result output string.
Note: the result of the Select-String does have a (hidden) LineNumber and (matched) Pattern property.
Add-Content -LiteralPath $Searches[$_.Pattern] -Value $Value
Add the result string to the specific mapped output file.
'Record {0} of {1}' -f $i++, $Total |Tee-Object -Append .\ProgressFile.txt
Tee-Object will write the progress to the standard output (display) and also to the specific file.

create file index manually using powershell, tab delimited

Sorry in advance for the probably trivial question, I'm a powershell noob, please bear with me and give me advice on how to get better.
I want to achieve a file index index.txt that contains the list of all files in current dir and subdirs in this format:
./dir1/file1.txt 07.05.2020 16:16 1959281
where
dirs listed are relative (i.e. this will be run remotely and to save space, the relative path is good enough)
the delimiter is a tab \t
the date format is day.month.fullyear hours:minutes:seconds, last written (this is the case for me, but I'm guessing this would be different on system setting and should be enforced)
(the last number is the size in bytes)
I almost get there using this command in powershell (maybe that's useful to someone else as well):
get-childitem . -recurse | select fullname,LastWriteTime,Length | Out-File index.txt
with this result
FullName LastWriteTime Length
-------- ------------- ------
C:\Users\user1\Downloads\test\asdf.txt 07.05.2020 16:19:29 1490
C:\Users\user1\Downloads\test\dirtree.txt 07.05.2020 16:08:44 0
C:\Users\user1\Downloads\test\index.txt 07.05.2020 16:29:01 0
C:\Users\user1\Downloads\test\test.txt 07.05.2020 16:01:23 814
C:\Users\user1\Downloads\test\text2.txt 07.05.2020 15:55:45 1346
So the questions that remain are: How to...
get rid of the headers?
enforce this date format?
tab delimit everything?
get control of what newline character is used (\n or \r or both)?
Another approach could be this:
$StartDirectory = Get-Location
Get-ChildItem -Path $StartDirectory -recurse |
Select-Object -Property #{Name='RelPath';Expression={$_.FullName.toString() -replace [REGEX]::Escape($StartDirectory.ToString()),'.'}},
#{Name='LastWriteTime';Expression={$_.LastWriteTime.toString('dd.MM.yyyy HH:mm:ss')}},
Length |
Export-Csv -Path Result.csv -NoTypeInformation -Delimiter "`t"
I recommend to use proper CSV files if you have structured data like this. The resulting CSV file will be saved in the current working directory.
If the path you are running this from is NOT the current scrip path, do:
$path = 'D:\Downloads' # 'X:\SomeFolder\SomeWhere'
Set-Location $path
first.
Next, this ought to do it:
Get-ChildItem . -Recurse -File | ForEach-Object {
"{0}`t{1:dd.MM.yyyy HH:mm}`t{2}" -f ($_ | Resolve-Path -Relative), $_.LastWriteTime, $_.Length
} | Out-File 'index.txt'
On Windows the newline will be \r\n (CRLF)
If you want control over that, this should do:
$newline = "`n" # for example
# capture the lines as string array in variable $lines
$lines = Get-ChildItem . -Recurse -File | ForEach-Object {
"{0}`t{1:dd.MM.yyyy HH:mm}`t{2}" -f ($_ | Resolve-Path -Relative), $_.LastWriteTime, $_.Length
}
# join the array with the chosen newline and save to file
$lines -join $newline | Out-File 'index.txt' -NoNewline
Because your requirement is to NOT have column headers in the output file, I'm using Out-File here instead of Export-Csv

how to use select-string to get value of particular string from txt file

How do I get a specific string from txt file using select-string for example, what I tried so far is:
$path = "\\serverpath\servername.txt"
$list = select-string -path $path -pattern "node"
write-host $list
servername.txt contains:
servername is node1 and it is development server, it has problem
servername is node2 and it is production server, it is good
So I need to list only the words node1, node2 ...from the .txt file.
When you want to use Select-String for this you can just expand your Pattern to match the digit following node and extract the matched Value like this:
$path = "\\serverpath\servername.txt"
$list = Select-String -Path $path -Pattern "node\d+" -AllMatches | % {$_.Matches.Value}
Write-Host $list
Explanation:
"node\d+" -> \d+ matches the word node + x-digits after the word (x > 0)
% -> alias for ForEach-Object Value given from the Pipe
$_.Matches.Value -> Gives the matched Value
You can use a regex pattern and then acceess the Matches attribute of the resulting select-string object.
$path = "\\serverpath\servername.txt"
$list = (select-string -Path $path -pattern "(node\d)").Matches
write-host $list
The regex pattern matches node plus a single digit between 0-9 following it.
https://regex101.com/r/vg4LIz/5

Inserting Space/Separator after Select-String -Context

I have a script that reads series of log files located in different places and looks for an error code with Select-String. After the error code I print out the next four lines to a file with "-Context". That file's content gets dumped into an email and sent off.
$logsToCheck = "F:\Log1.log",
"F:\log2.log",
"F:\log3.log"
$logsToCheck | % {Select-String -Path $_ -Pattern "SQLCODE:2627" -Context
0, 4} | Out-File $dupChkFile
$emailbody = Get-Content $dupChkFile | ConvertTo-Html
The actual output of the string is poorly formatted and runs together. Is there a way to add blank lines or spaces after the last line when using Select-String -Context?
Originally I was piping the $emailbody to a Out-String but changed it to HTML to try to clear up formatting.
try reading out the match and context separately.
Select-String -Path $_ "SQLCODE:2627" -Context 0,2 | %{
$_.Line
$_.Context.PostContext
"-----Separator-----"
}
the default output of Select-String with Context is human-readable modified, this will return everything as an array of unmodified strings, so you can be sure there will always be newlines, and it will behave better with other cmdlt's including Out-String or loops.
I would suggest using concatenation:
% { "$(Select-String -Path $_ -Pattern 'SQLCODE:2627' -Context 0, 4)`r`n" } |
Out-File $dupChkFile -Append

Powershell Select-String parse email from .rtf to .csv

Parse .rtf file, output email addresses in .csv file?
I have an .rtf file containing a bunch of email addresses, I need this parsed so that I can compare a .csv file to active users in Active Directory.
Basically I want what is to the left of "#my.domain.com"
$finds = Select-String -Path "path\to\my.rtf" -Pattern "#my.domain.com" | ForEach-Object {$_.Matches}
$finds | Select-Object -First 1 | ft *
This of course gives me one result so that I don't have alot of output.
I only manage to get matches or the complete line.
I've tried adding something along the line of
$finds = Select-String -Path "path\to\my.rtf" -Pattern "\w.#my.domain.com"
This gives me the very two last letters in the addresses.
If I keep adding dots to the "wildcard"
-Pattern "\w.....#my.domain.com"
I also get a ton of numbers/characters (.rtf formatting) for addresses that contains fewer characters.
How do I do this?
EDIT: I will update the question as soon as I've found a solution. As of now I'm trying with regular expressions.
Example:
-Pattern "\w*?#my.domain.com"
$mPattern = "[a-zA-Z0-9._%+-]+#[a-zA-Z0-9.-]+(\.[a-zA-Z]{2,4})"
$lines = get-content "path\to\your.rtf"
foreach($line in $lines){
([regex]::MAtch($rtfInput, $mpattern, "IgnoreCase ")).value }
This code worked for me. My inital code but with a new search pattern.
$finds = Select-String -Path "path\to\my.rtf" -Pattern "[a-zA-Z0-9._%+-]+#[a-zA-Z0-9.-]+(\.[a-zA-Z]{2,4})" | ForEach-Object {$_.Matches}
$finds | Select-Object -First 10 | ft *
Thanks!