DISTINCT Select-String output on directory/text search - powershell

I am curious how to produce a distinct file list based on this example.
** This example produces a list of all .ps1 and .psm1 files that contain the text "folders", but without the text ".invoke" on the same line.
$text='folders'
dir C:\Workspace\mydirectorytosearch1\ -recurse -filter '*.ps*1' | Get-ChildItem | select-string -pattern $text | where {$_ -NotLike '*.invoke(*'}
dir C:\Workspace\mydirectorytosearch2\ -recurse -filter '*.ps*1' | Get-ChildItem | select-string -pattern $text | where {$_ -NotLike '*.invoke(*'}
This is cool and works well but I get duplicate file output (same file but different line numbers).
How can I keep my file output distinct?
The current undesirable output:
C:\Workspace\mydirectorytosearch1\anonymize-psake.ps1:4:. "$($folders.example.test)\anonymize\Example.vars.ps1"
C:\Workspace\mydirectorytosearch1\anonymize-psake.ps1:5:. "$($folders.missles)\extract\build-utilities.ps1"
The desired output:
C:\Workspace\mydirectorytosearch1\anonymize-psake.ps1
Help me tweak my script??

You can eliminate duplicates wit Select-String and the Unique parameter:
$text='folders'
Get-ChildItem C:\Workspace\mydirectorytosearch1\,C:\Workspace\mydirectorytosearch2\ -Recurse -Filter '*.ps*1' |
Select-String -Pattern $text | Where-Object {$_ -NotLike '*.invoke(*'} |
Select-Object Path -Unique

Related

I need my Get-ChildItem string search command print file names along with their Date Modified values, sorted

I spent quite some time searching for the solution of my problem, but found nothing. I have one single folder with mostly .html files, and I frequently need to search to find the files that contain certain strings. I need the search result to be displayed with just the file name (as the file will only be in that one folder) and file's last write time. The list needs to be sorted by the last write time. This code works perfectly for finding the correct files
Get-ChildItem -Filter *.html -Recurse | Select-String -pattern "keyWord string" | group path | select name
The problem with it is that it displays the entire path of the file (which is not needed), it does not show the last write time, and it is not sorted by the last write time.
I also have this code
Get-ChildItem -Attributes !Directory *.html | Sort-Object -Descending -Property LastWriteTime | Select-Object Name, LastWriteTime
That code prints everything exactly as I want to see it, but it prints all the file names from the folder instead of printing only the files that I need to find with a specific string in them.
Since you are only using Select-String to determine if the text exists in any of the files move it inside a Where-Object filter and use the -Quiet parameter so that it returns true or false. Then sort and select the properties you want.
Get-ChildItem -Filter *.html |
Where-Object { $_ | Select-String -Pattern 'keyWord string' -Quiet } |
Sort-Object LastWriteTime |
Select-Object Name, LastWriteTime
For multiple patterns one way you can do it is like this
Get-ChildItem -Filter *.html |
Where-Object {
($_ | Select-String -Pattern 'keyWord string' -Quiet) -and
($_ | Select-String -Pattern 'keyWord string #2' -Quiet)
} |
Sort-Object LastWriteTime |
Select-Object Name, LastWriteTime
And another way using Select-String with multiple patterns which may be a bit faster
$patterns = 'keyword 1', 'keyword 2', 'keyword 3'
Get-ChildItem -Filter *.html |
Where-Object {
($_ | Select-String -Pattern $patterns | Select-Object -Unique Pattern ).Count -eq $patterns.Count
} |
Sort-Object LastWriteTime |
Select-Object Name, LastWriteTime
If you don't care about it being a bit redundant, you can Get-ChildItem the results after your searching:
Get-ChildItem -Filter *.html -Attributes !Directory -Recurse | Select-String -Pattern "keyWord string" | group path | foreach {Get-ChildItem $_.Name } | Sort-Object -Descending LastWriteTime | Select Name,LastWriteTime
After you Select-String you get the attributes of that object instead of the original, so we're taking the results of that object and passing it back into the Get-ChildItem command to retrieve those attributes instead.

How can i search for multiple string patterns in text files within a directory

I have a textbox that takes an input and searches a drive.
Drive for example is C:/users/me
let's say I have multiple files and subdirectories in there and I would like to search if the following strings exist in the file: "ssn" and "DOB"
Once user inputs the two strings. I split the string but space, so I can loop through the array. But here is my current code, but I'm stuck on how to proceed.
gci "C:\Users\me" -Recurse | where { ($_ | Select-String -pattern ('SSN') -SimpleMatch) -or ($_ | Select-String -pattern ('DOB') -SimpleMatch ) } | ft CreationTime, Name -Wrap -GroupBy Directory | Out-String
this above code works if i pasted it manually into powershell but I'm trying to recreate this in a script but having confusion and how to do so.
this code below is not getting all the files needed.
foreach ($x in $StringArrayInputs) {
if($x -eq $lastItem){
$whereClause = ($_ | Select-String -Pattern $x)
}else{
$whereClause = ($_ | Select-String -Pattern $x) + '-or'
}
$files= gci $dv -Recurse | Where { $_ | Select-String -Pattern $x -SimpleMatch} | ft CreationTime, Name -Wrap -GroupBy Directory | Out-String
}
Select-String's -Pattern parameter accepts an array of strings (any one of which triggers a match), so piping directly to a single Select-String call should do:
$files= Get-ChildItem -File -Recurse $dv |
Select-String -List -SimpleMatch -Pattern $StringArrayInputs } |
Get-Item |
Format-Table CreationTime, Name -Wrap -GroupBy Directory |
Out-String
Note:
Using -File with Get-ChildItem makes it return only files, not also directories.
Using -List with Select-String is an optimization that ensures that at most one match per file is looked for and reported.
Passing Select-String's output to Get-Item automatically binds the .Path property of the former's output to the -Path parameter of the latter.
Strictly speaking, binding to -Path subjects the argument to interpretation as a wildcard expression, which, however, is generally not a concern - except if the path contains [ characters.
If that is a possibility, insert a pipeline segment with Select-Object #{ Name='LiteralPath'; Expression='Path' } before Get-Item, which ensures binding to -LiteralPath instead.
I just followed your examples and combined both with a regex. I escaped the regex to avoid accidential usage of expressions (like a dot for any char).
It is working with my testfiles but may differ with your files. You may need to add " -Encoding UTF8" with your appropriate encoding so you may get regional specific chars as well.
$String = Read-Host "Enter multiple strings seperated by space to search for"
$escapedRegex = ([Regex]::Escape($String)) -replace "\\ ","|"
Get-ChildItem -Recurse -Attributes !Directory | Where-Object {
$_ | Get-Content | Select-String -Pattern $escapedRegex
} | Format-Table CreationTime, Name -Wrap -GroupBy Directory | Out-String

Efficiently extract matching lines including path and line number?

The following snippet extracts only the matching lines, I also want the path and line number:
Get-ChildItem $thePath\ -Include "*.txt" -Recurse | Get-Content | Select-String -Pattern 'THE_PATTEN' | Set-Content "output.txt"
I tried with this method and still it only extracts the matching lines:
Get-ChildItem $thePath\ -Include "*.txt" -Recurse | Get-Content | Select-String -Pattern 'THE_PATTEN' | Select-Object -ExpandProperty Line | Set-Content "output.txt"
How can I extract the path:filename:line number: matching line?
You don't need get-content. The path is passed over the pipe. (. is for -path, and *.txt is for -filter for speed)
get-childitem -recurse . *.txt | select-string hi
foo2\file3.txt:1:hi
file1.txt:1:hi
file2.txt:1:hi
First note that Get-ChildItem -Filter is way more efficient than Get-ChildItem -Include (see help get-childitem). Next is that Select-String accepts files. No need to get the content first. Now just Select the properties you need and export your file. (Note that the variable $match and $matches are system variables so you might not want to use them.)
$Patterns = Get-ChildItem $thePath -Filter "*.txt" -Recurse| Select-String -Pattern 'THE_PATTEN' | select Path,Filename,LineNumber,Line
# Export to csv (usable in excel)
$Patterns | Export-Csv output.csv -NoTypeInformation # -Delimiter ";" # the delimiter is optinal and depending of your region
# Exporting txt
foreach ($Pattern in $Patterns){
('{0} : {1} : {2}' -f ($Pattern.Path),($Pattern.LineNumber),($Pattern.Line)) | Add-Content "output.txt"
}
Yea, you can get line number and file name from the output of Select-String:
ls *.txt | % { Select-String -Path $_ -Pattern "THE_PATTERN" | select-object LineNumber, Line, Path }
You'll notice this approach is also a touch faster.
Good luck!

How to pipe multiple results into a single command

I've got the following pipeline:
dir -recurse *.* | sls -pattern "matching_pattern" | select -unique path
Which gives me an output like this:
Path
----
D:\code\a.txt
D:\code\b.txt
I want it to call the command gvim a.txt b.txt.
How do I do this?
Use Where-Object instead of Select-String for filtering the files, expand the FullName (or Name) property, so you get an array of paths or filenames, and splat it when calling gvim:
$files = Get-ChildItem -Recurse *.* |
Where-Object { (Get-Content $_.FullName) -match "matching_pattern" } |
Select-Object -Unique -Expand FullName
& gvim #files
Replace FullName with Name in the Select-Object statement to get just the filenames without path.
If you want to stick with Select-String the approach would be similar:
$files = Get-ChildItem -Recurse *.* |
Select-String -Pattern "matching_pattern" |
Select-Object -Unique -Expand Path
& gvim #files
Replace Path with Filename in the Select-Object statement to get just the filenames without path.
You could access the two results by index:
$result = dir -recurse *.* | sls -pattern "matching_pattern" | select -unique path
gvim $result[0].FullName $result[1].FullName

PowerShell - Search pattern from particular lines

I wrote below command to search all files within Workflow folders and look for only those files that matched pattern 'TextBox.TextBox'. It worked fine.
Now I want to change the command so it only search pattern from line 1 till line 50, instead to search whole file. How can I do that ?
Get-ChildItem E:\Test\Workflow -Recurse | Select-String -pattern "TextBox.TextBox" -SimpleMatch | group path
| select name | measure
You could use Get-Content and Select-Object -First:
Get-ChildItem E:\Test\Workflow -Recurse |ForEach-Object {
Get-Content -Path $_.FullName |Select-Object -First 50
} |Select-String -Pattern "TextBox.TextBox" -SimpleMatch
You can use the Where-Object cmdlet to filter all matches with LineNumber less equal 50:
Get-ChildItem E:\Test\Workflow -Recurse | Select-String -pattern "TextBox.TextBox" -SimpleMatch | Where-Object LineNumber -le 50 group path