Powershell - Find files that match a pattern for specific number of times - powershell

To find a simple pattern in a set of files in powershell I go
$pattern= 'mypattern'
$r= Get-ChildItem -Path "C:\.." -recurse |
Select-String -pattern $pattern | group path | select name
$r | Out-GridView
In my scenario, I have files that contain the pattern for more than one time and others that have the pattern for one time only. So I am interested in those files that contain the pattern for more than one time and not interested in the rest. Thanks

One approach for the start of what you are looking for is Select-String and Group-Object like you already have.
Select-String -Path (Get-ChildItem C:\temp\ -Filter *.txt -Recurse) -Pattern "140" -AllMatches |
Group-Object Path |
Where-Object{$_.Count -gt 1} |
Select Name, Count |
Out-GridView
This will take all the txt files in the temp directory and group them by the number of matches. -AllMatches is important as by default Select-String will only return the first match it finds on a line.
Of those groups we take the ones where the count is higher than one using Where-Object. Then we just output the file names and there counts with a Select Name,Count. Where name is the full file path where the matched text is located.
About Out-GridView
I see that you are assinging the output from Out-GridView to $r. If you want to do that you need to be sure you add the -PassThru parameter.

Related

Use PowerShell -Pattern to search for multiple values on multiple files

Team --
I have a snippet of code that works as designed. It will scan all the files within a folder hierarchy for a particular word and then return the de-duped file path of the files where all instance of the word were found.
$properties = #(
'Path'
)
Get-ChildItem -Path \\Server_Name\Folder_Name\* -recurse |
Select-String -Pattern ‘Hello’ |
Select-Object $properties -Unique |
Export-Csv \\Server_Name\Folder_Name\File_Name.csv -NoTypeInformation
I'd like to
Expand this code to be able to search for multiple words at once. So all cases where 'Hello' OR 'Hola' are found... and potentially an entire list of words if possible.
Have the code return not only the file path but the word that tripped it ... with multiple lines for the same path if both words tripped it
I've found some article talking about doing multiple word searches using methods like:
where { $_ | Select-String -Pattern 'Hello' } |
where { $_ | Select-String -Pattern 'Hola' } |
OR
Select-String -Pattern ‘(Hello.Hola)|(Hola.Hello)’
These codes will run ... but return no data is returned in the output file ... it just blank with the header 'Path'.
I'm missing something obvious ... anyone spare a fresh set of eyes?
MS
Select-String's -Pattern parameter accepts an array of patterns.
Each [Microsoft.PowerShell.Commands.MatchInfo] instance output by Select-String has a .Pattern property that indicates the specific pattern that matched.
Get-ChildItem -Path \\Server_Name\Folder_Name\* -recurse |
Select-String -Pattern 'Hello', 'Hola' |
Select-Object Path, Pattern |
Export-Csv \\Server_Name\Folder_Name\File_Name.csv -NoTypeInformation
Note:
If a given matching line matches multiple patterns, only the first matching pattern is reported for it, in input order.
While adding -AllMatches normally finds all matches on a line, as of PowerShell 7.2.x this doesn't work as expected with multiple patterns passed to -Pattern, with respect to the matches reported in the .Matches property - see GitHub issue #7765.
Similarly, the .Pattern property doesn't then reflect the (potentially) multiple patterns that matched; and, in fact, this isn't even possible at the moment, given that the .Pattern property is a [string]-typed member of [Microsoft.PowerShell.Commands.MatchInfo] and therefore cannot reflect multiple patterns.

How can I use PowerShell or a cmd "dir" to get the contents of multiple, but similar paths?

For example, I want the contents of the "Last" folder in the structure below. The various path structures are identical except for the first two levels.
C:\zyx-wvu\abc\Level3\Last
C:\tsr-qpo\def\Level3\Last
C:\nml-kji\ghi\Level3\Last
In PowerShell I get close with:
Get-ChildItem -Path C:\*-*\*
...but it doesn't return any results (as in it never finishes) when I try:
Get-ChildItem -Path C:\*-*\*\Level3
Get-ChildItem -Path C:\*-*\*
will only show you what's in the second layer behind anything with a hyphen in c:\
aka it will show
c:\1-2\alpha
c:\1-5\beta
etc...
What you want is
Get-ChildItem -Path C:\*-*\*\*
or more likely you want
Get-ChildItem -Path C:\*-*\* -recurse
if you want to find paths with the SAME name... you could group them together, and pull out anything with more than one finding... you didn't ask very specifically what you wanted, but here's some ideas.
get-childitem -Path c:\*-*\*\* | group-object -property basename | where count -gt 1 | select -expand group

Count Files by Name

I am looking for a way to count files from many sub-folders but the tricky part is that i want to filter them by part of their names. To be more specific, all files have a date at the middle of their names. If I want to just count the files within a specific folder I use this:
dir * |%{$_.Name.SubString(7,8)} | group |select name,count|ft -auto
And works like a charm. The problem lies that it cannot see more than one folder. Also a second problem is that in the result, I want to see the path name of the grouped counts. I am also testing this:
dir -recurse | ?{ $_.PSIsContainer } | %{ Write-Host $_.FullName (dir $_.FullName | Measure-Object).Count }
but I cannot implement the date filter from inside the name in this functions. I am also attaching an example of how is the format and how I would like the results.
Any help?
I am looking for a way to count files from many sub-folders but the tricky part is that I want to filter them by part of their names. To be more specific, all files have a date at the middle of their names.
It is not 100% clear to me, if you really want to filter them or to group them before counting, so I'll show both.
Assuming that this middle of their names is, e.g., delimited by _ this can be achieved the following way:
# C:/temp/testFolder/myName_123_folder/text.txt
Get-ChildItem * -Recurse |
Select-Object -Property Name, #{Name = "CustomDate"; Expression = {$_.Name.Split("_")[1]}} |
#This is how you would _filter_
#Where-Object {$_.Custom -eq "123"} |
Group-Object -Property CustomDate |
Select-Object Name, Count
Don't forget to check if the file name matches this pattern, before splitting. This can be done with a Select-Object statement between gci and 1. select, which checks the file name for your specific pattern.
Your question shows also that you wanted to filter for only directories:
dir -recurse | ?{ $_.PSIsContainer } | %{ #[...]
Which is not very efficient.
From the Docs of Get-ChildItem:
-Directory
Gets directories (folders).
To get only directories, use the -Directory parameter and omit the -File parameter. To exclude directories, use the -File parameter and omit the -Directory parameter, or use the -Attributes parameter.
This means, the preferred way to search only for Directories is:
Get-ChildItem -Recurse -Directory | % { #[...]

Powershell - Match ID's in a text file against filenames in multiple folders

I need to search through 350,000 files to find any that contains certain patterns in the filename. However, the list of patterns (id numbers) that it needs to match is 1000! So I would very much like to be able to script this, because they were originally planning on doing it manually...
So to make it clearer:
Check each File in folder and all subfolders.
If the filename contains any of the IDs in the text file then move it to another file
Otherwise, ignore it.
So I have the basic code that works with a single value:
$name = Get-Content 'C:\test\list.txt'
get-childitem -Recurse -path "c:\test\source\" -filter "*$name*" |
move-item -Destination "C:\test\Destination"
If I change $name to point to a single ID, it works, if I have a single ID in the txt file, it works. Multiple items in a list:
1111111
2222222
3333333
It fails. What am I doing wrong? How can I get it to work? I'm still new to powershell so please be a little more descriptive in any answers.
Your test fails because it is effectively trying to do this (using your test data).
Get-ChildItem -Recurse -Path "c:\test\source\" -filter "*1111111 2222222 3333333*"
Which obviously does not work. It is squishing the array into one single space delimited string. You have to account for the multiple id logic in a different way.
I am not sure which of these will perform better so make sure you test both of these with your own data to get a better idea of execution time.
Cycle each "filter"
$filters = Get-Content 'C:\test\list.txt'
# Get the files once
$files = Get-ChildItem -Recurse -Path "c:\test\source" -File
# Cycle Each ID filter manually
$filters | ForEach-Object{
$singleFilter
$files | Where-Object{$_.Name -like "*$singleFilter*"}
} | Move-Item -Destination "C:\test\Destination"
Make one larger filter
$filters = Get-Content 'C:\test\list.txt'
# Build a large regex alternative match pattern. Escape each ID in case there are regex metacharacters.
$regex = ($filters | ForEach-Object{[regex]::Escape($_)}) -join "|"
# Get the files once
Get-ChildItem -Recurse -path "c:\test\source" -File |
Where-Object{$_.Name -match $regex} |
Move-Item -Destination "C:\test\Destination"
try following this tutorial on how to use get-content function. Looks like when you have a multiple line file, you get an array back. you then have to iterate through your array and use the logic you used for only one item

Combine all content from several files, find matching strings and get a count of each line

I know how to get the data and search through it using some pattern. But that is not what I need.
Get-ChildItem -recurse -Filter *.xml | Get-Content | Select-String -pattern "something here"
I am searching through 100's of GPO xml files and we are trying to remove GPO's that perform the same thing over and over again. I want to find the unique values and combine them in one big happy gpo and get rid of all the redundant ones.
My goal :
1) Get all information from all *.xml files from 100's of sub folders and combine them into one file.
2) Find all lines that contain the same string and get a count of that string. I need a count for all strings in the combined file.
3) My goal is to find the lines that are unique and save them to a file, for further use.
Here's a quick-and-dirty approach using a Hashtable. Since the Hashtable setter performs an "update or create", you'll end up with a distinct list:
$ht = #{}
Get-ChildItem -recurse -Filter *.xml | Get-Content | %{$ht[$_] = $true}
$ht.Keys
Edit: Just saw you wanted counts as well. You can do this:
$ht = #{}
Get-ChildItem -recurse -Filter *.xml | Get-Content | %{$ht[$_] = $ht[$_]+1}
$ht
To export to CSV:
$ht.GetEnumerator() | select key, value | Export-Csv D:\output.csv