How to get files which have multiple required strings - powershell

I am trying to find the files which have given strings. I am using the below line
Get-ChildItem -recurse | Select-String -pattern "Magnet","Stew" | group path | select name
But it is giving the files which are having any one of the words "Magnet","Stew". But I want the files which have both the words. In logically speaking the above command interprets it as "Or" condition. I want "And" condition. Can anybody guide me of how to do this?

Try this for your regex:
'.*(?:stew.*magnet)|(?:magnet.*stew).*'
Edit:
If you're looking for those matches anywhere in the file:
Get-ChildItem -Recurse |
where {(
($_ | Select-String -Pattern 'Stew' -SimpleMatch -Quiet) -and
($_ | Select-String -Pattern 'Magnet' -SimpleMatch -Quiet)
)}
The solution posted at the link CB provided will work, but doesn't seem very efficient for what you're needing to do. The -SimpleMatch will be faster than using a Regex pattern, and the -Quiet switch will make it just return True or False, depending on whether it found a match in the file. If one or the other of the terms is much less likely to appear in the file, change the order so that one appears first in the test stack so the test will fail sooner and it can move on to the next file.

Related

Use PowerShell -Pattern to search for multiple values on multiple files

Team --
I have a snippet of code that works as designed. It will scan all the files within a folder hierarchy for a particular word and then return the de-duped file path of the files where all instance of the word were found.
$properties = #(
'Path'
)
Get-ChildItem -Path \\Server_Name\Folder_Name\* -recurse |
Select-String -Pattern ‘Hello’ |
Select-Object $properties -Unique |
Export-Csv \\Server_Name\Folder_Name\File_Name.csv -NoTypeInformation
I'd like to
Expand this code to be able to search for multiple words at once. So all cases where 'Hello' OR 'Hola' are found... and potentially an entire list of words if possible.
Have the code return not only the file path but the word that tripped it ... with multiple lines for the same path if both words tripped it
I've found some article talking about doing multiple word searches using methods like:
where { $_ | Select-String -Pattern 'Hello' } |
where { $_ | Select-String -Pattern 'Hola' } |
OR
Select-String -Pattern ‘(Hello.Hola)|(Hola.Hello)’
These codes will run ... but return no data is returned in the output file ... it just blank with the header 'Path'.
I'm missing something obvious ... anyone spare a fresh set of eyes?
MS
Select-String's -Pattern parameter accepts an array of patterns.
Each [Microsoft.PowerShell.Commands.MatchInfo] instance output by Select-String has a .Pattern property that indicates the specific pattern that matched.
Get-ChildItem -Path \\Server_Name\Folder_Name\* -recurse |
Select-String -Pattern 'Hello', 'Hola' |
Select-Object Path, Pattern |
Export-Csv \\Server_Name\Folder_Name\File_Name.csv -NoTypeInformation
Note:
If a given matching line matches multiple patterns, only the first matching pattern is reported for it, in input order.
While adding -AllMatches normally finds all matches on a line, as of PowerShell 7.2.x this doesn't work as expected with multiple patterns passed to -Pattern, with respect to the matches reported in the .Matches property - see GitHub issue #7765.
Similarly, the .Pattern property doesn't then reflect the (potentially) multiple patterns that matched; and, in fact, this isn't even possible at the moment, given that the .Pattern property is a [string]-typed member of [Microsoft.PowerShell.Commands.MatchInfo] and therefore cannot reflect multiple patterns.

Powershell - Match ID's in a text file against filenames in multiple folders

I need to search through 350,000 files to find any that contains certain patterns in the filename. However, the list of patterns (id numbers) that it needs to match is 1000! So I would very much like to be able to script this, because they were originally planning on doing it manually...
So to make it clearer:
Check each File in folder and all subfolders.
If the filename contains any of the IDs in the text file then move it to another file
Otherwise, ignore it.
So I have the basic code that works with a single value:
$name = Get-Content 'C:\test\list.txt'
get-childitem -Recurse -path "c:\test\source\" -filter "*$name*" |
move-item -Destination "C:\test\Destination"
If I change $name to point to a single ID, it works, if I have a single ID in the txt file, it works. Multiple items in a list:
1111111
2222222
3333333
It fails. What am I doing wrong? How can I get it to work? I'm still new to powershell so please be a little more descriptive in any answers.
Your test fails because it is effectively trying to do this (using your test data).
Get-ChildItem -Recurse -Path "c:\test\source\" -filter "*1111111 2222222 3333333*"
Which obviously does not work. It is squishing the array into one single space delimited string. You have to account for the multiple id logic in a different way.
I am not sure which of these will perform better so make sure you test both of these with your own data to get a better idea of execution time.
Cycle each "filter"
$filters = Get-Content 'C:\test\list.txt'
# Get the files once
$files = Get-ChildItem -Recurse -Path "c:\test\source" -File
# Cycle Each ID filter manually
$filters | ForEach-Object{
$singleFilter
$files | Where-Object{$_.Name -like "*$singleFilter*"}
} | Move-Item -Destination "C:\test\Destination"
Make one larger filter
$filters = Get-Content 'C:\test\list.txt'
# Build a large regex alternative match pattern. Escape each ID in case there are regex metacharacters.
$regex = ($filters | ForEach-Object{[regex]::Escape($_)}) -join "|"
# Get the files once
Get-ChildItem -Recurse -path "c:\test\source" -File |
Where-Object{$_.Name -match $regex} |
Move-Item -Destination "C:\test\Destination"
try following this tutorial on how to use get-content function. Looks like when you have a multiple line file, you get an array back. you then have to iterate through your array and use the logic you used for only one item

Powershell - Find files that match a pattern for specific number of times

To find a simple pattern in a set of files in powershell I go
$pattern= 'mypattern'
$r= Get-ChildItem -Path "C:\.." -recurse |
Select-String -pattern $pattern | group path | select name
$r | Out-GridView
In my scenario, I have files that contain the pattern for more than one time and others that have the pattern for one time only. So I am interested in those files that contain the pattern for more than one time and not interested in the rest. Thanks
One approach for the start of what you are looking for is Select-String and Group-Object like you already have.
Select-String -Path (Get-ChildItem C:\temp\ -Filter *.txt -Recurse) -Pattern "140" -AllMatches |
Group-Object Path |
Where-Object{$_.Count -gt 1} |
Select Name, Count |
Out-GridView
This will take all the txt files in the temp directory and group them by the number of matches. -AllMatches is important as by default Select-String will only return the first match it finds on a line.
Of those groups we take the ones where the count is higher than one using Where-Object. Then we just output the file names and there counts with a Select Name,Count. Where name is the full file path where the matched text is located.
About Out-GridView
I see that you are assinging the output from Out-GridView to $r. If you want to do that you need to be sure you add the -PassThru parameter.

Count specific string in text file using PowerShell

Is it possible to count specific strings in a file and save the value in a variable?
For me it would be the String "/export" (without quotes).
Here's one method:
$FileContent = Get-Content "YourFile.txt"
$Matches = Select-String -InputObject $FileContent -Pattern "/export" -AllMatches
$Matches.Matches.Count
Here's a way to do it.
$count = (get-content file1.txt | select-string -pattern "/export").length
As mentioned in comments, this will return the count of lines containing the pattern, so if any line has more than one instance of the pattern, the count won't be correct.
If you're searching in a large file (several gigabytes) that could have have millions of matches, you might run into memory problems. You can do something like this (inspired by a suggestion from NealWalters):
Select-String -Path YourFile.txt -Pattern '/export' -SimpleMatch | Measure-Object -Line
This is not perfect because
it counts the number of lines that contain the match, not the total number of matches.
it prints some headings along with the count, rather than putting just the count into a variable.
You can probably solve these if you need to. But at least you won't run out of memory.
grep -co vs grep -c
Both are useful and thanks for the "o" version. New one to me.

Why won't it rename the file? Powershell

Can someone tell me why this script won't work?
Get-ChildItem "\\fhnsrv01\home\aborgetti\Documentation\Stage" -Filter *.EDIPROD | `
Foreach-Object{
$content = Get-Content $_.FullName
#filter and save content to a new file
$content | Where-Object {$_ -match 'T042456'} | Rename-Item `
($_.BaseName+'_834.txt')
I found this syntax from another question on here and changed the environment variables.
For some reason it won't change the name of the file. The filename is
'AIDOCCAI.D051414.T042456.MO.EDIPROD'
Help much appreciated.
UPDATE
Thanks to TheMadTechnician I was able to get some working stuff. Great stuff actually. Figure I should share with the world!
#Call Bluezone to do file transfer
#start-process "\\fhnsrv01\home\aborgetti\Documentation\Projects\Automation\OpenBZ.bat"
#Variable Declarations
$a = Get-Date
$b = $a.ToString('MMddyy')
$source = "\\fhnsrv01\home\aborgetti\Documentation\Stage\"
$dest = "\\fhnsrv01\home\aborgetti\Documentation\Stage\orig"
#Find all the files that have EDIPROD extension and proceed to process them
#First copy the original file to the orig folder before any manipulation takes place
Copy-item $source\*.EDIPROD $dest
# Now we must rename the items that are in the table
Switch(GCI \\fhnsrv01\home\aborgetti\Documentation\Stage\*.EDIPROD){
{(GC $_|Select -first 1) -match "834*"}{$_ | Rename-Item -NewName {$_.BaseName+'_834.txt'}}
{(GC $_|Select -first 1) -match "820*"}{$_ | Rename-Item -NewName {$_.BaseName+'_820.txt'}}
}
Get-ChildItem's -Filter has issues, I really hesitate to use it in general. If it were up to me I'd do something like this:
Get-ChildItem "\\fhnsrv01\home\aborgetti\Documentation\Stage" |
?{$_.Extension -match ".EDIPROD" -and $_.name -match "T042456"}|
%{$_.MoveTo($_.FullName+"_834.txt")}
Well, I would put it all on one line, but you can line break after the pipe and it does make it a little easier to read, so there you have it. I'm rambling, sorry.
Edit: Wow, I didn't even address what was wrong with your script. Sorry, kind of distracted at the end of my work day here. So, why doesn't your script work? Here's why:
You pull a file and folder listing for the chosen path. That's great, it should work, more or less, I have almost no faith in the -Filter capabilities of the file system provider, but anyway, moving on!
You take that list and run it through a ForEach loop processing each file that matches your filter as such:
You read the contents of the file, and store them in the variable $content
You run the contents of the file, line by line, there a Where filter looking for the text "T042456"
For each line that matches that text you attempt to rename something to that line's basename plus _834.txt (the line of text is a string, it doesn't have a basename property, and it's not an object that can be renamed, so this is going to fail)
So, that's where the issue is. You're pulling the contents of the file, and parsing that line by line trying to match the text instead of matching against the file name. If you removed Everything after the first pipe up to the Where statement, and then for your rename-item put -newname before your desired name, and change the ( ) to { } that goes around the new name, and you would be set. Your code would work. So, your code, modified as I said, would look like:
Get-ChildItem "\\fhnsrv01\home\aborgetti\Documentation\Stage" -Filter *.EDIPROD |
Where-Object {$_ -match 'T042456'} | Rename-Item -NewName {$_.BaseName+'_834.txt'}
Though I have a feeling you want $.Name and not $.BaseName. Using $_.BaseName will leave you with (to use your example file name):
'AIDOCCAI.D051414.T042456.MO_834.txt`
Edit2: Really that's a whole different question, how to match multiple criteria, but the question is here, I'm here, why not just get it done?
So, you have multiple criteria for matching the file names. That really doesn't affect your loop to be honest, what it does affect is the Where statement. If you have multiple options what you probably want is a RegEx match. Totally doable! I'm only going to address the Where statement (?{ }) here, this won't change anything else in the script.
We leave the extension part, but we're going to need to modify the file name part. With RegEx you can match against alternative text by putting it in parenthesis and splitting up the various options with a pipe character. So it would look something like this:
"(T042456|T195917|T048585)"
Now we can incorporate that into the rest of the Where statement and it looks like this:
?{$_.Extension -match ".EDIPROD" -and $_.name -match "(T042456|T195917|T048585)"}
or in your script:
Where-Object {$_ -match "(T042456|T195917|T048585)"}
Edit3: Hm, need the first line for the qualifier. That complicates things a bit. Ok, so what I'm thinking is to get our directory listing, get the first line of each file with the desired extension, make an object that has two properties, the first property is the fileinfo object for the file, and the other property will be the first line of the file. Wait, I think we can do better. Switch (GCI *.EDIPROD){(get-content|select -first 1) -match 820}{Rename 820};{blah blah -match 834}{rename 834}}. Yeah, that should work. Ok, actual script, not theoretical gibberish script time. This way if you have other things to look for you can just add lines for them.
Switch(GCI \\fhnsrv01\home\aborgetti\Documentation\Stage\*.EDIPROD){
{(GC $_|Select -first 1).substring(177) -match "^834"}{$_ | Rename-Item -NewName {"834Dailyin$b"};Continue}
{(GC $_|Select -first 1).substring(177) -match "^820"}{$_ | Rename-Item -NewName {$_.BaseName+'_820.txt'};Continue}
}
Again, if you want the EDIPROD part to remain in the file name change $_.BaseName to $_.Name. Switch is pretty awesome if you're trying to match against different things and perform different actions depending on what the results are. If you aren't familiar with it you may want to go flex your google muscles and check it out.
Hm, alternatively we could have gotten the first line inside the Where filter, run a regex match against that, and renamed the file based on the regex match.
GCI \\fhnsrv01\home\aborgetti\Documentation\Stage\*.EDIPROD | ?{(GC $_ | Select -First 1) -match "(820|834)"}|Rename-Item -NewName {$_.Name+"_"+$Matches[1]+".txt"}
Then you just have to update the Where statement to include anything you're trying to match against. That's kind of sexy, though not as versatile as the switch. But for just simple search and rename it works fine.
Try it like this way
Get-ChildItem -Filter "*T042456*" -Recurse | % {Rename-Item $_ "$_ _834.txt"}