How to get regex exact coincidence with a var on Powershell - powershell

Another one that doesn't seem to have any solution already online.
I have a file with all my volumes and I have a variable where the user writes a volume such as "C" or "D". The problem here is that if I write "D" it will take me every single line that has a D.
I have this code:
$unit=Read-Host -Prompt 'Introduce a volume name'
echo "Searching for the volume..."
Get-Volume > volumes.txt
get-content volumes.txt | select-string -pattern "$unit" > exist2.txt
gc exist2.txt | where{$_ -ne ""} > exist.txt
$disk=(Get-Content exist.txt)
echo $disk
So the regex should be on the "select-string -pattern" and this is what I've tried so far:
get-content volumes.txt | select-string -pattern "/(^|\W)$unit($|\W)/i"
get-content volumes.txt | select-string -pattern "^[$unit]$"
get-content volumes.txt | select-string -pattern '^$unit,'
get-content volumes.txt | select-string -pattern "\$unit\b"
All of them returns nothing and what I want to return is the line of the D unit.
For example if I write "C" this it what will be returned
C NTFS Fixed Healthy OK
Thank you very much!

This is why PowerShell is an object oriented shell, so you don't have to do this string scraping. A PowerShell way to do this is:
Get-Volume | where-object { $_.DriveLetter -eq $unit -or $_.FriendlyName -eq $unit }
The output on screen does contain a header line, but that's not in the content, the command returns objects, and if you do nothing with them, PowerShell formats them into a table for showing on screen.
If you want to see it without headers
Get-Volume |
where-object { $_.DriveLetter -eq $unit -or $_.FriendlyName -eq $unit } |
format-Table -HideTableHeaders
but if you're going to work with it more in the script, don't convert it to text, it will only make it harder to work with later.

Related

How can i search for multiple string patterns in text files within a directory

I have a textbox that takes an input and searches a drive.
Drive for example is C:/users/me
let's say I have multiple files and subdirectories in there and I would like to search if the following strings exist in the file: "ssn" and "DOB"
Once user inputs the two strings. I split the string but space, so I can loop through the array. But here is my current code, but I'm stuck on how to proceed.
gci "C:\Users\me" -Recurse | where { ($_ | Select-String -pattern ('SSN') -SimpleMatch) -or ($_ | Select-String -pattern ('DOB') -SimpleMatch ) } | ft CreationTime, Name -Wrap -GroupBy Directory | Out-String
this above code works if i pasted it manually into powershell but I'm trying to recreate this in a script but having confusion and how to do so.
this code below is not getting all the files needed.
foreach ($x in $StringArrayInputs) {
if($x -eq $lastItem){
$whereClause = ($_ | Select-String -Pattern $x)
}else{
$whereClause = ($_ | Select-String -Pattern $x) + '-or'
}
$files= gci $dv -Recurse | Where { $_ | Select-String -Pattern $x -SimpleMatch} | ft CreationTime, Name -Wrap -GroupBy Directory | Out-String
}
Select-String's -Pattern parameter accepts an array of strings (any one of which triggers a match), so piping directly to a single Select-String call should do:
$files= Get-ChildItem -File -Recurse $dv |
Select-String -List -SimpleMatch -Pattern $StringArrayInputs } |
Get-Item |
Format-Table CreationTime, Name -Wrap -GroupBy Directory |
Out-String
Note:
Using -File with Get-ChildItem makes it return only files, not also directories.
Using -List with Select-String is an optimization that ensures that at most one match per file is looked for and reported.
Passing Select-String's output to Get-Item automatically binds the .Path property of the former's output to the -Path parameter of the latter.
Strictly speaking, binding to -Path subjects the argument to interpretation as a wildcard expression, which, however, is generally not a concern - except if the path contains [ characters.
If that is a possibility, insert a pipeline segment with Select-Object #{ Name='LiteralPath'; Expression='Path' } before Get-Item, which ensures binding to -LiteralPath instead.
I just followed your examples and combined both with a regex. I escaped the regex to avoid accidential usage of expressions (like a dot for any char).
It is working with my testfiles but may differ with your files. You may need to add " -Encoding UTF8" with your appropriate encoding so you may get regional specific chars as well.
$String = Read-Host "Enter multiple strings seperated by space to search for"
$escapedRegex = ([Regex]::Escape($String)) -replace "\\ ","|"
Get-ChildItem -Recurse -Attributes !Directory | Where-Object {
$_ | Get-Content | Select-String -Pattern $escapedRegex
} | Format-Table CreationTime, Name -Wrap -GroupBy Directory | Out-String

Include filename in output

I would like to get content from files in a folder (ignoring the header lines, since some file may ONLY contain the header). But in the output, I would like to include the filename from which the line is read. So far, I have the following:
Get-ChildItem | Get-Content | Where { $_ -notlike "HEADER_LINE_TEXT" } | Out-File -FilePath output_text.txt
I've tried to work with creating a variable in the Where block, $filename=$_.BaseName, and using it in the output, but this didn't work.
EDIT:
I ended up with the following:
Get-ChildItem -Path . |
Where-Object { $_.FullName -like "*records.txt"; $fname=$_FullName; } |
Get-Content |
Select-Object { ($fname + "|" + $_.Trim()) } |
Where { $_ -notlike "*HEADER_LINE_TEXT*" } |
Format-Table -HideTableHeaders |
Out-File -FilePath output_text.txt
This looks lengthy, and can probably be made shorter and clearer. Can someone help with cleaning this up a bit? I'll either post the solution, or vote for a cleaner solution, if one is posted. Thanks.
This looks like a case where it would make it more readable to not make it a one liner at cost of a little additional memory usage.
$InputFolder = "C:\example"
$OutputFile = "C:\example\output_text.txt"
$Files = Get-ChildItem $InputFolder | Where-Object { $_.FullName -like "*records.txt"}
Foreach ($File in $Files) {
$FilteredContent = Get-Content $File.FullName | Where-Object {$_ -notlike "*HEADER_LINE_TEXT*"}
$Output = $FilteredContent | Foreach-Object { "$($File.FullName)|$($_.Trim())" }
$Output | Out-File $OutputFile -Append
}
If you are going to go oneliner style for brevity, you could cut down on length by using position for parameters and using aliases.
Here are a couple other changes:
No need for the second semicolon in your first where block.
I think your variable wasn't working because you were missing the period between $_ and fullname.
Format-Table isn't needed because you already have the string you want to output
You can optimize a little by moving the second where earlier so that you don't trim() on lines you are just going to filter
Looks like you want to use foreach instead of select
Removed the + operator for string concatenation, instead using $() to evaluate inside parenthesis
gci . |
? { $_.FullName -like "*records.txt"; $fname=$_.FullName } |
% { gc $_.FullName } |
? { $_ -notlike "*HEADER_LINE_TEXT*" } |
% { "$fname|$($_.Trim())" } |
Out-File output_text.txt

powershell searching for a phrase in a large amount of files fast

Hello my question is is there a faster way to search for a phrase in a file other than select-string. I need to find a certain phrase in the first line of about 60k files but the current way i am doing it is too slow for what i need to have done.
I have tried doing
(Select-String "Phrase I am looking for" (cat mylist1)).Filename > mylist2
which gave me a result of 2 minutes and 30 seconds and then i tried
cat mylist1| %{ if ((cat $_ -first 1) -match "Phrase I am looking for") {echo $_}} > mylist2
which gave me a result of 2 minute and 57 seconds. Is there another method of searching for a string through a large amount of files that can bring the search time down?
Since you have at least PowerShell 3.0 then you could use .Where with Get-Content's -TotalCount and that should help some. -TotalCount defines how many lines of the file are being read in. I see that you are already using its alias -First though so there won't be any big changes here for that.
$path = "d:\temp"
$matchingPattern = "function"
(Get-ChildItem $path -File).Where{(Get-Content $_ -TotalCount 1) -match $matchingPattern }
I will try and test this against 60K of files and see what I can get in the mean htim. The above would return file objects where the first line contains "function". My test ran against 60K of files but my lines were likely shorter. Still did it in 44 seconds so perhaps that will help you
StreamReader will usually beat out Get-Content as well but since we are only getting one line I don't think it will be more efficient. This uses a streamreader in the where clause and reads the first line.
(Get-ChildItem $path -File).Where{([System.IO.File]::OpenText($_.Fullname).ReadLine()) -match $matchingPattern }
Note that the above code could contain a memory leak but it finished in 8 seconds compared to my first test. Writing to file added a second or two. Your mileage will vary.
Note that -match supports regex so you would need to escape regex meta characters if present.
You can do simply it:
$yoursearch = "PowerShell is cool!"
get-content "c:\temp\*.*" -TotalCount 1 | where { $_ -ilike "*$yoursearch*"} | select PSPath, #{N="Founded";E={$_}}
or A short version for non-purists:
gc "c:\temp\*.*" -To 1 | ? { $_ -ilike "*$yoursearch*"} | select PSPath, {$_}
If you want export your result:
$yoursearch = "PowerShell is cool!"
get-content "c:\temp\*.*" -TotalCount 1 | where { $_ -ilike "*$yoursearch*"} | select PSPath, #{N="Founded";E={$_}} |
export-csv "c:\temp\yourresult.csv" -notype
If you want a better filter for files input :
Get-ChildItem "c:\temp" -File |
Where {$firstrow= (Get-Content $_.FullName -TotalCount 1); $firstrow -ilike "*$yoursearch*"} |
Select fullName, #{N="Founded";E={$firstrow}} |
Export-Csv "c:\temp\yourresult.csv" -notype
or A short version for non-purists:
gci "c:\temp" -File | ? {$r= (gc $_.FullName -TotalCount 1); $r -ilike "*$yoursearch*"} |
Select f*, #{N="Founded";E={$r}} |
epcsv "c:\temp\yourresult.csv" -notype
Note: -file option exist only in PowerShell V5 (or +), else use psiscontainer propertie into where instruction
Note2: You can use option -list of select-string, seach all in file but stop when 1 row is founded
$yoursearch = "PowerShell where are you"
Select-String -Path "c:\temp\*.*" -Pattern $yoursearch -list | select Path, Line | export-csv "C:\temp\result.csv" -NoTypeInformation
A quick way to write to a file is to use the StreamWriter object. Assuming the files are in a folder:
$writer = [System.IO.StreamWriter] "selection.txt"
$files = gci -Path $path
$pattern ="Phrase"
$files | %{gc -Path $_.FullName | select -First 1 | ?{$_ -match $pattern}} | %{$writer.WriteLine($_)}
An example on how I would do it would be something like
Get-ChildItem -Path $path | Where-Object{$_.Name -contains "My String"}
This is generally a pretty fast way of achieving this however be advised if you -recurse through the entire C:\ drive then regardless you will be sitting for a minute unless you multi-thread

Using PowerShell to remove lines from a text file if it contains a string

I am trying to remove all the lines from a text file that contains a partial string using the below PowerShell code:
Get-Content C:\new\temp_*.txt | Select-String -pattern "H|159" -notmatch | Out-File C:\new\newfile.txt
The actual string is H|159|28-05-2005|508|xxx, it repeats in the file multiple times, and I am trying to match only the first part as specified above. Is that correct? Currently I am getting empty as output.
Am I missing something?
Suppose you want to write that in the same file, you can do as follows:
Set-Content -Path "C:\temp\Newtext.txt" -Value (get-content -Path "c:\Temp\Newtext.txt" | Select-String -Pattern 'H\|159' -NotMatch)
Escape the | character using a backtick
get-content c:\new\temp_*.txt | select-string -pattern 'H`|159' -notmatch | Out-File c:\new\newfile.txt
Another option for writing to the same file, building on the existing answers. Just add brackets to complete the action before the content is sent to the file.
(get-content c:\new\sameFile.txt | select-string -pattern 'H`|159' -notmatch) | Set-Content c:\new\sameFile.txt
You don't need Select-String in this case, just filter the lines out with Where-Object
Get-Content C:\new\temp_*.txt |
Where-Object { -not $_.Contains('H|159') } |
Set-Content C:\new\newfile.txt
String.Contains does a string comparison instead of a regex so you don't need to escape the pipe character, and it's also faster
The pipe character | has a special meaning in regular expressions. a|b means "match either a or b". If you want to match a literal | character, you need to escape it:
... | Select-String -Pattern 'H\|159' -NotMatch | ...
This is probably a long way around a simple problem, it does allow me to remove lines containing a number of matches. I did not have a partial match that could be used, and needed it to be done on over 1000 files.
This post did help me get to where I needed to, thank you.
$ParentPath = "C:\temp\test"
$Files = Get-ChildItem -Path $ParentPath -Recurse -Include *.txt
$Match2 = "matchtext1"
$Match2 = "matchtext2"
$Match3 = "matchtext3"
$Match4 = "matchtext4"
$Match5 = "matchtext5"
$Match6 = "matchtext6"
$Match7 = "matchtext7"
$Match8 = "matchtext8"
$Match9 = "matchtext9"
$Match10 = "matchtext10"
foreach ($File in $Files) {
$FullPath = $File | % { $_.FullName }
$OldContent = Get-Content $FullPath
$NewContent = $OldContent `
| Where-Object {$_ -notmatch $Match1} `
| Where-Object {$_ -notmatch $Match2} `
| Where-Object {$_ -notmatch $Match3} `
| Where-Object {$_ -notmatch $Match4} `
| Where-Object {$_ -notmatch $Match5} `
| Where-Object {$_ -notmatch $Match6} `
| Where-Object {$_ -notmatch $Match7} `
| Where-Object {$_ -notmatch $Match8} `
| Where-Object {$_ -notmatch $Match9} `
| Where-Object {$_ -notmatch $Match10}
Set-Content -Path $FullPath -Value $NewContent
Write-Output $File
}
If you anyone having this issue while doing what suggested by Robert Brooker-
*These files have different encodings. Left file: Unicode (UTF-8) with signature. Right file: Unicode (UTF-8) without signature. You can resolve the difference by saving the right file with the encoding Unicode (UTF-8) with signature.* with Set-Content
use -Encoding UTF8
so like this
(get-content c:\new\sameFile.txt | select-string -pattern 'H`|159' -notmatch) | Set-Content c:\new\sameFile.txt -Encoding UTF8

How do I add a newline to command output in PowerShell?

I run the following code using PowerShell to get a list of add/remove programs from the registry:
Get-ChildItem -path hklm:\software\microsoft\windows\currentversion\uninstall `
| ForEach-Object -Process { Write-Output $_.GetValue("DisplayName") } `
| Out-File addrem.txt
I want the list to be separated by newlines per each program. I've tried:
Get-ChildItem -path hklm:\software\microsoft\windows\currentversion\uninstall `
| ForEach-Object -Process { Write-Output $_.GetValue("DisplayName") `n } `
| out-file test.txt
Get-ChildItem -path hklm:\software\microsoft\windows\currentversion\uninstall `
| ForEach-Object {$_.GetValue("DisplayName") } `
| Write-Host -Separator `n
Get-ChildItem -path hklm:\software\microsoft\windows\currentversion\uninstall `
| ForEach-Object -Process { $_.GetValue("DisplayName") } `
| foreach($_) { echo $_ `n }
But all result in weird formatting when output to the console, and with three square characters after each line when output to a file. I tried Format-List, Format-Table, and Format-Wide with no luck. Originally, I thought something like this would work:
Get-ChildItem -path hklm:\software\microsoft\windows\currentversion\uninstall `
| ForEach-Object -Process { "$_.GetValue("DisplayName") `n" }
But that just gave me an error.
Or, just set the output field separator (OFS) to double newlines, and then make sure you get a string when you send it to file:
$OFS = "`r`n`r`n"
"$( gci -path hklm:\software\microsoft\windows\currentversion\uninstall |
ForEach-Object -Process { write-output $_.GetValue('DisplayName') } )" |
out-file addrem.txt
Beware to use the ` and not the '. On my keyboard (US-English Qwerty layout) it's located left of the 1.
(Moved here from the comments - Thanks Koen Zomers)
Give this a try:
PS> $nl = [Environment]::NewLine
PS> gci hklm:\software\microsoft\windows\currentversion\uninstall |
ForEach { $_.GetValue("DisplayName") } | Where {$_} | Sort |
Foreach {"$_$nl"} | Out-File addrem.txt -Enc ascii
It yields the following text in my addrem.txt file:
Adobe AIR
Adobe Flash Player 10 ActiveX
...
Note: on my system, GetValue("DisplayName") returns null for some entries, so I filter those out. BTW, you were close with this:
ForEach-Object -Process { "$_.GetValue("DisplayName") `n" }
Except that within a string, if you need to access a property of a variable, that is, "evaluate an expression", then you need to use subexpression syntax like so:
Foreach-Object -Process { "$($_.GetValue('DisplayName'))`r`n" }
Essentially within a double quoted string PowerShell will expand variables like $_, but it won't evaluate expressions unless you put the expression within a subexpression using this syntax:
$(`<Multiple statements can go in here`>).
I think you had the correct idea with your last example. You only got an error because you were trying to put quotes inside an already quoted string. This will fix it:
gci -path hklm:\software\microsoft\windows\currentversion\uninstall | ForEach-Object -Process { write-output ($_.GetValue("DisplayName") + "`n") }
Edit: Keith's $() operator actually creates a better syntax (I always forget about this one). You can also escape quotes inside quotes as so:
gci -path hklm:\software\microsoft\windows\currentversion\uninstall | ForEach-Object -Process { write-output "$($_.GetValue(`"DisplayName`"))`n" }
Ultimately, what you're trying to do with the EXTRA blank lines between each one is a little confusing :)
I think what you really want to do is use Get-ItemProperty. You'll get errors when values are missing, but you can suppress them with -ErrorAction 0 or just leave them as reminders. Because the Registry provider returns extra properties, you'll want to stick in a Select-Object that uses the same properties as the Get-Properties.
Then if you want each property on a line with a blank line between, use Format-List (otherwise, use Format-Table to get one per line).
gci -path hklm:\software\microsoft\windows\currentversion\uninstall |
gp -Name DisplayName, InstallDate |
select DisplayName, InstallDate |
fl | out-file addrem.txt
The option that I tend to use, mostly because it's simple and I don't have to think, is using Write-Output as below. Write-Output will put an EOL marker in the string for you and you can simply output the finished string.
Write-Output $stringThatNeedsEOLMarker | Out-File -FilePath PathToFile -Append
Alternatively, you could also just build the entire string using Write-Output and then push the finished string into Out-File.