select-string -pattern wildcard for a specific field - powershell

I'm still pretty new to powershell. Not sure to fix this or even what I am doing wrong.
My ultimate goal is to pull codes of 5 groups of 5 characters, groups of characters are delimited by -, from a long txt file. Example JTI45-534YS-PKQN6-MSE9S-2PFNM. There are multiple of these and I need to pull them all out of a file at once.
I'm trying multiple different variations on
Select-String .\reducedCodes.txt -Pattern "*-*-*-*-*"
or
Select-String .\reducedCodes.txt -Pattern "?????-?????-?????-?????-?????"
Thanks in advance.

Since your string that you are looking for looks like an alphanumeric, you could use a regex word, with is denoted by \w. And since there are five in a row, you could use \w{5} then they are separated by the -character. So Select-String normally gives you the lines containing the matches, and you just want the matches, you can then get the Matches property, where Value is the full match. Also note the Groups property where if you put the \w{5} inside () you could get an individual group.
(Select-String .\reducedCodes.txt -Pattern '\w{5}-\w{5}-\w{5}-\w{5}-\w{5}').Matches.Value

Related

Get count of exact word match within a Text file

The requirement is to get the exact match count for a word "test". So in the following example it should be 1:
testing 1 2 3 "test" testing
Tester testing 2345 tes testers testings testing
test
I tried the below code :
(Get-Content "C:\Users\abc\Desktop\POC\Findstring.txt" |
Select-String -Pattern "test" -AllMatches).matches.count
But it provides me the value as 9 since it provides a like functionality (it is also considering tester,testing etc in the count).
How should we ensure that we get the count for exact match and not for a LIKE operator scenario (similar to in SQL).
tl;dr
Use regex \btest\b as the -Pattern argument so as to match test as a whole word only.
Pass your input file path directly to Select-String's -LiteralPath parameter, which is much faster and more efficient than streaming the individual lines from the file via Get-Content.
(
Select-String -AllMatches `
-Pattern '\btest\b' `
-LiteralPath C:\Users\abc\Desktop\POC\Findstring.txt
).Matches.Count
Note: The command is spread across multiple lines for readability. To convert it to a single-line form, also remove the line-ending ` (backtick) characters, which act as line continuations.
Your intent is to limit matching test substrings to whole words.
Since Select-String uses regexes (regular expressions), you can do so by enclosing the substring in word-boundary assertions, \b, as Theo advises, i.e. '\btest\b'
For a detailed explanation of this regex and the ability to interact with it, see this regex101.com page
Also note that Select-String - like PowerShell in general - is case-insensitive by default; to match case-sensitively, add the -CaseSensitive switch.
Variation with also ignoring the word test when enclosed in "..."
If you additionally want to ignore "test" substrings (i.e. double-quoted instances of the word), you must amend your regex to also include a negative look-behind assertion, (?!...) in order to preclude a " preceding the word:
(
Select-String -AllMatches `
-Pattern '(?<!")\btest\b' `
-LiteralPath C:\Users\abc\Desktop\POC\Findstring.txt
).Matches.Count
See this regex101.com page.
Currently, you search for the pattern test which is also true in case of testing, testers, etc. The following should do the trick:
((Get-Content "C:\tmp\testdata.txt") -split " " | Select-String -Pattern '^(test)$' -AllMatches).count

Read specific text from text files

I wrote a PowerShell script to compare two text files. In file1 the data is organised. But in file2 the data is not organised. I usually organize data manually. But now the data is increased. I need to automate organising using PowerShell.
PowerShell has to read data between two special characters. For example: <****# is my data. It has to read **** only. this pattern repeats 'n' number of times.
Use a regular expression <(.*?)# to match the relevant substring. .*? matches all characters between < and the next occurrence of # (non-greedy/shortest match). The parentheses put the match in a capturing group, so it can be referenced later.
Select-String -Path 'C:\path\to\file2.txt' -Pattern '<(.*?)#' -AllMatches |
Select-Object -Expand Matches |
ForEach-Object { $_.Groups[1].Value }
$_.Groups[1] refers to the first capturing group in a match.

Add quotes to each column in a CSV via Powershell

I am trying to create a Powershell script which wraps quotes around each columns of the file on export to CSV. However the Export-CSV applet only places these where they are needed, i.e. where the text has a space or similar within it.
I have tried to use the following to wrap the quotes on each line but it ends up wrapping three quotes on each column.
$r.SURNAME = '"'+$r.SURNAME+'"';
Is anyone able to share how to forces these on each column of the file - so far I can just find info on stripping these out.
Thanks
Perhaps a better approach would be to simply convert to CSV (not export) and then a simple regex expression could add the quotes then pipe it out to file.
Assuming you are exporting the whole object $r:
$r | ConvertTo-Csv -NoTypeInformation `
| % { $_ -replace ',(.*?),',',"$1",' } `
| Select -Skip 1 | Set-Content C:\temp\file.csv
The Select -Skip 1 removes the header. If you want the header just take it out.
To clarify what the regex expression is doing:
Match: ,(.*?),
Explanation: This will match section of each line that has a comma followed by any number of characters (.*) without being greedy (? : basically means it will only match the minimum number of characters that is needed to complete the match) and the finally is ended with a comma. The parenthesis will hold everything between the two commas in a match variable to be used later in the replace.
Replace: ,"$1",
Explanation: The $1 holds the match between the two parenthesis mention above in the match. I am surrounding it with quotes and re-adding the commas since I matched on those as well they must be replaced or they are simply consumed. Please note, that while the match portion of the -replace can have double quotes without an issue, the replace section must be surrounded in single quotes or the $1 gets interpreted by PowerShell as a PowerShell variable and not a match variable.
You can also use the following code:
$r.SURNAME = "`"$($r.SURNAME)`""
I have cheated to get what I want by re-parsing the file through the following - guess that it acts as a simple find and replace on the file.
get-content C:\Data\Downloads\file2.csv
| foreach-object { $_ -replace '"""' ,'"'}
| set-content C:\Data\Downloads\file3.csv
Thanks for the help on this.

Select-String pattern not matching

I have the text of a couple hundred Word documents saved into individual .txt files in a folder. I am having an issue where a MergeField in the Word document wasn't formatted correctly, and now I need to find all the instances in the folder where the incorrect formatting occurs. the incorrect formatting is the string \#,$##,##0.00\* So, I'm trying to use PowerShell as follows:
select-string -path MY_PATH\.*txt -pattern '\#,$##,##0.00\*'
select-string -path MY_PATH\.*txt -pattern "\#`,`$##`,##0.00\*"
But neither of those commands finds any results, even though I'm sure the string exists in at least one file. I feel like the error is occurring because there are special characters in the parameter (specifically $ and ,) that I'm not escaping correctly, but I'm not sure how else to format the pattern. Any suggestions?
If you are actually looking for \#,$##,##0.00\* then you need to be aware that Select-String uses regex and you have a lot of control characters in there. Your string should be
\\\#,\$\#\#,\#\#0\.00\\\*
Or you can use the static method Escape of regex to do the dirty work for you.
[regex]::Escape("\#,$##,##0.00\*")
To put this all together you would get the following:
select-string -path MY_PATH\.*txt -pattern ([regex]::Escape("\#,$##,##0.00\*"))
Or even simpler would be to use the parameter -SimpleMatch since it does not interpet the string .. just searches as is. More here
select-string -path MY_PATH\.*txt -SimpleMatch "\#,$##,##0.00\*"
My try, similar to Matts:
select-string -path .\*.txt -pattern '\\#,\$##,##0\.00\\\*'
result:
test.txt:1:\#,$##,##0.00\*

Powershell select-string Matching x AND y

I'm trying to do a search of some log files for lines that contain certain strings. The files contain multiple lines like:
ALARM 11/08/2014 10:00:02,InFILE typeID,actionID,customerID: various_other_data_here
ALARM 11/08/2014 10:00:03,OutFILE typeID,actionID,customerID: various_other_data_here
I'm trying to find all lines in all files that have both 'ALARM' and 'OutFILE' in them.
I can use:
select-string .\*.log -pattern "ALARM"
to find all instances of 'ALARM', but how can I add the additional 'OutFILE'.
I've searched for this and found loads of examples that seem to be aimed at matching really copmplex strings with long ReEx, but nothing that is for matching simple x AND y type strings.
You can use a simple regex to match your scenario. This should work:
select-string .\*.log -pattern 'ALARM.*outFile'