Remove the at symbol ( # ) and curly bracket ( { ) from Select-Sring output in Powershell - powershell

I'm parsing filenames in Powershell, and when I use Get-ChildItem | select name, I get a clean output of the files:
file1.txt
file2.txt
file3.txt
But when I try to narrow down those files with Select-String, I'm getting a weird # and { in front of my output:
Get-ChildItem | select name | Select-String -Pattern "1"
#{file1.txt}
Is there a parameter I'm missing? If I pipe with findstr rather than Select-String it works like a charm:
Get-ChildItem | select name | Findstr "1"
file1.txt

You can simplify and speed up your command as follows:
#((Get-ChildItem).Name) -match '1'
Note: #(), the array-subexpression operator, is needed to ensure that -match operates on an array, even if only one file happens to exist in the current dir.
(...).Name uses member-access enumeration to extract all Name property values from the file-info objects returned by Get-ChildItem.
-match, the regular-expression matching operator, due to operating on an array of values, returns the sub-array of matching values.
To make your original command work:
Get-ChildItem | select -ExpandProperty Name |
Select-String -Pattern "1" | select -ExpandProperty Line
select -ExpandProperty Name makes select (Select-Object) return only the Name property values; by default (implied -Property parameter), a custom object that has a Name property is returned.
select -ExpandProperty line similarly extracts the Line property value from the Microsoft.PowerShell.Commands.MatchInfo instances that Select-String outputs.
Note that in PowerShell [Core] v7+ you could omit this step by instead using Select-String's (new) -Raw switch to request string-only output.
As for what you tried:
As stated, by not using -ExpandProperty, select name (implied -Property parameter) created a custom object ([pscustomobject] instance) with a Name property.
Select-String stringifies its input objects, if necessary, so it can perform a string search on them, which results in the representation you saw; here's a simulation:
# Stringify a custom object via an expandable string ("...")
PS> "$([pscustomobject] #{ Name = 'file1.txt' })"
#{Name=file1.txt}
As an aside:
The above stringification method is essentially like calling .ToString() on the input objects[1], which often results in useless string representations (by default, just the type name); a more useful and intuitive stringification would be to use PowerShell's rich output-formatting system, i.e. to use the string representation you would see in the console; changing Select-String's behavior to do that is the subject of this feature request on GitHub.
[1] Calling .ToString() directly on a [pscustomobject] instance is actually still broken as of PowerShell Core 7.0.0-rc.2, due to this bug; the workaround is to call .psobject.ToString() or to use an expandable string, as shown above.

Related

Use Select-String to get the single word matching pattern from files

I am trying to get only the word when using Select-String, but instead it is returning the whole string
Select-String -Path .\*.ps1 -Pattern '-Az' -Exclude "Get-AzAccessToken","-Azure","Get-AzContext"
I want to get all words in all .ps1 files that contain '-Az', for example 'New-AzHierarchy'
Select-String outputs objects of type Microsoft.PowerShell.Commands.MatchInfo by default, which supplement the whole line (input object) on which a match was found (.Line property) with metadata about the match (in PowerShell (Core) 7+, you can use -Raw to directly output the matching lines (input objects) only).
Note that in the default display output, it appears that only the matching lines are printed, with PowerShell (Core) 7+ now highlighting the part that matched the pattern(s).
Select-String's -Include / -Exclude parameters do not modify what patterns are matched; instead, they modify the -Path argument to further narrow down the set of input files. Since a wildcard expression as part of the -Path argument is usually sufficient, these parameters are rarely used.
Therefore:
Use the objects in the .Matches collection property Select-String's output objects to access the part of the line that actually matched the given pattern(s).
Since you want to capture entire command names that contain substring -Az, such as New-AzHierarchy, you must use a regex pattern that also captures the relevant surrounding characters: \w+-Az\w+
The simplest way to exclude specific matches is to filter them out afterwards, using a Where-Object call.
# Note: -AllMatches ensures that if there are *multiple* matches
# on a single line, they are all reported.
Select-String -Path .\*.ps1 -Pattern '\w+-Az\w+' -AllMatches |
ForEach-Object { $_.Matches.Value } |
Where-Object { $_ -notin 'Get-AzAccessToken', '-Azure', 'Get-AzContext' }

Select-String not working on piped object using Out-String

I am doing an API request which returns a bunch of data. In attempted to search through it with Select-String, it just spits out the entire value stored in the variable. This is an internet server which I am calling an api.
$return = Invoke-RestMethod -Method GET -Uri $uri -Headers #{"authorization" = $token} -ContentType "application/json"
$file = $return.data
$file | Out-String -Stream | Select-String -Pattern "word"
this returns the entire value of $file. printing $file looks like same as the pipe output. Why is this not working?
$file.Gettype says it is a system.object, another answer said to use Out-String, but something is not working.
$file.Gettype
IsPublic IsSerial Name BaseType
-------- -------- ---- --------
True True String System.Object
To complement iRon7's helpful answer with the precise logic of Out-String's -Stream switch, as of PowerShell 7.1:
Out-String, like the other Out-* cmdlets such as Out-File, uses PowerShell's rich output-formatting system to generate human-friendly representations of its input objects.
Without -Stream, Out-String only ever produces a single, (typically) multiline string.
With -Stream, line-by-line output behavior typically occurs - except for input objects that happen to be multiline strings, which are output as-is.
Because this exception is both obscure and unhelpful, GitHub proposal #14638 suggests removing it.
For so-called in-band data types, -Stream works as follows, which truly results in line-by-line output:
Input objects are formatted by PowerShell's rich formatting system, and the lines that make up the resulting representation are then output one by one.
Out-of-band data types are individually formatted outside of the formatting system, by simply calling their .NET .ToString() method.
In short: data types that represent a single value are out-of-band, and in addition to [string] out-of-band data types also comprise [char] and the various (standard) numeric types, such as [int], [long], [double], ...
[string] is the only out-of-band type that itself can result in a multiline representation, because calling .ToString() on a string is effective no-op that returns the string itself - whether it is single- or multiline.
Therefore:
Any string - notably also a multiline string - is output as-is, as a whole, and splitting it into individual lines requires an explicit operation; e.g. (note that regex \r?\n matches both Windows-style CRLF and Unix-style LF-only newlines):
"line 1`nline 2`nline 3" -split '\r?\n' # -> 'line 1', 'line 2', 'line 3'
If your input objects are a mix of in-band objects and (invariably out-of-band) multiline strings, you can combine Out-String -Stream with -split; e.g.:
((Get-Date), "line 1`nline 2`nline 3" | Out-String -Stream) -split '\r?\n'
On closer inspection, I suspect that your issue comes from an ambiguity in the Out-String documentation:
-Stream
Indicates that the cmdlet sends a separate string for each line of an
input object. By default, the strings for each object are accumulated
and sent as a single string.
Where the word line should be read as item.
To split you raw string into separate lines, you will need to split your string using the following command:
$Lines = $return.data -split [Environment]::NewLine
Note that this assumes that your data uses the same characters for a new line as the system you working on. If this is not the case, you might want to split the lines using an regular expression, e.g.:
$Lines = $return.data -split "`r*`n"
So what does the-Stream parameter do?
It sends a separate string for each item of an input object.
Where in this definition, it is also a known common best PowerShell practice to use a singular name for possible plural input objectS.
Meaning if you use the above defined $Lines variable (or something like $Lines = Get-Content .\File.json), the input object "$Lines" is a collection of strings:
$Lines.GetType().Name
String[]
if you stream this to Out-String it will (by default) join all the items and return a single string:
($Lines | Out-String).GetType().Name
String
In comparison, if you use the -Stream parameter, it will pass each separated item from the $Lines collection directly to the next cmdlet:
($Lines | Out-String -Stream).GetType().Name
Object[]
I have created a document issue for this: #7133 "line" should be "item"
Note:
In general, it is a bad practice to peek and poke directly into a
serialized string
(including Json) using string
methods and/or cmdlets (like Select-String). Instead you should use
the related parser (e.g.
ConvertFrom-Json)
for searching and replacing which will result in an easier syntax
and usually takes care of known issues and pitfalls.
Select-String outputs Microsoft.PowerShell.Commands.MatchInfo objects. It seems to me that the output is somehow fancified via the PS engine or something to highlight your match, but ultimately it does print the entire matched string.
You should check out the members of the object Select-String provides, like this:
$file | Out-String -Stream | Select-String -Pattern "word" | Get-Member
TypeName: Microsoft.PowerShell.Commands.MatchInfo
Name MemberType Definition
---- ---------- ----------
...
Matches Property System.Text.RegularExpressions.Match[] Matches {get;set;}
...
What you're interested in is the Matches property. It contains a bunch of information about the match. To extract exactly what you want, look at the Value property of Matches:
($file | Out-String -Stream | Select-String -Pattern "word").Matches.Value
word
Another way:
$file | Out-String -Stream | Select-String -Pattern "word" | ForEach-Object {$_.Matches} | Select-Object -Property Value
Value
-----
word
Or
$file | Out-String -Stream | Select-String -Pattern "word" | ForEach-Object {$_.Matches} | Select-Object -ExpandProperty Value
word

Powershell, how to capture argument(s) of Select-String and include with matched output

Thanks to #mklement0 for the help with getting this far with answer given in Powershell search directory for code files with text matching input a txt file.
The below Powershell works well for finding the occurrences of a long list of database field names in a source code folder.
$inputFile = 'C:\DataColumnsNames.txt'
$outputFile = 'C:\DataColumnsUsages.txt'
Get-ChildItem C:\ProjectFolder -Filter *.cs -Recurse -Force -ea SilentlyContinue |
Select-String -Pattern (Get-Content $inputFile) |
Select-Object Path, LineNumber, line |
Export-csv $outputfile
However, many lines of source code have multiple matches, especially ADO.NET SQL statements with a lot of field names on one line. If the field name argument was included with the matching output the results will be more directly useful with less additional massaging such as lining up everything with the original field name list. For example if there is a source line "BatchId = NewId" it will match field name list item "BatchId". Is there an easy way to include in the output both "BatchId" and "BatchId = NewId"?
Played with the matches object but it doesn't seem to have the information. Also tried Pipeline variable like here but X is null.
$inputFile = 'C:\DataColumnsNames.txt'
$outputFile = 'C:\DataColumnsUsages.txt'
Get-ChildItem C:\ProjectFolder -Filter *.cs -Recurse -Force -ea SilentlyContinue |
Select-String -Pattern (Get-Content $inputFile -PipelineVariable x) |
Select-Object $x, Path, LineNumber, line |
Export-csv $outputile
Thanks.
The Microsoft.PowerShell.Commands.MatchInfo instances that Select-String outputs have a Pattern property that reflects the specific pattern among the (potential) array of patterns passed to -Pattern that matched on a given line.
The caveat is that if multiple patterns match, .Pattern only reports the pattern among those that matched that is listed first among them in the -Pattern argument.
Here's a simple example, using an array of strings to simulate lines from files as input:
'A fool and',
'his barn',
'are soon parted.',
'foo and bar on the same line' |
Select-String -Pattern ('bar', 'foo') |
Select-Object Line, LineNumber, Pattern
The above yields:
Line LineNumber Pattern
---- ---------- -------
A fool and 1 foo
his barn 2 bar
foo and bar on the same line 4 bar
Note how 'bar' is listed as the Pattern value for the last line, even though 'foo' appeared first in the input line, because 'bar' comes before 'foo' in the pattern array.
To reflect the actual pattern that appears first on the input line in a Pattern property, more work is needed:
Formulate your array of patterns as a single regex using alternation (|), wrapped as a whole in a capture group ((...)) - e.g., '(bar|foo)')
Note: The expression used below, '({0})' -f ('bar', 'foo' -join '|'), constructs this regex dynamically, from an array (the array literal 'bar', 'foo' here, but you can substitute any array variable or even (Get-Content $inputFile)); if you want to treat the input patterns as literals and they happen to contain regex metacharacters (such as .), you'll need to escape them with [regex]::Escape() first.
Use a calculated property to define a custom Pattern property that reports the capture group's value, which is the first among the values encountered on each input line:
'A fool and',
'his barn',
'are soon parted.',
'foo and bar on the same line' |
Select-String -AllMatches -Pattern ('({0})' -f ('bar', 'foo' -join '|')) |
Select-Object Line, LineNumber,
#{ n='Pattern'; e={ $_.Matches[0].Groups[1].Value } }
This yields (abbreviated to show only the last match):
Line LineNumber Pattern
---- ---------- -------
...
foo and bar on the same line 4 foo
Now, 'foo' is properly reported as the matching pattern.
To report all patterns found on each line:
Switch -AllMatches is required to tell Select-String to find all matches on each line, represented in the .Matches collection of the MatchInfo output objects.
The .Matches collection must then be enumerated (via the .ForEach() collection method) to extract the capture-group value from each match.
'A fool and',
'his barn',
'are soon parted.',
'foo and bar on the same line' |
Select-String -AllMatches -Pattern ('({0})' -f ('bar', 'foo' -join '|')) |
Select-Object Line, LineNumber,
#{ n='Pattern'; e={ $_.Matches.ForEach({ $_.Groups[1].Value }) } }
This yields (abbreviated to show only the last match):
Line LineNumber Pattern
---- ---------- -------
...
foo and bar on the same line 4 {foo, bar}
Note how both 'foo' and 'bar' are now reported in Pattern, in the order encountered on the line.
The solid information and examples from #mklement0 were enough to point me in the right direction for researching and understanding more about Powershell and the object pipeline and calculated properties.
I was able to finally achieve my goals of a cross referencing a list of table and field names to the C# code base.The input file is simply table and field names, pipe delimited. (one of the glitches I had was not using pipe in the split, it was a visual error that took awhile to finally see, so check for that). The output is the table name, field name, code file name, line number and actual line. It's not perfect but much better than manual effort for a few hundred fields! And now there are possibilities for further automation in the data mapping and conversion project. Thought about using C# utility programming but that might have taken just as long to figure out and implement and much more cumbersome that a working Powershell.
The key for me at this point is "working"! My first deeper dive into the abstruse world of Powershell. The key points of my solution are the use of the calculated property to get the table and field names in the output, realization that expressions can be used in certain places like to build a Pattern and that the pipeline is passing only certain specific objects after each command (maybe that is too restricted of a view but it's better than what I had before).
Hope this helps someone in future. I could not find any examples close enough to get over the hump and so asked my first ever stackoverflow questions.
$inputFile = "C:\input.txt"
$outputFile = "C:\output.csv"
$results = Get-Content $inputfile
foreach ($i in $results) {
Get-ChildItem -Path "C:\ProjectFolder" -Filter *.cs -Recurse -ErrorAction SilentlyContinue -Force |
Select-String -Pattern $i.Split('|')[1] |
Select-Object #{ n='Pattern'; e={ $i.Split('|')[0], $i.Split('|')[1] -join '|'} }, Filename, LineNumber, line |
Export-Csv $outputFile -Append}

What constitutes a "line" for Select-String method in Powershell?

I would expect that Select-String consider \r\n (carriage-return + newline) the end of a line in Powershell.
However, as can be seen below, abc matches the whole the whole input:
PS C:\Tools\hashcat> "abc`r`ndef" | Select-String -Pattern "abc"
abc
def
If I break the string up into two parts, then Select-String behaves as I would expect:
PS C:\Tools\hashcat> "abc", "def" | Select-String -Pattern "abc"
abc
How can I give Select-String a string whose lines are terminated by \r\n, and then make this cmdlet only returns those strings that contain a match?
Select-String operates on each (stringified on demand[1]) input object.
A multi-line string such as "abc`r`ndef" is a single input object.
By contrast, "abc", "def" is a string array with two elements, passed as two input objects.
To ensure that the lines of a multi-line string are passed individually, split the string into an array of lines using PowerShell's -split operator: "abc`r`ndef" -split "`r?`n"
(The ? makes the `r optional so as to also correctly deal with `n-only (LF-only, Unix-style) line endings.)
In short:
"abc`r`ndef" -split "`r?`n" | Select-String -Pattern "abc"
The equivalent, using a PowerShell string literal with regular-expression (regex) escape sequences (the RHS of -split is a regex):
"abc`r`ndef" -split '\r?\n' | Select-String -Pattern "abc"
It is somewhat unfortunate that the Select-String documentation talks about operating on lines of text, given that the real units of operations are input objects - which may themselves comprise multiple lines, as we've seen.
Presumably, this comes from the typical use case of providing input objects via the Get-Content cmdlet, which outputs a text file's lines one by one.
Note that Select-String doesn't return the matching strings directly, but wraps them in [Microsoft.PowerShell.Commands.MatchInfo] objects containing helpful metadata about the match.
Even there the line metaphor is present, however, as it is the .Line property that contains the matching string.
[1] Optional reading: How Select-String stringifies input objects
If an input object isn't a string already, it is converted to one, though possibly not in the way you might expect:
Loosely speaking, the .ToString() method is called on each non-string input object[2]
, which for non-strings is not the same as the representation you get with PowerShell's default output formatting (the latter is what you see when you print an object to the console or use Out-File, for instance); by contrast, it is the same representation you get with string interpolation in a double-quoted string (when you embed a variable reference or command in "...", e.g., "$HOME" or "$(Get-Date)").
Often, .ToString() just yields the name of the object's type, without containing any instance-specific information; e.g., $PSVersionTable stringifies to System.Management.Automation.PSVersionHashTable.
# Matches NOTHING, because Select-String sees
# 'System.Management.Automation.PSVersionHashTable' as its input.
$PSVersionTable | Select-String PSVersion
In case you do want to search the default output format line by line, use the following idiom:
... | Out-String -Stream | Select-String ...
However, note that for non-string input it is more robust and preferable for subsequent processing to filter the input by querying properties with a Where-Object condition.
That said, there is a strong case to be made for Select-String needing to implicitly apply Out-String -Stream stringification, as discussed in this GitHub feature request.
[2] More accurately, .psobject.ToString() is called, either as-is, or - if the object's ToString method supports an IFormatProvider-typed argument - as .psobject.ToString([cultureinfo]::InvariantCulture) so as to obtain a culture-invariant representation - see this answer for more information.
"abc`r`ndef"
is one string which if you echo (Write-Output) out in console would result in:
PS C:\Users\gpunktschmitz> echo "abc`r`ndef"
abc
def
The Select-String will echo out every string where "abc" is part of it. As "abc" is part the string this very string will be selected.
"abc", "def"
is a list of two strings. Using the Select-String here will first test "abc" and then "def" if the pattern matches "abc". As only the first one matches only it will be selected.
Use the following to split the string into a list and select only the elements containing "abc"
"abc`r`ndef".Split("`r`n") | Select-String -Pattern "abc"
Basically Mr. Guenther Schmitz explained the correct usage of Select-String, but I want to just add some points to support his answer.
I did some reverse engineering work against this Select-String cmdlet. It's in the Microsoft.PowerShell.Utility.dll. Some relevant code snippets are as follows, notice these are codes from reverse engineering for reference, not the actual source code.
string text = inputObject.BaseObject as string;
...
matchInfo = (inputObject.BaseObject as MatchInfo);
object operand = ((object)matchInfo) ?? ((object)inputObject);
flag2 = doMatch(operand, out matchInfo2, out text);
We can find out that it just treat the inputObject as a whole string, it doesn't do any split.
I don't find the actual source code of this cmdlet on github, probably this utility part is not open source yet. But I find the unit test of this Select-String.
$testinputone = "hello","Hello","goodbye"
$testinputtwo = "hello","Hello"
The test strings they are using for unit test are actually lists of strings. It means that they were not even thinking about your use case and very possibly it's just designed to accept input of string collection.
However if we look at the official document of Microsoft regarding Select-String we do see it talks about line a lot while it can't recognize a line in a string. My personal guess is the concept of line is only meaningful while the cmdlet accept a file as an input, in the case the file is like a list of string, each item in the list represents a single line.
Hope it can make things more clear.

Select-String in Powershell only displaying part of the line from a text file, need it to display whole thing

I am trying to write a simple PS script to check large .txt log files for a short string: "SRVE0242I:"
$lines = Select-String -Path $logDir -Pattern "SRVE0242I:" | Select-Object line | Out-String
On output though, it only displays the following:
Line
[28/06/17 13:48:27:839] 00000020 ServletWrappe I SRVE0242I: [User] [User] [com_xxxxxxx_...
And not the full line. Is there a limit to how many characters this pulls? I can't find any info on any restrictions for the Select-String cmdlet. Is there a better way to do this so that I don't a) pull the heading "Line" in my list of lines (Don't really want to create table formatting for such a simple output) and b) get the whole line when I pull the info?
You are seeing it like this because it's displaying the Line property using the default Format-Table view and shortening it to the width of the console.
Do this instead:
$lines = Select-String -Path $logDir -Pattern "SRVE0242I:" | Select-Object -ExpandProperty line
This returns the value of the Line property as a string to the $lines variable. You don't need to use Out-String.
There is! Long story short, Select-Object is doing the truncating here. Here's one way to get the first untruncated line in a Select-String output
$(Select-String -Path $logDir -Pattern "SRVE0242I:")[0].Line
When you run into something like this, you can break down the individual steps to determine what's happening by piping things to Get-Member. Here's what's happening in the code above:
Select-String <# args #> | Get-Member
Select-String gives us a MatchInfo object, which (as you've correctly determined) has a 'Line' property. When run on it's own, Select-String will actually spit out all the information you're looking for, and will not truncate it by default (at least, on v6.0.0-beta). It does give you an array of MatchInfo objects if it finds multiple matches, so you have to index into that array if you just want the first one (like I did above).
Select-String <# args #> | Select-Object Line | Get-Member
Select-Object applies PowerShell's default formatting for objects which, in most cases, will truncate your output for easier viewing. For objects with a bunch of members (like a MatchInfo object), it will try to do one per line by default.
Select-String <# args #> | Select-Object Line | Out-String | Get-Member
Out-String directly translates it's input to a string. That is, rather than trying to cast something to a string or pull a string Property out of an object that's passed to it, it just changes whatever it receives into an object. In this case, it turns the already-formatted MatchInfo output into a string. Nothing happens to the output on the terminal, but Get-Member will reveal a String rather than a MatchInfo object.
It's not directly relevant here, but if you're interested in modifying the default formatting, it's governed by the types.ps1xml file.