This question is more about my understanding Powershell's objects rather than solving this practical example. I know there are other ways of separating out a page number from a string.
In my example I want to do this by accessing the object-match-value of the piped pattern match.
# data
$headerString = 'BARTLETT-BEDGGOOD__PAGE_5 BEECH-BEST__PAGE_6'
# require the number of page only
$regexPageNum = '([0-9]$)'
# split the header string into two separate strings to access page numbers
[string[]]$pages = $null
$pages = $headerString -split ' '
# access page numbers using regex pattern
$pages[0] | Select-String -AllMatches -Pattern $regexPageNum | Select-Object {$_.Matches.Value}
The output is:
$_.Matches.Value
----------------
5
Okay. So far so good. I see the page number of array member pages[0] But how do I take this value from the object? The following does not work.
$x = $pages[0] | Select-String -AllMatches -Pattern $regexPageNum | Select-Object {$_.Matches.Value}
Write-Host "Here it is:"$x
Output:
Here it is: #{$_.Matches.Value=5}
Instead of assigning the value 5 to the variable $x Powershell assigns, what looks to me: a hash table with an object description as its only member?
But if I try to access my variable using "Brackets for Access" Reference: hashtables Powershell indicates that variable $x is in fact an array.
x = $pages[0] | Select-String -AllMatches -Pattern $regexPageNum | Select-Object {$_.Matches.Value}
Write-Host "Here it is:"$x
$y = $x[$_.Matches.Value]
Write-Host "What about now:"$y
Output:
Here it is: #{$_.Matches.Value=5}
InvalidOperation:
Line |
33 | $y = $x[$_.Matches.Value]
| ~~~~~~~~~~~~~~~~~~~~~~~~~
| Index operation failed; the array index evaluated to null.
What about now:
Okay. At this stage I know I'm being silly. But the point I'm trying to make is: How can I retrieve the value I want when I'm done with the Powershell object?
You can use $x.{ $_.Matches.Value } to access the value.
$x = $pages[0] | Select-String -AllMatches -Pattern $regexPageNum | Select-Object { $_.Matches.Value }
$x.{ $_.Matches.Value } # This will print 5
ie, You would have to wrap the property name inside {} since the property name contains "."
Instead of this way, I would suggest you to create a calculated property using Select-Object which makes the code more readable.
$x = $pages[0] | Select-String -AllMatches -Pattern $regexPageNum | Select-Object #{Name = 'PageNumber'; Expression = {$_.Matches.Value}}
$x.PageNumber
#Access matches in case of single match
$x = "red blue yellow green" | select-string -Pattern 'blue'
$x.matches.value
#Output
blue
#Access matches in case of multi match
$x = "red blue yellow green blue" | select-string -Pattern 'blue' -AllMatches
$x.matches.value
#Output
blue
blue
When you use a scriptblock as a parameter to Select-Object the return value will contain a property whose name matches the source code of the script block...
PS> #{ "aaa" = "bbb" } | select-object { $_.aaa; <# xxx #> }
$_.aaa; <# xxx #>
-------------------
bbb
In this pathological case, if I want to access the property I can't use the name in the default "dotted" notation because it contains reserved characters, but you can access it if you quote the property name:
PS> $x = #{ "aaa" = "bbb" } | select-object { $_.aaa; <# xxx #> }
# note the leading and trailing spaces in the string because the
# the original scriptblock source contains spaces between the "{" and "}"
PS> $x.' $_.aaa; <# xxx #> '
bbb
In your case you'd do this:
PS> $x = $pages[0] | Select-String -AllMatches -Pattern $regexPageNum | Select-Object {$_.Matches.Value}
PS> $x.'$_.Matches.Value'
Other options work too:
$x = $pages[0] `
| Select-String -AllMatches -Pattern $regexPageNum `
| Select-Object {$_.Matches.Value}
# get the property whose name is contained in the $name variable
PS> $name = '$_.Matches.Value'
PS> $x.$name
5
# the scriptblock gets converted into a string, and then that string
# is used as a property name
PS> $x.{$_.Matches.Value}
5
# note the whitespace in both scriptblocks has to match *exactly* otherwise the property name won't be found
PS> $x.{ $_.Matches.Value }
ParentContainsErrorRecordException: The property ' $_.Matches.Value ' cannot be found on this object. Verify that the property exists.
but...
There's an easier way - if you pass a hashtable to Select-Object instead of a scriptblock you can specify the name of the property - e.g.
PS> $x = $pages[0] `
| Select-String -AllMatches -Pattern $regexPageNum `
| Select-Object #{ "l"="Count"; "e"={$_.Matches.Value} }
PS> $x
Count
-----
5
PS> $x.Count
5
References:
about_Calculated_Properties - Hashtable key definitions
Related
I wrote a script to extract the URL and Revision Number from svn info command of a svn repository and save the result in a .txt file.
The $revision and $url are both strings, so the replace method should work on them but it doesn't. Is there possibly something wrong in my code causing this?
$TheFilePath = "C:\Users\MyPC\REPOSITORY\NewProject\OUTPUT.txt"
echo "#- Automatic Package Update `n----------"| Out-File -FilePath $TheFilePath
$url = svn info C:\Users\MyPC\REPOSITORY\NewProject\trunk | Select-String -Pattern 'URL' -CaseSensitive -SimpleMatch | select-object -First 1
$url | Add-Content -path $TheFilePath
$revision = svn info C:\Users\MyPC\REPOSITORY\NewProject\trunk | Select-String -Pattern 'Revision' -CaseSensitive -SimpleMatch | $revision.Replace('Revision','srcrev')
$revision | Add-Content -path $TheFilePath
here is the output of svn info (Irrelevant outputs have been omitted) :
Path: .
Working Copy Root Path: C:\Users\MyPC\REPOSITORY\NewProject\trunk
URL: https://svn.mycompany.de/svn/NewProject/trunk
Relative URL: ^/trunk
Repository Root: https://svn.mycompany.de/svn/NewProject
Revision: 5884
And here is what I get inside the .txt file , running the code :
#- Automatic Package Update
----------
URL: https://svn.mycompany.de/svn/NewProject/trunk
Revision: 5884
----------
Looking at the example output of svn info here and the example you just provided, you should be able to get the info you need easier with ConvertFrom-StringData then with Select-String.
In PowerShell < 7.x you can use ConvertFrom-StringData on the output of svn info after changing the colon (:) delimiter into an equal sign (=) to get a Hashtable with all properties and values.
Then, using calculated properties you can extract the items you're interested in and save as CSV file for instance like this:
$svnInfo = svn info 'C:\Users\MyPC\REPOSITORY\NewProject\trunk'
$result = $svnInfo -replace '(?<!:.*):', '=' | ConvertFrom-StringData |
Select-Object #{Name = 'URL'; Expression = {$_['URL']}},
#{Name = 'srcrev'; Expression = {$_['Revision']}}
# output on screen
$result | Format-Table -AutoSize
# output to CSV file
$result | Export-Csv -Path 'C:\Users\MyPC\REPOSITORIES\NewProject\OUTPUT.csv' -NoTypeInformation
Regex details on the -replace to replace only the first occurrence of the colon:
(?<! Assert that it is impossible to match the regex below with the match ending at this position (negative lookbehind)
: Match the character “:” literally
. Match any single character
* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
)
: Match the character “:” literally
If you're using PowerShell 7 or higher, tghings get easier because then you have an extra -Delimiter parameter:
$svnInfo = svn info 'C:\Users\MyPC\REPOSITORY\NewProject\trunk'
$result = $svnInfo -replace '(?<!:.*):', '=' | ConvertFrom-StringData -Delimiter ':' |
Select-Object #{Name = 'URL'; Expression = {$_['URL']}},
#{Name = 'srcrev'; Expression = {$_['Revision']}}
# output on screen
$result | Format-Table -AutoSize
# output to CSV file
$result | Export-Csv -Path 'C:\Users\MyPC\REPOSITORIES\NewProject\OUTPUT.csv' -NoTypeInformation
$revision doesn't exist until after the svn command is done.
Use the ForEach-Object cmdlet and refer to the current match as $_ to modify the output object inline - the matched line in the output from Select-String is stored in a property called Line:
$revision = svn info C:\Users\MyPC\REPOSITORY\NewProject\trunk |Select-String -Pattern 'Revision' -CaseSensitive -SimpleMatch |ForEach-Object { $_.Line.Replace('Revision', 'srcrev') }
I am using powershell to search several files for a specific match to a regular expression. Because it is a regular expression, I wan't to only see what I have programmed my regex to accept, and the line number at which it is matched.
I then want to take the matched value and the line number and create an object to output to an excel file.
I can get each item in individual select string statements, but then they won't be matched up with each other
Select-String -Path $pathToFile -Pattern '(?<={\n\s*Box\s=\s")14\d{3}(?=",)' |
Select LineNumber, Matches.Value
#Will only print out the lineNumber
Select-String -Path $pathToFile -Pattern '(?<={\n\s*Box\s=\s")14\d{3}(?=",)' |
Foreach {$_.matches} | Select value
#Will only print matched value and can't print linenumber
Can anyone help me get both the line number and the matched value?
Edit: Just to clarify what I am doing
$files = Get-ChildItem $directory -Include *.vb,*.cs -Recurse
$colMatchedFiles = #()
foreach ($file in $files) {
$fileContent = Select-String -Path $file -Pattern '(?<={\n\s*Box\s=\s")14\d{3}(?=",)' |
Select-Object LineNumber, #{Name="Match"; Expression={$_.Matches[0].Groups[1].Value}}
write-host $fileContent #just for checking if there is anything
}
This still does not get anything, it just outputs a bunch of blank lines
Edit: What I am expecting to happen is for this script to search the content of all the files in the directory and find the lines that match the regular expression. Below is what I would expect for output for each file in the loop
LineNumber Match
---------- -----
324 15
582 118
603 139
... ...
File match sample:
{
Box = "5015",
Description = "test box 1"
}....
{
Box = "5118",
Description = "test box 2"
}...
{
Box = "5139",
Description = "test box 3"
}...
Example 1
Select the LineNumber and group value for each match. Example:
$sampleData = #'
prefix 1 A B suffix 1
prefix 2 A B suffix 2
'# -split "`n"
$sampleData | Select-String '(A B)' |
Select-Object LineNumber,
#{Name="Match"; Expression={$_.Matches[0].Groups[1].Value}}
Example 2
Search *.vb and *.cs for files containing the string Box = "<n>", where <n> is some number, and output the filename, line number of the file, and the number on the box = lines. Sample code:
Get-ChildItem $pathToFiles -Include *.cs,*.vb -Recurse |
Select-String 'box = "(\d+)"' |
Select-Object Path,
LineNumber,
#{Name="Match"; Expression={$_.Matches[0].Groups[1].Value -as [Int]}}
This returns output like the following:
Path LineNumber Match
---- ---------- -----
C:\Temp\test1.cs 2 5715
C:\Temp\test1.cs 6 5718
C:\Temp\test1.cs 10 5739
C:\Temp\test1.vb 2 5015
C:\Temp\test1.vb 6 5118
C:\Temp\test1.vb 10 5139
Example 3
Now that we know that we want the line before the match to contain {, we can use the -Context parameter with Select-String. Example:
Get-ChildItem $pathToFiles -Include *.cs,*.vb -Recurse |
Select-String 'box = "(\d+)"' -Context 1 | ForEach-Object {
# Line prior to match must contain '{' character
if ( $_.Context.DisplayPreContext[0] -like "*{*" ) {
[PSCustomObject] #{
Path = $_.Path
LineNumber = $_.LineNumber
Match = $_.Matches[0].Groups[1].Value
}
}
}
I have the following code which lists the first 5 items in the Inbox folder (of Outlook).
How would I extract only the number portion of it( say - 7 digit arbitrary numberss, which are embedded within other text)? Then using Powershell commands, I'd really like to take those extracted numbers and dump them to a CSV file(thus, they can be easily incorporated into an existing spreadsheet I use).
Here's what I tried :
$outlook = new-object -com Outlook.Application
$sentMail = $outlook.Session.GetDefaultFolder(6) # == olFolderInbox
$sentMail.Items | select -last 10 TaskSubject # ideally, grabbing first 20
$matches2 = "\d+$"
$res = gc $sentMail.Items | ?{$_ -match $matches2 | %{ $_ -match $matches2 | out-null; $matches[1] }
but this does not run correctly, but rather .. keeps me hanging with awaiting-input symbol: like so :
>>
>>
>>
Do I need to perhaps create a separate variable in between the 1st part and 2nd part?
Not sure what the $matches variable is for but try to replace your last line with something like below.
For Subject Line Items:
$sentMail.Items | % { $_.TaskSubject | Select-String -Pattern '^\d{3}-\d{3}-\d{4}' | % {([string]$_).Substring(0,12)} }
For Message Body Items:
$sentMail.Items | % { ($_.Body).Split("`n") | Select-String -Pattern '^\d{3}-\d{3}-\d{4}' |% {([string]$_).Substring(0,12)} }
Here is a refrence to Select-String which I use pretty often.
https://technet.microsoft.com/library/hh849903.aspx
Here is a reference to the Phone number portion which I have never used but found pretty cool.
http://blogs.technet.com/b/heyscriptingguy/archive/2011/03/24/use-powershell-to-search-a-group-of-files-for-phone-numbers.aspx
Good luck!
Here is an edited version for 7 digit extraction via subject line. This assumes the number has a space on each side but can be modified a bit if necessary. You may also want to adjust the depth by changing the -First portion to Select * or just making 100 deeper in range.
$outlook = New-Object -com Outlook.Application
$Mail = $outlook.Session.GetDefaultFolder(6) # Folder Inbox
$Mail.Items | select -First 100 TaskSubject |
% { $_.TaskSubject | Select-String -Pattern '\s\d{7}\s'} |
% {((Select-String -InputObject $_ -Pattern '\s\d{7}\s').Line).split(" ") |
% {if(($_.Length -eq 7) -and ($_ -match '\d{7}')) {$_ | Out-File -FilePath "C:\Temp\SomeFile.csv" -Append}}}
Some of this you have already addressed / figured out but I wanted to explain the issues with your current code.
If you expect multiple matches and want to return those then you would need to use Select-String with the -AllMatches parameter. Your regex, in your example, is currently looking for a sequence of digits at the end of the subject. That would only return one match so lets looks at the issues with your code.
$sentMail.Items | select -last 10 TaskSubject
You are filtering the last 10 items but you are not storing those for later use so they would merely be displayed on screen. We cover a solution later.
One of the primary reasons for using -match is to get the Boolean value that is returned for code like if blocks and where clauses. You can still use it in the way you intended. Looking at the current code in question:
$res = gc $sentMail.Items | ?{$_ -match $matches2 | %{ $_ -match $matches2 | out-null; $matches[1] }
The two big issues with this are you are calling Get-Content(gc) on each item. Get-Content is for pulling file data which $sentMail.Items is not. You also having a large where block. Where blocks will pass data to the output steam based on a true or false condition. Your malformed statement ?{$_ -match $matches2 | %{ $_ -match $matches2 | out-null; $matches[1] } wont do this... at least not well.
$outlook = new-object -com Outlook.Application
$sentMail = $outlook.Session.GetDefaultFolder(6) # == olFolderInbox
$matches2 = "\d+$"
$sentMail.Items | select -last 10 -ExpandProperty TaskSubject | ?{$_ -match $matches2} | %{$Matches[0]}
Take the last 10 email subjects and check if either of them match the regex string $matches2. If they do then return the string match to standard output.
I'm working on a script that combines parts of two text files. These files are not too large (about 2000 lines each).
I'm seeing strange output from select-string that I don't think should be there.
Here's samples of my two files:
CC.csv - 2026 lines
LS126L47L6/1L2#519,07448,1,B
LS126L47L6/1R1-1#503,07449,1,B
LS126L47L6/1L3#536,07450,1,B
LS126L47L6/2R1#515,07451,1,B
LS126L47L6/10#525,07452,1,B
LS126L47L6/1L4#538,07453,1,B
GI.txt - 1995 lines
07445,B,SH,1
07446,B,SH,1
07448,B,SH,1
07449,B,SH,1
07450,B,SH,1
07451,B,SH,1
07452,B,SH,1
07453,B,SH,1
07454,B,SH,1
And here's a sample of the output file:
output in myfile.csv
LS126L47L6/3R1#516,07446,1,B
LS126L47L6/1L2#519,07448,1,B
LS126L47L6/1R1-1#503,07449,1,B
System.Object[],B
LS126L47L6/2R1#515,07451,1,B
This is the script I'm using:
sc ./myfile.csv "col1,col2,col3,col4"
$mn = gc cc.csv | select -skip 1 | % {$_.tostring().split(",")[1]}
$mn | % {
$a = (gc cc.csv | sls $_ ).tostring() -replace ",[a-z]$", ""
if (gc GI.txt | sls $_ | select -first 1)
{$b = (gc GI.txt | sls $_ | select -first 1).tostring().split(",")[1]}
else {$b = "NULL"
write-host "$_ is not present in GI file"}
$c = $a + ',' + $b
ac ./myfile.csv -value $c
}
The $a variable is where I am sometimes seeing the returned string as System.Object[]
Any ideas why? Also, this script takes quite some time to finish. Any tips for a newb on how to speed it up?
Edit: I should add that I've taken one line from the cc.csv file, saved in a new text file, and run through the script in console up through assigning $a. I can't get it to return "system.object[]".
Edit 2: After follow the advice below and trying a couple of things I've noticed that if I run
$mn | %{(gc cc.csv | sls $_).tostring()}
I get System.Object[].
But if I run
$mn | %{(gc cc.csv | sls $_)} | %{$_.tostring()}
It comes out fine. Go figure.
The problem is caused by a change in multiplicity of matches. If there are multiple matching elements an Object[] array (of MatchInfo elements) is returned; a single matching element results in a single MatchInfo object (not in an array); and when there are no matches, null is returned.
Consider these results, when executed against the "cc.csv" test-data supplied:
# matches many
(gc cc.csv | Select-String "LS" ).GetType().Name # => Object[]
# matches one
(gc cc.csv | Select-String "538").GetType().Name # => MatchInfo
# matches none
(gc cc.csv | Select-String "FAIL") # => null
The result of calling ToString on Object[] is "System.Object[]" while the result is a more useful concatenation of the matched values when invoked directly upon a MatchInfo object.
The immediate problem can be fixed with selected | Select -First 1, which will result in a MatchInfo being returned for the first two cases. Select-String will still search the entire input - extra results are simply discarded.
However, it seems like the look-back into "cc.csv" (with the Select-String) could be eliminated entirely as that is where $_ originally comes from. Here is a minor [untested] adaptation, of what it may look like:
gc cc.csv | Select -Skip 1 | %{
$num = $_.Split(",")[1]
$a = $_ -Replace ",[a-z]$", ""
# This is still O(m*n) and could be improved with a hash/set probe.
$gc_match = Select-String $num -Path gi.csv -SimpleMatch | Select -First 1
if ($gc_match) {
# Use of "Select -First 1" avoids the initial problem; but
# it /may/ be more appropriate for an error to indicate data problems.
# (Likewise, an error in the original may need further investigation.)
$b = $gc_match.ToString().Split(",")[1]
} else {
$b = "NULL"
Write-Host "$_ is not present in GI file"
}
$c = $a + ',' + $b
ac ./myfile.csv -Value $c
}
I'm trying to extract certain values from a sorted hash table using Select-String.
This works, but why are there extra blank lines in the output?
cls
$fruits = #{"1" = "apple"; "2" = "lemon"; "3" = "orange"; "4" = "apricot"}
foreach ($fruit in $fruits.GetEnumerator() | Sort-Object Value) {
$fruit.Value | Select-String -pattern "ap" -SimpleMatch
}
I think you're getting the empty values because for each item you're doing a Select-String, which is returning a value sometimes but nothing at other times - those nothings are the blank lines.
Try something like this that uses Where-Object:
$fruits.Values | Sort-Object | Where-Object { $_ -match "ap" }
I'm using PowerShell 3.0.
I did two lines of code
PS C:\> $fruits = #{"1" = "apple"; "2" = "lemon"; "3" = "orange"; "4" = "apricot"}
PS C:\> $fruits.Values -like 'ap*'
apricot
apple
PS C:\>
Further Investigating
Seems to be that when there is no match it returns NULL.
foreach($key in $fruits.Keys){(Select-String -Pattern 'ap' -SimpleMatch -InputObject $fruits[$key]) -eq $NULL}
Since it is not redirecting to a variable it just prints NULL to the Host. That's my assumption (assume).
Strange...good question.