Select-String sometimes results in "System.Object[]" - powershell

I'm working on a script that combines parts of two text files. These files are not too large (about 2000 lines each).
I'm seeing strange output from select-string that I don't think should be there.
Here's samples of my two files:
CC.csv - 2026 lines
LS126L47L6/1L2#519,07448,1,B
LS126L47L6/1R1-1#503,07449,1,B
LS126L47L6/1L3#536,07450,1,B
LS126L47L6/2R1#515,07451,1,B
LS126L47L6/10#525,07452,1,B
LS126L47L6/1L4#538,07453,1,B
GI.txt - 1995 lines
07445,B,SH,1
07446,B,SH,1
07448,B,SH,1
07449,B,SH,1
07450,B,SH,1
07451,B,SH,1
07452,B,SH,1
07453,B,SH,1
07454,B,SH,1
And here's a sample of the output file:
output in myfile.csv
LS126L47L6/3R1#516,07446,1,B
LS126L47L6/1L2#519,07448,1,B
LS126L47L6/1R1-1#503,07449,1,B
System.Object[],B
LS126L47L6/2R1#515,07451,1,B
This is the script I'm using:
sc ./myfile.csv "col1,col2,col3,col4"
$mn = gc cc.csv | select -skip 1 | % {$_.tostring().split(",")[1]}
$mn | % {
$a = (gc cc.csv | sls $_ ).tostring() -replace ",[a-z]$", ""
if (gc GI.txt | sls $_ | select -first 1)
{$b = (gc GI.txt | sls $_ | select -first 1).tostring().split(",")[1]}
else {$b = "NULL"
write-host "$_ is not present in GI file"}
$c = $a + ',' + $b
ac ./myfile.csv -value $c
}
The $a variable is where I am sometimes seeing the returned string as System.Object[]
Any ideas why? Also, this script takes quite some time to finish. Any tips for a newb on how to speed it up?
Edit: I should add that I've taken one line from the cc.csv file, saved in a new text file, and run through the script in console up through assigning $a. I can't get it to return "system.object[]".
Edit 2: After follow the advice below and trying a couple of things I've noticed that if I run
$mn | %{(gc cc.csv | sls $_).tostring()}
I get System.Object[].
But if I run
$mn | %{(gc cc.csv | sls $_)} | %{$_.tostring()}
It comes out fine. Go figure.

The problem is caused by a change in multiplicity of matches. If there are multiple matching elements an Object[] array (of MatchInfo elements) is returned; a single matching element results in a single MatchInfo object (not in an array); and when there are no matches, null is returned.
Consider these results, when executed against the "cc.csv" test-data supplied:
# matches many
(gc cc.csv | Select-String "LS" ).GetType().Name # => Object[]
# matches one
(gc cc.csv | Select-String "538").GetType().Name # => MatchInfo
# matches none
(gc cc.csv | Select-String "FAIL") # => null
The result of calling ToString on Object[] is "System.Object[]" while the result is a more useful concatenation of the matched values when invoked directly upon a MatchInfo object.
The immediate problem can be fixed with selected | Select -First 1, which will result in a MatchInfo being returned for the first two cases. Select-String will still search the entire input - extra results are simply discarded.
However, it seems like the look-back into "cc.csv" (with the Select-String) could be eliminated entirely as that is where $_ originally comes from. Here is a minor [untested] adaptation, of what it may look like:
gc cc.csv | Select -Skip 1 | %{
$num = $_.Split(",")[1]
$a = $_ -Replace ",[a-z]$", ""
# This is still O(m*n) and could be improved with a hash/set probe.
$gc_match = Select-String $num -Path gi.csv -SimpleMatch | Select -First 1
if ($gc_match) {
# Use of "Select -First 1" avoids the initial problem; but
# it /may/ be more appropriate for an error to indicate data problems.
# (Likewise, an error in the original may need further investigation.)
$b = $gc_match.ToString().Split(",")[1]
} else {
$b = "NULL"
Write-Host "$_ is not present in GI file"
}
$c = $a + ',' + $b
ac ./myfile.csv -Value $c
}

Related

Accessing matches variables in Powershell

This question is more about my understanding Powershell's objects rather than solving this practical example. I know there are other ways of separating out a page number from a string.
In my example I want to do this by accessing the object-match-value of the piped pattern match.
# data
$headerString = 'BARTLETT-BEDGGOOD__PAGE_5 BEECH-BEST__PAGE_6'
# require the number of page only
$regexPageNum = '([0-9]$)'
# split the header string into two separate strings to access page numbers
[string[]]$pages = $null
$pages = $headerString -split ' '
# access page numbers using regex pattern
$pages[0] | Select-String -AllMatches -Pattern $regexPageNum | Select-Object {$_.Matches.Value}
The output is:
$_.Matches.Value
----------------
5
Okay. So far so good. I see the page number of array member pages[0] But how do I take this value from the object? The following does not work.
$x = $pages[0] | Select-String -AllMatches -Pattern $regexPageNum | Select-Object {$_.Matches.Value}
Write-Host "Here it is:"$x
Output:
Here it is: #{$_.Matches.Value=5}
Instead of assigning the value 5 to the variable $x Powershell assigns, what looks to me: a hash table with an object description as its only member?
But if I try to access my variable using "Brackets for Access" Reference: hashtables Powershell indicates that variable $x is in fact an array.
x = $pages[0] | Select-String -AllMatches -Pattern $regexPageNum | Select-Object {$_.Matches.Value}
Write-Host "Here it is:"$x
$y = $x[$_.Matches.Value]
Write-Host "What about now:"$y
Output:
Here it is: #{$_.Matches.Value=5}
InvalidOperation:
Line |
33 | $y = $x[$_.Matches.Value]
| ~~~~~~~~~~~~~~~~~~~~~~~~~
| Index operation failed; the array index evaluated to null.
What about now:
Okay. At this stage I know I'm being silly. But the point I'm trying to make is: How can I retrieve the value I want when I'm done with the Powershell object?
You can use $x.{ $_.Matches.Value } to access the value.
$x = $pages[0] | Select-String -AllMatches -Pattern $regexPageNum | Select-Object { $_.Matches.Value }
$x.{ $_.Matches.Value } # This will print 5
ie, You would have to wrap the property name inside {} since the property name contains "."
Instead of this way, I would suggest you to create a calculated property using Select-Object which makes the code more readable.
$x = $pages[0] | Select-String -AllMatches -Pattern $regexPageNum | Select-Object #{Name = 'PageNumber'; Expression = {$_.Matches.Value}}
$x.PageNumber
#Access matches in case of single match
$x = "red blue yellow green" | select-string -Pattern 'blue'
$x.matches.value
#Output
blue
#Access matches in case of multi match
$x = "red blue yellow green blue" | select-string -Pattern 'blue' -AllMatches
$x.matches.value
#Output
blue
blue
When you use a scriptblock as a parameter to Select-Object the return value will contain a property whose name matches the source code of the script block...
PS> #{ "aaa" = "bbb" } | select-object { $_.aaa; <# xxx #> }
$_.aaa; <# xxx #>
-------------------
bbb
In this pathological case, if I want to access the property I can't use the name in the default "dotted" notation because it contains reserved characters, but you can access it if you quote the property name:
PS> $x = #{ "aaa" = "bbb" } | select-object { $_.aaa; <# xxx #> }
# note the leading and trailing spaces in the string because the
# the original scriptblock source contains spaces between the "{" and "}"
PS> $x.' $_.aaa; <# xxx #> '
bbb
In your case you'd do this:
PS> $x = $pages[0] | Select-String -AllMatches -Pattern $regexPageNum | Select-Object {$_.Matches.Value}
PS> $x.'$_.Matches.Value'
Other options work too:
$x = $pages[0] `
| Select-String -AllMatches -Pattern $regexPageNum `
| Select-Object {$_.Matches.Value}
# get the property whose name is contained in the $name variable
PS> $name = '$_.Matches.Value'
PS> $x.$name
5
# the scriptblock gets converted into a string, and then that string
# is used as a property name
PS> $x.{$_.Matches.Value}
5
# note the whitespace in both scriptblocks has to match *exactly* otherwise the property name won't be found
PS> $x.{ $_.Matches.Value }
ParentContainsErrorRecordException: The property ' $_.Matches.Value ' cannot be found on this object. Verify that the property exists.
but...
There's an easier way - if you pass a hashtable to Select-Object instead of a scriptblock you can specify the name of the property - e.g.
PS> $x = $pages[0] `
| Select-String -AllMatches -Pattern $regexPageNum `
| Select-Object #{ "l"="Count"; "e"={$_.Matches.Value} }
PS> $x
Count
-----
5
PS> $x.Count
5
References:
about_Calculated_Properties - Hashtable key definitions

Using Powershell to compare two files and then output only the different string names

So I am a complete beginner at Powershell but need to write a script that will take a file, compare it against another file, and tell me what strings are different in the first compared to the second. I have had a go at this but I am struggling with the outputs as my script will currently only tell me on which line things are different, but it also seems to count lines that are empty too.
To give some context for what I am trying to achieve, I would like to have a static file of known good Windows processes ($Authorized) and I want my script to pull a list of current running processes, filter by the process name column so to just pull the process name strings, then match anything over 1 character, sort the file by unique values and then compare it against $Authorized, plus finally either outputting the different process strings found in $Processes (to the ISE Output Pane) or just to output the different process names to a file.
I have spent today attempting the following in Powershell ISE and also Googling around to try and find solutions. I heard 'fc' is a better choice instead of Compare-Object but I could not get that to work. I have thus far managed to get it to work but the final part where it compares the two files it seems to compare line by line, for which would always give me false positives as the line position of the process names in the file supplied would change, furthermore I only want to see the changed process names, and not the line numbers which it is reporting ("The process at line 34 is an outlier" is what currently gets outputted).
I hope this makes sense, and any help on this would be very much appreciated.
Get-Process | Format-Table -Wrap -Autosize -Property ProcessName | Outfile c:\users\me\Desktop\Processes.txt
$Processes = 'c:\Users\me\Desktop\Processes.txt'
$Output_file = 'c:\Users\me\Desktop\Extracted.txt'
$Sorted = 'c:\Users\me\Desktop\Sorted.txt'
$Authorized = 'c:\Users\me\Desktop\Authorized.txt'
$regex = '.{1,}'
select-string -Path $Processes -Pattern $regex |% { $_.Matches } |% { $_.Value } > $Output_file
Get-Content $Output_file | Sort-Object -Unique > $Sorted
$dif = Compare-Object -ReferenceObject $(Get-Content $Sorted) -DifferenceObject $(get-content $Authorized) -IncludeEqual
$lineNumber = 1
foreach ($difference in $dif)
{
if ($difference.SideIndicator -ne "==")
{
Write-Output "The Process at Line $linenumber is an Outlier"
}
$lineNumber ++
}
Remove-Item c:\Users\me\Desktop\Processes.txt
Remove-Item c:\Users\me\Desktop\Extracted.txt
Write-Output "The Results are Stored in $Sorted"
From the length and complexity of your script, I feel like I'm missing something, but your description seems clear
Running process names:
$ProcessNames = #(Get-Process | Select-Object -ExpandProperty Name)
.. which aren't blank: $ProcessNames = $ProcessNames | Where-Object {$_ -ne ''}
List of authorised names from a file:
$AuthorizedNames = Get-Content 'c:\Users\me\Desktop\Authorized.txt'
Compare:
$UnAuthorizedNames = $ProcessNames | Where-Object { $_ -notin $AuthorizedNames }
optional output to file:
$UnAuthorizedNames | Set-Content out.txt
or in the shell:
#(gps).Name -ne '' |? { $_ -notin (gc authorized.txt) } | sc out.txt
1 2 3 4 5 6 7 8
1. #() forces something to be an array, even if it only returns one thing
2. gps is a default alias of Get-Process
3. using .Property on an array takes that property value from every item in the array
4. using an operator on an array filters the array by whether the items pass the test
5. ? is an alias of Where-Object
6. -notin tests if one item is not in a collection
7. gc is an alias of Get-Content
8. sc is an alias of Set-Content
You should use Set-Content instead of Out-File and > because it handles character encoding nicely, and they don't. And because Get-Content/Set-Content sounds like a memorable matched pair, and Get-Content/Out-File doesn't.

Powershell - Empty entries in CSV

I don't have much experience with CSV, so apologies if I'm really blind here.
I have a basic CSV and script setup to test this with. The CSV has two columns, Letter and Number. Letter goes from A-F and Number goes from 1-10. This means that Number has more rows than Letter, so when running the following script, the output can sometimes provide an empty Letter.
$L = ipcsv ln.csv | Get-Random | Select-Object -ExpandProperty Letter
$N = ipcsv ln.csv | Get-Random | Select-Object -ExpandProperty Number
Write-Output $L
Write-Output $N
Some outputs come out as
B
9
while others can come out as
5
I don't know whether the issue is my script not ignoring empty lines or my CSV being written incorrectly, which is posted below.
Letter,Number
A,1
B,2
C,3
D,4
E,5
F,6
,7
,8
,9
,10
What's my issue here and how do I go about fixing it?
Your asking for a random object from your CSV, not a random letter. Since some of the lines are missing a letter, you might end up picking one that has an empty Letter-value.
If you want to pick any line with a letter, you need to filter the rows first to only pick from the ones with a value. Also, you sould avoid reading the same file twice, use a varible
#$csv = Import-CSV -Path ln.csv
$csv = #"
Letter,Number
A,1
B,2
C,3
D,4
E,5
F,6
,7
,8
,9
,10
"# | ConvertFrom-Csv
$L = $csv | Where-Object { $_.Letter } | Get-Random | Select-Object -ExpandProperty Letter
$N = $csv | Where-Object { $_.Number } | Get-Random | Select-Object -ExpandProperty Number
Write-Output $L
Write-Output $N
CSV migtht not be the best solution for this scenario. Ex. you could store these as arrays in the script, like:
$chars = [char[]](65..70) #A-F uppercase letters
$numbers = 1..10
$L = $chars | Get-Random
$N = $numbers | Get-Random
Write-Output $L
Write-Output $N
Import-Csv turns each line into an object, with a property for each column.
Even though one or more property values may be empty, the object still exists, and Get-Random has no reason determine that an object with a certain property (such as Letter) having the value "" (ie. an empty string), should not be picked.
You can fix this by expanding the property values first, then filter for empty values and then finally pick the random value from those that weren't empty:
$L = ipcsv ln.csv |Select-Object -ExpandProperty Letter |Where-Object {$_} |Get-Random
$N = ipcsv ln.csv |Select-Object -ExpandProperty Number |Where-Object {$_} |Get-Random

Using powershell, how do I extract a 7-digit number from a subject-line (of an email ), regular expressions?

I have the following code which lists the first 5 items in the Inbox folder (of Outlook).
How would I extract only the number portion of it( say - 7 digit arbitrary numberss, which are embedded within other text)? Then using Powershell commands, I'd really like to take those extracted numbers and dump them to a CSV file(thus, they can be easily incorporated into an existing spreadsheet I use).
Here's what I tried :
$outlook = new-object -com Outlook.Application
$sentMail = $outlook.Session.GetDefaultFolder(6) # == olFolderInbox
$sentMail.Items | select -last 10 TaskSubject # ideally, grabbing first 20
$matches2 = "\d+$"
$res = gc $sentMail.Items | ?{$_ -match $matches2 | %{ $_ -match $matches2 | out-null; $matches[1] }
but this does not run correctly, but rather .. keeps me hanging with awaiting-input symbol: like so :
>>
>>
>>
Do I need to perhaps create a separate variable in between the 1st part and 2nd part?
Not sure what the $matches variable is for but try to replace your last line with something like below.
For Subject Line Items:
$sentMail.Items | % { $_.TaskSubject | Select-String -Pattern '^\d{3}-\d{3}-\d{4}' | % {([string]$_).Substring(0,12)} }
For Message Body Items:
$sentMail.Items | % { ($_.Body).Split("`n") | Select-String -Pattern '^\d{3}-\d{3}-\d{4}' |% {([string]$_).Substring(0,12)} }
Here is a refrence to Select-String which I use pretty often.
https://technet.microsoft.com/library/hh849903.aspx
Here is a reference to the Phone number portion which I have never used but found pretty cool.
http://blogs.technet.com/b/heyscriptingguy/archive/2011/03/24/use-powershell-to-search-a-group-of-files-for-phone-numbers.aspx
Good luck!
Here is an edited version for 7 digit extraction via subject line. This assumes the number has a space on each side but can be modified a bit if necessary. You may also want to adjust the depth by changing the -First portion to Select * or just making 100 deeper in range.
$outlook = New-Object -com Outlook.Application
$Mail = $outlook.Session.GetDefaultFolder(6) # Folder Inbox
$Mail.Items | select -First 100 TaskSubject |
% { $_.TaskSubject | Select-String -Pattern '\s\d{7}\s'} |
% {((Select-String -InputObject $_ -Pattern '\s\d{7}\s').Line).split(" ") |
% {if(($_.Length -eq 7) -and ($_ -match '\d{7}')) {$_ | Out-File -FilePath "C:\Temp\SomeFile.csv" -Append}}}
Some of this you have already addressed / figured out but I wanted to explain the issues with your current code.
If you expect multiple matches and want to return those then you would need to use Select-String with the -AllMatches parameter. Your regex, in your example, is currently looking for a sequence of digits at the end of the subject. That would only return one match so lets looks at the issues with your code.
$sentMail.Items | select -last 10 TaskSubject
You are filtering the last 10 items but you are not storing those for later use so they would merely be displayed on screen. We cover a solution later.
One of the primary reasons for using -match is to get the Boolean value that is returned for code like if blocks and where clauses. You can still use it in the way you intended. Looking at the current code in question:
$res = gc $sentMail.Items | ?{$_ -match $matches2 | %{ $_ -match $matches2 | out-null; $matches[1] }
The two big issues with this are you are calling Get-Content(gc) on each item. Get-Content is for pulling file data which $sentMail.Items is not. You also having a large where block. Where blocks will pass data to the output steam based on a true or false condition. Your malformed statement ?{$_ -match $matches2 | %{ $_ -match $matches2 | out-null; $matches[1] } wont do this... at least not well.
$outlook = new-object -com Outlook.Application
$sentMail = $outlook.Session.GetDefaultFolder(6) # == olFolderInbox
$matches2 = "\d+$"
$sentMail.Items | select -last 10 -ExpandProperty TaskSubject | ?{$_ -match $matches2} | %{$Matches[0]}
Take the last 10 email subjects and check if either of them match the regex string $matches2. If they do then return the string match to standard output.

Adding numbers into 2 totals and put into each its own variable

Hope you can help me with this little puzzle.
I have ONE txt file looking like this:
firstnumbers
348.92
237
230
329.31
secondnumbers
18.21
48.92
37
30
29.31
So a txt file with one Column that has 2 strings and some numbers on each line.
I want to take the total of each column and put it into each variable like say $a and $b
Yes it is 1 column, just to make sure no misunderstanding
It's pretty easy, if I use 2 files with each column of numbers without the headers(strings)
$a = (Get-Content 'firstnumbers.txt' | Measure-Object -Sum).Sum
$b = (Get-Content 'secondnumbers.txt' | Measure-Object -Sum).Sum
But it would be a little more cool to have them in one txt file, like the aforementioned with a header over each row of numbers.
I've tried removing the the headers with i.e. $a.Replace("first", $null).Replace("sec", $null) and then doing a $b.Split(" ")[1,2,3,4,5] ending with | measure -sum
That gives me the correct number of firstnumbers - but it won't work if I don't keep the specific set of numbers each time. They'll change and there's gonna be more or less of them.
It should be pretty easy I'm guessing. I just can't to seem wrap my head around it at the moment.
Any advice would be awesome!
cheers
Something like this should work:
$file = "C:\path\to\your.txt"
[IO.File]::ReadAllText($file) | % {
$_ -replace "`n+([0-9])", ' $1' -split "`n"
} | ? { $_ -ne "" } | % {
$a = $_ -split " ", 2
$v = $a[1] -split " " | Measure-Object -Sum
"{0}`t{1}" -f ($a[0], $v.Sum)
}
Output:
firstnumbers 1145,23
secondnumbers 163,44
Here's another approach, rather than parsing the text as one big blob, you could test each line to see if it contains a # or text, if it's text, then it triggers the creation of a new entry in a hashtable where the sums are stored:
# C:\Temp> get-content .\numbers.txt | foreach{
$val=0;
if([Decimal]::TryParse($_,[ref]$val)){
$sums[$key]+=$val
}else{
$sums += #{"$_"=0}; #add new entry to hashtable
$key=$_;}
} -end {$sums}
Name Value
---- -----
secondnumbers 163.44
firstnumbers 1145.23
Edit: As noted in the comments, the $sums variable persists for each run which causes problems if you run this command twice. You could call Remove-variable sums after each run, or add it to the end processing block like this:
# C:\Temp> get-content .\numbers.txt | foreach{
$val=0;
if([Decimal]::TryParse($_,[ref]$val)){
$sums[$key]+=$val
}else{
$sums += #{"$_"=0}; #add new entry to hashtable
$key=$_;}
} -end {$sums; remove-variable sums;}