Using a variable within a PowerShell Select-String - powershell

I have this one section of my PowerShell script that I'm currently stuck on. Basically, I have two files that I want to do a comparison on via select-string...In detail
For each item in FileA.txt I want to do a select-string to FileB.txt to discover if it exist. If the line item in FileA.txt doesn't exist in FileB.txt, then print the FileA.txt line item to the screen.
This is what the text files looks like..more or less
FileA.txt
1
2
3
4
6
FileB.txt
6
7
8
9
10
Desired output would be the following:
1
2
3
4
This is what my PS code looks like now. My thought process was that I could use the variable within the select-string but its not working out for me :(
$IPs = Get-Content "C:\\FileA.txt"
Get-Content C:\FileB.txt | Select-String -InputObject $IPs
Could someone please help me out and point out what I am doing wrong.

Based on your limited sample data, here is an example of how you could do this :
"1 2 3 4 6" > "fileA.txt"
"6 7 8 9 10" > "fileB.txt"
$arrayA = (Get-Content "fileA.txt").Split(" ")
$arrayB = (Get-Content "fileB.txt").Split(" ")
$arrayResult = #()
foreach($valueA in $arrayA) {
if($arrayB -notcontains $valueA) {
$arrayResult += $valueA
}
}
$arrayResult -join " "
Now I believe the input files will be quite different eventually
EDIT :
Using line breaks :
"1
2
3
4
6" > "fileA.txt"
"6
7
8
9
10" > "fileB.txt"
$arrayA = Get-Content "fileA.txt"
$arrayB = Get-Content "fileB.txt"
$arrayResult = #()
foreach($valueA in $arrayA) {
if($arrayB -notcontains $valueA) {
$arrayResult += $valueA
}
}
$arrayResult -join "`n"
NB : the 2 scripts begin by filling the needed files, I guess you won't need to do it

In this specific example, Compare-Object would probably be a better choice. It is designed to find the differences between two lists.
You would use something like:
Compare-Object -ReferenceObject $(gc .\FileA.txt) -DifferenceObject $(gc .\FileB.txt) | where { $_.SideIndicator -eq '<=' } | select -expand InputObject
However, you could also do this with select-string:
gc .\FileA.txt | select-string -Pattern $(gc .\FileB.txt) -NotMatch
which just finds the lines of FileA that don't match the lines of FileB, however the lines of FileB are interpreted as regular expressions, which probably isn't appropriate for IP addresses since '.' is a wildcard.

Related

Using Powershell to compare two files and then output only the different string names

So I am a complete beginner at Powershell but need to write a script that will take a file, compare it against another file, and tell me what strings are different in the first compared to the second. I have had a go at this but I am struggling with the outputs as my script will currently only tell me on which line things are different, but it also seems to count lines that are empty too.
To give some context for what I am trying to achieve, I would like to have a static file of known good Windows processes ($Authorized) and I want my script to pull a list of current running processes, filter by the process name column so to just pull the process name strings, then match anything over 1 character, sort the file by unique values and then compare it against $Authorized, plus finally either outputting the different process strings found in $Processes (to the ISE Output Pane) or just to output the different process names to a file.
I have spent today attempting the following in Powershell ISE and also Googling around to try and find solutions. I heard 'fc' is a better choice instead of Compare-Object but I could not get that to work. I have thus far managed to get it to work but the final part where it compares the two files it seems to compare line by line, for which would always give me false positives as the line position of the process names in the file supplied would change, furthermore I only want to see the changed process names, and not the line numbers which it is reporting ("The process at line 34 is an outlier" is what currently gets outputted).
I hope this makes sense, and any help on this would be very much appreciated.
Get-Process | Format-Table -Wrap -Autosize -Property ProcessName | Outfile c:\users\me\Desktop\Processes.txt
$Processes = 'c:\Users\me\Desktop\Processes.txt'
$Output_file = 'c:\Users\me\Desktop\Extracted.txt'
$Sorted = 'c:\Users\me\Desktop\Sorted.txt'
$Authorized = 'c:\Users\me\Desktop\Authorized.txt'
$regex = '.{1,}'
select-string -Path $Processes -Pattern $regex |% { $_.Matches } |% { $_.Value } > $Output_file
Get-Content $Output_file | Sort-Object -Unique > $Sorted
$dif = Compare-Object -ReferenceObject $(Get-Content $Sorted) -DifferenceObject $(get-content $Authorized) -IncludeEqual
$lineNumber = 1
foreach ($difference in $dif)
{
if ($difference.SideIndicator -ne "==")
{
Write-Output "The Process at Line $linenumber is an Outlier"
}
$lineNumber ++
}
Remove-Item c:\Users\me\Desktop\Processes.txt
Remove-Item c:\Users\me\Desktop\Extracted.txt
Write-Output "The Results are Stored in $Sorted"
From the length and complexity of your script, I feel like I'm missing something, but your description seems clear
Running process names:
$ProcessNames = #(Get-Process | Select-Object -ExpandProperty Name)
.. which aren't blank: $ProcessNames = $ProcessNames | Where-Object {$_ -ne ''}
List of authorised names from a file:
$AuthorizedNames = Get-Content 'c:\Users\me\Desktop\Authorized.txt'
Compare:
$UnAuthorizedNames = $ProcessNames | Where-Object { $_ -notin $AuthorizedNames }
optional output to file:
$UnAuthorizedNames | Set-Content out.txt
or in the shell:
#(gps).Name -ne '' |? { $_ -notin (gc authorized.txt) } | sc out.txt
1 2 3 4 5 6 7 8
1. #() forces something to be an array, even if it only returns one thing
2. gps is a default alias of Get-Process
3. using .Property on an array takes that property value from every item in the array
4. using an operator on an array filters the array by whether the items pass the test
5. ? is an alias of Where-Object
6. -notin tests if one item is not in a collection
7. gc is an alias of Get-Content
8. sc is an alias of Set-Content
You should use Set-Content instead of Out-File and > because it handles character encoding nicely, and they don't. And because Get-Content/Set-Content sounds like a memorable matched pair, and Get-Content/Out-File doesn't.

Count tabs per line and return the lines with too many tabs

Looking for a PowerShell script that looks in a text file for rows that have too many (or too few) tabs.
I found this PowerShell script that does exactly what I want (almost).
This counts the number of tabs per row:
Get-Content test.txt | ForEach-Object {
($_ | Select-String `t -all).matches | Measure-Object | Select-Object count
}
Can someone extend/modify/re-write this to return only the rows (with row numbers) that have more than, or less than, X number of tabs per row?
Don't use Get-Content before piping to Select-String, you'll lose contextual information about each line.
Instead, use the -Path parameter with Select-String:
$Tabs = Select-String -Path .\test.txt -Pattern "`t" -AllMatches
$Tabs |Select-Object LineNumber,Line,#{Name='TabCount';Expression={ $_.Matches.Count }}
To return only the ones where the number of tabs is greater than $x, use Where-Object:
$x = 3
$Tabs |Where-Object { $_.TabCount -ge $x} | Select-Object -ExpandProperty Line
If you just want a quick overview of the distribution, you could also use Group-Object:
Get-Content .\test.txt | Group-Object { "{0} tabs" -f [regex]::Matches($_,"`t").Count }
Lots of ways to do this. Get-Content works just fine for me and we create a custom object that you can then filter as desired.
Get-Content test.txt | ForEach-Object{
New-Object PSObject -Property #{
Line = $_
LineNumber = $_.ReadCount
NumberofTabs = [regex]::matches($_,"`t").count
}
}
Use the .net regex method to count the tabs returned and populate a value based on the result.
NumberofTabs Number Line
------------ ------ ----
8 1 ;lkjasfdsa
8 2 asdfasdf
4 3 asdfasdfasdfa
2 4 fasdfjasdlfjas;l
Now you can use PowerShell to filter as you see fit.
} | Where-Object { $_.NumberofTabs -ne 4}
So if 4 was the perfect number then line 3 would be ommited from the results.

Using powershell, how do I extract a 7-digit number from a subject-line (of an email ), regular expressions?

I have the following code which lists the first 5 items in the Inbox folder (of Outlook).
How would I extract only the number portion of it( say - 7 digit arbitrary numberss, which are embedded within other text)? Then using Powershell commands, I'd really like to take those extracted numbers and dump them to a CSV file(thus, they can be easily incorporated into an existing spreadsheet I use).
Here's what I tried :
$outlook = new-object -com Outlook.Application
$sentMail = $outlook.Session.GetDefaultFolder(6) # == olFolderInbox
$sentMail.Items | select -last 10 TaskSubject # ideally, grabbing first 20
$matches2 = "\d+$"
$res = gc $sentMail.Items | ?{$_ -match $matches2 | %{ $_ -match $matches2 | out-null; $matches[1] }
but this does not run correctly, but rather .. keeps me hanging with awaiting-input symbol: like so :
>>
>>
>>
Do I need to perhaps create a separate variable in between the 1st part and 2nd part?
Not sure what the $matches variable is for but try to replace your last line with something like below.
For Subject Line Items:
$sentMail.Items | % { $_.TaskSubject | Select-String -Pattern '^\d{3}-\d{3}-\d{4}' | % {([string]$_).Substring(0,12)} }
For Message Body Items:
$sentMail.Items | % { ($_.Body).Split("`n") | Select-String -Pattern '^\d{3}-\d{3}-\d{4}' |% {([string]$_).Substring(0,12)} }
Here is a refrence to Select-String which I use pretty often.
https://technet.microsoft.com/library/hh849903.aspx
Here is a reference to the Phone number portion which I have never used but found pretty cool.
http://blogs.technet.com/b/heyscriptingguy/archive/2011/03/24/use-powershell-to-search-a-group-of-files-for-phone-numbers.aspx
Good luck!
Here is an edited version for 7 digit extraction via subject line. This assumes the number has a space on each side but can be modified a bit if necessary. You may also want to adjust the depth by changing the -First portion to Select * or just making 100 deeper in range.
$outlook = New-Object -com Outlook.Application
$Mail = $outlook.Session.GetDefaultFolder(6) # Folder Inbox
$Mail.Items | select -First 100 TaskSubject |
% { $_.TaskSubject | Select-String -Pattern '\s\d{7}\s'} |
% {((Select-String -InputObject $_ -Pattern '\s\d{7}\s').Line).split(" ") |
% {if(($_.Length -eq 7) -and ($_ -match '\d{7}')) {$_ | Out-File -FilePath "C:\Temp\SomeFile.csv" -Append}}}
Some of this you have already addressed / figured out but I wanted to explain the issues with your current code.
If you expect multiple matches and want to return those then you would need to use Select-String with the -AllMatches parameter. Your regex, in your example, is currently looking for a sequence of digits at the end of the subject. That would only return one match so lets looks at the issues with your code.
$sentMail.Items | select -last 10 TaskSubject
You are filtering the last 10 items but you are not storing those for later use so they would merely be displayed on screen. We cover a solution later.
One of the primary reasons for using -match is to get the Boolean value that is returned for code like if blocks and where clauses. You can still use it in the way you intended. Looking at the current code in question:
$res = gc $sentMail.Items | ?{$_ -match $matches2 | %{ $_ -match $matches2 | out-null; $matches[1] }
The two big issues with this are you are calling Get-Content(gc) on each item. Get-Content is for pulling file data which $sentMail.Items is not. You also having a large where block. Where blocks will pass data to the output steam based on a true or false condition. Your malformed statement ?{$_ -match $matches2 | %{ $_ -match $matches2 | out-null; $matches[1] } wont do this... at least not well.
$outlook = new-object -com Outlook.Application
$sentMail = $outlook.Session.GetDefaultFolder(6) # == olFolderInbox
$matches2 = "\d+$"
$sentMail.Items | select -last 10 -ExpandProperty TaskSubject | ?{$_ -match $matches2} | %{$Matches[0]}
Take the last 10 email subjects and check if either of them match the regex string $matches2. If they do then return the string match to standard output.

compare columns in two csv files

With all of the examples out there you would think I could have found my solution. :-)
Anyway, I have two csv files; one with two columns, one with 4. I need to compare one column from each one using powershell. I thought I had it figured out but when I did a compare of my results, it comes back as false when I know it should be true. Here's what I have so far:
$newemp = Import-Csv -Path "C:\Temp\newemp.csv" -Header login_id, lastname, firstname, other | Select-Object "login_id"
$ps = Import-Csv -Path "C:\Temp\Emplid_LoginID.csv" | Select-Object "login id"
If ($newemp -eq $ps)
{
write-host "IDs match" -forgroundcolor green
}
Else
{
write-host "Not all IDs match" -backgroundcolor yellow -foregroundcolor black
}
I had to specifiy headers for the first file because it doesn't have any. What's weird is that I can call each variable to see what it holds and they end up with the same info but for some reason still comes up as false. This occurs even if there is only one row (not counting the header row).
I started to parse them as arrays but wasn't quite sure that was the right thing. What's important is that I compare row1 of the first file with with row1 of the second file. I can't just do a simple -match or -contains.
EDIT: One annoying thing is that the variables seem to hold the header row as well. When I call each one, the header is shown. But if I call both variables, I only see one header but two rows.
I just added the following check but getting the same results (False for everything):
$results = Compare-Object -ReferenceObject $newemp -DifferenceObject $ps -PassThru | ForEach-Object { $_.InputObject }
Using latkin's answer from here I think this would give you the result set you're looking for. As per latkin's comment, the property comparison is redundant for your purposes but I left it in as it's good to know. Additionally the header is specified even for the csv with headers to prevent the header row being included in the comparison.
$newemp = Import-Csv -Path "C:\Temp\_sotemp\Book1.csv" -Header loginid |
Select-Object "loginid"
$ps = Import-Csv -Path "C:\Temp\_sotemp\Book2.csv" -Header loginid |
Select-Object "loginid"
#get list of (imported) CSV properties
$props1 = $newemp | gm -MemberType NoteProperty | select -expand Name | sort
$props2 = $ps | gm -MemberType NoteProperty | select -expand Name | sort
#first check that properties match
#omit this step if you know for sure they will be
if(Compare-Object $props1 $props2){
throw "Properties are not the same! [$props1] [$props2]"
}
#pass properties list to Compare-Object
else{
Compare-Object $newemp $ps -Property $props1
}
In the second line, I see there a space "login id" and the first line doesn't have it. Could that be an issue. Try having the same name for the headers in the .csv files itself. And it works for without providing header or select statements. Below is my experiment based upon your input.
emp.csv
loginid firstname lastname
------------------------------
abc123 John patel
zxy321 Kohn smith
sdf120 Maun scott
tiy123 Dham rye
k2340 Naam mason
lk10j5 Shaan kelso
303sk Doug smith
empids.csv
loginid
-------
abc123
zxy321
sdf120
tiy123
PS C:\>$newemp = Import-csv C:\scripts\emp.csv
PS C:\>$ps = Import-CSV C:\scripts\empids.csv
PS C:\>$results = Compare-Object -ReferenceObject $newemp -DifferenceObject $ps | foreach { $_.InputObject}
Shows the difference objects that are not in $ps
loginid firstname lastname SideIndicator
------- --------- -------- -------------
k2340 Naam mason <=
lk10j5 Shaan kelso <=
303sk Doug smith <=
I am not sure if this is what you are looking for but i have used the PowerShell to do some CSV formatting for myself.
$test = Import-Csv .\Desktop\Vmtools-compare.csv
foreach ($i in $test) {
foreach ($n in $i.name) {
foreach ($m in $test) {
$check = "yes"
if ($n -eq $m.prod) {
$check = "no"
break
}
}
if ($check -ne "no") {$n}
}
}
this is how my excel csv file looks like:
prod name
1 3
2 5
3 8
4 2
5 0
and script outputs this:
8
0
so basically script takes each number under Name column and then checks it against prod column. If the number is there then it won't display else it will display that number.
I have also done it the opposite way:
$test = Import-Csv c:\test.csv
foreach ($i in $test) {
foreach ($n in $i.name) {
foreach ($m in $test) {
$check = "yes"
if ($n -eq $m.prod) {echo $n}
}
}
}
this is how my excel csv looks like:
prod name
1 3
2 5
3 8
4 2
5 0
and script outputs this:
3
5
2
so script shows the matching entries only.
You can play around with the code to look at different columns.

Adding numbers into 2 totals and put into each its own variable

Hope you can help me with this little puzzle.
I have ONE txt file looking like this:
firstnumbers
348.92
237
230
329.31
secondnumbers
18.21
48.92
37
30
29.31
So a txt file with one Column that has 2 strings and some numbers on each line.
I want to take the total of each column and put it into each variable like say $a and $b
Yes it is 1 column, just to make sure no misunderstanding
It's pretty easy, if I use 2 files with each column of numbers without the headers(strings)
$a = (Get-Content 'firstnumbers.txt' | Measure-Object -Sum).Sum
$b = (Get-Content 'secondnumbers.txt' | Measure-Object -Sum).Sum
But it would be a little more cool to have them in one txt file, like the aforementioned with a header over each row of numbers.
I've tried removing the the headers with i.e. $a.Replace("first", $null).Replace("sec", $null) and then doing a $b.Split(" ")[1,2,3,4,5] ending with | measure -sum
That gives me the correct number of firstnumbers - but it won't work if I don't keep the specific set of numbers each time. They'll change and there's gonna be more or less of them.
It should be pretty easy I'm guessing. I just can't to seem wrap my head around it at the moment.
Any advice would be awesome!
cheers
Something like this should work:
$file = "C:\path\to\your.txt"
[IO.File]::ReadAllText($file) | % {
$_ -replace "`n+([0-9])", ' $1' -split "`n"
} | ? { $_ -ne "" } | % {
$a = $_ -split " ", 2
$v = $a[1] -split " " | Measure-Object -Sum
"{0}`t{1}" -f ($a[0], $v.Sum)
}
Output:
firstnumbers 1145,23
secondnumbers 163,44
Here's another approach, rather than parsing the text as one big blob, you could test each line to see if it contains a # or text, if it's text, then it triggers the creation of a new entry in a hashtable where the sums are stored:
# C:\Temp> get-content .\numbers.txt | foreach{
$val=0;
if([Decimal]::TryParse($_,[ref]$val)){
$sums[$key]+=$val
}else{
$sums += #{"$_"=0}; #add new entry to hashtable
$key=$_;}
} -end {$sums}
Name Value
---- -----
secondnumbers 163.44
firstnumbers 1145.23
Edit: As noted in the comments, the $sums variable persists for each run which causes problems if you run this command twice. You could call Remove-variable sums after each run, or add it to the end processing block like this:
# C:\Temp> get-content .\numbers.txt | foreach{
$val=0;
if([Decimal]::TryParse($_,[ref]$val)){
$sums[$key]+=$val
}else{
$sums += #{"$_"=0}; #add new entry to hashtable
$key=$_;}
} -end {$sums; remove-variable sums;}