Pull MessageTrace data from O365 RestUri - powershell

I'm trying to record Message delivery/failure information in O365. I have over 250K mailboxes and the message I'm trying to trace is a global email sent to a root DL with lots of nested DLs.
I'm trying the below piece.
$Root = "https://reports.office365.com/ecp/reportingwebservice/reporting.svc/"
$Format = "`$format=JSON"
$WebService = "MessageTrace"
$Select = "`$select=RecipientAddress,Status"
$Filter = "`$filter=MessageId eq 'xxxxxxxxxxxxxxxxxxx#xxxx.xxx.xxxx.OUTLOOK.COM' and Status eq 'Failed'"
# Build report URL
$url = ($Root + $WebService + "/?" + $Select + "&" + $Filter + "&" + $Format)
$sens = $null
Do {
$sens = Invoke-RestMethod -Credential $cred -uri $url
$sens.d.results.Count
$sens.d.results | select -Last 1 -ExpandProperty RecipientAddress| ft -Wrap
if ($sens.d.__next) {
$url = ($sens.d.__next + "&" + $Format)
}
} While ($sens.d.__next -ne $null)
For sample message trace in my test domain, I should have:
19 Delivered eventd
14 Expanded events
5001 Failed events.
I hit a problem with PageSize as the default limit is 2000. Filter with Delivered & Expanded gives me accurate results as it’s completed on the first iteration.
But Failed events, as it has to break into 3 pages, is not fetching the data correctly.
If my understanding is correct, I should see 2 iterations with 2000 entries each and with __next containing $skiptoken=1999 and 3999 respectively and the last iteration with 1001 entries without __next but I keep getting __next even after the 3rd iteration and the $skiptoken keeps increasing over 10000s.
It seems to be looping with the same results.
2000
sens2500#contoso.local
2000
sens939#contoso.local
2000
sens1183#contoso.local
2000
sens214#contoso.local
2000
sens1183#contoso.local
2000
sens1423#contoso.local
I could see the results entries are not unique between 1st, 2nd, and 3rd iterations either. It seems like it just pulls a random 2000 entries on each attempt.
I tried to do an [IN] with $OrderBy=RecipientAddress to see if that makes any difference but I wasn't able to make it work successfully.
Any help on this?

Related

How to speed up processing of ~million lines of text in log file

I am trying to parse a very large log file that consists of space delimited text across about 16 fields. Unfortunately the app logs a blank line in between each legitimate one (arbitrarily doubling the lines I must process). It also causes fields to shift because it uses space as both a delineator as well as for empty fields. I couldn't get around this in LogParser. Fortunately Powershell affords me the option to reference fields from the end as well making it easier to get later fields affected by shift.
After a bit of testing with smaller sample files, I've determined that processing line by line as the file is streaming with Get-Content natively is slower than just reading the file completely using Get-Content -ReadCount 0 and then processing from memory. This part is relatively fast (<1min).
The problem comes when processing each line, even though it's in memory. It is taking hours for a 75MB file with 561178 legitimate lines of data (minus all the blank lines).
I'm not doing much in the code itself. I'm doing the following:
Splitting line via space as delimiter
One of the fields is an IP address that I am reverse DNS resolving, which is obviously going to be slow. So I have wrapped this into more code to create an in-memory arraylist cache of previously resolved IPs and pulling from it when possible. The IPs are largely the same so after a few hundred lines, resolution shouldn't be an issue any longer.
Saving the needed array elements into my pscustomobject
Adding pscustomobject to arraylist to be used later.
During the loop I'm tracking how many lines I've processed and outputting that info in a progress bar (I know this adds extra time but not sure how much). I really want to know progress.
All in all, it's processing some 30-40 lines per second, but obviously this is not very fast.
Can someone offer alternative methods/objectTypes to accomplish my goals and speed this up tremendously?
Below are some samples of the log with the field shift (Note this is a Windows DNS Debug log) as well as the code below that.
10/31/2022 12:38:45 PM 2D00 PACKET 000000B25A583FE0 UDP Snd 127.0.0.1 6c94 R Q [8385 A DR NXDOMAIN] AAAA (4)pool(3)ntp(3)org(0)
10/31/2022 12:38:45 PM 2D00 PACKET 000000B25A582050 UDP Snd 127.0.0.1 3d9d R Q [8081 DR NOERROR] A (4)pool(3)ntp(3)org(0)
NOTE: the issue in this case being [8385 A DR NXDOMAIN] (4 fields) vs [8081 DR NOERROR] (3 fields)
Other examples would be the "R Q" where sometimes it's " Q".
$Logfile = "C:\Temp\log.txt"
[System.Collections.ArrayList]$LogEntries = #()
[System.Collections.ArrayList]$DNSCache = #()
# Initialize log iteration counter
$i = 1
# Get Log data. Read entire log into memory and save only lines that begin with a date (ignoring blank lines)
$LogData = Get-Content $Logfile -ReadCount 0 | % {$_ | ? {$_ -match "^\d+\/"}}
$LogDataTotalLines = $LogData.Length
# Process each log entry
$LogData | ForEach-Object {
$PercentComplete = [math]::Round(($i/$LogDataTotalLines * 100))
Write-Progress -Activity "Processing log file . . ." -Status "Processed $i of $LogDataTotalLines entries ($PercentComplete%)" -PercentComplete $PercentComplete
# Split line using space, including sequential spaces, as delimiter.
# NOTE: Due to how app logs events, some fields may be blank leading split yielding different number of columns. Fortunately the fields we desire
# are in static positions not affected by this, except for the last 2, which can be referenced backwards with -2 and -1.
$temp = $_ -Split '\s+'
# Resolve DNS name of IP address for later use and cache into arraylist to avoid DNS lookup for same IP as we loop through log
If ($DNSCache.IP -notcontains $temp[8]) {
$DNSEntry = [PSCustomObject]#{
IP = $temp[8]
DNSName = Resolve-DNSName $temp[8] -QuickTimeout -DNSOnly -ErrorAction SilentlyContinue | Select -ExpandProperty NameHost
}
# Add DNSEntry to DNSCache collection
$DNSCache.Add($DNSEntry) | Out-Null
# Set resolved DNS name to that which came back from Resolve-DNSName cmdlet. NOTE: value could be blank.
$ResolvedDNSName = $DNSEntry.DNSName
} Else {
# DNSCache contains resolved IP already. Find and Use it.
$ResolvedDNSName = ($DNSCache | ? {$_.IP -eq $temp[8]}).DNSName
}
$LogEntry = [PSCustomObject]#{
Datetime = $temp[0] + " " + $temp[1] + " " + $temp[2] # Combines first 3 fields Date, Time, AM/PM
ClientIP = $temp[8]
ClientDNSName = $ResolvedDNSName
QueryType = $temp[-2] # Second to last entry of array
QueryName = ($temp[-1] -Replace "\(\d+\)",".") -Replace "^\.","" # Last entry of array. Replace any "(#)" characters with period and remove first period for friendly name
}
# Add LogEntry to LogEntries collection
$LogEntries.Add($LogEntry) | Out-Null
$i++
}
Here is a more optimized version you can try.
What changed?:
Removed Write-Progress, especially because it's not known if Windows PowerShell is used. PowerShell versions below 6 have a big performance impact with Write-Progress
Changed $DNSCache to Generic Dictionary for fast lookups
Changed $LogEntries to Generic List
Switched from Get-Content to switch -Regex -File
$Logfile = 'C:\Temp\log.txt'
$LogEntries = [System.Collections.Generic.List[psobject]]::new()
$DNSCache = [System.Collections.Generic.Dictionary[string, psobject]]::new([System.StringComparer]::OrdinalIgnoreCase)
# Process each log entry
switch -Regex -File ($Logfile) {
'^\d+\/' {
# Split line using space, including sequential spaces, as delimiter.
# NOTE: Due to how app logs events, some fields may be blank leading split yielding different number of columns. Fortunately the fields we desire
# are in static positions not affected by this, except for the last 2, which can be referenced backwards with -2 and -1.
$temp = $_ -Split '\s+'
$ip = [string] $temp[8]
$resolvedDNSRecord = $DNSCache[$ip]
if ($null -eq $resolvedDNSRecord) {
$resolvedDNSRecord = [PSCustomObject]#{
IP = $ip
DNSName = Resolve-DnsName $ip -QuickTimeout -DnsOnly -ErrorAction Ignore | select -ExpandProperty NameHost
}
$DNSCache[$ip] = $resolvedDNSRecord
}
$LogEntry = [PSCustomObject]#{
Datetime = $temp[0] + ' ' + $temp[1] + ' ' + $temp[2] # Combines first 3 fields Date, Time, AM/PM
ClientIP = $ip
ClientDNSName = $resolvedDNSRecord.DNSName
QueryType = $temp[-2] # Second to last entry of array
QueryName = ($temp[-1] -Replace '\(\d+\)', '.') -Replace '^\.', '' # Last entry of array. Replace any "(#)" characters with period and remove first period for friendly name
}
# Add LogEntry to LogEntries collection
$LogEntries.Add($LogEntry)
}
}
If it's still slow, there is still the option to use Start-ThreadJob as a multithreading approach with chunked lines (like 10000 per job).

summing up average infos with PowerShell

I need to make a percentage of all my VMs who succeeded their backups but weekly but I'm pretty new with all of this and didn't got any courses or formations with PowerShell.
It's already working daily but what I want is to sum up everything and make a percentage of all the VMs that did their backups.
I wanted the script to start every 24 hours, make a weekly report and every 7 days, send a mail about the results. I already did the mail part but I don't know how to do the rest.
Edit
I already did the average script for every day.
$success_rate = 100 - ($nbckp_vms * 100 / $total_vms)
But now that I have 7 days, I want to make this action 7 times, have the result saved each day in a .txt file and then, at the 7th day, make a success rate every week.
So, of course I know it's something like "all the results / number of results * 100" or something like that but, I can't actually make this work on my PowerShell script.
I have these informations with this part of the script:
# Check backup
$body = "*** VMs not backed up last night ***" + "`r`n" + "`r`n"
$total_vms = 0
$nbckp_vms = 0
foreach ($i in $csv1) {
$total_vms++
$VM = $i.VM
$backup = $i.backup
$today = Get-Date -Format "M/d/yyyy"
$yesterday = (Get-Date).AddDays(-1).ToString("M/d/yyyy")
try {
if ($backup -notlike "*$yesterday*" -and `
$backup -notlike "*$today*" -and `
$backup -notlike "No backup*" -and `
$backup -notlike "TiNa backup*"
) {
#Write-Output "$VM have not been backuped last night."
$nbckp_vms++
$body = $body + "$VM" + "`r`n"
}
} catch {
}
}
What I want is to send to myself a weekly mail about the percentage of VMs that succeeded their backups. This is what a normal mail looks like:
*** VMs not backed up last night ***
Machine1
Machine2
Machine3
Machine4
Machine5
Machine6
Machine7
Machine8
Machine9
Machine10
Machine11
Machine12
Machine13
Machine14
Machine15
Machine16
Machine17
Machine18
Machine19
Machine20
Machine21
Machine22
Machine23
Machine24
Machine25
Machine26
Machine27
Machine28
Machine29
Machine30
Machine31
Machine32
Machine33
Machine34
*** Backup success rate for production KPIs ***
Daily success rate = 94.28%
Total VMs = 594
Daily unbacked up VMs = 34
The mail system works perfectly but I just want a weekly thing.
(I gave the VMs generic names)
This is what I tried so far:
$success_rate_weekly = 100 - (($text[1] += $text[2] += $text[3] += $text[4] += $text[5] += $text[6] += $text[7]) /= 7
get-content "E:\PS\Malik\valeurs.txt" | foreach { -split $_ | select -index 4 } | measure -sum
I found the last one on a french forum but none of these two lines worked for me.
(Posted on behalf of the question author)
Thanks to Ansgar Wiechers who gave me the idea of running two scripts, and to something really dumb I found on the internet, this is what I have:
First, I used Windows Task Scheduler to run my first script daily and recover my daily informations. I also used WTS to run my second script monthly who will make a percentage every month before erasing the content of the .txt file
I used this line:
$success_rate_weekly = get-content "E:\PS\Malik\valeurs.txt" | measure -average | select -expand average
And this line helped me to calculate the average of VMs that succeeded their backups.

Efficient way to find and replace many strings in a large text file

The Text file contains a software output on a time domain analysis. 10800 seconds simulation and 50 nodes being considered. We have 540,000 strings to be replaced in 540 MB text file with 4.5 million lines.
Which is currently projected to take more than 4 days. Something is going wrong. Don't know what. Please suggest me a better efficient approach.
Below is the function which does the find and replace.
To replace the string the script goes through the original text file line by line at the same time it generates a duplicate file with replaced strings. So another 540 MB file with 4.5 million lines will be generated at the end of the script.
Function ReplaceStringsInTextFile
{
$OutputfilebyLine = New-Object -typename System.IO.StreamReader $inputFilePathFull
$uPreviousValue = 0
$time = 60
$u = 0; $LastStringWithoutFindResult = 0
$lineNumber = 0
while ($null -ne ($line = $OutputfilebyLine.ReadLine())) {
$lineNumber = $lineNumber + 1
if ($time -le $SimulationTimeSeconds) # time simulation start and end checks
{
# 10800 strings corresponds to one node
# there are 50 nodes.. Thus 540,000 values
# $StringsToFindFileContent contains strings to find 540,000 strings
# $StringsToReplaceFileContent contains strings to replace 540,000 strings
$StringToFindLineSplit = -split $StringsToFindFileContent[$time-60]
$StringToReplaceLineSplit = -split $StringsToReplaceFileContent[$time-60]
if($u -le $NumberofNodes-1)
{
$theNode = $Nodes_Ar[$u]
$StringToFindvalue = $StringToFindLineSplit[$u]
$StringToReplacevalue = $StringToReplaceLineSplit[$u]
if (($line -match $theNode) -And ($line -match $StringToFindvalue)){
$replacedLine = $line.replace($StringToFindvalue,$StringToReplacevalue)
add-content -path $WriteOutputfilePathFull -value "$replacedLine"
$uPreviousValue = $u
$checkLineMatched = 1
if (($line -match $LastNodeInArray)) {
$time = $time + 1
$LastStringWithoutFindResult = 0
}
} elseIf (($line -match $LastNodeInArray) -And ($checkLineMatched -eq 0)) {
$LastStringWithoutFindResult = $LastStringWithoutFindResult + 1
} else {
#"Printing lines without match"
add-content -path $WriteOutputfilePathFull -value "$line"
$checkLineMatched = 0
}
}
if ($checkLineMatched -eq 1) {
# incrementing the value of node index to next one in case the last node is found
$u = $uPreviousValue + 1
if ($u -eq $Nodes_Ar.count) {
$u = 0
$timeElapsed = (get-date -displayhint time) - $startTime
"$($timeElapsed.Hours) Hours $($timeElapsed.Minutes) Minutes $($timeElapsed.Seconds) Seconds"
}
}
}
# Checking if the search has failed for more than three cycles
if ($LastStringWithoutFindResult -ge 5) { # showing error dialog in case of search error
[System.Windows.Forms.MessageBox]::Show("StringToFind Search Fail. Please correct StringToFind values. Aborting now" , "Status" , 0)
$OutputfilebyLine.close()
}
}
$OutputfilebyLine.close()
}
The above function is the last part of the script. Which is taking the most time.
I had run the script in under 10 hours 1 year ago.
Update The script sped up running after 4 hours and suddenly time to complete projection reduced from 4 days to under 3 hours. The script finished running in 7 hours and 9 minutes. However i am not sure what made the sudden change in speed other than asking the question on stack overflow :)
As per the suggestion by https://stackoverflow.com/users/478656/tessellatingheckler
I have avoided writing one line at a time using
add-content -path $WriteOutputfilePathFull -value "$replacedLine"
Instead i am now writing ten thousand lines at a time using add-content
$tenThousandLines = $tenThousandLines + "`n" + $replacedLine
And at the appropriate time I am using add-content to write 10,000 lines at one go like below. The if block follows my methods logic
if ($lineNumber/10000 -gt $tenThousandCounter){
clear-host
add-content -path $WriteOffpipeOutputfilePathFull -value "$tenThousandLines"
$tenThousandLines = ""
$tenThousandCounter = $tenThousandCounter + 1
}
I have encountered system out of memmory exception error when trying to add 15,000 or 25,000 lines at a time. After using this the time required for the operation has reduced from 7 hours to 5 hours. And at another time to 2 hours and 36 minutes.

Powershell/Sharepoint anti flooding script

Extreme powershell newbie here. I appreciate any and all help.
I'm trying to put together a simple anti-flooding script to work with Sharepoint/Powershell. Need it to look at a datetime in a field and compare it to the current datetime then stop execution if within 5 seconds of the last submittal. The method im using now always seems to evaluate to true.
#get system datetime (output format - 06/12/2014 07:57:25)
$a = (Get-Date)
# Get current List Item
$ListItem = $List.GetItemById($ItemID)
$DateToCompare = $ListItem["baseline"].AddMilliseconds(5000)
if ($DateToCompare -gt $a)
{Break}
#set variable to field
$ListItem["baseline"] = $a
#write new item
$ListItem.Update()
Break
I don't have Sharepoint access so I cannot fully test.
Can you verify the datatype of "baseline" attribute?
($ListItem["baseline"]).getType().Name
Are you sure that 5000 millseconds is really being added?
Write-Output "NOW: $($curDate) BASELINE: $($DateToCompare) DIFF: $( ($curDate - $DateToCompare).TotalMilliseconds )"
Why use break rather than let the evaluation naturally end? Below is an alternative way you might restructure you code.
#The difference in Milliseconds acceptable
$threshold = 5000
#Get current date, the formatting depends on what you have defined for output.
$curDate = Get-Date
#Get current list item from SP
$listItem = $List.GetItemById($ItemID)
# Get current List Item's baseline
$DateToCompare = $listItem["baseline"]
Write-Output "NOW: $($curDate) BASELINE: $($DateToCompare) DIFF: $( ($curDate - $DateToCompare).TotalMilliseconds )"
if ( ($curDate - $DateToCompare).TotalMilliseconds -le $threshold ){
#set variable to field
$ListItem["baseline"] = $curDate
#write new item
$ListItem.Update()
} else {
#Outside of threshold
}
So it turns out the script as I gave it above was functional. The issue was the time I was pulling under the (Get-Date) function was the server time (central), rather than the local time (eastern).
#bring server time up to eastern time
$a = (Get-Date).AddMilliseconds(7200000)
# Get current List Item
$ListItem = $List.GetItemById($ItemID)
#take baseline time and add 5 seconds
$DateToCompare = $ListItem["baseline"].AddMilliseconds(5000)
#stop if script has run in the last 5 sec (loop prevention)
if ($DateToCompare -gt $a)
{Break}
#stop if the status hasnt changed
if ($ListItem["baselinestatus"] -eq $ListItem["Status"])
{Break}
#get current activity status
$currentstatus = $ListItem["Status"]
#get current contents of log
$log = $ListItem["Log"]
#append new entry to existing and write it to the log
$newentry = $log + "<br>" + $a + " - " + $currentstatus
#set variable to field
$ListItem["Log"] = $newentry
$ListItem["baseline"] = $a
$ListItem["baselinestatus"] = $currentstatus
#write new item
$ListItem.Update()

Optimizing a script

Info
I've created a script which analyzes the debug logs from Windows DNS Server.
It does the following:
Open debug log using [System.IO.File] class
Perform a regex match on each line
Separate 16 capture groups into different properties inside a custom object
Fills dictionaries and appends to the value of each key to produce statistics
Steps 1 and 2 take the longest. In fact, they take a seemingly endless amount of time, because the file is growing as it is being read.
Problem
Due to the size of the debug log (80,000kb) it takes a very long time.
I believe that my code is fine for smaller text files, but it fails to deal with much larger files.
Code
Here is my code: https://github.com/cetanu/msDnsStats/blob/master/msdnsStats.ps1
Debug log preview
This is what the debug looks like (including the blank lines)
Multiply this by about 100,000,000 and you have my debug log.
21/03/2014 2:20:03 PM 0D0C PACKET 0000000005FCB280 UDP Rcv 202.90.34.177 3709 Q [1001 D NOERROR] A (2)up(13)massrelevance(3)com(0)
21/03/2014 2:20:03 PM 0D0C PACKET 00000000042EB8B0 UDP Rcv 67.215.83.19 097f Q [0000 NOERROR] CNAME (15)manchesterunity(3)org(2)au(0)
21/03/2014 2:20:03 PM 0D0C PACKET 0000000003131170 UDP Rcv 62.36.4.166 a504 Q [0001 D NOERROR] A (3)ekt(4)user(7)net0319(3)com(0)
21/03/2014 2:20:03 PM 0D0C PACKET 00000000089F1FD0 UDP Rcv 80.10.201.71 3e08 Q [1000 NOERROR] A (4)dns1(5)offis(3)com(2)au(0)
Request
I need ways or ideas on how to open and read each line of a file more quickly than what I am doing now.
I am open to suggestions of using a different language.
I would trade this:
$dnslog = [System.IO.File]::Open("c:\dns.log","Open","Read","ReadWrite")
$dnslog_content = New-Object System.IO.StreamReader($dnslog)
For ($i=0;$i -lt $dnslog.length; $i++)
{
$line = $dnslog_content.readline()
if ($line -eq $null) { continue }
# REGEX MATCH EACH LINE OF LOGFILE
$pattern = $line | select-string -pattern $regex
# IGNORE EMPTY MATCH
if ($pattern -eq $null) {
continue
}
for this:
Get-Content 'c:\dns.log' -ReadCount 1000 |
ForEach-Object {
foreach ($line in $_)
{
if ($line -match $regex)
{
#Process matches
}
}
That will reduce then number of file read operations by a factor of 1000.
Trading the select-string operation will require re-factoring the rest of the code to work with $matches[n] instead of $pattern.matches[0].groups[$n].value, but is much faster. Select-String returns matchinfo objects which contain a lot of additional information about the match (line number, filename, etc.) which is great if you need it. If all you need is strings from the captures then it's wasted effort.
You're creating an object ($log), and then accumulating values into array properties:
$log.date += #($pattern.matches[0].groups[$n].value); $n++
that array addition is going to kill your performance. Also, hash table operations are faster than object property updates.
I'd create $log as a hash table first, and the key values as array lists:
$log = #{}
$log.date = New-Object collections.arraylist
Then inside your loop:
$log.date.add($matches[1]) > $nul)
Then create your object from $log after you've populated all of the array lists.
As a general piece of advise, use the Measure-Command to find out which script blocks take the longest time.
That being said, the sleep process seems a bit weird. If I'm not in error, you sleep 20 ms after each row:
sleep -milliseconds 20
Multiply 20 ms with the log size, 100 million iterations, and you'll get quite a long total sleep time.
Try sleeping after some decent batch size. Try if 10 000 rows is good like so,
if($i % 10000 -eq 0) {
write-host -nonewline "."
start-sleep -milliseconds 20
}