Run command, extract a field, run a resultant command - powershell

Apologies if this is an insanely simple question, but I'm at something at a loss.
What I'm trying to do is take a command output - in this case from NetApp DFM:
dfm event list
ID Source Name Severity Timestamp
------- ------- ------------- ----------- ------------
1 332 volume-online Normal 20 Apr 10:16
2 443 volume-online Normal 20 Apr 10:17
3 3222 volume-online Normal 20 Apr 10:18
I have about 17,000 events - I want to delete them all by ID, by running:
dfm event delete <ID>
I know exactly how I'd do this on Unix (and used to, when this was our platform):
for i in `dfm event list | awk '{print $1}'`
do
dfm event delete $i
done
For bonus points - a 'grep' type criteria? I apologise in advance for the basic nature of the question - I've tried looking on Google for a suitable example, but haven't found anything.
I've made a start by:
dfm event list > dfmevent.txt
foreach ( $line in get-content dfmevent.txt ) {
echo $line
}
But I thought I would ask if there's a better way.

I don't have access to your environment to test but if you are just trying to get access to that first element which is the ID then that should be straight forward.
dfm event list | ForEach-Object{$_.Split(" ",2)[0]} | Where-Object{$_ -match '^\d+$'} | ForEach-Object{
#For Testing
Write-Host "Id: $_ will be deleted"
# Then do something
# dfm event delete $_
}
I'm sure the output is already delimited with new line so sending to file might be redundant.
We take each line and try and split it on the first space. Then pass the first element from that array. Next we ensure that element is indeed a number with a simple regex check. This will ensure that we only get numbers. I had thought about skipping the first two lines but this should work for other occurrences of text as well.
The last loop is for processing that ID. I left a Write-Host there for testing. Assuming you get the id's you are looking for you should just be able to uncomment out that last line with dfm event delete $_

Capturing the output of a DOS command into Powershell is a challenge.
Using a native snapin or module from NetApp would be easier.
might be worth checking out if that link helps
Otherwise, your method of writing to a text file and reading it back in is actually quite a good idea, this is one way of reading it back and pushing the data into the command you need.
$a = get-content dfmevent.txt
foreach ($i in $a) { if ($i.ReadCount -gt 2) { dfm event delete ($i.Substring(0,$i.IndexOf(" "))) } }
This will assign to the variable $result only
$a = get-content dfmevent.txt
$result = #()
foreach ($i in $a) { if ($i.ReadCount -gt 2) { $result += $i.Substring(0,$i.IndexOf(" "))} }
And if you did not want to write to a text file, you could use the .NET method of capturing the output directly
$ProcessInfo = New-Object System.Diagnostics.ProcessStartInfo
$ProcessInfo.FileName = "dfm"
$ProcessInfo.RedirectStandardOutput = $true
$ProcessInfo.UseShellExecute = $false
$ProcessInfo.Arguments = "event list"
$Process = New-Object System.Diagnostics.Process
$Process.StartInfo = $ProcessInfo
$Process.Start() | Out-Null
$Process.WaitForExit()
$output = $Process.StandardOutput.ReadToEnd()

Related

How to speed up processing of ~million lines of text in log file

I am trying to parse a very large log file that consists of space delimited text across about 16 fields. Unfortunately the app logs a blank line in between each legitimate one (arbitrarily doubling the lines I must process). It also causes fields to shift because it uses space as both a delineator as well as for empty fields. I couldn't get around this in LogParser. Fortunately Powershell affords me the option to reference fields from the end as well making it easier to get later fields affected by shift.
After a bit of testing with smaller sample files, I've determined that processing line by line as the file is streaming with Get-Content natively is slower than just reading the file completely using Get-Content -ReadCount 0 and then processing from memory. This part is relatively fast (<1min).
The problem comes when processing each line, even though it's in memory. It is taking hours for a 75MB file with 561178 legitimate lines of data (minus all the blank lines).
I'm not doing much in the code itself. I'm doing the following:
Splitting line via space as delimiter
One of the fields is an IP address that I am reverse DNS resolving, which is obviously going to be slow. So I have wrapped this into more code to create an in-memory arraylist cache of previously resolved IPs and pulling from it when possible. The IPs are largely the same so after a few hundred lines, resolution shouldn't be an issue any longer.
Saving the needed array elements into my pscustomobject
Adding pscustomobject to arraylist to be used later.
During the loop I'm tracking how many lines I've processed and outputting that info in a progress bar (I know this adds extra time but not sure how much). I really want to know progress.
All in all, it's processing some 30-40 lines per second, but obviously this is not very fast.
Can someone offer alternative methods/objectTypes to accomplish my goals and speed this up tremendously?
Below are some samples of the log with the field shift (Note this is a Windows DNS Debug log) as well as the code below that.
10/31/2022 12:38:45 PM 2D00 PACKET 000000B25A583FE0 UDP Snd 127.0.0.1 6c94 R Q [8385 A DR NXDOMAIN] AAAA (4)pool(3)ntp(3)org(0)
10/31/2022 12:38:45 PM 2D00 PACKET 000000B25A582050 UDP Snd 127.0.0.1 3d9d R Q [8081 DR NOERROR] A (4)pool(3)ntp(3)org(0)
NOTE: the issue in this case being [8385 A DR NXDOMAIN] (4 fields) vs [8081 DR NOERROR] (3 fields)
Other examples would be the "R Q" where sometimes it's " Q".
$Logfile = "C:\Temp\log.txt"
[System.Collections.ArrayList]$LogEntries = #()
[System.Collections.ArrayList]$DNSCache = #()
# Initialize log iteration counter
$i = 1
# Get Log data. Read entire log into memory and save only lines that begin with a date (ignoring blank lines)
$LogData = Get-Content $Logfile -ReadCount 0 | % {$_ | ? {$_ -match "^\d+\/"}}
$LogDataTotalLines = $LogData.Length
# Process each log entry
$LogData | ForEach-Object {
$PercentComplete = [math]::Round(($i/$LogDataTotalLines * 100))
Write-Progress -Activity "Processing log file . . ." -Status "Processed $i of $LogDataTotalLines entries ($PercentComplete%)" -PercentComplete $PercentComplete
# Split line using space, including sequential spaces, as delimiter.
# NOTE: Due to how app logs events, some fields may be blank leading split yielding different number of columns. Fortunately the fields we desire
# are in static positions not affected by this, except for the last 2, which can be referenced backwards with -2 and -1.
$temp = $_ -Split '\s+'
# Resolve DNS name of IP address for later use and cache into arraylist to avoid DNS lookup for same IP as we loop through log
If ($DNSCache.IP -notcontains $temp[8]) {
$DNSEntry = [PSCustomObject]#{
IP = $temp[8]
DNSName = Resolve-DNSName $temp[8] -QuickTimeout -DNSOnly -ErrorAction SilentlyContinue | Select -ExpandProperty NameHost
}
# Add DNSEntry to DNSCache collection
$DNSCache.Add($DNSEntry) | Out-Null
# Set resolved DNS name to that which came back from Resolve-DNSName cmdlet. NOTE: value could be blank.
$ResolvedDNSName = $DNSEntry.DNSName
} Else {
# DNSCache contains resolved IP already. Find and Use it.
$ResolvedDNSName = ($DNSCache | ? {$_.IP -eq $temp[8]}).DNSName
}
$LogEntry = [PSCustomObject]#{
Datetime = $temp[0] + " " + $temp[1] + " " + $temp[2] # Combines first 3 fields Date, Time, AM/PM
ClientIP = $temp[8]
ClientDNSName = $ResolvedDNSName
QueryType = $temp[-2] # Second to last entry of array
QueryName = ($temp[-1] -Replace "\(\d+\)",".") -Replace "^\.","" # Last entry of array. Replace any "(#)" characters with period and remove first period for friendly name
}
# Add LogEntry to LogEntries collection
$LogEntries.Add($LogEntry) | Out-Null
$i++
}
Here is a more optimized version you can try.
What changed?:
Removed Write-Progress, especially because it's not known if Windows PowerShell is used. PowerShell versions below 6 have a big performance impact with Write-Progress
Changed $DNSCache to Generic Dictionary for fast lookups
Changed $LogEntries to Generic List
Switched from Get-Content to switch -Regex -File
$Logfile = 'C:\Temp\log.txt'
$LogEntries = [System.Collections.Generic.List[psobject]]::new()
$DNSCache = [System.Collections.Generic.Dictionary[string, psobject]]::new([System.StringComparer]::OrdinalIgnoreCase)
# Process each log entry
switch -Regex -File ($Logfile) {
'^\d+\/' {
# Split line using space, including sequential spaces, as delimiter.
# NOTE: Due to how app logs events, some fields may be blank leading split yielding different number of columns. Fortunately the fields we desire
# are in static positions not affected by this, except for the last 2, which can be referenced backwards with -2 and -1.
$temp = $_ -Split '\s+'
$ip = [string] $temp[8]
$resolvedDNSRecord = $DNSCache[$ip]
if ($null -eq $resolvedDNSRecord) {
$resolvedDNSRecord = [PSCustomObject]#{
IP = $ip
DNSName = Resolve-DnsName $ip -QuickTimeout -DnsOnly -ErrorAction Ignore | select -ExpandProperty NameHost
}
$DNSCache[$ip] = $resolvedDNSRecord
}
$LogEntry = [PSCustomObject]#{
Datetime = $temp[0] + ' ' + $temp[1] + ' ' + $temp[2] # Combines first 3 fields Date, Time, AM/PM
ClientIP = $ip
ClientDNSName = $resolvedDNSRecord.DNSName
QueryType = $temp[-2] # Second to last entry of array
QueryName = ($temp[-1] -Replace '\(\d+\)', '.') -Replace '^\.', '' # Last entry of array. Replace any "(#)" characters with period and remove first period for friendly name
}
# Add LogEntry to LogEntries collection
$LogEntries.Add($LogEntry)
}
}
If it's still slow, there is still the option to use Start-ThreadJob as a multithreading approach with chunked lines (like 10000 per job).

PowerShell clearing memory after finishing

I have a PowerShell script that reads a large CSV file (4GB+), finds certain lines, then writes the lines to other files.
I'm noticing that when it gets to "echo "Processed $datacounter total lines in the $datafile file"" the last line of the script, it doesn't actually finish until 5-10 minutes later.
What is it doing for that period? When it does finish, memory usage drops off significantly. Is there a way to force it to clear memory at the end of the script?
Screenshot of Memory Usage
Screenshot of script timestamps
Here is the final version of my script for reference.
# Get the filename
$datafile = Read-Host "Filename"
$dayofweek = Read-Host "Day of week (IE 1 = Monday, 2 = Tuesday..)"
$campaignWriters = #{}
# Create campaign ID hash table
$campaignByID = #{}
foreach($c in (Import-Csv 'campaigns.txt' -Delimiter '|')) {
foreach($id in ($c.CampaignID -split ' ')) {
$campaignByID[$id] = $c.CampaignName
}
foreach($cname in ($c.CampaignName)) {
$writer = $campaignWriters[$cname] = New-Object IO.StreamWriter($dayofweek + $cname + '_filtered.txt')
if($dayofweek -eq 1) {
$writer.WriteLine("ID1|ID2|ID3|ID4|ID5|ID6|Time|Time-UTC-Sec")
}
}
}
# Display the campaigns
$campaignByID.GetEnumerator() | Sort-Object Value
# Read in data file
$encoding = [Text.Encoding]::GetEncoding('iso-8859-1')
$datareader = New-Object IO.StreamReader($datafile, $encoding)
$datacounter = 0
echo "Starting.."
get-date -Format g
while (!$datareader.EndOfStream) {
$data = $datareader.ReadLine().Split('รพ')
# Find the Campaign in the hashtable
$campaignName = $campaignByID[$data[3]]
if($campaignName) {
$writer = $campaignWriters[$campaignName]
# If a campaign name was returned from the hash, add the line using that campaign's writer
$writer.WriteLine(($data[20,3,5,8,12,14,0,19] -join '|'))
}
$datacounter++;
}
$datareader.Close()
foreach ($writer in $campaignWriters.Values) {
$writer.Close()
}
echo "Done!"
get-date -Format g
echo "Processed $datacounter total lines in the $datafile file"
I'm assuming that campaigns.txt is the mult-gigabyte file you are referring to. If it's the other file(s), this might not make as much sense.
If so, invoking import-csv the inside parenthesis then using the foreach statement to iterate through them is what's driving your memory usage so high. A better alternative would be use a PowerShell pipeline to stream records from the file without needing to keep all of them in memory at the same time. You achieve this by changing the foreach statment into a ForEach-Object cmdlet:
Import-Csv 'campaigns.txt' -Delimiter '|' | ForEach-Object {
foreach($id in ($_.CampaignID -split ' ')) {
$campaignByID[$id] = $_.CampaignName
}
}
The .NET garbage collector is optimized cases where the majority of objects are short-lived. Therefor this change should result in a noticeable performance increase, as well as reduced wind-down time at the end.
I advise against forcing garbage collection with [System.GC]::Collect(), the garbage collector knows best when it should run. The reasons for this are complex, if you really want to know details why this is true, Maoni's blog has a wealth of details about garbage collection in the .NET environment.
It may or may not work, but you can try to tell garbage collection to run:
[System.GC]::Collect()
You don't have fine grained control over it though, and it may help to Remove-Variable or set variables to $null for some things before running it so that there aren't references to the data anymore.

In powershell, i want Ioop twice through a text file but in second loop i want to continue from end of first loop

I have a text file to process. Text file has some configuration data and some networking commands. I want to run all those network commands and redirect output in some log file.
At starting of text file,there are some configuration information like File-name and file location. This can be used for naming log file and location of log file. These line starts with some special characters like '<#:'. just to know that rest of the line is config data about file not the command to execute.
Now, before i want start executing networking commands (starts with some special characters like '<:'), first i want to read all configuration information about file i.e. file name, location, overwrite flag etc. Then i can run all commands and dump output into log file.
I used get-content iterator to loop over entire text file.
Question: Is there any way to start looping over file from a specific line again?
So that i can process config information first (loop till i first encounter command to execute, remember this line number), create log file and then keep running commands and redirect output to log file (loop from last remembered line number).
Config File looks like:
<#Result_File_Name:dump1.txt
<#Result_File_Location:C:\powershell
<:ping www.google.com
<:ipconfig
<:traceroute www.google.com
<:netsh interface ip show config
My powerhsell script looks like:
$content = Get-Content C:\powershell\config.txt
foreach ($line in $content)
{
if($line.StartsWith("<#Result_File_Name:")) #every time i am doing this, even for command line
{
$result_file_arr = $line.split(":")
$result_file_name = $result_file_arr[1]
Write-Host $result_file_name
}
#if($line.StartsWith("<#Result_File_Location:"))#every time i am doing this, even for command line
#{
# $result_file_arr = $line.split(":")
# $result_file_name = $result_file_arr[1]
#}
if( $conf_read_over =1)
{
break;
}
if ($line.StartsWith("<:")) #In this if block, i need to run all commands
{
$items = $line.split("<:")
#$items[0]
#invoke-expression $items[2] > $result_file_name
invoke-expression $items[2] > $result_file_name
}
}
If all the config information starts with <# just process those out first separately. Once that is done you can assume the rest are commands?
# Collect config lines and process
$config = $content | Where-Object{$_.StartsWith('<#')} | ForEach-Object{
$_.Trim("<#") -replace "\\","\\" -replace "^(.*?):(.*)" , '$1 = $2'
} | ConvertFrom-StringData
# Process all the lines that are command lines.
$content | Where-Object{!$_.StartsWith('<#') -and ![string]::IsNullOrEmpty($_)} | ForEach-Object{
Invoke-Expression $_.trimstart("<:")
}
I went a little over board with the config section. What I did was convert it into a hashtable. Now you will have your config options, as they were in file, accessible as an object.
$config
Name Value
---- -----
Result_File_Name dump1.txt
Result_File_Location C:\powershell
Small reconfiguration of your code, with some parts missing, would look like the following. You will most likely need to tweak this to your own needs.
# Collect config lines and process
$config = ($content | Where-Object{$_.StartsWith('<#')} | ForEach-Object{
$_.Trim("<#") -replace "\\","\\" -replace "^(.*?):(.*)" , '$1 = $2'
} | Out-String) | ConvertFrom-StringData
# Process all the lines that are command lines.
$content | Where-Object{!$_.StartsWith('<#') -and ![string]::IsNullOrEmpty($_)} | ForEach-Object{
Invoke-Expression $_.trimstart("<:") | Add-Content -Path $config.Result_File_Name
}
As per your comment you are still curious about your restart loop logic which was part of your original question. I will add this as a separate answer to that. I would still prefer my other approach.
# Use a flag to determine if we have already restarted. Assume False
$restarted = $false
$restartIndexPoint = 4
$restartIndex = 2
for($contentIndex = 0; $contentIndex -lt $content.Length; $contentIndex++){
Write-Host ("Line#{0} : {1}" -f $contentIndex, $content[$contentIndex])
# Check to see if we are on the $restartIndexPoint for the first time
if(!$restarted -and $contentIndex -eq $restartIndexPoint){
# Set the flag so this does not get repeated.
$restarted = $true
# Reset the index to repeat some steps over again.
$contentIndex = $restartIndex
}
}
Remember that array indexing is 0 based when you are setting your numbers. Line 20 is element 19 in the string array for example.
Inside the loop we run a check. If it passes we change the current index to something earlier. The write-host will just print the lines so you can see the "restart" portion. We need a flag to be set so that we are not running a infinite loop.

Compare more than two strings

Here is what I am trying to achieve...
I have to view the ADAM db in VMWARE to see the replication times. My question is how would I compare more than two strings using the compare-object command. I cannot find any articles on more than two values.
This is what I started writing. I am trying to make this as dynamic as possible...
#PORT FOR LDAP
$ldap = 389;
#PATH
$path = 'DC=vdi,DC=vmware,DC=int';
#SERVERS
$vm = #("fqdn" , "fqdn" , "fqdn");
#ARRAY FOR LOOP
$comp = #();
#LOOP FOR ARRAY COMPARE
for($i = 1; $i -le $vm.count; $i++)
{
$comp += repadmin.exe /showrepl $svr":"$ldap $path | Select-String "Last attempt";
}
#CREATE DYNAMIC VARIABLES
for($i = 0; $i -le ($comp.count - 1); $i++)
{
New-Variable -name repl$i -Value $comp[$i];
}
Thank you in advanced!!!
As I mentioned in my comment, your question is too vague for us to provide a good answer for your situation, so I'll focus on "compare more than two strings". To do this, I wuold recommend Group-Object. Ex.
$data = #"
==== INBOUND NEIGHBORS ======================================
CN=Configuration,CN={B59C1E29-972F-455A-BDD5-1FA7C1B7D60D}
....
Last attempt # 2010-05-28 07:29:34 was successful.
CN=Schema,CN=Configuration,CN={B59C1E29-972F-455A-BDD5-1FA7C1B7D60D}
....
Last attempt # 2010-05-28 07:29:34 was successful.
OU=WSFG,DC=COM
....
Last attempt # 2010-05-28 07:29:35 failed, result -2146893008
(0x8009033
0):
"# -split [environment]::NewLine
$comp = $data | Select-String "Last attempt"
$comp | Group-Object
Count Name Group
----- ---- -----
2 Last attempt # 2010-05-28 07:29:34 was successful. { Last atte...
1 Last attempt # 2010-05-28 07:29:35 failed, result -2146893008 { Last atte...
Group-Object and PowerShell is very flexible, so you could customize this to ex. display the servernames and status for the servers that wasn't equal to the rest (ex. count = 1 or not in any of the biggest groups) etc., but I won't spend more time going into details because I have no idea of what you are trying to achieve, so I'll probably just waste both of ours time.
Summary: What I can tell you is the I would proabably (99% sure) use Group-Object to "compare more than two strings".

Powershell get repadmin /istg in array?

Is this possible?
I'm brand new to powershell and am currently in the process of converting a vbscript script to Powershell. The following one-liner command seems to do exactly what the entire vbscript does:
Repadmin /istg
which outputs
Repadmin: running command /istg against full DC ST-DC7.somestuff.com
Gathering topology from site BR-CORP (ST-DC7.somestuff.com):
Site ISTG
================== =================
Portland ST-DC4
Venyu ST-DC5
BR-Office ST-DC3
BR-CORP ST-DC7
The problem is I need to return this info (namely the last 4 lines) as objects which contain a "Site" and "ISTG" field. I tried the following:
$returnValues = Repadmin /istg
$returnValues
But this didin't return anything (possibly because Repadmin writes out the lines instead of actually returning the data?)
Is there a way to get the Info from "Repadmin /istg" into an array?
Here's one possible way, using regular expressions:
$output = repadmin /istg
for ( $n = 10; $n -lt $output.Count; $n++ ) {
if ( $output[$n] -ne "" ) {
$output[$n] | select-string '\s*(\S*)\s*(\S*)$' | foreach-object {
$site = $_.Matches[0].Groups[1].Value
$istg = $_.Matches[0].Groups[2].Value
}
new-object PSObject -property #{
"Site" = $site
"ISTG" = $istg
} | select-object Site,ISTG
}
}
You have to start parsing the 10th item of output and ignore empty lines because repadmin.exe seems to insert superflous line breaks (or at least, PowerShell thinks so).