output content of each line after a search string - powershell

I have a debug log which spits out certain numbers that follow a preset error message.
For instance
08:29:25.178 [DEBUG] Error lookup ID 2834
08:29:25.179 [DEBUG] Error lookup ID 2834
The main reason I want to do this is to be able to then just output the unique instances of this ID (in above example would be just one 2834). It is not possible otherwise as the timestamp in the line makes it unique. So therefore I need to only output the id at the end (in this case 2834).
I currently have following script which works but I am wondering if there is not a more efficient/elegant way to do all this.
$tempfile='tempfile.txt'
$tempfile2='tempfile2.txt'
$tempfile3='tempfile3.txt'
$finalfile='missingIDs.txt'
get-content 20180131.log -ReadCount 1000 |
foreach { $_ -match " Error lookup ID" } > $tempfile
get-content $tempfile | % { $_.Split(' ')[-1] } >$tempfile2
gc $tempfile2 | sort | get-unique > $tempfile3
gc $tempfile3| get-unique > $finalfile

Restating the problem for clarity:
Given lines of input, find " Error lookup ID" followed by a string of numbers, ID. Return all unique ID found in the input.
$testInput = #(
"08:29:25.177 [INFO] system started 5342"
"08:29:25.177 [DEBUG] Error lookup ID 2834"
"08:29:25.178 [TRACE] entered something"
"08:29:25.179 [DEBUG] Error lookup ID 2834"
"08:29:25.179 [DEBUG] Error lookup ID 2836"
)
$testInput | % { if ($_ -match ".*Error lookup ID (\d+)"){$Matches.1} } | Select-Object -Unique

Remove the intermediate text files, and using the pipeline instead.
$finalfile='missingIDs.txt'
Get-Content 20180131.log -ReadCount 1000 |
foreach { $_ -match " Error lookup ID" } |
foreach { $_.Split(' ')[-1]} |
Sort-Object -Unique |
Out-File $finalfile
This makes your whole process more efficient, as there's no disk writes/reads.

Related

how to count split words select-string pattern powershell. text file log life .txt .log

I need to count rows with values ms 2xx (where xx is any number) it can be 200,201,202,258,269 etc. (It has to start with number 2)
Then do it also with numbers 4 and 5.
There In my .txt file with rows like this:
2022.10.20 13:42:01.570 | INFO | Executed action "PERPIRestEPService.Controllers.PERPIController.GetVersion (PERPIRestEPService)" in 4.9487ms
2022.10.20 13:42:01.570 | INFO | Executed endpoint '"PERPIRestEPService.Controllers.PERPIController.GetVersion (PERPIRestEPService)"'
2022.10.20 13:42:01.570 | INFO | Request finished in 5.5701ms 200 application/json; charset=utf-8
2022.10.20 13:42:01.908 | DBUG | Starting HttpMessageHandler cleanup cycle with 4 items
2022.10.20 13:42:01.908 | DBUG | Ending HttpMessageHandler cleanup cycle after 0.0105ms - processed: 4 items - remaining: 0 items
2022.10.20 13:44:30.632 | DBUG | Received data from rabbit: <?xml version="1.0"?>
<TransactionJournal xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://www.datapac.sk/Posybe">
<Header>
<GeneratedDate>2022-10-20T13:44:30.5409065+02:00</GeneratedDate>
<BusinessDate>2022-10-20</BusinessDate>
<SourceSystem>Posybe MCFS</SourceSystem>
<Site>C702</Site>
<Version>1.0</Version>
I need to make this table. It should look like this:
Count Name
----- ----
97 200
278 202
2 205
18 275
I have this code:
$files = Get-ChildItem -Path "C:\Users\krivosik\Desktop\Scripts\logs\PosybeRestEPService\*.log"
#Write-Host $files
$files |
Select-String -Pattern 'ms 2','ms 4' |
Group-Object Pattern -NoElement
#| Select-Object Count, Name
foreach ($file in $files){
$mss=$file | Select-String -Pattern 'ms 2','ms 4'
foreach ($l in $mss){
$s = $l -split(" ")
$s[9]
$s[9] | Group-Object | Select-Object Count, Name
#Group-Object Pattern -NoElement |
#Select-String -Pattern 'ms 2','ms 4'
}
}
I tried to split the row and now I have just the numbers I want to. Now I have to count them and make that table but I dont know how. I should be using Group-object but It just does not work for me.
This is the only output I can get from this code:
Count Name
----- ----
1463 ms 2
1 ms 4
202
Count : 1
Name : 202
202
Count : 1
Name : 202
202
Count : 1
Name : 202
202
Count : 1
Name : 202
Group-Object is the right solution in this scenario. You just have to group by the value that was matched for the name, this way it accounts for the total count of how many were found with that pattern:
Select-String -Path 'C:\Users\krivosik\Desktop\Scripts\logs\PosybeRestEPService\*.log' -Pattern '(?<=\d.*?ms )(2|4|5)\d+' |
Group-Object -Property { $_.Matches.Value } -NoElement
As for the pattern matching, use a "positive lookbehind" to ensure the capture of what would be the name property, making it less error-prone in case something else down the line matches ms 2/4/5.
Postive LookBehind: (?<=\d.*?ms ), ensures this pattern matches before you can match what follows without actually capturing that match.
(2|4|5)\d+, here is the actual capture of the name property with options to match a pattern starting with either 2,4, or 5 only.
Now, Group-Object can take the output of Select-String and group them by the values matched via Matches.Value; i.e. 2xx,4xx,5xx.
Edit: The path on where these values are found is already exposed via Select-String but you have to bring it out using a calculated property. Also, if you want to match exact values of ms 2xx the regex following the positive lookbehind has to be changed to those values:
Select-String -Path 'C:\Users\krivosik\Desktop\Scripts\logs\PosybeRestEPService\*.log' -Pattern '(?<=\d.*?ms )(200|202)' |
Group-Object -Property { $_.Matches.Value } |
Select-Object -Property Count, Name, #{
Name = 'Path'
Expression = { $_.Group[0].Path }
}
The | is used as a delimiter for an "or" RegEx operator, so it will match exactly 200, or 202.
If you want to add more value to match exactly, just separated them by the | delimiter inside the ().

Powershell get all errors not just first

I have a command like so:
(($remoteFiles | Where-Object { -not ($_ | Select-String -Quiet -NotMatch -Pattern '^[a-f0-9]{32}( )') }) -replace '^[a-f0-9]{32}( )', '$0= ' -join "`n") | ConvertFrom-StringData
sometimes it throws a
ConvertFrom-StringData : Data item 'a3512c98c9e159c021ebbb76b238707e' in line 'a3512c98c9e159c021ebbb76b238707e = My Pictures/Tony/Automatic Upload/Tony’s iPhone/2022-10-08 21-46-21 (2).mov'
is already defined.
BUT I believe there to be more and the error is only thrown on the FIRST occurrence, is there a way to get all of the errors so I can act upon them?
is there a way to get all of the errors
I'm afraid there is not, because what ConvertFrom-StringData reports on encountering a problem is a statement-terminating error, which means that it aborts its execution instantly, without considering further input.
You'd have to perform your own analysis of the input in order to detect multiple problems, such as duplicate keys; e.g.:
#'
a = 1
b = 2
a = 10
c = 3
b = 20
'# | ForEach-Object {
$_ -split '\r?\n' |
Group-Object { ($_ -split '=')[0].Trim() } |
Where-Object Count -gt 1 |
ForEach-Object {
Write-Error "Duplicate key: $($_.Name)"
}
}
Output:
Write-Error: Duplicate key: a
Write-Error: Duplicate key: b

How to search a text file for string and perform a lookup using results and the contents of a CSV file?

I've been using a PowerShell script that reads a file and extracts error codes. It quick, simple and does the job that I want it to but I've now been asked to share it with a wide audience so I need to make it a bit more robust.
The problem I've got is that I need to take the output from my script and use it to lookup against a CSV file so that I get a user friendly table at the end that lists:
A count of the how many time each error occurred (in descending order)
The Error code
The corresponding error message that it displays to the end user
This is the line format in the source file, there's normally upwards on 2000 lines
17-12-2016,10:17:44:487{String=ERROR->(12345678)<inData:{device=printer, formName=blah.frm, eject=FALSE, operation=readForm}><outData:{fields=0, CODE=, field1=}> <outError:{CODE=Error103102, extendedErrorCode=-1, VARS=0}>}
This is my current script:
$WS = Read-Host "Enter computer name"
$date = Read-host "Enter Date"
# Search pattern for select-string (date always at the beginning of the line and the error code somewhere further in)
$Pattern = $date+".*<outError:{CODE="
# This find the lines in the files that contain the search pattern
$lines = select-string -path "\\$WS\c$\folder\folder\file.dat" -pattern $Pattern
# This is the specific Error code pattern that I'm looking for in each line
$regex = [regex] 'Error\d{1,6}'
$Var = #()
# Loops through each line and extracts Error code
foreach ($line in $lines) { $a = $line -match $regex
# Adds each match to variable
$Var += $matches.Values
}
# Groups and sorts results in to easy to read format
$Var | group | Sort-Object -Property count -descending
And this is the result it gives me:
Count Name Group
----- ---- -----
24 Error106013 {Error106013, Error106013, Error106013, Error106013...}
14 Error106109 {Error106109, Error106109, Error106109, Error106109...}
12 Error203002 {Error203002, Error203002, Error203002, Error203002...}
The CSV that I need to lookup against is as simple as it gets, with just 2 values per line in the format:
Code,Error message
What I need to get to is something like this:
Count Name Error Message
----- ---- -----
24 Error106013 Error:blah
14 Error106109 Error:blah,blah
12 Error203002 Error:blah,blah,balh
Google has failed me so I'm hoping that there is someone out there that can at the least point me in the right direction.
Not tested but it should work with a simple calculated property - just replace the last line with:
$errorMap = Import-Csv 'your_errorcode.csv'
$Var | Group-Object | Sort-Object -Property count -descending |
Select-Object Count, Name, #{l='Error Message'; e={($errorMap | Where-Object Code -eq $_.Name)."Error message"}}
Note: You also have to replace path to your CSV.

Filtering files by partial name match

I have a network share with 20.000 XML files in the format
username-computername.xml
There are duplicate entries in the form of (when a user received a new comptuer)
user1-computer1.xml
user1-computer2.xml
or
BLRPPR-SKB52084.xml
BLRSIA-SKB50871.xml
S028DS-SKB51334.xml
s028ds-SKB52424.xml
S02FL6-SKB51644.xml
S02FL6-SKB52197.xml
S02VUD-SKB52083.xml
Since im going to manipulate the XMLs later I can't just dismiss properties of the array as at the very least I need the full path. The aim is, if a duplicate is found, the one with the newer timestamp is being used.
Here is a snipet of the code where I need that logic
$xmlfiles = Get-ChildItem "network share"
Here I'm just doing a foreach loop:
foreach ($xmlfile in $xmlfiles) {
[xml]$xmlcontent = Get-Content -Path $xmlfile.FullName -Encoding UTF8
Select-Xml -Xml $xmlcontent -Xpath " "
# create [pscustomobject] etc...
}
Essentially what I need is
if ($xmlfiles.Name.Split("-")[0]) - duplicate) {
# select the one with higher $xmlfiles.LastWriteTime and store either
# the full object or the $xmlfiles.FullName
}
Ideally that should be part of the foreach loop to not to have to loop through twice.
You can use Group-Object to group files by a custom attribute:
$xmlfiles | Group-Object { $_.Name.Split('-')[0] }
The above statement will produce a result like this:
Count Name Group
----- ---- -----
1 BLRPPR {BLRPPR-SKB52084.xml}
1 BLRSIA {BLRSIA-SKB50871.xml}
2 S028DS {S028DS-SKB51334.xml, s028ds-SKB52424.xml}
2 S02FL6 {S02FL6-SKB51644.xml, S02FL6-SKB52197.xml}
1 S02VUD {S02VUD-SKB52083.xml}
where the Group property contains the original FileInfo objects.
Expand the groups in a ForEach-Object loop, sort each group by LastWriteTime, and select the most recent file from it:
... | ForEach-Object {
$_.Group | Sort-Object LastWriteTime -Desc | Select-Object -First 1
}

Select-String sometimes results in "System.Object[]"

I'm working on a script that combines parts of two text files. These files are not too large (about 2000 lines each).
I'm seeing strange output from select-string that I don't think should be there.
Here's samples of my two files:
CC.csv - 2026 lines
LS126L47L6/1L2#519,07448,1,B
LS126L47L6/1R1-1#503,07449,1,B
LS126L47L6/1L3#536,07450,1,B
LS126L47L6/2R1#515,07451,1,B
LS126L47L6/10#525,07452,1,B
LS126L47L6/1L4#538,07453,1,B
GI.txt - 1995 lines
07445,B,SH,1
07446,B,SH,1
07448,B,SH,1
07449,B,SH,1
07450,B,SH,1
07451,B,SH,1
07452,B,SH,1
07453,B,SH,1
07454,B,SH,1
And here's a sample of the output file:
output in myfile.csv
LS126L47L6/3R1#516,07446,1,B
LS126L47L6/1L2#519,07448,1,B
LS126L47L6/1R1-1#503,07449,1,B
System.Object[],B
LS126L47L6/2R1#515,07451,1,B
This is the script I'm using:
sc ./myfile.csv "col1,col2,col3,col4"
$mn = gc cc.csv | select -skip 1 | % {$_.tostring().split(",")[1]}
$mn | % {
$a = (gc cc.csv | sls $_ ).tostring() -replace ",[a-z]$", ""
if (gc GI.txt | sls $_ | select -first 1)
{$b = (gc GI.txt | sls $_ | select -first 1).tostring().split(",")[1]}
else {$b = "NULL"
write-host "$_ is not present in GI file"}
$c = $a + ',' + $b
ac ./myfile.csv -value $c
}
The $a variable is where I am sometimes seeing the returned string as System.Object[]
Any ideas why? Also, this script takes quite some time to finish. Any tips for a newb on how to speed it up?
Edit: I should add that I've taken one line from the cc.csv file, saved in a new text file, and run through the script in console up through assigning $a. I can't get it to return "system.object[]".
Edit 2: After follow the advice below and trying a couple of things I've noticed that if I run
$mn | %{(gc cc.csv | sls $_).tostring()}
I get System.Object[].
But if I run
$mn | %{(gc cc.csv | sls $_)} | %{$_.tostring()}
It comes out fine. Go figure.
The problem is caused by a change in multiplicity of matches. If there are multiple matching elements an Object[] array (of MatchInfo elements) is returned; a single matching element results in a single MatchInfo object (not in an array); and when there are no matches, null is returned.
Consider these results, when executed against the "cc.csv" test-data supplied:
# matches many
(gc cc.csv | Select-String "LS" ).GetType().Name # => Object[]
# matches one
(gc cc.csv | Select-String "538").GetType().Name # => MatchInfo
# matches none
(gc cc.csv | Select-String "FAIL") # => null
The result of calling ToString on Object[] is "System.Object[]" while the result is a more useful concatenation of the matched values when invoked directly upon a MatchInfo object.
The immediate problem can be fixed with selected | Select -First 1, which will result in a MatchInfo being returned for the first two cases. Select-String will still search the entire input - extra results are simply discarded.
However, it seems like the look-back into "cc.csv" (with the Select-String) could be eliminated entirely as that is where $_ originally comes from. Here is a minor [untested] adaptation, of what it may look like:
gc cc.csv | Select -Skip 1 | %{
$num = $_.Split(",")[1]
$a = $_ -Replace ",[a-z]$", ""
# This is still O(m*n) and could be improved with a hash/set probe.
$gc_match = Select-String $num -Path gi.csv -SimpleMatch | Select -First 1
if ($gc_match) {
# Use of "Select -First 1" avoids the initial problem; but
# it /may/ be more appropriate for an error to indicate data problems.
# (Likewise, an error in the original may need further investigation.)
$b = $gc_match.ToString().Split(",")[1]
} else {
$b = "NULL"
Write-Host "$_ is not present in GI file"
}
$c = $a + ',' + $b
ac ./myfile.csv -Value $c
}