Extract data from a log that contains certain pattern

Extract data from a log that contains certain pattern - powershell

I have an Apache log file with lines in this format:
192.168.100.1 - - [13/Dec/2018:15:11:52 -0600] "GET/onabc/soitc/BackChannel/?param=369%2FGetTableEntryList%2F7%2Fonabc-s31%2FHPD%3AIncident%20Management%20Console27%2FDefault%20User%20View%20(Manager)9%2F3020872007%2Resolved%22%20AND%20((%27Assignee%20Login%20ID%27%20%3D%20%22Allen%22)Token=FEIH-MTJQ-H9PR-LQDY-WIEA-ZULM-45FU-P1FK HTTP/1.1"
I need to extract some data from an Apache log file just in cases that the line contain the "login" word and list the IP, date and login ID ("Allen" is the login ID in this case) or save them in another file.
Thanks to your advice I am now using PowerShell to make this works, I have now this:
$Readlog = Get-content -path C:\Example_log.txt
$Results = foreach ($Is_login in $Readlog)
{
if ($Is_login -match 'login')
{
[PSCustomObject]#{
IP = $Is_login.Split(' ')[0]#No need to trim the start.
Date = $Is_login.Split('[]')[1].Split(':')[0]
Hour = $Is_login.Split('[]')[1].Split(' ')[0] -Replace ('\d\d\/\w\w\w\/\d\d\d\d:','')
LoginID = select-string -InputObject $Is_login -Pattern "(?<=3D%20%22)\w{1,}" -AllMatches | % {$_.Matches.Groups[0].Value}
Status = select-string -InputObject $Is_login -Pattern "(?<=%20%3C%20%22)\w{1,}" -AllMatches | % {$_.Matches.Groups[0].Value}
}
}
}
$Results
Thanks to your hints, now I have this results:
IP : 192.168.100.1
Date : 13/Dec/2018
Hour : 15:11:52
LoginID : Allen
Status : Resolved
IP : 192.168.100.30
Date : 13/Dec/2018
Hour : 16:05:31
LoginID : Allen
Status : Resolved
IP : 192.168.100.40
Date : 13/Dec/2018
Hour : 15:11:52
LoginID : ThisisMyIDHank
Status : Resolved
IP : 192.168.100.1
Date : 13/Dec/2018
Hour : 15:11:52
LoginID : Hank
Status : Resolved
Thanks to everyone for your help.

[replaced code using not-really-there asterisks in sample data.]
[powershell v5.1]
this will match any line that contains "login" and then extract the requested info using basic string operators. i tried to use regex, but got bogged down in the pattern matching. [blush] regex would almost certainly be faster, but this is easier for me to understand.
# fake reading in a text file
# in real life, use Get-Content
$InStuff = #'
192.168.100.1 - - [13/Dec/2018:15:11:52 -0600] "GET/onabc/soitc/BackChannel/?param=369%2FGetTableEntryList%2F7%2Fonabc-s31%2FHPD%3AIncident%20Management%20Console27%2FDefault%20User%20View%20(Manager)9%2F3020872007%2Resolved%22%20AND%20((%27Assignee%20Login%20ID%27%20%3D%20%22Allen%22)Token=FEIH-MTJQ-H9PR-LQDY-WIEA-ZULM-45FU-P1FK HTTP/1.1"
100.100.100.100 - - [06/Nov/2018:10:10:10 -0666] "nothing that contains the trigger word"
'# -split [environment]::NewLine
$Results = foreach ($IS_Item in $InStuff)
{
if ($IS_Item -match 'login')
{
# build a custom object with the desired items
# the PSCO makes export to a CSV file very, very easy [*grin*]
# the split pattern is _very fragile_ and will break if the pattern is not consistent
# a regex pattern would likely be both faster and less fragile, but i can't figure one out
[PSCustomObject]#{
IP = $IS_Item.Split(' ')[0].TrimStart('**')
Date = $IS_Item.Split('[}')[1].Split(':')[0]
# corrected for not-really-there asterisks
#LoginName = $IS_Item.Split('*')[-3]
LoginName = (($IS_Item.Split(')')[-2] -replace '%\w{2}') -csplit 'ID')[1]
}
}
}
# show on screen
$Results
# save to a CSV file
$Results |
Export-Csv -LiteralPath "$env:TEMP\Henry_Chinasky_-_LogExtract.CSV" -NoTypeInformation
on screen output ...
IP Date LoginName
-- ---- ---------
192.168.100.1 13/Dec/2018 Allen
csv file content ...
"IP","Date","LoginName"
"192.168.100.1","13/Dec/2018","Allen"

Related

powershell split into hashtable

I'm trying to split string that i got from jira rest api, and i can't find a good way to do it.
API returns this kind of object
com.atlassian.greenhopper.service.sprint.Sprint#3b306c49[id=2792,rapidViewId=920,state=CLOSED,name=ABI
Reports/Support sprint
12,startDate=2018-09-11T09:45:26.622+02:00,endDate=2018-09-27T22:00:00.000+02:00,completeDate=2018-09-28T08:15:41.088+02:00,sequence=2792] com.atlassian.greenhopper.service.sprint.Sprint#c518022[id=2830,rapidViewId=920,state=ACTIVE,name=ABI
Reports/Support sprint
13,startDate=2018-09-28T08:30:26.785+02:00,endDate=2018-10-16T20:30:00.000+02:00,completeDate=,sequence=2830]
What I do with it is
$sprints = $issue.fields.customfield_10012 | Select-String -Pattern '\x5b(.*)\x5d' | ForEach-Object {$_.Matches.Groups[1].Value}
Where $issue.fields.customfield_10012 is the field returned from REST API
This gives me object striped of exesse data which i can convert to hash table using this
Foreach ($sprint in $sprints) {
Try {
#assign values to variable
$sprint = $sprint -split ',' | Out-String
$sprint = ConvertFrom-StringData -StringData $sprint
[int]$sId = $sprint.id
$sName = "N'" + $sprint.name.Replace("'", "''") + "'"
#insert into sql using Invoke-Sqlcmd
}
Catch {
#Write log msg into log table about error in Staging of the worklog for the ticket
$logMsg = "Staging sprint ($sId) for ticket ($key): $($_.Exception.Message)"
Write-Host $logMsg
}
}
But my users are creative and one of the sprint's name was "Sprint 11 - AS,SS,RS" - which breaks my -split ',' and convert to hash table.
Any idea how to split this string to proper hash table?
com.atlassian.greenhopper.service.sprint.Sprint#3b306c49[id=2792,rapidViewId=920,state=CLOSED,name=ABI
Reports/Support sprint
12,startDate=2018-09-11T09:45:26.622+02:00,endDate=2018-09-27T22:00:00.000+02:00,completeDate=2018-09-28T08:15:41.088+02:00,sequence=2792] com.atlassian.greenhopper.service.sprint.Sprint#c518022[id=2830,rapidViewId=920,state=ACTIVE,name=Sprint
11 -
AS,SS,RS,startDate=2018-09-28T08:30:26.785+02:00,endDate=2018-10-16T20:30:00.000+02:00,completeDate=,sequence=2830]

Split the string on commas followed by a word with an equal sign
Working with each of those records on their own line (if this does not match the source data you can still use the logic below) we do a match to split up the data inside the braces [] from that outside. Then we so a split on that internal data as discussed above, with a positive lookahead, to get the hashtables.
$lines = "com.atlassian.greenhopper.service.sprint.Sprint#3b306c49[id=2792,rapidViewId=920,state=CLOSED,name=ABI Reports/Support sprint 12,startDate=2018-09-11T09:45:26.622+02:00,endDate=2018-09-27T22:00:00.000+02:00,completeDate=2018-09-28T08:15:41.088+02:00,sequence=2792]",
"com.atlassian.greenhopper.service.sprint.Sprint#c518022[id=2830,rapidViewId=920,state=ACTIVE,name=Sprint 11 - AS,SS,RS,startDate=2018-09-28T08:30:26.785+02:00,endDate=2018-10-16T20:30:00.000+02:00,completeDate=,sequence=2830]"
$lines | Where-Object{$_ -match "^(?<sprintid>.*)\[(?<details>.*)\]"} | ForEach-Object{
$Matches.details -split ",(?=\w+=)" | Out-String | ConvertFrom-StringData
}
If we use the [pscustomobject] type accelerator when can get an object set right from that.
id : 2792
startDate : 2018-09-11T09:45:26.622+02:00
completeDate : 2018-09-28T08:15:41.088+02:00
sequence : 2792
name : ABI Reports/Support sprint 12
rapidViewId : 920
endDate : 2018-09-27T22:00:00.000+02:00
state : CLOSED
id : 2830
startDate : 2018-09-28T08:30:26.785+02:00
completeDate :
sequence : 2830
name : Sprint 11 - AS,SS,RS
rapidViewId : 920
endDate : 2018-10-16T20:30:00.000+02:00
state : ACTIVE
I have more experience with ConvertFrom-StringData however as TheIncorrigible1 mentions... ConvertFrom-String is also powerful and can reduce some legwork here.

Flatten LaTeX file with PowerShell

I would like to make a simple PowerShell script that:
Takes an input .tex file
replaces occurrences of \input{my_folder/my_file} with the file content itself
outputs a new file
My first step is to match the different file names so as to import them, although the following code outputs not only the file names but also \include{file1}, \include{file2}, etc.
$ms = Get-Content ms.tex -Raw
$environment = "input"
$inputs = $ms | Select-String "\\(?:input|include)\{([^}]+)\}" -AllMatches | Foreach {$_.matches}
Write-Host $inputs
I thought using the parenthesis would create a matched group but this fails, can you to me explain why and what is the proper way of just getting the filenames instead of the full match?
On regex101 this regexp \\(?:input|include)\{([^}]+)\} seems to work fine.

You are looking for Positive lookbehind and positive lookahead:
#'
Some line
\input{my_folder/my_file}
Other line
'# | Select-String '(?<=\\input{)[^}]+(?=})' -AllMatches | Foreach {$_.matches}
Result
Groups : {0}
Success : True
Name : 0
Captures : {0}
Index : 18
Length : 17
Value : my_folder/my_file

.txt Log File Data Extraction Output to CSV with REGEX

I have asked this question before to which LotPings came up with a perfect result. When speaking to the user this relates to I only got half the information in the first place!
Knowing now exactly what is required I will explain the scenario again...
Things to be bear in mind:
Terminal will always be A followed by 3 digits i.e. A123
User ID is at the top of the log file and only appears once, will always start with 89 and be six digits long. the line will always start SELECTED FOR OPERATOR 89XXXX
There are two Date patterns in the file (one is the date of search the other DOB) each needs extracting to separate columns. Not all records have a DOB and some only have the year.
Enquirer doesn't always begin with a 'C' and needs the whole proceeding line.
The search result always has 'Enquiry' and then extraction after that.
Here is the log file
L TRANSACTIONS LOGGED FROM 01/05/2018 0001 TO 31/05/2018 2359
SELECTED FOR OPERATOR 891234
START TERMINAL USER ENQUIRER TERMINAL IP
========================================================================================================================
01/05/18 1603 A555 CART87565 46573 RBCO NPC SERVICES GW/10/0043
SEARCH ENQUIRY RECORD NO : S48456/06P CHAPTER CODE =
RECORD DISPLAYED : S48853/98D
PRINT REQUESTED : SINGLE RECORD
========================================================================================================================
03/05/18 1107 A555 CERT16574 BTD/54/1786 16475
REF ENQUIRY DHF ID : 58/94710W CHAPTER CODE =
RECORD DISPLAYED : S585988/84H
========================================================================================================================
24/05/18 1015 A555 CERT15473 19625 CBRS DDS SERVICES NM/18/0199
IMAGE ENQUIRY NAME : TREVOR SMITH CHAPTER CODE =
DATE OF BIRTH : / /1957
========================================================================================================================
24/05/18 1025 A555 CERT15473 15325 CBRS DDS SERVICES NM/12/0999
REF ENQUIRY DDS ID : 04/102578R CHAPTER CODE =
========================================================================================================================
Here is an example of the log file and what needs to be extracted and under what header.
To a CSV looking like this
The PowerShell Script LotPings has done works perfectly, I just need User ID to be extracted from the top line, to account for not all records having DOB and there being more than one type of enquiry i.e. Ref Enquiry, Search Enquiry, Image Enquiry.
$FileIn = '.\SO_51209341_data.txt'
$TodayCsv = '.\SO_51209341_data.csv'
$RE1 = [RegEx]'(?m)(?<Date>\d{2}\/\d{2}\/\d{2}) (?<Time>\d{4}) +(?<Terminal>A\d{3}) +(?<User>C[A-Z0-9]+) +(?<Enquirer>.*)$'
$RE2 = [RegEx]'\s+SEARCH REF\s+NAME : (?<Enquiry>.+?) (PAGE|CHAPTER) CODE ='
$RE3 = [RegEx]'\s+DATE OF BIRTH : (?<DOB>[0-9 /]+?/\d{4})'
$Sections = (Get-Content $FileIn -Raw) -split "={30,}`r?`n" -ne ''
$Csv = ForEach($Section in $Sections){
$Row= #{} | Select-Object Date, Time, Terminal, User, Enquirer, Enquiry, DOB
$Cnt = 0
if ($Section -match $RE1) {
++$Cnt
$Row.Date = $Matches.Date
$Row.Time = $Matches.Time
$Row.Terminal = $Matches.Terminal
$Row.User = $Matches.User
$Row.Enquirer = $Matches.Enquirer.Trim()
}
if ($Section -match $RE2) {
++$Cnt
$Row.Enquiry = $Matches.Enquiry
}
if ($Section -match $RE3){
++$Cnt
$Row.DOB = $Matches.DOB
}
if ($Cnt -eq 3) {$Row}
}
$csv | Format-Table
$csv | Export-Csv $Todaycsv -NoTypeInformation

With such precise data the first answer could have been:
## Q:\Test\2018\07\12\SO_51311417.ps1
$FileIn = '.\SO_51311417_data.txt'
$TodayCsv = '.\SO_51311417_data.csv'
$RE0 = [RegEx]'SELECTED FOR OPERATOR\s+(?<UserID>\d{6})'
$RE1 = [RegEx]'(?m)(?<Date>\d{2}\/\d{2}\/\d{2}) (?<Time>\d{4}) +(?<Terminal>A\d{3}) +(?<Enquirer>.*)$'
$RE2 = [RegEx]'\s+(SEARCH|REF|IMAGE) ENQUIRY\s+(?<SearchResult>.+?)\s+(PAGE|CHAPTER) CODE'
$RE3 = [RegEx]'\s+DATE OF BIRTH : (?<DOB>[0-9 /]+?/\d{4})'
$Sections = (Get-Content $FileIn -Raw) -split "={30,}`r?`n" -ne ''
$UserID = "n/a"
$Csv = ForEach($Section in $Sections){
If ($Section -match $RE0){
$UserID = $Matches.UserID
} Else {
$Row= #{} | Select-Object Date,Time,Terminal,UserID,Enquirer,SearchResult,DOB
$Cnt = 0
If ($Section -match $RE1){
$Row.Date = $Matches.Date
$Row.Time = $Matches.Time
$Row.Terminal = $Matches.Terminal
$Row.Enquirer = $Matches.Enquirer.Trim()
$Row.UserID = $UserID
}
If ($Section -match $RE2){
$Row.SearchResult = $Matches.SearchResult
}
If ($Section -match $RE3){
$Row.DOB = $Matches.DOB
}
$Row
}
}
$csv | Format-Table
$csv | Export-Csv $Todaycsv -NoTypeInformation
Sample output
Date Time Terminal UserID Enquirer SearchResult DOB
---- ---- -------- ------ -------- ------------ ---
01/05/18 1603 A555 891234 CART87565 46573 RBCO NPC SERVICES GW/10/0043 RECORD NO : S48456/06P
03/05/18 1107 A555 891234 CERT16574 BTD/54/1786 16475 DHF ID : 58/94710W
24/05/18 1015 A555 891234 CERT15473 19625 CBRS DDS SERVICES NM/18/0199 NAME : TREVOR SMITH / /1957
24/05/18 1025 A555 891234 CERT15473 15325 CBRS DDS SERVICES NM/12/0999 DDS ID : 04/102578R

How to get the status each logged in user status details

$dat = query user /server:$SERVER
this query gives below data
USERNAME SESSIONNAME ID STATE IDLE TIME LOGON TIME
>vm82958 console 1 Active 1:28 2/9/2018 9:18 AM
adminhmc 2 Disc 1:28 2/13/2018 10:25 AM
nn82543 3 Disc 2:50 2/13/2018 3:07 PM
I would like to get each independent user details like STATE, USERNAME, ID details. I tried below code but it is not giving any data
foreach($proc in $dat) {
$proc.STATE # This is not working this command not giving any data.
$proc.ID # This is not working this command not giving any data.
}
Please help me on this.
The result of $dat.GetType() is:
IsPublic IsSerial Name BaseType
-------- -------- ---- --------
True True String System.Object

This is very similar to this StackOverflow post, but you have blank fields in your data.
One solution is to deal with this first. Example below but this may break given data that is very different to your example. For a more robust and complete solution see Matt's comment
# replace 20 spaces or more with TWO commas, because it signifies a missing field
$dat2 = $dat.Trim() -replace '\s{20,}', ',,'
# replace 2 spaces or more with a single comma
$datTable = $dat2.Trim() -replace '\s{2,}', ',,'
foreach($proc in $datTable) {
$proc.STATE
$proc.ID
}

Another option is to use fixed Columns with string.Insert , like this:
$content = quser /server:$SERVER
$columns = 14,42,46,54,65 | Sort -Descending
$Delimiter = ','
$dat = $content | % {
$line = $_
$columns | % {
$line = $line.Insert($_, $Delimiter)
}
$line -replace '\s'
} |
ConvertFrom-Csv -Delimiter $Delimiter
And Then:
foreach($proc in $dat) {
$proc.STATE
$proc.ID # Will show the relevant Data
}

Cover flat file using mapping to table (eg. csv)

I have following several hundred entries in text file like this:
DisplayName : John, Smith
UPN : MY3043241#domain.local
Status : DeviceOk
DeviceID : ApplC39HJ3JPDTF9
DeviceEnableOutboundSMS : False
DeviceMobileOperator :
DeviceAccessState : Allowed
DeviceAccessStateReason : Global
DeviceAccessControlRule :
DeviceType : iPhone
DeviceUserAgent : Apple-iPhone4C1/902.206
DeviceModel : iPhone
... about 1500 entries of the above with blank line between each.
I'm looking to create a table with following headers from the above:
DisplayName,UPN,Status,DeviceID,DeviceEnableOutboundSMS,DeviceMobileOperator,DeviceAccessState,DeviceAccessStateReason,DeviceAccessControlRule,DeviceType,DeviceUserAgent,DeviceModel
The question is, is there a tool or some easy way to do this in excel or other application. I know it is easy task to write a simple algorithm, unfortunately I cannot go that route. Powershell would be an option but I'm not good at so if you have any tips on how to approach this that route please, let me know.

Although I am a big fan on Powershell one-liners, it wouldn't be of much help to someone trying to learn or start out with it. More so getting buy-in, in a corporate setting.
I have written a cmdlet, documentation included, to get you started.
function Import-UserDevice {
Param
(
# Path of the text file we are importing records from.
[string] $Path
)
if (-not (Test-Path -Path $Path)) { throw "Data file not found: $Path" }
# Use the StreamReader for efficiency.
$reader = [System.IO.File]::OpenText($Path)
# Create the initial record.
$entry = New-Object -TypeName psobject
while(-not $reader.EndOfStream) {
# Trimming is necessary to remove empty spaces.
$line = $reader.ReadLine().Trim()
# An empty line would indicate we need to start a new record.
if ($line.Length -le 0 -and -not $reader.EndOfStream) {
# Output the completed record and prepare a new record.
$entry
$entry = New-Object -TypeName psobject
continue
}
# Split the line through ':' to get properties names and values.
$entry | Add-Member -MemberType NoteProperty -Name $line.Split(':')[0].Trim() -Value $line.Split(':')[1].Trim()
}
# Output the residual record.
$entry
# Close the file.
$reader.Close()
}
Here's an example of how you could use it to export records to CSV.
Import-UserDevice -Path C:\temp\data.txt | Export-Csv C:\TEMP\report.csv -NoTypeInformation

Powershell answer here... I used a test file C:\Temp\test.txt which contains:
DisplayName : John, Smith
UPN : MY3043241#domain.local
Status : DeviceOk
DeviceID : ApplC39HJ3JPDTF9
DeviceEnableOutboundSMS : False
DeviceMobileOperator :
DeviceAccessState : Allowed
DeviceAccessStateReason : Global
DeviceAccessControlRule :
DeviceType : iPhone
DeviceUserAgent : Apple-iPhone4C1/902.206
DeviceModel : iPhone
DisplayName : Mary, Anderson
UPN : AR456789#domain.local
Status : DeviceOk
DeviceID : ApplC39HJ3JPDTF8
DeviceEnableOutboundSMS : False
DeviceMobileOperator :
DeviceAccessState : Allowed
DeviceAccessStateReason : Global
DeviceAccessControlRule :
DeviceType : iPhone
DeviceUserAgent : Apple-iPhone4C1/902.206
DeviceModel : iPhone
So that I could have multiple records to parse. Then I ran it against this script which creates an empty array $users, gets the content of that file 13 lines at a time (12 fields + the empty line). Then it creates a custom object with no properties. Then for each of the 13 lines, if the line is not empty it creates a new property for that object we just created, where the name is everything before the : and the value is everything after it (with spaces removed from the end of the name and value). Then it adds that object to the array.
$users=#()
gc c:\temp\test.log -ReadCount 13|%{
$User = new-object psobject
$_|?{!([string]::IsNullOrEmpty($_))}|%{
Add-Member -InputObject $User -MemberType NoteProperty -Name ($_.Split(":")[0].TrimEnd(" ")) -Value ($_.Split(":")[1].TrimEnd(" "))
}
$users+=$User
}
Once you have the array $Users filled you could do something like:
$Users | Export-CSV C:\Temp\NewFile.csv -notypeinfo
That gives you a CSV that you would expect it to.

For an input file with just a couple hundred records I'd probably read the entire file, split it at empty lines, split the text blocks at line breaks, and the lines at colons. Somewhat like this:
$infile = 'C:\path\to\input.txt'
$outfile = 'C:\path\to\output.csv'
[IO.File]::ReadAllText($infile).Trim() -split "`r`n`r`n" | % {
$o = New-Object -Type PSObject
$_.Trim() -split "`r`n" | % {
$a = "$_ :" -split '\s*:\s*'
$o | Add-Member -Type NoteProperty -Name $a[0] -Value $a[1]
}
$o
} | Export-Csv $outfile -NoType

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Extract data from a log that contains certain pattern - powershell

Related

powershell split into hashtable

Flatten LaTeX file with PowerShell

.txt Log File Data Extraction Output to CSV with REGEX

How to get the status each logged in user status details

Cover flat file using mapping to table (eg. csv)

Categories

Resources