Parse info from a text file - powershell

I am running this command to pull the last line of a log file:
Get-Content c:\temp\MigrationJobStatus-20171020-123839-515.log |
Select-Object -Last 1
The results do give me the last line, but now I need to filter the results:
10/20/2017 12:38:56 PM Information [Event]: [JobEnd], [JobId]: [70b82296-b6e2-4539-897d-c46384619059], [Time]: [10/20/2017 12:38:49.074], [FilesCreated]: [0], [BytesProcessed]: [0], [ObjectsProcessed]: [34], [TotalExpectedSPObjects]: [34], [TotalErrors]: [19], [TotalWarnings]: [3], [TotalRetryCount]: [0], [MigrationType]: [None], [MigrationDirection]: [Import], [CreatedOrUpdatedFileStatsBySize]: [{}], [ObjectsStatsByType]: [{"SPUser":{"Count":1,"TotalTime":0,"AccumulatedVersions":0,"ObjectsWithVersions":0},"SPFolder": "Count":4,"TotalTime":629,"AccumulatedVersions":0,"ObjectsWithVersions":0},"SPDocumentLibrary":"Count":1,"TotalTime":68,"AccumulatedVersions":0,"ObjectsWithVersions":0},"SPFile":{"Count":13,"TotalTime":0,"AccumulatedVersions":0,"ObjectsWithVersions":0},"SPListItem":{"Count":16,"TotalTime":2240,"AccumulatedVersions":0,"ObjectsWithVersions":0}}], [CorrelationId]: [7bbf249e-701a-4000-8eee-c4a7ef172063]
I need to be able to pull the following and export to CSV:
[JobId]: [70b82296-b6e2-4539-897d-c46384619059]
[FilesCreated]: [0]
[BytesProcessed]: [0]
[ObjectsProcessed]: [34]
[TotalExpectedSPObjects]: [34]
[TotalErrors]: [19]
[TotalWarnings]: [3]
Can someone give me some ideas on how to accomplish this?
I am doing a OneDrive 4 Business migration and need to pull the results of the Get-SPOMigrationJobProgress log for a few thousand users.

Need to add other fields there and then save results using Out-File
$results = ""
$fields = #("[JobId]", "[FilesCreated]")
$items = get-content c:\temp\MigrationJobStatus-20171020-123839-515.log | select-object -last 1 | %{ $_.Split(",")}
foreach($item in $items)
{
$field = ($item.Split(":")[0]).Trim()
if($fields.Contains($field)) { $results+= "$item`r`n" }
}
Write-Host $results

You can use split and grab the fields you need.
$text = get-content c:\temp\MigrationJobStatus-20171020-123839-515.log | select-object -last 1
$text = ($text -split ",").Trim(" ")
$csvtext = #"
$($text[3])
$($text[4])
$($text[5])
$($text[6])
$($text[7])
$($text[8])
"#
$csvtext | Out-File ".\logfile.csv"

You can get to the fields that you want by using regular expression and then create a psobject from each match:
$regexPattern = '\[([^]]+)\]: \[([^]]+)\]'
$result = Get-Content c:\temp\MigrationJobStatus-20171020-123839-515.log |
Select-Object -Last 1 |
Select-String -Pattern $regexPattern -AllMatches |
ForEach-Object { $_.Matches.Value } |
ForEach-Object { $_ -match $regexPattern |
Select-Object #{n='Name';e={$Matches[1]}},#{n='Value';e={$Matches[2]}} }
You can filter down the resulting object collection with Where-Object and use Export-Csv to get your result into a csv file.

Related

CSV file - count distinct, group by, sum

I have a file that looks like the following;
- Visitor ID,Revenue,Channel,Flight
- 1234,100,Email,BA123
- 2345,200,PPC,BA112
- 456,150,Email,BA456
I need to produce a file that contains;
The count of distinct Visitor IDs (3)
The total revenue (450)
The count of each Channel
Email 2
PPC 2
The count of each Flight
BA123 1
BA112 1
BA456 1
So far I have the following code, however when executing this on the 350MB file, it takes too long and in some cases breaks the memory limit. As I have to run this function on multiple columns, it is going through the file many times. I ideally need to do this in one file pass.
$file = 'log.txt'
function GroupBy($columnName)
{
$objects = Import-Csv -Delimiter "`t" $file | Group-Object $columnName |
Select-Object #{n=$columnName;e={$_.Group[0].$columnName}}, Count
for($i=0;$i -lt $objects.count;$I++) {
$line += $columnName +"|"+$objects[$I]."$columnName" +"|Count|"+ $objects[$I].'Count' + $OFS
}
return $line
}
$finalOutput += GroupBy "Channel"
$finalOutput += GroupBy "Flight"
Write-Host $finalOutput
Any help would be much appreciated.
Thanks,
Craig
The fact that your are importing the CSV again for each column is what is killing your script. Try to do the loading once, then re-use the data. For example:
$data = Import-Csv .\data.csv
$flights = $data | Group-Object Flight -NoElement | ForEach-Object {[PsCustomObject]#{Flight=$_.Name;Count=$_.Count}}
$visitors = ($data | Group-Object "Visitor ID" | Measure-Object).Count
$revenue = ($data | Measure-Object Revenue -Sum).Sum
$channel = $data | Group-Object Channel -NoElement | ForEach-Object {[PsCustomObject]#{Channel=$_.Name;Count=$_.Count}}
You can display the data like this:
"Revenue : $revenue"
"Visitors: $visitors"
$flights | Format-Table -AutoSize
$channel | Format-Table -AutoSize
This will probably work - using hashmaps.
Pros: It will be faster/use less memory.
Cons: It is less readable
by far than Group-Object, and requires more code.
Make it even less memory-hungry: Read the CSV-file line by line
$data = Import-CSV -Path "C:\temp\data.csv" -Delimiter ","
$DistinctVisitors = #{}
$TotalRevenue = 0
$ChannelCount = #{}
$FlightCount = #{}
$data | ForEach-Object {
$DistinctVisitors[$_.'Visitor ID'] = $true
$TotalRevenue += $_.Revenue
if (-not $ChannelCount.ContainsKey($_.Channel)) {
$ChannelCount[$_.Channel] = 0
}
$ChannelCount[$_.Channel] += 1
if (-not $FlightCount.ContainsKey($_.Flight)) {
$FlightCount[$_.Flight] = 0
}
$FlightCount[$_.Flight] += 1
}
$DistinctVisitorsCount = $DistinctVisitors.Keys | Measure-Object | Select-Object -ExpandProperty Count
Write-Output "The count of distinc Visitor IDs $DistinctVisitorsCount"
Write-Output "The total revenue $TotalRevenue"
Write-Output "The Count of each Channel"
$ChannelCount.Keys | ForEach-Object {
Write-Output "$_ $($ChannelCount[$_])"
}
Write-Output "The count of each Flight"
$FlightCount.Keys | ForEach-Object {
Write-Output "$_ $($FlightCount[$_])"
}

Loop through csv compare content with an array and then add content to csv

I don't know how to append a string to CSV. What am I doing:
I have two csv files. One with a list of host-names and id's and another one with a list of host-names and some numbers.
Example file 1:
Hostname | ID
IWBW140004 | 3673234
IWBW130023 | 2335934
IWBW120065 | 1350213
Example file 2:
ServiceCode | Hostname | ID
4 | IWBW120065 |
4 | IWBW140004 |
4 | IWBW130023 |
Now I read the content of file 1 in a two dimensional array:
$pcMatrix = #(,#())
Import-Csv $outputFile |ForEach-Object {
foreach($property in $_.PSObject.Properties){
$pcMatrix += ,($property.Value.Split(";")[1],$property.Value.Split(";")[2])
}
}
Then I read the content of file 2 and compare it with my array:
Import-Csv $Group".csv" | ForEach-Object {
foreach($property in $_.PSObject.Properties){
for($i = 0; $i -lt $pcMatrix.Length; $i++){
if($pcMatrix[$i][0] -eq $property.Value.Split('"')[1]){
#Add-Content here
}
}
}
}
What do I need to do, to append $pcMatrix[$i][1] to the active column in file 2 in the row ID?
Thanks for your suggestions.
Yanick
It seems like you are over-complicating this task.
If I understand you correctly, you want to populate the ID column in file two, with the ID that corresponds to the correct hostname from file 1. The easiest way to do that, is to fill all the values from the first file into a HashTable and use that to lookup the ID for each row in the second file:
# Read the first file and populate the HashTable:
$File1 = Import-Csv .\file1.txt -Delimiter "|"
$LookupTable = #{}
$File1 |ForEach-Object {
$LookupTable[$_.Hostname] = $_.ID
}
# Now read the second file and update the ID values:
$File2 = Import-Csv .\file2.txt -Delimiter "|"
$File2 |ForEach-Object {
$_.ID = $LookupTable[$_.Hostname]
}
# Then write the updated rows back to a new CSV file:
$File2 | Export-CSV -Path .\file3.txt -NoTypeInformation -Delimiter "|"

Combining like objects in an array

I am attempting to analyze a group of text files (MSFTP logs) and do counts of IP addresses that have submitted bad credentials. I think I have it worked out except I don't think that the array is passing to/from the function correctly. As a result, I get duplicate entries if the same IP appears in multiple log files. What am I doing wrong?
Function LogBadAttempt($FTPLog,$BadPassesArray)
{
$BadPassEx="PASS - 530"
Foreach($Line in $FTPLog)
{
if ($Line -match $BadPassEx)
{
$IP=($Line.Split(' '))[1]
if($BadPassesArray.IP -contains $IP)
{
$CurrentIP=$BadPassesArray | Where-Object {$_.IP -like $IP}
[int]$CurrentCount=$CurrentIP.Count
$CurrentCount++
$CurrentIP.Count=$CurrentCount
}else{
$info=#{"IP"=$IP;"Count"='1'}
$BadPass=New-Object -TypeName PSObject -Property $info
$BadPassesArray += $BadPass
}
}
}
return $BadPassesArray
}
$BadPassesArray=#()
$FTPLogs = Get-Childitem \\ftpserver\MSFTPSVC1\test
$Result = ForEach ($LogFile in $FTPLogs)
{
$FTPLog=Get-Content ($LogFile.fullname)
LogBadAttempt $FTPLog
}
$Result | Export-csv C:\Temp\test.csv -NoTypeInformation
The result looks like...
Count IP
7 209.59.17.20
20 209.240.83.135
18441 209.59.17.20
13059 200.29.3.98
and would like it to combine the entries for 209.59.17.20
You're making this way too complicated. Process the files in a pipeline and use a hashtable to count the occurrences of each IP address:
$BadPasswords = #{}
Get-ChildItem '\\ftpserver\MSFTPSVC1\test' | Get-Content | ? {
$_ -like '*PASS - 530*'
} | % {
$ip = ($_ -split ' ')[1]
$BadPasswords[$ip]++
}
$BadPasswords.GetEnumerator() |
select #{n='IP';e={$_.Name}}, #{n='Count';e={$_.Value}} |
Export-Csv 'C:\Temp\test.csv' -NoType

powershell - rename csv file based on a column value

My CSV file has contents like this:
currentTime, SeqNum, Address
1381868225469, 0,
1381868226491, 1, 38:1c:4a:0:8d:d
1381868227493, 1,
1381868228513, 2, 38:1c:4a:0:8d:d
1381868312825, 43,
1381868312916, 1694564736, 3a:1c:4a:1:a1:98
1381868312920, 1694564736, 3a:1c:4a:1:a1:98
1381868312921, 44,
Depending on whether the 3rd column is empty or not, I want to separate the file into 2 or more files (those with lines containing the 3rd column (fileName should contain the 3rd column) and one without the 3rd column.
Example output:
**File0.txt**
1381868225469, 0,
1381868227493, 1,
1381868312825, 43,
1381868312921, 44,
**File1-381c4a08dd.txt**
1381868226491, 1, 38:1c:4a:0:8d:d
1381868228513, 2, 38:1c:4a:0:8d:d
**File2-3a1c4a1a198.txt**
1381868312916, 1694564736, 3a:1c:4a:1:a1:98
1381868312920, 1694564736, 3a:1c:4a:1:a1:98
I referred to the stackoverflow questions HERE and HERE to get most of my work done. However, I want to rename my file based on the 3rd column. Since, windows does not accept ":" in the file name, I want to remove the ":" before attaching the 3rd column to my file name. I want my file name to look like this:
FileName-381c4a08dd.txt
How do I go about this? This is my attempt at it so far:
import-csv File.txt | group-object Address | foreach-object {
$_.group | select-object currentTime, SeqNum, Address | convertto-csv -NoTypeInformation | %{$_ -replace '"', ""} | out-file File-$($_.Address.remove(':')).txt -fo -en ascii
}
Try something like this:
$csv = Import-Csv 'C:\path\to\file.txt'
$n = 0
# export rows w/o address
$outfile = 'File${n}.txt'
$csv | ? { $null, '' -contains $_.Address } |
Export-Csv $outfile -NoTypeInformation
# export rows w/ address
$csv | ? { $null, '' -notcontains $_.Address } | Group-Object Address | % {
$n++
$outfile = "File${n}-" + $_.Name.Replace(':', '') + '.txt'
$_.Group | Export-Csv $outfile -NoTypeInformation
}
The filter $null, '' -contains $_.Address is required, because the address record will be $null when you have an empty address and no trailing line break in the last line of the input file.
If you want the output files to be created without header line you need to replace
... | Export-Csv $outfile -NoTypeInformation
with
... | ConvertTo-Csv -NoTypeInformation | select -Skip 1 | Out-File $outfile

I need help formatting output with PowerShell's Out-File cmdlet

I have a series of documents that are going through the following function designed to count word occurrences in each document. This function works fine outputting to the console, but now I want to generate a text file containting the information, but with the file name appended to each word in the list.
My current console output is:
"processing document1 with x unique words occuring as follows"
"word1 12"
"word2 8"
"word3 3"
"word4 4"
"word5 1"
I want a delimited file in this format:
document1;word1;12
document1;word2;8
document1;word3;3
document1;word4;4
document1;word1;1
document2;word1;16
document2;word2;11
document2;word3;9
document2;word4;9
document2;word1;13
While the function below gets me the lists of words and occurences, I'm having a hard time figuring out where or how to insert the filename variable so that it prints at the head of each line. MSDN has been less-than helpful, and most of the places I try to insert the variable result in errors (see below)
function Count-Words ($docs) {
$document = get-content $docs
$document = [string]::join(" ", $document)
$words = $document.split(" `t",[stringsplitoptions]::RemoveEmptyEntries)
$uniq = $words | sort -uniq
$words | % {$wordhash=#{}} {$wordhash[$_] += 1}
Write-Host $docs "contains" $wordhash.psbase.keys.count "unique words distributed as follows."
$frequency = $wordhash.psbase.keys | sort {$wordhash[$_]}
-1..-25 | %{ $frequency[$_]+" "+$wordhash[$frequency[$_]]} | Out-File c:\out-file-test.txt -append
$grouped = $words | group | sort count
Do I need to create a string to pass to the out-file cmdlet? is this just something I've been putting in the wrong place on the last few tries? I'd like to understand WHY it's going in a particular place as well. Right now I'm just guessing, because I know I have no idea where to put the out-file to achieve my selected results.
I've tried formatting my command per powershell help, using -$docs and -FilePath, but each time I add anything to the out-file above that runs successfully, I get the following error:
Out-File : Cannot validate argument on parameter 'Encoding'. The argument "c:\out-file-test.txt" does not bel
ong to the set "unicode,utf7,utf8,utf32,ascii,bigendianunicode,default,oem" specified by the ValidateSet attribute. Sup
ply an argument that is in the set and then try the command again.
At C:\c.ps1:39 char:71
+ -1..-25 | %{ $frequency[$_]+" "+$wordhash[$frequency[$_]]} | Out-File <<<< -$docs -width 1024 c:\users\x46332\co
unt-test.txt -append
+ CategoryInfo : InvalidData: (:) [Out-File], ParameterBindingValidationException
+ FullyQualifiedErrorId : ParameterArgumentValidationError,Microsoft.PowerShell.Commands.OutFileCommand
I rewrote most of your code. You should utilize objects to make it easier formatting the way you want. This one splits on "space" and groups words together. Try this:
Function Count-Words ($paths) {
$output = #()
foreach ($path in $paths) {
$file = Get-ChildItem $path
((Get-Content $file) -join " ").Split(" ", [System.StringSplitOptions]::RemoveEmptyEntries) | Group-Object | Select-Object -Property #{n="FileName";e={$file.BaseName}}, Name, Count | % {
$output += "$($_.FileName);$($_.Name);$($_.Count)"
}
}
$output | Out-File test-out2.txt -Append
}
$filepaths = ".\test.txt", ".\test2.txt"
Count-Words -paths $filepaths
It outputs like you asked(document;word;count). If you want documentname to include extension, change $file.BaseName to $file.Name . Testoutput:
test;11;1
test;9;2
test;13;1
test2;word11;5
test2;word1;4
test2;12;1
test2;word2;2
Slightly different approach:
function Get-WordCounts ($doc)
{
$text_ = [IO.File]::ReadAllText($doc.fullname)
$WordHash = #{}
$text_ -split '\b' -match '\w+'|
foreach {$WordHash[$_]++}
$WordHash.GetEnumerator() |
foreach {
New-Object PSObject -Property #{
Word = $_.Key
Count = $_.Value
}
}
}
$docs = gci c:\testfiles\*.txt |
sort name
&{
foreach ($doc in dir $docs)
{
Get-WordCounts $doc |
sort Count -Descending |
foreach {
(&{$doc.Name;$_.Word;$_.Count}) -join ';'
}
}
} | out-file c:\somedir\wordcounts.txt
Try this:
$docs = #("document1", "document2", ...)
$docs | % {
$doc = $_
Get-Content $doc `
| % { $_.split(" `t",[stringsplitoptions]::RemoveEmptyEntries) } `
| Group-Object `
| select #{n="Document";e={$doc}}, Name, Count
} | Export-CSV output.csv -Delimiter ";" -NoTypeInfo
If you want to make this into a function you could do it like this:
function Count-Words($docs) {
foreach ($doc in $docs) {
Get-Content $doc `
| % { $_.split(" `t",[stringsplitoptions]::RemoveEmptyEntries) } `
| Group-Object `
| select #{n="Document";e={$doc}}, Name, Count
}
}
$files = #("document1", "document2", ...)
Count-Words $files | Export-CSV output.csv -Delimiter ";" -NoTypeInfo