How to export two variables into same CSV as joined via PowerShell? - powershell

I have a PowerShell script employing poshwsus module like below:
$FileOutput = "C:\WSUSReport\WSUSReport.csv"
$ProcessLog = "C:\WSUSReport\QueryLog2.txt"
$WSUSServers = "C:\WSUSReport\Computers.txt"
$WSUSPort = "8530"
import-module poshwsus
ForEach ($Server in Get-Content $WSUSServers)
{
& connect-poshwsusserver $Server -port $WSUSPort | out-file $ProcessLog -append
$r1 = & Get-PoshWSUSClient | select #{name="Computer";expression={$_.FullDomainName}},#{name="LastUpdated";expression={if ([datetime]$_.LastReportedStatusTime -gt [datetime]"1/1/0001 12:00:00 AM") {$_.LastReportedStatusTime} else {$_.LastSyncTime}}}
$r2 = & Get-PoshWSUSUpdateSummaryPerClient -UpdateScope (new-poshwsusupdatescope) -ComputerScope (new-poshwsuscomputerscope) | Select Computer,NeededCount,DownloadedCount,NotApplicableCount,NotInstalledCount,InstalledCount,FailedCount
}
What I need to do is to export CSV outpout including the results with the columns (like "inner join"):
Computer, NeededCount, DownloadedCount, NotApplicableCount, NotINstalledCount, InstalledCount, FailedCount, LastUpdated
I have tried to use the line below in foreach, but it didn't work as I expected.
$r1 + $r2 | export-csv -NoTypeInformation -append $FileOutput
I appreciate if you may help or advise.
EDIT --> The output I've got:
ComputerName LastUpdate
X A
Y B
X
Y
So no error, first two rows from $r2, last two rows from $r1, it is not joining the tables as I expected.
Thanks!

I've found my guidance in this post: Inner Join in PowerShell (without SQL)
Modified my query accordingly like below, works like a charm.
$FileOutput = "C:\WSUSReport\WSUSReport.csv"
$ProcessLog = "C:\WSUSReport\QueryLog.txt"
$WSUSServers = "C:\WSUSReport\Computers.txt"
$WSUSPort = "8530"
import-module poshwsus
function Join-Records($tab1, $tab2){
$prop1 = $tab1 | select -First 1 | % {$_.PSObject.Properties.Name} #properties from t1
$prop2 = $tab2 | select -First 1 | % {$_.PSObject.Properties.Name} #properties from t2
$join = $prop1 | ? {$prop2 -Contains $_}
$unique1 = $prop1 | ?{ $join -notcontains $_}
$unique2 = $prop2 | ?{ $join -notcontains $_}
if ($join) {
$tab1 | % {
$t1 = $_
$tab2 | % {
$t2 = $_
foreach ($prop in $join) {
if (!$t1.$prop.Equals($t2.$prop)) { return; }
}
$result = #{}
$join | % { $result.Add($_,$t1.$_) }
$unique1 | % { $result.Add($_,$t1.$_) }
$unique2 | % { $result.Add($_,$t2.$_) }
[PSCustomObject]$result
}
}
}
}
ForEach ($Server in Get-Content $WSUSServers)
{
& connect-poshwsusserver $Server -port $WSUSPort | out-file $ProcessLog -append
$r1 = & Get-PoshWSUSClient | select #{name="Computer";expression={$_.FullDomainName}},#{name="LastUpdated";expression={if ([datetime]$_.LastReportedStatusTime -gt [datetime]"1/1/0001 12:00:00 AM") {$_.LastReportedStatusTime} else {$_.LastSyncTime}}}
$r2 = & Get-PoshWSUSUpdateSummaryPerClient -UpdateScope (new-poshwsusupdatescope) -ComputerScope (new-poshwsuscomputerscope) | Select Computer,NeededCount,DownloadedCount,NotApplicableCount,NotInstalledCount,InstalledCount,FailedCount
Join-Records $r1 $r2 | Select Computer,NeededCount,DownloadedCount,NotApplicableCount,NotInstalledCount,InstalledCount,FailedCount, LastUpdated | export-csv -NoTypeInformation -append $FileOutput
}

I think this could be made simpler. Since Select-Object's -Property parameter accepts an array of values, you can create an array of the properties you want to display. The array can be constructed by comparing your two objects' properties and outputting a unique list of those properties.
$selectProperties = $r1.psobject.properties.name | Compare-Object $r2.psobject.properties.name -IncludeEqual -PassThru
$r1,$r2 | Select-Object -Property $selectProperties
Compare-Object by default will output only differences between a reference object and a difference object. Adding the -IncludeEqual switch displays different and equal comparisons. Adding the -PassThru parameter outputs the actual objects that are compared rather than the default PSCustomObject output.

Related

Find out Text data in CSV File Numeric Columns in Powershell

I am very new in powershell.
I am trying to validate my CSV file by finding out if there is any text value in my numeric fields. I can define with columns are numeric.
This is my source data like this
ColA ColB ColC ColD
23 23 ff 100
2.30E+01 34 2.40E+01 23
df 33 ss df
34 35 36 37
I need output something like this (only text values if found in any column)
ColA ColC ColD
2.30E+01 ff df
df 2.40E+01
ss
I have tried some code but not getting any results, get only some output like as under
System.Object[]
---------------
xxx fff' ddd 3.54E+03
...
This is what I was trying
#
cls
function Is-Numeric ($Value) {
return $Value -match "^[\d\.]+$"
}
$arrResult = #()
$arraycol = #()
$FileCol = #("ColA","ColB","ColC","ColD")
$dif_file_path = "C:\Users\$env:username\desktop\f2.csv"
#Importing CSVs
$dif_file = Import-Csv -Path $dif_file_path -Delimiter ","
############## Test Datatype (Is-Numeric)##########
foreach($col in $FileCol)
{
foreach ($line in $dif_file) {
$val = $line.$col
$isnum = Is-Numeric($val)
if ($isnum -eq $false) {
$arrResult += $line.$col
$arraycol += $col
}
}
}
[pscustomobject]#{$arraycol = "$arrResult"}| out-file "C:\Users\$env:username\Desktop\Errors1.csv"
####################
can someone guide me right direction?
Thanks
You can try something like this,
function Is-Numeric ($Value) {
return $Value -match "^[\d\.]+$"
}
$dif_file_path = "C:\Users\$env:username\desktop\f2.csv"
#Importing CSVs
$dif_file = Import-Csv -Path $dif_file_path -Delimiter ","
#$columns = $dif_file | Get-member -MemberType 'NoteProperty' | Select-Object -ExpandProperty 'Name'
# Use this to specify certain columns
$columns = "ColB", "ColC", "ColD"
foreach($row in $dif_file) {
foreach ($col in $columns) {
if ($col -in $columns) {
if (!(Is-Numeric $row.$col)) {
$row.$col = ""
}
}
}
}
$dif_file | Export-Csv C:\temp\formatted.txt
Look up name of columns as you go
Look up values of each col in each row and if it is not numeric, change to ""
Exported updated file.
I think not displaying columns that have no data creates the challenge here. You can do the following:
$csv = Import-Csv "C:\Users\$env:username\desktop\f2.csv"
$finalprops = [collections.generic.list[string]]#()
$out = foreach ($line in $csv) {
$props = $line.psobject.properties | Where {$_.Value -notmatch '^[\d\.]+$'} |
Select-Object -Expand Name
$props | Where {$_ -notin $finalprops} | Foreach-Object { $finalprops.add($_) }
if ($props) {
$line | Select $props
}
$out | Select-Object ($finalprops | Sort)
Given the nature of Format-Table or tabular output, you only see the properties of the first object in the collection. So if object1 has ColA only, but object2 has ColA and ColB, you only see ColA.
The output order you want is quite different than the input CSV; you're tracking bad text data not by first occurrence, but by column order, which requires some extra steps.
test.csv file contents:
ColA,ColB,ColC,ColD
23,23,ff,100
2.30E+01,34,2.40E+01,23
df,33,ss,df
34,35,36,37
Sample code tested to meet your description:
$csvIn = Import-Csv "$PSScriptRoot\test.csv";
# create working data set with headers in same order as input file
$data = [ordered]#{};
$csvIn[0].PSObject.Properties | foreach {
$data.Add($_.Name, (New-Object System.Collections.ArrayList));
};
# add fields with text data
$csvIn | foreach {
$_.PSObject.Properties | foreach {
if ($_.Value -notmatch '^-?[\d\.]+$') {
$null = $data[$_.Name].Add($_.Value);
}
}
}
$removes = #(); # remove `good` columns with numeric data
$rowCount = 0; # column with most bad values
$data.GetEnumerator() | foreach {
$badCount = $_.Value.Count;
if ($badCount -eq 0) { $removes += $_.Key; }
if ($badCount -gt $rowCount) { $rowCount = $badCount; }
}
$removes | foreach { $data.Remove($_); }
0..($rowCount - 1) | foreach {
$h = [ordered]#{};
foreach ($key in $data.Keys) {
$h.Add($key, $data[$key][$_]);
}
[PSCustomObject]$h;
} |
Export-Csv -NoTypeInformation -Path "$PSScriptRoot\text-data.csv";
output file contents:
"ColA","ColC","ColD"
"2.30E+01","ff","df"
"df","2.40E+01",
,"ss",
#Jawad, Finally I have tried
function Is-Numeric ($Value) {
return $Value -match "^[\d\.]+$"
}
$arrResult = #()
$columns = "ColA","ColB","ColC","ColD"
$dif_file_path = "C:\Users\$env:username\desktop\f1.csv"
$dif_file = Import-Csv -Path $dif_file_path -Delimiter "," |select $columns
$columns = $dif_file | Get-member -MemberType 'NoteProperty' | Select-Object -ExpandProperty 'Name'
foreach($row in $dif_file) {
foreach ($col in $columns) {
$val = $row.$col
$isnum = Is-Numeric($val)
if ($isnum -eq $false) {
$arrResult += $col+ " " +$row.$col
}}}
$arrResult | out-file "C:\Users\$env:username\desktop\Errordata.csv"
I get correct result in my out file, order is very ambiguous like
ColA ss
ColB 5.74E+03
ColA ss
ColC rrr
ColB 3.54E+03
ColD ss
ColB 8.31E+03
ColD cc
any idea to get proper format? thanks
Note: with your suggested code, I get complete source file with all data , not the specific error data.

CSV file - count distinct, group by, sum

I have a file that looks like the following;
- Visitor ID,Revenue,Channel,Flight
- 1234,100,Email,BA123
- 2345,200,PPC,BA112
- 456,150,Email,BA456
I need to produce a file that contains;
The count of distinct Visitor IDs (3)
The total revenue (450)
The count of each Channel
Email 2
PPC 2
The count of each Flight
BA123 1
BA112 1
BA456 1
So far I have the following code, however when executing this on the 350MB file, it takes too long and in some cases breaks the memory limit. As I have to run this function on multiple columns, it is going through the file many times. I ideally need to do this in one file pass.
$file = 'log.txt'
function GroupBy($columnName)
{
$objects = Import-Csv -Delimiter "`t" $file | Group-Object $columnName |
Select-Object #{n=$columnName;e={$_.Group[0].$columnName}}, Count
for($i=0;$i -lt $objects.count;$I++) {
$line += $columnName +"|"+$objects[$I]."$columnName" +"|Count|"+ $objects[$I].'Count' + $OFS
}
return $line
}
$finalOutput += GroupBy "Channel"
$finalOutput += GroupBy "Flight"
Write-Host $finalOutput
Any help would be much appreciated.
Thanks,
Craig
The fact that your are importing the CSV again for each column is what is killing your script. Try to do the loading once, then re-use the data. For example:
$data = Import-Csv .\data.csv
$flights = $data | Group-Object Flight -NoElement | ForEach-Object {[PsCustomObject]#{Flight=$_.Name;Count=$_.Count}}
$visitors = ($data | Group-Object "Visitor ID" | Measure-Object).Count
$revenue = ($data | Measure-Object Revenue -Sum).Sum
$channel = $data | Group-Object Channel -NoElement | ForEach-Object {[PsCustomObject]#{Channel=$_.Name;Count=$_.Count}}
You can display the data like this:
"Revenue : $revenue"
"Visitors: $visitors"
$flights | Format-Table -AutoSize
$channel | Format-Table -AutoSize
This will probably work - using hashmaps.
Pros: It will be faster/use less memory.
Cons: It is less readable
by far than Group-Object, and requires more code.
Make it even less memory-hungry: Read the CSV-file line by line
$data = Import-CSV -Path "C:\temp\data.csv" -Delimiter ","
$DistinctVisitors = #{}
$TotalRevenue = 0
$ChannelCount = #{}
$FlightCount = #{}
$data | ForEach-Object {
$DistinctVisitors[$_.'Visitor ID'] = $true
$TotalRevenue += $_.Revenue
if (-not $ChannelCount.ContainsKey($_.Channel)) {
$ChannelCount[$_.Channel] = 0
}
$ChannelCount[$_.Channel] += 1
if (-not $FlightCount.ContainsKey($_.Flight)) {
$FlightCount[$_.Flight] = 0
}
$FlightCount[$_.Flight] += 1
}
$DistinctVisitorsCount = $DistinctVisitors.Keys | Measure-Object | Select-Object -ExpandProperty Count
Write-Output "The count of distinc Visitor IDs $DistinctVisitorsCount"
Write-Output "The total revenue $TotalRevenue"
Write-Output "The Count of each Channel"
$ChannelCount.Keys | ForEach-Object {
Write-Output "$_ $($ChannelCount[$_])"
}
Write-Output "The count of each Flight"
$FlightCount.Keys | ForEach-Object {
Write-Output "$_ $($FlightCount[$_])"
}

Compare-Object - Separate side columns

Is it possible to display the results of a PowerShell Compare-Object in two columns showing the differences of reference vs difference objects?
For example using my current cmdline:
Compare-Object $Base $Test
Gives:
InputObject SideIndicator
987654 =>
555555 <=
123456 <=
In reality the list is rather long. For easier data reading is it possible to format the data like so:
Base Test
555555 987654
123456
So each column shows which elements exist in that object vs the other.
For bonus points it would be fantastic to have a count in the column header like so:
Base(2) Test(1)
555555 987654
123456
Possible? Sure. Feasible? Not so much. PowerShell wasn't really built for creating this kind of tabular output. What you can do is collect the differences in a hashtable as nested arrays by input file:
$ht = #{}
Compare-Object $Base $Test | ForEach-Object {
$value = $_.InputObject
switch ($_.SideIndicator) {
'=>' { $ht['Test'] += #($value) }
'<=' { $ht['Base'] += #($value) }
}
}
then transpose the hashtable:
$cnt = $ht.Values |
ForEach-Object { $_.Count } |
Sort-Object |
Select-Object -Last 1
$keys = $ht.Keys | Sort-Object
0..($cnt-1) | ForEach-Object {
$props = [ordered]#{}
foreach ($key in $keys) {
$props[$key] = $ht[$key][$_]
}
New-Object -Type PSObject -Property $props
} | Format-Table -AutoSize
To include the item count in the header name change $props[$key] to $props["$key($($ht[$key].Count))"].

Combining like objects in an array

I am attempting to analyze a group of text files (MSFTP logs) and do counts of IP addresses that have submitted bad credentials. I think I have it worked out except I don't think that the array is passing to/from the function correctly. As a result, I get duplicate entries if the same IP appears in multiple log files. What am I doing wrong?
Function LogBadAttempt($FTPLog,$BadPassesArray)
{
$BadPassEx="PASS - 530"
Foreach($Line in $FTPLog)
{
if ($Line -match $BadPassEx)
{
$IP=($Line.Split(' '))[1]
if($BadPassesArray.IP -contains $IP)
{
$CurrentIP=$BadPassesArray | Where-Object {$_.IP -like $IP}
[int]$CurrentCount=$CurrentIP.Count
$CurrentCount++
$CurrentIP.Count=$CurrentCount
}else{
$info=#{"IP"=$IP;"Count"='1'}
$BadPass=New-Object -TypeName PSObject -Property $info
$BadPassesArray += $BadPass
}
}
}
return $BadPassesArray
}
$BadPassesArray=#()
$FTPLogs = Get-Childitem \\ftpserver\MSFTPSVC1\test
$Result = ForEach ($LogFile in $FTPLogs)
{
$FTPLog=Get-Content ($LogFile.fullname)
LogBadAttempt $FTPLog
}
$Result | Export-csv C:\Temp\test.csv -NoTypeInformation
The result looks like...
Count IP
7 209.59.17.20
20 209.240.83.135
18441 209.59.17.20
13059 200.29.3.98
and would like it to combine the entries for 209.59.17.20
You're making this way too complicated. Process the files in a pipeline and use a hashtable to count the occurrences of each IP address:
$BadPasswords = #{}
Get-ChildItem '\\ftpserver\MSFTPSVC1\test' | Get-Content | ? {
$_ -like '*PASS - 530*'
} | % {
$ip = ($_ -split ' ')[1]
$BadPasswords[$ip]++
}
$BadPasswords.GetEnumerator() |
select #{n='IP';e={$_.Name}}, #{n='Count';e={$_.Value}} |
Export-Csv 'C:\Temp\test.csv' -NoType

I need help formatting output with PowerShell's Out-File cmdlet

I have a series of documents that are going through the following function designed to count word occurrences in each document. This function works fine outputting to the console, but now I want to generate a text file containting the information, but with the file name appended to each word in the list.
My current console output is:
"processing document1 with x unique words occuring as follows"
"word1 12"
"word2 8"
"word3 3"
"word4 4"
"word5 1"
I want a delimited file in this format:
document1;word1;12
document1;word2;8
document1;word3;3
document1;word4;4
document1;word1;1
document2;word1;16
document2;word2;11
document2;word3;9
document2;word4;9
document2;word1;13
While the function below gets me the lists of words and occurences, I'm having a hard time figuring out where or how to insert the filename variable so that it prints at the head of each line. MSDN has been less-than helpful, and most of the places I try to insert the variable result in errors (see below)
function Count-Words ($docs) {
$document = get-content $docs
$document = [string]::join(" ", $document)
$words = $document.split(" `t",[stringsplitoptions]::RemoveEmptyEntries)
$uniq = $words | sort -uniq
$words | % {$wordhash=#{}} {$wordhash[$_] += 1}
Write-Host $docs "contains" $wordhash.psbase.keys.count "unique words distributed as follows."
$frequency = $wordhash.psbase.keys | sort {$wordhash[$_]}
-1..-25 | %{ $frequency[$_]+" "+$wordhash[$frequency[$_]]} | Out-File c:\out-file-test.txt -append
$grouped = $words | group | sort count
Do I need to create a string to pass to the out-file cmdlet? is this just something I've been putting in the wrong place on the last few tries? I'd like to understand WHY it's going in a particular place as well. Right now I'm just guessing, because I know I have no idea where to put the out-file to achieve my selected results.
I've tried formatting my command per powershell help, using -$docs and -FilePath, but each time I add anything to the out-file above that runs successfully, I get the following error:
Out-File : Cannot validate argument on parameter 'Encoding'. The argument "c:\out-file-test.txt" does not bel
ong to the set "unicode,utf7,utf8,utf32,ascii,bigendianunicode,default,oem" specified by the ValidateSet attribute. Sup
ply an argument that is in the set and then try the command again.
At C:\c.ps1:39 char:71
+ -1..-25 | %{ $frequency[$_]+" "+$wordhash[$frequency[$_]]} | Out-File <<<< -$docs -width 1024 c:\users\x46332\co
unt-test.txt -append
+ CategoryInfo : InvalidData: (:) [Out-File], ParameterBindingValidationException
+ FullyQualifiedErrorId : ParameterArgumentValidationError,Microsoft.PowerShell.Commands.OutFileCommand
I rewrote most of your code. You should utilize objects to make it easier formatting the way you want. This one splits on "space" and groups words together. Try this:
Function Count-Words ($paths) {
$output = #()
foreach ($path in $paths) {
$file = Get-ChildItem $path
((Get-Content $file) -join " ").Split(" ", [System.StringSplitOptions]::RemoveEmptyEntries) | Group-Object | Select-Object -Property #{n="FileName";e={$file.BaseName}}, Name, Count | % {
$output += "$($_.FileName);$($_.Name);$($_.Count)"
}
}
$output | Out-File test-out2.txt -Append
}
$filepaths = ".\test.txt", ".\test2.txt"
Count-Words -paths $filepaths
It outputs like you asked(document;word;count). If you want documentname to include extension, change $file.BaseName to $file.Name . Testoutput:
test;11;1
test;9;2
test;13;1
test2;word11;5
test2;word1;4
test2;12;1
test2;word2;2
Slightly different approach:
function Get-WordCounts ($doc)
{
$text_ = [IO.File]::ReadAllText($doc.fullname)
$WordHash = #{}
$text_ -split '\b' -match '\w+'|
foreach {$WordHash[$_]++}
$WordHash.GetEnumerator() |
foreach {
New-Object PSObject -Property #{
Word = $_.Key
Count = $_.Value
}
}
}
$docs = gci c:\testfiles\*.txt |
sort name
&{
foreach ($doc in dir $docs)
{
Get-WordCounts $doc |
sort Count -Descending |
foreach {
(&{$doc.Name;$_.Word;$_.Count}) -join ';'
}
}
} | out-file c:\somedir\wordcounts.txt
Try this:
$docs = #("document1", "document2", ...)
$docs | % {
$doc = $_
Get-Content $doc `
| % { $_.split(" `t",[stringsplitoptions]::RemoveEmptyEntries) } `
| Group-Object `
| select #{n="Document";e={$doc}}, Name, Count
} | Export-CSV output.csv -Delimiter ";" -NoTypeInfo
If you want to make this into a function you could do it like this:
function Count-Words($docs) {
foreach ($doc in $docs) {
Get-Content $doc `
| % { $_.split(" `t",[stringsplitoptions]::RemoveEmptyEntries) } `
| Group-Object `
| select #{n="Document";e={$doc}}, Name, Count
}
}
$files = #("document1", "document2", ...)
Count-Words $files | Export-CSV output.csv -Delimiter ";" -NoTypeInfo