Powershell: Combine single arrays into columns - powershell

Given:
$column1 = #(1,2,3)
$column2 = #(4,5,6)
How can I combine them into an object $matrix which gets displayed as a matrix with the single arrays as columns:
column1 column2
------- -------
1 4
2 5
3 6

It seems that all of my solutions today requires calculated properties. Try:
$column1 = #(1,2,3)
$column2 = #(4,5,6)
0..($column1.Length-1) | Select-Object #{n="Id";e={$_}}, #{n="Column1";e={$column1[$_]}}, #{n="Column2";e={$column2[$_]}}
Id Column1 Column2
-- ------- -------
0 1 4
1 2 5
2 3 6
If the lengths of the arrays are not equal, you could use:
$column1 = #(1,2,3)
$column2 = #(4,5,6,1)
$max = ($column1, $column2 | Measure-Object -Maximum -Property Count).Maximum
0..$max | Select-Object #{n="Column1";e={$column1[$_]}}, #{n="Column2";e={$column2[$_]}}
I wasn't sure if you needed the Id, so I included it in the first sample to show how to include it.

Little better, maybe:
$column1 = #(1,2,3)
$column2 = #(4,5,6,7)
$i=0
($column1,$column2 | sort length)[1] |
foreach {
new-object psobject -property #{
loess = $Column1[$i]
lowess = $column2[$i++]
}
} | ft -auto
loess lowess
----- ------
1 4
2 5
3 6
7

Here's something I created today. It takes a range of 0 to one of the column lengths, then maps it to a list of hashes. Use the select to turn it into a proper table.
$table = 0..$ColA.Length | % { #{
ColA = $ColA[$_]
ColB = $ColB[$_]
}} | Select ColA, ColB
Using the following variables:
$ColA = #(1, 2, 3)
$ColB = #(4, 5, 6)
Results in
ColB ColA
---- ----
1 4
2 5
3 6

I came up with this.. but it seems too verbose. Anything shorter?
&{
for ($i=0; $i -lt $y.Length; $i++) {
New-Object PSObject -Property #{
y = $y[$i]
loess = $smooth_loess[$i]
lowess = $smooth_lowess[$i]
}
}
} | Format-Table -AutoSize

Here is a combination of mjolinor and Frode F. solutions. I ran into some problems using Frode's object construction trick using select-object. For some reason it would output hash values likely representing object references. I only code in PowerShell a few times a year, so I am just providing this in case anyone else finds it useful (perhaps even my future self).
$column1 = #(1,2,3)
$column2 = #(4,5,6,7)
$column3 = #(2,5,5,2,1,3);
$max = (
$column1,
$column2,
$column3 |
Measure-Object -Maximum -Property Count).Maximum;
$i=0
0..$max |
foreach {
new-object psobject -property #{
col1 = $Column1[$i]
col3 = $column3[$i]
col2 = $column2[$i++]
}
} | ft -auto

Related

Powershell Get-Random with Constraints

I'm currently using the Get-Random function of Powershell to randomly pull a set number of rows from a csv. I need to create a constraint that says if one id is pulled, find the other ids that match it and pull their value.
Here is what I currently have:
$chosenOnes = Import-CSV C:\Temp\pk2.csv | sort{Get-Random} | Select -first 6
$i = 1
$count = $chosenOnes | Group-Object householdID
foreach ($row in $count)
{
if ($row.count -gt 1)
{
$students = $row.Group.Student
foreach ($student in $students)
{
$name = $student.tostring()
#...do something
$i = $i + 1
}
}
else
{
$name = $row.Group.Student
if($i -le 5)
{
#...do something
}
else
{
#...do something
}
$i = $i + 1
}
}
Example dataset
ID,name
165,Ernest Hemingway
1204,Mark Twain
1578,Stephen King
1634,Charles Dickens
1726,George Orwell
7751,John Doe
7751,Tim Doe
In this example, there are 7 rows but I'm randomly selecting 6 in my code. What needs to happen is when ID=7751 then I must return both rows where ID=7751. The IDs cannot not be statically set in the code.
Use Get-Random directly, with -Count, to extract a given number of random elements from a collection.
$allRows = Import-CSV C:\Temp\pk2.csv
$chosenHouseholdIDs = ($allRows | Get-Random -Count 6).householdID
Then filter all rows by whether their householdID column contains one of the 6 randomly selected rows' householdID values (PSv3+ syntax), using the -in array-containment operator:
$allRows | Where-Object householdID -in $chosenHouseholdIDs
Optional reading: performance considerations:
$allRows | Get-Random -Count 6 is not only conceptually simpler, but also much faster than $allRows | Sort-Object { Get-Random } | Select-Object -First 6
Using the Time-Command function to compare the performance of two approaches, using a 1000-row test file with 10 columns yields the following sample timings on my Windows 10 VM in Windows PowerShell - note that the Sort-Object { Get-Random }-based solution is more than 15(!) times slower:
Factor Secs (100-run avg.) Command TimeSpan
------ ------------------- ------- --------
1.00 0.007 $allRows | Get-Random -Count 6 00:00:00.0072520
15.65 0.113 $allRows | Sort-Object { Get-Random } | Select-Object -First 6 00:00:00.1134909
Similarly, a single pass through all rows to find matching IDs via array-containment operator -in performs much better than looping over the randomly selected IDs and searching all rows for each.
I tried sticking with your beginning and came up with this.
$Array = Import-CSV C:\test\StudtentTest.csv
$Array | Sort{Get-Random} | select -first 2 | %{
$id = $_.id
$Array | ?{$_.id -eq $id} | %{
$_
}
}
$Array will be your parsed CSV
We pipe in and sort by random select -first 2 (in this case)
Save the ID of the object into $id and then search the array for that ID and dispaly each that matches
If same ID does match you end up with something like
ID name
-- ----
7751 John Doe
7751 Tim Doe
1634 Charles Dickens

Add up the data if the reference from another file is correct

I have two CSV Files which look like this:
test.csv:
"Col1","Col2"
"1111","1"
"1122","2"
"1111","3"
"1121","2"
"1121","2"
"1133","2"
"1133","2"
The second looks like this:
test2.csv:
"Number","signs"
"1111","ABC"
"1122","DEF"
"1111","ABC"
"1121","ABC"
"1133","GHI"
Now the goal is to get a summary of all points from test.csv assigned to the "signs" of test2.csv. Reference are the numbers, as you may see.
Should be something like this:
ABC = 8
DEF = 2
GHI = 4
I have tried to test this out but cannot get the goal. What I have so far is:
$var = "C:\PathToCSV"
$csv1 = Import-Csv "$var\test.csv"
$csv2 = Import-Csv "$var\test2.csv"
# Process: group by 'Item' then sum 'Average' for each group
# and create output objects on the fly
$test1 = $csv1 | Group-Object Col1 | ForEach-Object {
New-Object psobject -Property #{
Col1 = $_.Name
Sum = ($_.Group | Measure-Object Col2 -Sum).Sum
}
}
But this gives me back the following output:
Ps> $test1
Sum Col1
--- ----
4 1111
2 1122
4 1121
4 1133
I am not able to get the summary and the mapping of the signs.
Not sure if I understand your question correctly, but I'm going to assume that for each value from the column "signs" you want to lookup the values from the column "Number" in the second CSV and then calculate the sum of the column "Col2" for all matches.
For that I'd build a hashtable with the pre-calculated sums for the unique values from "Col1":
$h1 = #{}
$csv1 | ForEach-Object {
$h1[$_.Col1] += [int]$_.Col2
}
and then build a second hashtable to sum up the lookup results for the values from the second CSV:
$h2 = #{}
$csv2 | ForEach-Object {
$h2[$_.signs] += $h1[$_.Number]
}
However, that produced a different value for "ABC" than what you stated as the desired result in your question when I processed your sample data:
Name Value
---- -----
ABC 12
GHI 4
DEF 2
Or did you mean you want to sum up the corresponding values for the unique numbers for each sign? For that you'd change the second code snippet to something like this:
$h2 = #{}
$csv2 | Group-Object signs | ForEach-Object {
$name = $_.Name
$_.Group | Select-Object -Unique -Expand Number | ForEach-Object {
$h2[$name] += $h1[$_]
}
}
That would produce the desired result from your question:
Name Value
---- -----
ABC 8
GHI 4
DEF 2

Conditional criteria in powershell group measure-object?

I have data in this shape:
externalName,day,workingHours,hoursAndMinutes
PRJF,1,11,11:00
PRJF,2,11,11:00
PRJF,3,0,0:00
PRJF,4,0,0:00
CFAW,1,11,11:00
CFAW,2,11,11:00
CFAW,3,11,11:00
CFAW,4,11,11:00
CFAW,5,0,0:00
CFAW,6,0,0:00
and so far code is
$gdata = Import-csv $filepath\$filename | Group-Object -Property Externalname;
$test = #()
$test += foreach($rostername in $gdata) {
$rostername.Group | Select -Unique externalName,
#{Name = 'AllDays';Expression = {(($rostername.Group) | measure -Property day).count}},
}
$test;
What I can't work out is how to do a conditional count of the lines where day is non-zero.
The aim is to produce two lines:
PRJF, 4, 2, 11
CFAW, 6, 4, 11
i.e. Roster name, roster length, days on, average hours worked per day on.
You need a where-object to filter for non zero workinghours
I'd use a [PSCustomObject] to generate a new table
EDIT a bit more efficient with only one Measure-Object
## Q:\Test\2018\08\06\SO_51700660.ps1
$filepath = 'Q:\Test\2018\08\06'
$filename = 'SO_S1700660.csv'
$gdata = Import-Csv (Join-Path $filepath $filename) | Group-Object -Property Externalname
$test = ForEach($Roster in $gdata) {
$WH = ($Roster.Group.Workinghours|Where-Object {$_ -ne 0}|Measure-Object -Ave -Sum)
[PSCustomObject]#{
RosterName = $Roster.Name
RosterLength = $Roster.Count
DaysOn = $WH.count
AvgHours = $WH.Average
TotalHours = $WH.Sum
}
}
$test | Format-Table
Sample output:
> .\SO_51700660.ps1
RosterName RosterLength DaysOn AvgHours TotalHours
---------- ------------ ------ -------- ----------
PRJF 4 2 11 22
CFAW 6 4 11 44

Powershell - Prefix each line of Format-Table with String

I would like to know if there is an easy way of prefixing each line of a powershell table with a String.
For example, if I create an Array using the following code:
$Array = #()
$Object = #{}
$Object.STR_PARAM = "A"
$Object.INT_PARAM = 1
$Array += [PSCustomObject] $Object
$Object = #{}
$Object.STR_PARAM = "B"
$Object.INT_PARAM = 2
$Array += [PSCustomObject] $Object
Calling Format-Table give the following output:
$Array | Format-Table -AutoSize
STR_PARAM INT_PARAM
--------- ---------
A 1
B 2
Instead, I would like to have the following:
$Array | Format-Table-Custom -AutoSize -PrefixString " "
STR_PARAM INT_PARAM
--------- ---------
A 1
B 2
And if possible, I would also like to be able to use the Property parameter like this:
$SimpleFormat = #{Expression={$_.STR_PARAM}; Label="String Param"},
#{Expression={$_.INT_PARAM}; Label="Integer Param"};
$Array | Format-Table-Custom -Property $SimpleFormat -AutoSize -PrefixString "++"
++String Param Integer Param
++------------ -------------
++A 1
++B 2
Any help would be appreciated. Thanks.
You could just use the format expressions directly:
$f = #{Expression={"++" + $_.STR_PARAM}; Label="++String Param"},
#{Expression={$_.INT_PARAM}; Label="Integer Param"};
$Array | Format-Table $f -AutoSize
Output
++String Param Integer Param
-------------- -------------
++A 1
++B 2
Update to use expression and filter
Filter Format-Table-Custom
{
Param
(
[string]
$PrefixString,
[object]
$Property
)
end {
$rows = $input | Format-Table $property -AutoSize | Out-String
$lines = $rows.Split("`n")
foreach ($line in $lines) {
if ($line.Trim().Length -gt 0) {
$PrefixString + $line
}
}
}
}
$f = #{Expression={"--" + $_.STR_PARAM}; Label="--String Param"},
#{Expression={$_.INT_PARAM}; Label="Integer Param"};
$Array | Format-Table-Custom -Property $f -PrefixString "++"
Output
++--String Param Integer Param
++-------------- -------------
++--A 1
++--B 2

How to sum multiple items in an object in PowerShell?

I have:
$report.gettype().name
Object[]
echo $report
Item Average
-- -------
orange 0.294117647058824
orange -0.901960784313726
orange -0.901960784313726
grape 9.91335740072202
grape 0
pear 3.48736462093863
pear -0.0324909747292419
pear -0.0324909747292419
apple 12.1261261261261
apple -0.0045045045045045
I want to create a variable, $total, (such as a hash table) which contains the sum of the 'Average' column for each item, for example,
echo $total
orange -1.5097
grape 9.913
pear 3.423
apple 12.116
Right now I'm thinking of looping through the $report, but it's hell ugly, and I am looking for something more elegant than the following starting point (incomplete):
$tmpPrev = ""
foreach($r in $report){
$tmp = $r.item
$subtotal = 0
if($tmp <> $tmpPrev){
$subtotal += $r.average
}
How could I do this?
Cmdlets Group-Object and Measure-Object help to solve the task in a PowerShell-ish way:
Code:
# Demo input
$report = #(
New-Object psobject -Property #{ Item = 'orange'; Average = 1 }
New-Object psobject -Property #{ Item = 'orange'; Average = 2 }
New-Object psobject -Property #{ Item = 'grape'; Average = 3 }
New-Object psobject -Property #{ Item = 'grape'; Average = 4 }
)
# Process: group by 'Item' then sum 'Average' for each group
# and create output objects on the fly
$report | Group-Object Item | %{
New-Object psobject -Property #{
Item = $_.Name
Sum = ($_.Group | Measure-Object Average -Sum).Sum
}
}
Output:
Sum Item
--- ----
3 orange
7 grape
I've got a more command-line solution.
Given $report
$groupreport = $report | Group-Object -Property item -AsHashTable
is
Name Value
---- -----
grape {#{Item=grape; Average=9.91335740072202}, #{Item=grape; Average=0}}
orange {#{Item=orange; Average=0.294117647058824}, #{Item=orange; Average=-0.901960784313726...
apple {#{Item=apple; Average=12.1261261261261}, #{Item=apple; Average=-0.0045045045045045}}
pear {#{Item=pear; Average=3.48736462093863}, #{Item=pear; Average=-0.0324909747292419}, #...
then
$tab=#{}
$groupreport.keys | % {$tab += #{$_ = ($groupreport[$_] | measure-object -Property average -sum)}}
gives
PS> $tab["grape"]
Count : 2
Average :
Sum : 9,91335740072202
Maximum :
Minimum :
Property : Average
PS> $tab["grape"].sum
9,91335740072202
It seems short and usable.
Summary
$groupreport = $report | Group-Object -Property item -AsHashTable
$tab = #{}
$groupreport.keys | % {$tab += #{$_ = ($groupreport[$_] | measure-object -Property average -sum)}}
$tab.keys | % {write-host $_ `t $tab[$_].sum}
I don't know if you can get rid of looping. What about:
$report | % {$averages = #{}} {
if ($averages[$_.item]) {
$averages[$_.item] += $_.average
}
else {
$averages[$_.item] = $_.average
}
} {$averages}