Get rows from one datatable where Id not in another using Powershell - powershell

I have two DataTables in Powershell with differing columns but one common Id column.
I want to get the rows from DataTable A where the Id of the row doesn't appear in DataTable B.
|DataTable A |
|---------------------------|
|Id|SomeName|SomeDescription|
|--|--------|---------------|
|1 |Blah |Whatevs |
|2 |Foo |Bar |
|3 |Woo |Yeah |
|DataTable B |
|------------------------------------|
| Id | SomeOtherName | SomeOtherDesc |
|----|---------------|---------------|
| 1 | Blah blah | Yadda yadda |
| 2 | Foo foo | Bah bah |
The result I'd like:
|DataTable Result |
|---------------------------|
|Id|SomeName|SomeDescription|
|--|--------|---------------|
|3 |Woo |Yeah |
How is this best done in Powershell?

Quickly whipped a csv table and not a data table, but if the ID columns are put in an array, this should work,:
$tableA = #'
Id,somename,somedescription
1,Blah,Whatevs
2,Foo,Bar
3,Woo,Yeah
'#
$tableB = #'
Id,somename,somedescription
1,Blah,Whatevs asd
2,Foo,Bar asd
'#
$importA = $tableA | Convertfrom-csv
$importB = $tableB | Convertfrom-csv
$importA | Where-Object { $importB.Id -notcontains $_.Id }

Since you work with DataTables and the result should also be a DataTable, this should do it:
# DataTable A
$dtA = New-Object System.Data.DataTable
$dtA.Columns.Add([System.Data.DataColumn]::new("Id",[int]))
$dtA.Columns.Add([System.Data.DataColumn]::new("SomeName"))
$dtA.Columns.Add([System.Data.DataColumn]::new("SomeDescription"))
$row = $dtA.NewRow()
$row["Id"] = 1
$row["SomeName"] = "Blah"
$row["SomeDescription"] = "Whatevs"
$dtA.rows.Add($row)
$row = $dtA.NewRow()
$row["Id"] = 2
$row["SomeName"] = "Foo"
$row["SomeDescription"] = "Bar"
$dtA.rows.Add($row)
$row = $dtA.NewRow()
$row["Id"] = 3
$row["SomeName"] = "Woo"
$row["SomeDescription"] = "Yeah"
$dtA.rows.Add($row)
# DataTable B
$dtB = New-Object System.Data.DataTable
$dtB.Columns.Add([System.Data.DataColumn]::new("Id",[int]))
$dtB.Columns.Add([System.Data.DataColumn]::new("SomeOtherName"))
$dtB.Columns.Add([System.Data.DataColumn]::new("SomeOtherDesc"))
$row = $dtB.NewRow()
$row["Id"] = 1
$row["SomeOtherName"] = "Blah blah"
$row["SomeOtherDesc"] = "Yadda yadda"
$dtB.rows.Add($row)
$row = $dtB.NewRow()
$row["Id"] = 2
$row["SomeOtherName"] = "Foo foo"
$row["SomeOtherDesc"] = "Ba ba"
$dtB.rows.Add($row)
# create a clone of datatable A (no data, just the structure)
$dtResult = $dtA.Clone()
# Get the Id values that are in DataTable A, but not in DataTable B
$diff = Compare-Object -ReferenceObject $dtA.Id -DifferenceObject $dtB.Id -PassThru | Where-Object { $_.SideIndicator -eq '<=' }
$dtA | Where-Object { $diff -contains $_.Id } | ForEach-Object {
# here, $_ is of type System.Data.DataRow
# you cannot add this DataRow directly because it belongs to another DataTable,
# to overcome that use the 'ItemArray' property to get an array of the values inside
$null = $dtResult.Rows.Add($_.ItemArray)
}
$dtResult
Result (type System.Data.DataTable):
Id SomeName SomeDescription
-- -------- ---------------
3 Woo Yeah

Related

Using PowerShell Core ConvertFrom-Markdown to parse values in a markdown table

I'm interested in using the ConvertFrom-Markdown cmdlet to parse values in a markdown table. The cmdlet uses the markdig markdown processor, which has an Abstract Syntax Tree that should be able to be traversed for this purpose.
How can we search/enumerate the Tokens in the following powershell snippet to return the rows and columns?
(#'
# header1
## header2
| Column1 | Column2 |
| ------- | ------- |
| Row1Column1 | Row1Column2 |
| Row2Column1 | Ro2Column2 |
'# | ConvertFrom-Markdown).Tokens
The values that I see in the Tokens look promising, I can see Markdig.Extensions.Tables.TableCell in the Parent fields, but that's about as far as I can get.
Here's a way to do it.
Note I'm not sure if, example a Table can contain only TableRows, so the | where-object { ... } might not be necessary.
# set up some sample data
$md = #"
# header1
## header2
| Column1 | Column2 |
| ------- | ------- |
| Row1Column1 | Row1Column2 |
| Row2Column1 | Ro2Column2 |
"# | ConvertFrom-Markdown
# walk the syntax tree
$mdDoc = $md.Tokens;
$mdTables = #( $mdDoc | where-object { $_ -is [Markdig.Extensions.Tables.Table] } );
foreach( $mdTable in $mdTables )
{
write-host "table";
$mdRows = #( $mdTable | where-object { $_ -is [Markdig.Extensions.Tables.TableRow] } );
foreach( $mdRow in $mdRows )
{
write-host " row";
write-host " header = $($mdRow.IsHeader)";
$mdCells = #( $mdRow | where-object { $_ -is [Markdig.Extensions.Tables.TableCell] } );
foreach( $mdCell in $mdCells )
{
write-host " cell";
$mdInline = $mdCell.Inline;
write-host " inline - $($mdInline.Content)";
}
}
}
Which gives the following output:
table
row
header = True
cell
inline - Column1
cell
inline - Column2
row
header = False
cell
inline - Row1Column1
cell
inline - Row1Column2
row
header = False
cell
inline - Row2Column1
cell
inline - Ro2Column2
Hopefully that'll be enough to get you started...
If you like to import the markdown tables into PowerShell arrays, you can parse and build PsCustomObjects as well along the way...
$MarkDown = #"
# header1
## header2
| Column1 | Column2 |
| ------- | ------- |
| Row1Column1 | Row1Column2 |
| Row2Column1 | Ro2Column2 |
| Table2 Column1 |
| ------------- |
| T2 Row1 |
| T2 Row2 |
| T2 Row3 |
"# | ConvertFrom-Markdown
$mdDoc = $Markdown.Tokens
[array]$tables = $null
$mdTables = #($mdDoc | where {$_ -is [Markdig.Extensions.Tables.Table]})
foreach ($mdTable in $mdTables) {
[array]$table = $null
$mdRows = #($mdTable | where {$_ -is [Markdig.Extensions.Tables.TableRow]})
foreach ($mdRow in $mdRows) {
$mdCells = #($mdRow | where-object { $_ -is [Markdig.Extensions.Tables.TableCell]})
$mdCellsValues = #($mdCells.Inline.Content | foreach {$_.ToString()})
if ($mdRow.IsHeader) {# don't use headers as values
$CustomProperties = $mdCellsValues
} else {# iterate throw the customobject and populate it
$thisrow = New-Object PSCustomObject | select $CustomProperties
foreach ($i in 0..($CustomProperties.Count -1)) {
$thisrow.($CustomProperties[$i]) = $mdCellsValues[$i]
}
$table += $thisrow
}# endif
}#end tablerows
$tables += ,$table #add each table a sub arrays
}#end tables
$tables
The result is available in two sub arrays
C:\> $tables[0]
Column1 Column2
------- -------
Row1Column1 Row1Column2
Row2Column1 Ro2Column2
C:\> $tables[1]
Table2 Column1
-------------
T2 Row1
T2 Row2
T2 Row3

PowerShell: Iterating Multiple Variables not working as expected

I am trying to iterate over an array $dailyTasks to find 'Blank' i.e. '' values in the EmployeeName column and inject names from another array into those empty values.
Example of what the array looks like before the for loop starts:
| Task | EmployeeName | EmployeeName2 |
|-------|--------------|---------------|
| Task1 | | |
| Task2 | Person Y | |
| Task3 | | |
| Task4 | Person Z | Person X |
This is my for loop code that produces an undesired result. $randomisedUsers is an Object[]
$randomisedUsers | Group-Object { $_ -in ($randomisedUsers | Select-Object -Last 2) } | ForEach-Object {
if ($_.Name -eq 'True') {
for ($i = 0; $i -lt $dailyTasks.Count; $i++) {
if ($dailyTasks[$i].Task -eq 'Task4') {
$dailyTasks[$i].EmployeeName = $_.Group.EmployeeName[0]
$dailyTasks[$i].EmployeeName2 = $_.Group.EmployeeName[1]
}
}
} else {
for ($i = 0; $i -lt $dailyTasks.Count; $i++) {
if ($dailyTasks[$i].EmployeeName -eq '') {
if ($_.Count -gt '1') {
for ($x = 0; $x -lt $_.Group.EmployeeName.Count; $x++) {
$dailyTasks[$i].EmployeeName = $_.Group.EmployeeName[$x]
}
} else {
$dailyTasks[$i].EmployeeName = $_.Group.EmployeeName
}
}
}
}
}
Result:
| Task | EmployeeName | EmployeeName2 |
|-------|--------------|---------------|
| Task1 | Person A | |
| Task2 | Person Y | |
| Task3 | Person A | |
| Task4 | Person Z | Person X |
The problem here is that $_.Group.EmployeeName contains two objects but for whatever reason the result table doesnt populate Person B in the array:
$_.Group.EmployeeName
{Person A, Person B}
The desired result in this case is:
| Task | EmployeeName | EmployeeName2 |
|-------|--------------|---------------|
| Task1 | Person A | |
| Task2 | Person Y | |
| Task3 | Person B | |
| Task4 | Person Z | Person X |
Im not completely sure where im going wrong in my for loops and i've been stuck on this for a while...
TIA
I would personally use something like this:
$csv = #'
Task,EmployeeName,EmployeeName2
Task1,,
Task2,Person Y,
Task3,,
Task4,Person Z,Person X
'# | ConvertFrom-Csv
$fillEmployees = [System.Collections.ArrayList]#(
'Person A'
'Person B'
)
foreach($line in $csv)
{
if([string]::IsNullOrWhiteSpace($line.EmployeeName))
{
$line.EmployeeName = $fillEmployees[0]
$fillEmployees.RemoveAt(0)
}
}
The flow is quite simple, if the loop finds a value in EmployeeName that is null or has white spaces it will replace that value with the index 0 of $fillEmployees and then remove that index 0 from the list.
It's hard to tell what you're trying to accomplish with your code, but if you have an array of the type System.Array filled with random names which will be used to fill this empty values on EmployeeName you can convert that Array to an ArrayList which will allow you to use the .RemoveAt(..) method:
PS /> $fillEmployees = 0..10 | ForEach-Object {"Employee {0}" -f [char](Get-Random -Minimum 65 -Maximum 90)}
PS /> $fillEmployees
Employee J
Employee S
Employee D
Employee P
Employee O
Employee E
Employee M
Employee K
Employee R
Employee F
Employee A
PS /> $fillEmployees.GetType()
IsPublic IsSerial Name BaseType
-------- -------- ---- --------
True True Object[] System.Array
Attempting to Remove an item from an Array would result in the following:
PS /> $fillEmployees.RemoveAt(0)
Exception calling "RemoveAt" with "1" argument(s): "Collection was of a fixed size."
At line:1 char:1
...
...
However if convert it to an ArrayList (not convert it but copy it):
PS /> $fillEmployees = [System.Collections.ArrayList]$fillEmployees
PS /> $fillEmployees.RemoveAt(0)

PowerShell: Expression only with Last item of an Array

I've been stuck on this for a little while however I've got an array of People and im trying to get the last person and creating a seperate column with that person only.
I've played around with #{NAME = 'NAME' Expression = {}} in Select-Object but I don't really know how to tackle it.
Current:
| Employee |
|---------------|
| John Doe |
| Jane West |
| Jordan Row |
| Paul Willson |
| Andrew Wright |
Desired Result:
| Employee | Employee2 |
|--------------|---------------|
| John Doe | |
| Jane West | |
| Jordan Row | |
| Paul Willson | Andrew Wright |
TIA!
So what I decided to do here is create 2 groups. One group contains all of the values except the last 2, and the other group contains these last 2 values
# create the sample array
$employees = #(
'John Doe'
'Jane West'
'Jordan Row'
'Paul Willson'
'Andrew Wright'
)
$employees |
# Separate objects into 2 groups: those contained in the last 2 values and those not contained in the last 2 values
Group-Object {$_ -in ($employees | Select-Object -Last 2)} |
ForEach-Object {
switch ($_) {
{$_.name -eq 'False'} { # 'False' Name of group where values are not one of the last 2
# Iterate through all the values and assign them to Employee property. Leave Employee2 property blank
$_.group | ForEach-Object {
[PSCustomObject]#{
Employee = $_
Employee2 = ''
}
}
}
{$_.name -eq 'True'} { # 'True' Name of group where values are those of the last 2
# Create an object that assigns the values to Employee and Employee2
[PSCustomObject]#{
Employee = $_.group[0]
Employee2 = $_.group[1]
}
}
}
}
Output
Employee Employee2
-------- ---------
John Doe
Jane West
Jordan Row
Paul Willson Andrew Wright
Edit
Here is another way you can do it
$employees[0..($employees.Count-3)] | ForEach-Object {
[PSCustomObject]#{
Employee = $_
Employee2 = ''
}
}
[PSCustomObject]#{
Employee = $employees[-2]
Employee2 = $employees[-1]
}

Add a column to a csv file and fill up new column based on an existing column powershell

I have been trying to add a new column to a csv file and populating the new column based on value in an existing column.
I have a table like this:
|name | number | state | desc|
| ---- | ------ |-------|-----|
|a | 1 | n | i |
|b | 2 | n | j |
|c | 3 | l | j |
|d | 4 | m | k |
I want to add a new column data and populate it based on number column matching with an array.
This is my code so far:
$a=("a","b","c")
$b=("p","q","r")
.
.
.
$c= import-csv -Path "C:\..."
$b |where-object {filtered the file based on some criteria}| select-object number, state, desc, #{Name="data"; Expression={Foreach-object {if ($_.number in $a){$_data = "x"}
elseif($_.number in $b){$_.data = "y"}.......} | export-csv -notypeinformation -path "C:\...."
The script runs but do not populate the new column. Please help
You've got the right idea. Import-Csv will produce an array of objects and you can use Select-Object to add calculated properties, then pipe again to Export-Csv. However, it's not exactly clear from the description or the example code what the expression should be. How do you want to define the new "data" property?
For now I'll work with what we have. The array variables $a & $b will never match anything. Also you can't use ForEach-Object like that, nor will assigning to $data work. The returning value of the Expression script block gets assigned to the property you named data. The following example demonstrates the point:
$a = ( "1", "2", "3")
$b = ( "4", "5", "6")
Import-Csv -Path "C:\temp\12-22-20.csv"|
Select-Object number, state, desc,
#{Name = 'Data'; Expression = { If( $_.Number -in $a ){ 'x' } elseif( $_.Number -in $b ){ 'y' } Else { $null }}} |
Export-Csv -Path "C:\temp\12-22-20_New.csv" -NoTypeInformation
The resulting Csv file will look something like:
number state desc Data
------ ----- ---- ----
1 n i x
2 n j x
3 l j x
4 m k y
Update: Example Using Add-Member
You do not need to use a loop to add the property:
$a = ( "1", "2", "3")
$b = ( "4", "5", "6")
Import-Csv -Path "C:\temp\12-22-20.csv" |
Add-Member -MemberType ScriptProperty -Name "data" -Value { If( $this.Number -in $a ){ 'x' } elseif( $this.Number -in $b ){ 'y' } Else { $null }} -PassThru |
Export-Csv -Path C:\temp\12-22-20_New.csv -NoTypeInformation
By using a MemberType of ScriptProperty we can make a slight modification to script block, replacing $_ with $this The pipe is an implicit loop. I'm not sure if there are any detractions to using a ScriptProperty, but this exports as expected. This approach doesn't require storing the output in $c, but -PassThru would facilitate that if preferred.
99% of the time Select-Object is used for this. The only difference I'm aware of it Select-Object converts the objects to PSCustomObjects. Get-Member will preserve the underlying type, however Import-Csv only outputs PSCustomObjects in the first place, so there's no impact here.
Try iterating over the $c array of imported objects and add the new property to all objects. You want to make sure the new column exists in all of the objects. You can either use Select-Object as in your example, or you can use Add-Member to add it to the imported object.
$a=("a","b","c")
$b=("p","q","r")
...
$c = Import-Csv -Path "C:\..."
$c | ForEach-Object {
$value = ""
# custom logic for value of "data"
# if (...) { $value = ... }
$_ | Add-Member -MemberType NoteProperty -Name "data" -Value $value
}
$c | Export-Csv -NoTypeInformation -path "C:\...."

How to use Group-Object on this?

I am trying to get all the accounts from $f which do not match the accounts in $table4 into $accounts. But I need to also check if the occupancy number matches or not.
CSV $f:
Account_no |occupant_code
-----------|------------
12345 | 1
67890 | 2
45678 | 3
DataTable $table4
Account_no |occupant_code
-----------|------------
12345 | 1
67890 | 1
45678 | 3
Current code:
$accounts = Import-Csv $f |
select account_no, occupant_code |
where { $table4.account_no -notcontains $_.account_no }
What this needs to do is to check that occupant_code doesn't match, i.e.:
12345: account and occupant from $f and $table4 match; so it's ignored
67890: account matches $table4, but occupancy_code does not match, so it is added to $accounts.
Current result:
Desired result: 67890
I believe I need to use Group-Object, but I do not know how to use that correctly.
I tried:
Import-Csv $f |
select account_no, occupant_code |
Group-Object account_no |
Where-Object { $_.Group.occupant_code -notcontains $table4.occupant_code }
An alternative to Bill's suggestion would be to fill a hashtable with your reference data ($table4) and look up the occupant_code value for each account from $f, assuming that your account numbers are unique:
$ref = #{}
$table4 | ForEach-Object {
$ref[$_.Account_no] = $_.occupant_code
}
$accounts = Import-Csv $f |
Where-Object { $_.occupant_code -ne $ref[$_.Account_no] } |
Select-Object -Expand Account_no
Compare-Object?
csv1.csv:
Account_no,occupant_code
12345,1
67890,2
45678,3
csv2.csv:
Account_no,occupant_code
12345,1
67890,1
45678,3
PowerShell command:
Compare-Object (Import-Csv .\csv1.csv) (Import-Csv .\csv2.csv) -Property occupant_code -PassThru
Output:
Account_no occupant_code SideIndicator
---------- ------------- -------------
67890 1 =>
67890 2 <=
$f | InnerJoin $table4 {$Left.Account_no -eq $Right.Account_no -and $Left.occupant_code -ne $Right.occupant_code} #{Account_no = {$Left.$_}} | Format-Table
Result:
occupant_code Account_no
------------- ----------
{2, 1} 67890
For details see: In Powershell, what's the best way to join two tables into one?
In addition to all the other answers, you might be able to leverage the IndexOf() method on arrays
$services = get-service
$services.name.IndexOf("xbgm")
240
I am on a tablet right now and don't have a handy way to test it, but something along these lines might work for you:
$table4.account_no.IndexOf($_.account_no)
should fetch the index your account_no lives in for $table 4, so you could jam it all into one ugly pipe:
$accounts = Import-Csv $f | select account_no, occupant_code |
where { ($table4.account_no -notcontains $_.account_no) -or ($table4[$table4.account_no.IndexOf($_.account_no)].occupant_code -ne $_.occupant_code) }
An inner join or a normal loop might just be cleaner though, especially if you want to add some other stuff in. Since someone posted an innerjoin, you could try a loop like:
$accounts = new-object System.Collections.ArrayList
$testSet = $table4.account_no
foreach($myThing in Import-Csv $f)
{
if($myThing.account_no -in $testSet )
{
$i = $testSet.IndexOf($myThing.account_no)
if($table4[$i].occupant_code -eq $myThing.occupant_code) {continue}
}
$accounts.add($myThing)
}
Edit for OP, he mentioned $table4 is a data.table
There is probably a much better way to do this, as I haven't used data.table before, but this seems to work fine:
$table = New-Object system.Data.DataTable
$col1 = New-Object system.Data.DataColumn Account_no,([string])
$col2 = New-Object system.Data.DataColumn occupant_code,([int])
$table.columns.add($col1)
$table.columns.add($col2)
$row = $table.NewRow()
$row.Account_no = "12345"
$row.occupant_code = 1
$table.Rows.Add($row)
$row = $table.NewRow()
$row.Account_no = "67890"
$row.occupant_code = 1
$table.Rows.Add($row)
$row = $table.NewRow()
$row.Account_no = "45678"
$row.occupant_code = 3
$table.Rows.Add($row)
$testList = #()
$testlist += [pscustomobject]#{Account_no = "12345"; occupant_code = 1}
$testlist += [pscustomobject]#{Account_no = "67890"; occupant_code = 2}
$testlist += [pscustomobject]#{Account_no = "45678"; occupant_code = 3}
$accounts = new-object System.Collections.ArrayList
$testSet = $table.account_no
foreach($myThing in $testList)
{
if($myThing.account_no -in $testSet )
{
$i = $testSet.IndexOf($myThing.account_no)
if($table.Rows[$i].occupant_code -eq $myThing.occupant_code) {continue}
}
$accounts.add($myThing) | out-null
}
$accounts