Transform complex structure to CSV? [duplicate] - powershell

I have a JSON file that looks like this:
{
"id": 10011,
"title": "Test procedure",
"slug": "slug",
"url": "http://test.test",
"email": "test#test.com",
"link": "http://test.er",
"subject": "testing",
"level": 1,
"disciplines": [
"discipline_a",
"discipline_b",
"discipline_c"
],
"areas": [
"area_a",
"area_b"
]
},
I was trying to use the following command to convert that into the CSV file:
(Get-Content "PATH_TO\test.json" -Raw | ConvertFrom-Json)| Convertto-CSV -NoTypeInformation | Set-Content "PATH_TO\test.csv"
However, for disciplines and areas I am getting System.Object[] in the resulting CSV file.
Is there a way to put all those nested values as a separate columns in CSV file like area_1, area_2 etc. And the same for disciplines.

2017-11-20, Completely rewrote function to improve performance and add features as -ArrayBase and support for PSStandardMembers and grouped objects.
Flatten-Object
Recursively flattens objects containing arrays, hash tables and (custom) objects. All added properties of the supplied objects will be aligned with the rest of the objects.
Requires PowerShell version 2 or higher.
Cmdlet
Function Flatten-Object { # Version 00.02.12, by iRon
[CmdletBinding()]Param (
[Parameter(ValueFromPipeLine = $True)][Object[]]$Objects,
[String]$Separator = ".", [ValidateSet("", 0, 1)]$Base = 1, [Int]$Depth = 5, [Int]$Uncut = 1,
[String[]]$ToString = ([String], [DateTime], [TimeSpan]), [String[]]$Path = #()
)
$PipeLine = $Input | ForEach {$_}; If ($PipeLine) {$Objects = $PipeLine}
If (#(Get-PSCallStack)[1].Command -eq $MyInvocation.MyCommand.Name -or #(Get-PSCallStack)[1].Command -eq "<position>") {
$Object = #($Objects)[0]; $Iterate = New-Object System.Collections.Specialized.OrderedDictionary
If ($ToString | Where {$Object -is $_}) {$Object = $Object.ToString()}
ElseIf ($Depth) {$Depth--
If ($Object.GetEnumerator.OverloadDefinitions -match "[\W]IDictionaryEnumerator[\W]") {
$Iterate = $Object
} ElseIf ($Object.GetEnumerator.OverloadDefinitions -match "[\W]IEnumerator[\W]") {
$Object.GetEnumerator() | ForEach -Begin {$i = $Base} {$Iterate.($i) = $_; $i += 1}
} Else {
$Names = If ($Uncut) {$Uncut--} Else {$Object.PSStandardMembers.DefaultDisplayPropertySet.ReferencedPropertyNames}
If (!$Names) {$Names = $Object.PSObject.Properties | Where {$_.IsGettable} | Select -Expand Name}
If ($Names) {$Names | ForEach {$Iterate.$_ = $Object.$_}}
}
}
If (#($Iterate.Keys).Count) {
$Iterate.Keys | ForEach {
Flatten-Object #(,$Iterate.$_) $Separator $Base $Depth $Uncut $ToString ($Path + $_)
}
} Else {$Property.(($Path | Where {$_}) -Join $Separator) = $Object}
} ElseIf ($Objects -ne $Null) {
#($Objects) | ForEach -Begin {$Output = #(); $Names = #()} {
New-Variable -Force -Option AllScope -Name Property -Value (New-Object System.Collections.Specialized.OrderedDictionary)
Flatten-Object #(,$_) $Separator $Base $Depth $Uncut $ToString $Path
$Output += New-Object PSObject -Property $Property
$Names += $Output[-1].PSObject.Properties | Select -Expand Name
}
$Output | Select ([String[]]($Names | Select -Unique))
}
}; Set-Alias Flatten Flatten-Object
Syntax
<Object[]> Flatten-Object [-Separator <String>] [-Base "" | 0 | 1] [-Depth <Int>] [-Uncut<Int>] [ToString <Type[]>]
or:
Flatten-Object <Object[]> [[-Separator] <String>] [[-Base] "" | 0 | 1] [[-Depth] <Int>] [[-Uncut] <Int>] [[ToString] <Type[]>]
Parameters
-Object[] <Object[]>
The object (or objects) to be flatten.
-Separator <String> (Default: .)
The separator used between the recursive property names. .
-Depth <Int> (Default: 5)
The maximal depth of flattening a recursive property. Any negative value will result in an unlimited depth and could cause a infinitive loop.
-Uncut <Int> (Default: 1)
The number of object iterations that will left uncut further object properties will be limited to just the DefaultDisplayPropertySet. Any negative value will reveal all properties of all objects.
-Base "" | 0 | 1 (Default: 1)
The first index name of an embedded array:
1, arrays will be 1 based: <Parent>.1, <Parent>.2, <Parent>.3, ...
0, arrays will be 0 based: <Parent>.0, <Parent>.1, <Parent>.2, ...
"", the first item in an array will be unnamed and than followed with 1: <Parent>, <Parent>.1, <Parent>.2, ...
-ToString <Type[]= [String], [DateTime], [TimeSpan]>
A list of value types (default [String], [DateTime], [TimeSpan]) that will be converted to string rather the further flattened. E.g. a [DateTime] could be flattened with additional properties like Date, Day, DayOfWeek etc. but will be converted to a single (String) property instead.
Note:
The parameter -Path is for internal use but could but used to prefix property names.
Examples
Answering the specific question:
(Get-Content "PATH_TO\test.json" -Raw | ConvertFrom-Json) | Flatten-Object | Convertto-CSV -NoTypeInformation | Set-Content "PATH_TO\test.csv"
Result:
{
"url": "http://test.test",
"slug": "slug",
"id": 10011,
"link": "http://test.er",
"level": 1,
"areas.2": "area_b",
"areas.1": "area_a",
"disciplines.3": "discipline_c",
"disciplines.2": "discipline_b",
"disciplines.1": "discipline_a",
"subject": "testing",
"title": "Test procedure",
"email": "test#test.com"
}
Stress testing a more complex custom object:
New-Object PSObject #{
String = [String]"Text"
Char = [Char]65
Byte = [Byte]66
Int = [Int]67
Long = [Long]68
Null = $Null
Booleans = $False, $True
Decimal = [Decimal]69
Single = [Single]70
Double = [Double]71
Array = #("One", "Two", #("Three", "Four"), "Five")
HashTable = #{city="New York"; currency="Dollar"; postalCode=10021; Etc = #("Three", "Four", "Five")}
Object = New-Object PSObject -Property #{Name = "One"; Value = 1; Text = #("First", "1st")}
} | Flatten
Result:
Double : 71
Decimal : 69
Long : 68
Array.1 : One
Array.2 : Two
Array.3.1 : Three
Array.3.2 : Four
Array.4 : Five
Object.Name : One
Object.Value : 1
Object.Text.1 : First
Object.Text.2 : 1st
Int : 67
Byte : 66
HashTable.postalCode : 10021
HashTable.currency : Dollar
HashTable.Etc.1 : Three
HashTable.Etc.2 : Four
HashTable.Etc.3 : Five
HashTable.city : New York
Booleans.1 : False
Booleans.2 : True
String : Text
Char : A
Single : 70
Null :
Flatting grouped objects:
$csv | Group Name | Flatten | Format-Table # https://stackoverflow.com/a/47409634/1701026
Flatting common objects:
(Get-Process)[0] | Flatten-Object
Or a list (array) of objects:
Get-Service | Flatten-Object -Depth 3 | Export-CSV Service.csv
Note that a command as below takes hours to compute:
Get-Process | Flatten-Object | Export-CSV Process.csv
Why? because it results in a table with a few hundred rows and several thousand columns. So if you if would like to use this for flatting process, you beter limit the number of rows (using the Where-Object cmdlet) or the number of columns (using the Select-Object cmdlet).
For the latest Flatten-Object version, see: https://powersnippets.com/flatten-object/

The CSV conversion/export cmdlets have no way of "flattening" an object, and I may be missing something, but I know of no way to do this with a built-in cmdlet or feature.
If you can guarantee that disciplines and areas will always have the same number of elements, you can trivialize it by using Select-Object with derived properties to do this:
$properties=#('id','title','slug','url','email','link','subject','level',
#{Name='discipline_1';Expression={$_.disciplines[0]}}
#{Name='discipline_2';Expression={$_.disciplines[1]}}
#{Name='discipline_3';Expression={$_.disciplines[2]}}
#{Name='area_1';Expression={$_.areas[0]}}
#{Name='area_2';Expression={$_.areas[1]}}
)
(Get-Content 'PATH_TO\test.json' -Raw | ConvertFrom-Json)| Select-Object -Property $properties | Export-CSV -NoTypeInformation -Path 'PATH_TO\test.csv'
However, I am assuming that disciplines and areas will be variable length for each record. In that case, you will have to loop over the input and pull the highest count value for both disciplines and areas, then build the properties array dynamically:
$inputData = Get-Content 'PATH_TO\test.json' -Raw | ConvertFrom-Json
$counts = $inputData | Select-Object -Property #{Name='disciplineCount';Expression={$_.disciplines.Count}},#{Name='areaCount';Expression={$_.areas.count}}
$maxDisciplines = $counts | Measure-Object -Maximum -Property disciplineCount | Select-Object -ExpandProperty Maximum
$maxAreas = $counts | Measure-Object -Maximum -Property areaCount | Select-Object -ExpandProperty Maximum
$properties=#('id','title','slug','url','email','link','subject','level')
1..$maxDisciplines | % {
$properties += #{Name="discipline_$_";Expression=[scriptblock]::create("`$_.disciplines[$($_ - 1)]")}
}
1..$maxAreas | % {
$properties += #{Name="area_$_";Expression=[scriptblock]::create("`$_.areas[$($_ - 1)]")}
}
$inputData | Select-Object -Property $properties | Export-CSV -NoTypeInformation -Path 'PATH_TO\test.csv'
This code hasn't been fully tested, so it may need some tweaking to work 100%, but I believe the ideas are solid =)

Related

Add a column to a csv file and fill up new column based on an existing column powershell

I have been trying to add a new column to a csv file and populating the new column based on value in an existing column.
I have a table like this:
|name | number | state | desc|
| ---- | ------ |-------|-----|
|a | 1 | n | i |
|b | 2 | n | j |
|c | 3 | l | j |
|d | 4 | m | k |
I want to add a new column data and populate it based on number column matching with an array.
This is my code so far:
$a=("a","b","c")
$b=("p","q","r")
.
.
.
$c= import-csv -Path "C:\..."
$b |where-object {filtered the file based on some criteria}| select-object number, state, desc, #{Name="data"; Expression={Foreach-object {if ($_.number in $a){$_data = "x"}
elseif($_.number in $b){$_.data = "y"}.......} | export-csv -notypeinformation -path "C:\...."
The script runs but do not populate the new column. Please help
You've got the right idea. Import-Csv will produce an array of objects and you can use Select-Object to add calculated properties, then pipe again to Export-Csv. However, it's not exactly clear from the description or the example code what the expression should be. How do you want to define the new "data" property?
For now I'll work with what we have. The array variables $a & $b will never match anything. Also you can't use ForEach-Object like that, nor will assigning to $data work. The returning value of the Expression script block gets assigned to the property you named data. The following example demonstrates the point:
$a = ( "1", "2", "3")
$b = ( "4", "5", "6")
Import-Csv -Path "C:\temp\12-22-20.csv"|
Select-Object number, state, desc,
#{Name = 'Data'; Expression = { If( $_.Number -in $a ){ 'x' } elseif( $_.Number -in $b ){ 'y' } Else { $null }}} |
Export-Csv -Path "C:\temp\12-22-20_New.csv" -NoTypeInformation
The resulting Csv file will look something like:
number state desc Data
------ ----- ---- ----
1 n i x
2 n j x
3 l j x
4 m k y
Update: Example Using Add-Member
You do not need to use a loop to add the property:
$a = ( "1", "2", "3")
$b = ( "4", "5", "6")
Import-Csv -Path "C:\temp\12-22-20.csv" |
Add-Member -MemberType ScriptProperty -Name "data" -Value { If( $this.Number -in $a ){ 'x' } elseif( $this.Number -in $b ){ 'y' } Else { $null }} -PassThru |
Export-Csv -Path C:\temp\12-22-20_New.csv -NoTypeInformation
By using a MemberType of ScriptProperty we can make a slight modification to script block, replacing $_ with $this The pipe is an implicit loop. I'm not sure if there are any detractions to using a ScriptProperty, but this exports as expected. This approach doesn't require storing the output in $c, but -PassThru would facilitate that if preferred.
99% of the time Select-Object is used for this. The only difference I'm aware of it Select-Object converts the objects to PSCustomObjects. Get-Member will preserve the underlying type, however Import-Csv only outputs PSCustomObjects in the first place, so there's no impact here.
Try iterating over the $c array of imported objects and add the new property to all objects. You want to make sure the new column exists in all of the objects. You can either use Select-Object as in your example, or you can use Add-Member to add it to the imported object.
$a=("a","b","c")
$b=("p","q","r")
...
$c = Import-Csv -Path "C:\..."
$c | ForEach-Object {
$value = ""
# custom logic for value of "data"
# if (...) { $value = ... }
$_ | Add-Member -MemberType NoteProperty -Name "data" -Value $value
}
$c | Export-Csv -NoTypeInformation -path "C:\...."

Compare-Object - Separate side columns

Is it possible to display the results of a PowerShell Compare-Object in two columns showing the differences of reference vs difference objects?
For example using my current cmdline:
Compare-Object $Base $Test
Gives:
InputObject SideIndicator
987654 =>
555555 <=
123456 <=
In reality the list is rather long. For easier data reading is it possible to format the data like so:
Base Test
555555 987654
123456
So each column shows which elements exist in that object vs the other.
For bonus points it would be fantastic to have a count in the column header like so:
Base(2) Test(1)
555555 987654
123456
Possible? Sure. Feasible? Not so much. PowerShell wasn't really built for creating this kind of tabular output. What you can do is collect the differences in a hashtable as nested arrays by input file:
$ht = #{}
Compare-Object $Base $Test | ForEach-Object {
$value = $_.InputObject
switch ($_.SideIndicator) {
'=>' { $ht['Test'] += #($value) }
'<=' { $ht['Base'] += #($value) }
}
}
then transpose the hashtable:
$cnt = $ht.Values |
ForEach-Object { $_.Count } |
Sort-Object |
Select-Object -Last 1
$keys = $ht.Keys | Sort-Object
0..($cnt-1) | ForEach-Object {
$props = [ordered]#{}
foreach ($key in $keys) {
$props[$key] = $ht[$key][$_]
}
New-Object -Type PSObject -Property $props
} | Format-Table -AutoSize
To include the item count in the header name change $props[$key] to $props["$key($($ht[$key].Count))"].

Powershell v2.0 substitute null values from a Hash table

I have a hash table as below:
$Hash = #{
Team1=$Team1.count
Team2=$Team2.count
Team3=$Team3.count
}
$GroupByTeam = New-Object psobject -Property $Hash |
Select 'Team1','Team2','Team3' | ConvertTo-Html -Fragment
This is fine and each "team" returns their own value. However, teams may have a null value and I wish to substitute this for "0".
In an attempt to work this out, I have tried to select the null value first but can't seem to do this:
$Hash.values | select -property Values
Values
------
{1, 2}
But
$Hash.values | select -property Values | where {$_.Values is $null}
doesn't pull back anything. Also tried:
$Hash.values | select -expandproperty Values | where {$_.Values is $null}
Any ideas?
thanks
Your best option is to cast the values to int when creating the hashtable:
$Hash = #{
Team1 = [int]$Team1.Count
Team2 = [int]$Team2.Count
Team3 = [int]$Team3.Count
}
If that's not possible for some reason you could go with an enumerator:
($Hash.GetEnumerator()) | ForEach-Object {
if ($_.Value -eq $null) { $Hash[$_.Name] = 0 }
}
or (as Mathias suggested) use the Keys property to the same end:
($Hash.Keys) | ForEach-Object {
if ($Hash[$_] -eq $null) { $Hash[$_] = 0 }
}
Note that either way you need to use a subexpression (or assign the enumerated objects/keys to a variable) otherwise you'll get an error because you're modifying a data structure while it's being enumerated.
What you'll want to do is collect the keys that refer to null values, and then populate those with 0s:
# Create and populate hashtable
$HashTable = #{
Team1 = 123
Team2 = $null
Team3 = 456
}
# Find keys of `$null` values
$nullKeys = $HashTable.Keys |Where-Object { $HashTable[$_] -eq $null }
# Populate appropriate indices with 0
$nullKeys |ForEach-Object { $HashTable[$_] = 0 }

How to convert string to integer in PowerShell

I have a list of directories with numbers. I have to find the highest number and and increment it by 1 and create a new directory with that increment value. I am able to sort the below array, but I am not able to increment the last element as it is a string.
How do I convert this below array element to an integer?
PS C:\Users\Suman\Desktop> $FileList
Name
----
11
2
1
You can specify the type of a variable before it to force its type. It's called (dynamic) casting (more information is here):
$string = "1654"
$integer = [int]$string
$string + 1
# Outputs 16541
$integer + 1
# Outputs 1655
As an example, the following snippet adds, to each object in $fileList, an IntVal property with the integer value of the Name property, then sorts $fileList on this new property (the default is ascending), takes the last (highest IntVal) object's IntVal value, increments it and finally creates a folder named after it:
# For testing purposes
#$fileList = #([PSCustomObject]#{ Name = "11" }, [PSCustomObject]#{ Name = "2" }, [PSCustomObject]#{ Name = "1" })
# OR
#$fileList = New-Object -TypeName System.Collections.ArrayList
#$fileList.AddRange(#([PSCustomObject]#{ Name = "11" }, [PSCustomObject]#{ Name = "2" }, [PSCustomObject]#{ Name = "1" })) | Out-Null
$highest = $fileList |
Select-Object *, #{ n = "IntVal"; e = { [int]($_.Name) } } |
Sort-Object IntVal |
Select-Object -Last 1
$newName = $highest.IntVal + 1
New-Item $newName -ItemType Directory
Sort-Object IntVal is not needed so you can remove it if you prefer.
[int]::MaxValue = 2147483647 so you need to use the [long] type beyond this value ([long]::MaxValue = 9223372036854775807).
Example:
2.032 MB (2,131,022 bytes)
$u=($mbox.TotalItemSize.value).tostring()
$u=$u.trimend(" bytes)") #yields 2.032 MB (2,131,022
$u=$u.Split("(") #yields `$u[1]` as 2,131,022
$uI=[int]$u[1]
The result is 2131022 in integer form.
Use:
$filelist = #(11, 1, 2)
$filelist | sort #{expression={$_[0]}} |
% {$newName = [string]([int]$($_[0]) + 1)}
New-Item $newName -ItemType Directory
Use:
$filelist = #("11", "1", "2")
$filelist | sort #{expression={[int]$_}} | % {$newName = [string]([int]$_ + 1)}
New-Item $newName -ItemType Directory
If someone is looking for how this can be run from command line, as a single command, this is one way it can be done:
$FileList | ` # Writes array to pipeline
Select-Object -Last 1 | ` # Selects last item in array
ConvertFrom-String -TemplateContent "{[int]NameTmp:12}" | ` # Converts string to number and names the variable "NameTmp"
Add-Member -Name "Name" -Value { $this.NameTmp + 1 } -MemberType ScriptProperty -PassThru | ` # Increments variable "NameTmp" by one and adds new variable named "Name" to pipeline object
New-Item -Type Directory # Creates new directy in current folder. Takes directory name from pipelined "Name" variable
Once you have selected the highest value, which is "12" in my example, you can then declare it as integer and increment your value:
$FileList = "1", "2", "11"
$foldername = [int]$FileList[2] + 1
$foldername

Powershell looping through Object Variables returning values

Suppose I have object $foo with many (500+) properties.
$foo.q1_sales = "1000"
$foo.q1_expense = "800"
$foo.q2_sales = "1325"
$foo.q2_expense = "1168"
$foo.q3_sales = "895"
$foo.q3_expense = "980"
$foo.q4_sales = "900"
$foo.q4_expense = "875"
...
I want to loop through all properties in $foo and get each value and process it in some way.
$quarters = #("1","2","3","4")
foreach($quarter in $quarters) {
if($foo.q$quarter_sales -gt $foo.q$quarter_expense) {
#process data
}
}
How do I accomplish this? Get-Variable? Get-Member? some combination? Some other way?
Changing the structure of $foo is not an option, unless we can do it programmatically. Sorry.
You can use a subexpression to evaluate the property name, such as:
$quarters = #("1","2","3","4")
foreach($quarter in $quarters) {
if($foo.$("q"+$quarter+"_sales") -gt $foo.$("q"+$quarter+"_expense")) {
#process data
}
}
That will evaluate the sub-expressions first, so it figures out "q"+$quarter+"_sales" and then just evaluates $foo.q1_sales as a result.
Get-Member is a good thought. Here's a generalized attempt to expand on that, so that you can see how it could be done. For my example, my object will be a DateTime:
# Define our object
$object = (Get-Date)
# Get the property names
$properties = $object | Get-Member -MemberType "Property" | % { $_.Name }
# Get our collection of values by iterating the collection of properties
# and for each property getting the value.
$values = $properties | % { $object."$_" }
And then the output would just be the values of each property of DateTime:
$values
Tuesday, October 14, 2014 12:00:00 AM
14
Tuesday
287
17
Local
972
44
10
45
635489054859729996
Ticks : 638859729996
Days : 0
Hours : 17
Milliseconds : 972
Minutes : 44
Seconds : 45
TotalDays : 0.739420983791667
TotalHours : 17.746103611
TotalMilliseconds : 63885972.9996
TotalMinutes : 1064.76621666
TotalSeconds : 63885.9729996
2014
This assumes that you only want MemberTypes of "Property", so this may not go far enough if you're also after NoteProperty membertypes. Maybe "*Property" for that?
Otherwise, this should work for any arbitrary type with properties. Just swap out my $object with yours.
#TheMadTechnician has already provided the simplest answer. Here's an alternative way:
#Sampledata
$foo = New-Object psobject -Property #{
q1_sales = "1000"
q1_expense = "800"
q2_sales = "1325"
q2_expense = "1168"
q3_sales = "895"
q3_expense = "980"
q4_sales = "900"
q4_expense = "875"
}
#Convert to array
$array = $foo.psobject.Properties |
#Group by quarter id(number) so we can create an object per quarter
Group-Object { $_.Name -replace 'q(\d+)_.*', '$1' } |
ForEach-Object {
New-Object psobject -Property #{
Quarter = $_.Name
Sales = [int]($_.Group | Where-Object { $_.Name -match 'sales' } | Select-Object -ExpandProperty Value)
Expense = [int]($_.Group | Where-Object { $_.Name -match 'expense' } | Select-Object -ExpandProperty Value)
}
}
#Get quarters with positive result
$array | Where-Object { $_.Sales -gt $_.Expense } | ForEach-Object {
#process data (sends object through in this sample)
$_
}
Output:
Sales Expense Quarter
----- ------- -------
1325 1168 2
1000 800 1
900 875 4