Array unrolling in PowerShell - powershell

I'm trying to understand array unrolling in Posh and the following puzzles me. In this example I return the array wrapped in another array (using a comma), which basically gives me the original array after the outer array gets unrolled on return. Why then do I get a correct result (just the array) when I indirectly wrap the result by using the array operator (#) on a variable with the result, but get an array in an array if I use # directly on the function? Does # operator behave differently for different types of its parameter?
Is there any PS help article for the # operator and the array unrolling?
function Unroll
{
return ,#(1,2,3)
}
Write-Host "No wrapper"
$noWrapper = Unroll #This is array[3]
$noWrapper | % { $_.GetType() }
Write-Host "`nWrapped separately"
$arrayWrapper = #($noWrapper) #No change, still array[3]
$arrayWrapper | % { $_.GetType() }
Write-Host "`nWrapped directly"
$directArrayWrapper = #(Unroll) #Why is this array[1][3] then?
$directArrayWrapper | % { $_.GetType() }
Write-Host "`nThe original array in elem 0"
$directArrayWrapper[0] | % { $_.GetType() }
Thanks

If you remove the comma operator from the Unroll function, it behaves at you should expect.
function Unroll
{
return #(1,2,3)
}
In this case $arrayWrapper = #($noWrapper) is the same as $directArrayWrapper = #(Unroll).
You may find more information about array unrolling with this SO question and in this article about array literals In PowerShell

Looking at this other question, it seems that
Putting the results (an array) within a grouping expression (or
subexpression e.g. $()) makes it eligible again for unrolling.
which seems to be what happen here.

Related

Powershell only outputting first character

I have the following dataset:
id|selectedquery|
1|SELECT fieldX FROM tableA|
2|SELECT fieldY FROM tableB|
that dataset is used in the following code
$rows=($dataSet.Tables | Select-Object -Expand Rows)
$i=0
foreach ($row in $rows)
{
#Write-Output $rows.selectquery[$i].length
$query = $rows.selectquery[$i]
#Write-Output $rows.selectquery[$i]
--doing some stuff--
$i++
}
Often $rows.selectquery[$i] only gives me the first character of the value in the field selectedquery being the 'S'.
When I remove the [$i] from $rows.selectquery it gives me (understandably) multiple records back. If I then put the [$i] back after $rows.selectquery[$i] things woerk fine.
Can anyone explain this behaviour?
You'll want to reference the SelectQuery property on either $row or $rows[$i] - not the entire $rows collection:
$row.SelectQuery
# or
$rows[$i].SelectQuery
Mathias' helpful answer shows the best way to solve your particular problem.
As for what happened:
You - inadvertently - used PowerShell's member-access enumeration feature when you used $rows.selectquery; that is, even though $rows is a collection that itself has no .selectquery property, PowerShell accessed that property on every element of the collection and returned the resulting values as an array.
The pitfall is that if the collection only has one element, the return value is not an array - it is just the one and only element's property value itself.
While this is analogous to how the pipeline operates (a single output object is captured by itself if assigned to a variable, while two or more are implicitly collected in an array), it is somewhat counterintuitive in the context of member-access enumeration:
In other words, $collection.SomeProperty is equivalent to $collection | ForEach-Object { $_.SomeProperty } and not, as would make more sense, because it always returns an array (collection), $collection.ForEach('SomeProperty')
GitHub issue #6802 discusses this problem.
While this behavior is often unproblematic, because PowerShell offers unified handling of scalars and collections (e.g. (42)[0], is the same as 42 itself; see this answer), a problem arises if the single value returned happens to be a string, because indexing into a string returns its characters.
Workaround: Cast to [array] before applying the index:
([array] $rows.selectquery)[0]
A simple example:
# Multi-element array.
[array] $rows1 = [pscustomobject] #{ selectquery = 'foo' },
[pscustomobject] #{ selectquery = 'bar' }
# Single-element array:
[array] $rows2 = [pscustomobject] #{ selectquery = 'baz' }
# Contrast member-access enumeration + index access between the two:
[pscustomobject] #{
MultiElement = $rows1.selectquery[0]
SingleElement = $rows2.selectquery[0]
SinglElementWithWorkaround = ([array] $rows2.selectquery)[0]
}
The above yields the following:
MultiElement SingleElement SinglElementWithWorkaround
------------ ------------- --------------------------
foo b baz
As you can see, the multi-element array worked as expected, because the member-access enumeration returned an array too, while the single-element array resulted in single string 'baz' being returned and 'baz'[0] returns its first character, 'b'.
Casting to [array] first avoids that problem (([array] $rows2.selectquery)[0]).
Using #(...), the array-subexpression operator - #($rows.selectquery)[0] - is another option, but, for the sake of efficiency, it should only be used on commands (e.g., #(Get-ChildItem -Name *.txt)[0]) not expressions, as in the case at hand.)

PowerShell create array failed in a loop

Thought I have read enough examples here and elsewhere. Still I fail creating arrays in Power Shell.
With that code I hoped to create slices of pair values from an array.
$values = #('hello','world','bonjour','moon','ola','mars')
function slice_array {
param (
[String[]]$Items
)
[int16] $size = 2
$pair = [string[]]::new($size) # size is 2
$returns = [System.Collections.ArrayList]#()
[int16] $_i = 0
foreach($item in $Items){
$pair[$_i] = $Item
$_i++;
if($_i -gt $size - 1){
$_i = 0
[void]$returns.Add($pair)
}
}
return $returns
}
slice_array($values)
the output is
ola
mars
ola
mars
ola
mars
I would hope for
'hello','world'
'bonjour','moon'
'ola','mars'
Is possible to slice that array to an array of arrays with length 2 ?
Any explenation why it doesn't work as expected ?
How should the code be changed ?
Thanks for any hint to properly understand Arrays in PowerShell !
Here's a PowerShell-idiomatic solution (the fix required for your code is in the bottom section):
The function is named Get-Slices to adhere to PowerShell's verb-noun naming convention (see the docs for more information).
Note: Often, the singular form of the noun is used, e.g. Get-Item rather than Get-Items, given that you situationally may get one or multiple output values; however, since the express purpose here is to slice a single object into multiple parts, I've chosen the plural.
The slice size (count of elements per slice) is passed as a parameter.
The function uses .., the range operator, to extract a single slice from an array.
It uses PowerShell's implicit output behavior (no need for return, no need to build up a list of return values explicitly; see this answer for more information).
It shows how to output an array as a whole from a function, which requires wrapping it in an auxiliary single-element array using the unary form of ,, the array constructor operator. Without this auxiliary array, the array's elements would be output individually to the pipeline (which is also used for function / script output; see this answer for more information.
# Note: For brevity, argument validation, pipeline support, error handling, ...
# have been omitted.
function Get-Slices {
param (
[String[]] $Items
,
[int] $Size # The slice size (element count)
)
$sliceCount = [Math]::Ceiling($Items.Count / $Size)
if ($sliceCount -le 1) {
# array is empty or as large as or smaller than a slice? ->
# wrap it *twice* to ensure that the output is *always* an
# *array of arrays*, in this case containing just *one* element
# containing the original array.
,, $Items
}
else {
foreach ($offset in 0..($sliceCount-1)) {
, $Items[($offset * $Size)..(($offset+1) * $Size - 1)] # output this slice
}
}
}
To slice an array into pairs and collect the output in an array of arrays (jagged array):
$arrayOfPairs =
Get-Slices -Items 'hello','world','bonjour','moon','ola','mars' -Size 2
Note:
Shell-like syntax is required when you call functions (commands in general) in PowerShell: arguments are whitespace-separated and not enclosed in (...) (see this answer for more information)
Since a function's declared parameters are positional by default, naming the arguments as I've done above (-Item ..., -Size ...) isn't strictly necessary, but helps readability.
Two sample calls:
"`n-- Get pairs (slice count 2):"
Get-Slices -Items 'hello','world','bonjour','moon','ola','mars' -Size 2 |
ForEach-Object { $_ -join ', ' }
"`n-- Get slices of 3:"
Get-Slices -Items 'hello','world','bonjour','moon','ola','mars' -Size 3 |
ForEach-Object { $_ -join ', ' }
The above yields:
-- Get pairs (slice count 2):
hello, world
bonjour, moon
ola, mars
-- Get slices of 3:
hello, world, bonjour
moon, ola, mars
As for what you tried:
The only problem with your code was that you kept reusing the very same auxiliary array for collecting a pair of elements, so that subsequent iterations replaced the elements of the previous ones, so that, in the end, your array list contained multiple references to the same pair array, reflecting the last pair only.
This behavior occurs, because arrays are instance of reference types rather than value types - see this answer for background information.
The simplest solution is to add a (shallow) clone of your $pair array to your list, which ensures that each list entry is a distinct array:
[void]$returns.Add($pair.Clone())
Why you got 3 equal pairs instead of different pairs:
.Net (powershell based on it) is object-oriented language and it has consept of reference types and value types. Almost all types are reference types.
What happens in your code:
You create $pair = [string[]] object. $pair variable actually stores memory address of (reference to) [string[]] object, because arrays are reference types
You fill $pair array with values
You add (!) $pair to $returns. Remember that $pair is reference to memory block. And when you add it to $returns, it adds memory address of [string[]] you wrote values to.
You repeat step2: You fill $pair array with different values, but address of this array in memory keeps the same. Doing this you actually replace values from step2 with new values in the same $pair object.
= // = step3
= // = step4
= // = step3
As a result: in $returns there are three same memory addresses: [[reference to $pair], [reference to $pair], [reference to $pair]]. And $pair values were overwritten by code with last pair values.
On output it works like this:
Powershell looks at $results which is array.
Powershell looks to $results[0] which reference to $pair
Powershell outputs reference to $pair[0]
Powershell outputs reference to $pair[1]
Powershell looks to $results[1] which reference to $pair
Powershell outputs reference to $pair[0]
Powershell outputs reference to $pair[1]
Powershell looks to $results[1] which reference to $pair
Powershell outputs reference to $pair[0]
Powershell outputs reference to $pair[1]
So you see, you triple output the object from the same memory address. You overwritten it 3 times in slice_array and now it stores only last pair values.
To fix it in your code, you should create a new $pair in memory: add $pair = [string[]]::new($size) just after $returns.Add($pair)

Creating a function to loop through Powershell

I am trying to create a function to loop through json file and based upon specific condition met want to add the value of the json keys to array but not able to accomplish the same.
Following is the json:
$a1 = [ {
"name": "sachin",
"surname": "",
},
{
"name": "Rajesh",
"surname": "Mehta"
}
]
and the function that i have created is below:
function addingkeyval
{
param ($key,
[Array]$tobeaddedarr)
$a1 | Select-Object -Property name,surname | ForEach-Object {
if($_.$key-eq "")
{
}
else
{
$tobeaddedarr +=$_.$key+","
}
}
}
when i cal this function with the statement:
$surnamearr = #()
addingkeyval -key surname -tobeaddedarr $surnamearr
i want the output in the array $surnamearr as below
("Mehta")
somehow i am not able to add the value in the array can anyone help me in achieving the same
$surnamearr = ([string[]] (ConvertFrom-Json $a1).surname) -ne ''
ConvertFrom-Json $a1 converts the JSON array to an array of PowerShell custom objects (type [pscustomobject]).
(...).surname extracts the values of all .surname properties of the objects in the resulting array, using member-access enumeration.
Cast [string[]] ensures that the result is treated as an array (it wouldn't be, if the array in $a1 happened to contain just one element) and explicitly casts to strings, so that filtering nonempty values via -ne '' also works for input objects that happen to have no .surname property.
-ne '' filters out all empty-string elements from the resulting array (with an array as the LHS, comparison operators such as -ne act as array filters rather than returning a Boolean value).
The result is that $surnamearr is an array containing all non-empty .surname property values.
As for what you tried:
$tobeaddedarr +=$_.$key+","
This statement implicitly creates a new array, distinct from the array reference the caller passed via the $tobeaddedarr parameter variable - therefore, the caller never sees the change.
You generally cannot pass an array to be appended to in-place, because arrays are immutable with respect to their number of elements.
While you could pass an mutable array-like structure (such as [System.Collections.Generic.List[object]] instance, which you can append to with its .Add() method), it is much simpler to build a new array inside the function and output it.
Therefore, you could write your function as:
function get-keyval
{
param ($key)
# Implicitly output the elements of the array output by using this command.
([string[]] (ConvertFrom-Json $key).surname) -ne ''
}
$surnamearr = get-keyval surname
Note:
Outputting an array or a collection in PowerShell by default enumerates it: it sends its elements one by one to the output stream (pipeline).
While it is possible to output an array as a whole, via the unary form of ,, the array-construction operator (
, (([string[]] (ConvertFrom-Json $a1).surname) -ne '')), which improves performance, the problem is that the caller may not expect this behavior, especially not in a pipeline.
If you capture the output in a variable, such as $surnamearr here, PowerShell automatically collects the individual output objects in an [object[]] array for you; note, however, that a single output object is returned as-is, i.e. not wrapped in an array.
You can wrap #(...) around the function call to ensure that you always get an array
Alternatively, type-constrain the receiving variable: [array] $surnamearr = ..., which also creates an [object[]] array, or something like [string[]] $surnamearr = ..., which creates a strongly typed string array.

Nested For-Each flattens?

There's an array of objects, where each has a collection of objects, where each has a string property. When I do a nested iteration:
$TheArray | %{$_.TheCollection | %{$_.TheProperty}}
it seems like I end up not with an array of string arrays, but with a 1D array of strings. Is that by design? That is the desired behavior in the first place, but utterly unexpected.
Yes, that output makes sense to me, at least on an intuitive level. I can't explain in accurate technical detail, but the only object written to the pipeline in your expression
$TheArray | %{$_.TheCollection | %{$_.TheProperty} }
is the inner-most
$_.TheProperty
Since this evaluates to a String, a number of Strings are accumulated in the pipeline and returned in an array.
Here's some sample code that mocks-up what you've described:
class HasProperty {
[String] $TheProperty;
HasProperty ([String] $prop){
$this.TheProperty = $prop
}
}
class SomeObject {
[HasProperty[]] $TheCollection
SomeObject ([HasProperty[]] $array) {
$this.TheCollection = $array
}
}
[SomeObject[]]$TheArray = #()
$TheArray = foreach ($i in (0..9)) {
[HasProperty[]]$tempArray = foreach ($n in (0..3)) { [HasProperty]::new("Property$i-$n") }
[SomeObject]::new($tempArray)
}
$TheArray | %{$_.TheCollection | %{$_.TheProperty} }
PowerShell's object-oriented pipeline makes it easy to extract values from some collection of objects. I've used it to get the group membership of a collection of users to determine how their memberships overlap, for instance.
PowerShell by default enumerates (unrolls) collections when outputting them to the pipeline: that is, instead of outputting the collection itself, its elements are output, one by one.
PowerShell collects all output objects from a pipeline in a flat array.
Therefore, even outputting multiple arrays creates a single, flat output array by default, which is the concatenation of these arrays.
A simpler example:
# Output 2 2-element arrays.
> 1..2 | % { #(1, 2) } | Measure-Object | % Count
4 # i.e., #(1, 2, 1, 2).Count
In order to produce nested arrays, you must suppress enumeration, which can be achieved in two ways:
Simplest option: Wrap the output array in an aux. outer array so that enumerating the outer array yields the embedded one as a single output object:
# Use unary , to wrap the RHS in a single-element array.
> 1..2 | % { , #(1, 2) } | Measure-Object | % Count
2 # i.e., #(#(1, 2), #(1, 2)).Count
Alternative, using Write-Output -NoEnumerate (PSv4+):
> 1..2 | % { Write-Output -NoEnumerate #(1, 2) } | Measure-Object | % Count
2 # i.e., #(#(1, 2), #(1, 2)).Count
Note: While use of #(...) is not necessary to create array literals (as used above) - for literals, separating elements with , is sufficient - you still need #(...) to ensure that output from enclosed expressions or commands is treated as an array, in case only a single object happens to be output.

Receive array with one array element from function

Good day to all!
I have some function which splits an input array to $partitionCount partitions. Return value should be an array of arrays.
If $partitionCount equals to 1 you dont need to perform some split logic and you can return input array itself. In that case powershell flattens return result into simple array (not array of arrays as required).
I have tried following approaches found with Stackoverflow and other resources:
return #($input)
flattens result into single array
return #(,$input)
also flattens result into single array
return #($input, #())
it works but uses empty array to prevent flattening, bad one
$bucket = #(Split-Array $input 1)
place an array directive in call site, works fine, but you should place it in each call, which is very confusing and not obvious
Is there any right way to handle that case?
To prevent flattening of a single-item array, prepend a comma to the array you're trying to return (,#($value)):
function Get-SingleArray {
param($InputArray)
return ,#($InputArray[0])
}
Demonstration:
PS C:\> $a = Get-SingleArray 2,3,4,5
PS C:\> $a[0] -eq 2
True
PS C:\> $a.GetType.FullName
System.Object[]