I'm learning to work with the import-excel module and have successfully imported the data from a sample.xlsx file. I need to extract out the total amount based on the values of another column values. Basically, I want to just create a grouped data view where I can store the sum of values next to each type. Here's the sample data view.
Type Amount
level 1 $1.00
level 1 $2.00
level 2 $3.00
level 3 $4.00
level 3 $5.00
Now to import I'm just using the simple code
$fileName = "C:\SampleData.xlsx"
$data = Import-Excel -Path $fileName
#extracting distinct type values
$distinctTypes = $importedExcelRows | Select-Object -ExpandProperty "Type" -Unique
#looping through distinct types and storing it in the output
$output = foreach ($type in $distinctTypes)
{
$data | Group-Object $type | %{
New-Object psobject -Property #{
Type = $_.Name
Amt = ($_.Group | Measure-Object 'Amount' -Sum).Sum
}
}
}
$output
The output I'm looking for looks somewhat like:
Type Amount
level 1 $3.00
level 2 $3.00
level 3 $9.00
However, I'm getting nothing in the output. It's $null I think. Any help is appreciated I think I'm missing something in the looping.
You're halfway there by using Group-Object for this scenario, kudos on that part. Luckily, you can group by the type at your import and then measure the sum:
$fileName = "C:\SampleData.xlsx"
Import-Excel -Path $fileName | Group-Object -Property Type | % {
$group = $_.Group | % {
$_.Amount = $_.Amount -replace '[^0-9.]'
$_
} | Measure-Object -Property Amount -Sum
[pscustomobject]#{
Type = $_.Name
Amount = "{0:C2}" -f $group.Sum
}
}
Since you can't measure the amount in currency format, you can remove the dollar sign with some regex of [^0-9.], removing everything that is not a number, or ., or you could use ^\$ instead as well. This allows for the measurement of the amount and you can just format the amount back to currency format using the string format operator '{0:C2} -f ....
I don't know what your issue is but when the dollar signs are not part of the data you pull from the Excel sheet it should work as expected ...
$InputCsvData = #'
Type,Amount
level 1,1.00
level 1,2.00
level 2,3.00
level 3,4.00
level 3,5.00
'# |
ConvertFrom-Csv
$InputCsvData |
Group-Object -Property Type |
ForEach-Object {
[PSCustomObject]#{
Type = $_.Name
Amt = '${0:n2}'-f ($_.Group | Measure-Object -Property Amount -Sum).Sum
}
}
The ouptut looks like this:
Type Amt
---- ---
level 1 $3,00
level 2 $3,00
level 3 $9,00
Otherwise you may remove the dollar signs before you try to summarize the numbers.
Let's say we have an array of objects $objects. Let's say these objects have a "Name" property.
This is what I want to do
$results = #()
$objects | %{ $results += $_.Name }
This works, but can it be done in a better way?
If I do something like:
$results = objects | select Name
$results is an array of objects having a Name property. I want $results to contain an array of Names.
Is there a better way?
I think you might be able to use the ExpandProperty parameter of Select-Object.
For example, to get the list of the current directory and just have the Name property displayed, one would do the following:
ls | select -Property Name
This is still returning DirectoryInfo or FileInfo objects. You can always inspect the type coming through the pipeline by piping to Get-Member (alias gm).
ls | select -Property Name | gm
So, to expand the object to be that of the type of property you're looking at, you can do the following:
ls | select -ExpandProperty Name
In your case, you can just do the following to have a variable be an array of strings, where the strings are the Name property:
$objects = ls | select -ExpandProperty Name
As an even easier solution, you could just use:
$results = $objects.Name
Which should fill $results with an array of all the 'Name' property values of the elements in $objects.
To complement the preexisting, helpful answers with guidance of when to use which approach and a performance comparison.
Outside of a pipeline[1], use (requires PSv3+):
$objects.Name # returns .Name property values from all objects in $objects
as demonstrated in rageandqq's answer, which is both syntactically simpler and much faster.
Accessing a property at the collection level to get its elements' values as an array (if there are 2 or more elements) is called member-access enumeration and is a PSv3+ feature.
Alternatively, in PSv2, use the foreach statement, whose output you can also assign directly to a variable: $results = foreach ($obj in $objects) { $obj.Name }
If collecting all output from a (pipeline) command in memory first is feasible, you can also combine pipelines with member-access enumeration; e.g.:
(Get-ChildItem -File | Where-Object Length -lt 1gb).Name
Tradeoffs:
Both the input collection and output array must fit into memory as a whole.
If the input collection is itself the result of a command (pipeline) (e.g., (Get-ChildItem).Name), that command must first run to completion before the resulting array's elements can be accessed.
In a pipeline, in case you must pass the results to another command, notably if the original input doesn't fit into memory as a whole, use: $objects | Select-Object -ExpandProperty Name
The need for -ExpandProperty is explained in Scott Saad's answer (you need it to get only the property value).
You get the usual pipeline benefits of the pipeline's streaming behavior, i.e. one-by-one object processing, which typically produces output right away and keeps memory use constant (unless you ultimately collect the results in memory anyway).
Tradeoff:
Use of the pipeline is comparatively slow.
For small input collections (arrays), you probably won't notice the difference, and, especially on the command line, sometimes being able to type the command easily is more important.
Here is an easy-to-type alternative, which, however is the slowest approach; it uses ForEach-Object via its built-in alias, %, with simplified syntax (again, PSv3+):
; e.g., the following PSv3+ solution is easy to append to an existing command:
$objects | % Name # short for: $objects | ForEach-Object -Process { $_.Name }
Note: Use of the pipeline is not the primary reason this approach is slow, it is the inefficient implementation of the ForEach-Object (and Where-Object) cmdlets, up to at least PowerShell 7.2. This excellent blog post explains the problem; it led to feature request GitHub issue #10982; the following workaround greatly speeds up the operation (only somewhat slower than a foreach statement, and still faster than .ForEach()):
# Speed-optimized version of the above.
# (Use `&` instead of `.` to run in a child scope)
$objects | . { process { $_.Name } }
The PSv4+ .ForEach() array method, more comprehensively discussed in this article, is yet another, well-performing alternative, but note that it requires collecting all input in memory first, just like member-access enumeration:
# By property name (string):
$objects.ForEach('Name')
# By script block (more flexibility; like ForEach-Object)
$objects.ForEach({ $_.Name })
This approach is similar to member-access enumeration, with the same tradeoffs, except that pipeline logic is not applied; it is marginally slower than member-access enumeration, though still noticeably faster than the pipeline.
For extracting a single property value by name (string argument), this solution is on par with member-access enumeration (though the latter is syntactically simpler).
The script-block variant ({ ... }) allows arbitrary transformations; it is a faster - all-in-memory-at-once - alternative to the pipeline-based ForEach-Object cmdlet (%).
Note: The .ForEach() array method, like its .Where() sibling (the in-memory equivalent of Where-Object), always returns a collection (an instance of [System.Collections.ObjectModel.Collection[psobject]]), even if only one output object is produced.
By contrast, member-access enumeration, Select-Object, ForEach-Object and Where-Object return a single output object as-is, without wrapping it in a collection (array).
Comparing the performance of the various approaches
Here are sample timings for the various approaches, based on an input collection of 10,000 objects, averaged across 10 runs; the absolute numbers aren't important and vary based on many factors, but it should give you a sense of relative performance (the timings come from a single-core Windows 10 VM:
Important
The relative performance varies based on whether the input objects are instances of regular .NET Types (e.g., as output by Get-ChildItem) or [pscustomobject] instances (e.g., as output by Convert-FromCsv).
The reason is that [pscustomobject] properties are dynamically managed by PowerShell, and it can access them more quickly than the regular properties of a (statically defined) regular .NET type. Both scenarios are covered below.
The tests use already-in-memory-in-full collections as input, so as to focus on the pure property extraction performance. With a streaming cmdlet / function call as the input, performance differences will generally be much less pronounced, as the time spent inside that call may account for the majority of the time spent.
For brevity, alias % is used for the ForEach-Object cmdlet.
General conclusions, applicable to both regular .NET type and [pscustomobject] input:
The member-enumeration ($collection.Name) and foreach ($obj in $collection) solutions are by far the fastest, by a factor of 10 or more faster than the fastest pipeline-based solution.
Surprisingly, % Name performs much worse than % { $_.Name } - see this GitHub issue.
PowerShell Core consistently outperforms Windows Powershell here.
Timings with regular .NET types:
PowerShell Core v7.0.0-preview.3
Factor Command Secs (10-run avg.)
------ ------- ------------------
1.00 $objects.Name 0.005
1.06 foreach($o in $objects) { $o.Name } 0.005
6.25 $objects.ForEach('Name') 0.028
10.22 $objects.ForEach({ $_.Name }) 0.046
17.52 $objects | % { $_.Name } 0.079
30.97 $objects | Select-Object -ExpandProperty Name 0.140
32.76 $objects | % Name 0.148
Windows PowerShell v5.1.18362.145
Factor Command Secs (10-run avg.)
------ ------- ------------------
1.00 $objects.Name 0.012
1.32 foreach($o in $objects) { $o.Name } 0.015
9.07 $objects.ForEach({ $_.Name }) 0.105
10.30 $objects.ForEach('Name') 0.119
12.70 $objects | % { $_.Name } 0.147
27.04 $objects | % Name 0.312
29.70 $objects | Select-Object -ExpandProperty Name 0.343
Conclusions:
In PowerShell Core, .ForEach('Name') clearly outperforms .ForEach({ $_.Name }). In Windows PowerShell, curiously, the latter is faster, albeit only marginally so.
Timings with [pscustomobject] instances:
PowerShell Core v7.0.0-preview.3
Factor Command Secs (10-run avg.)
------ ------- ------------------
1.00 $objects.Name 0.006
1.11 foreach($o in $objects) { $o.Name } 0.007
1.52 $objects.ForEach('Name') 0.009
6.11 $objects.ForEach({ $_.Name }) 0.038
9.47 $objects | Select-Object -ExpandProperty Name 0.058
10.29 $objects | % { $_.Name } 0.063
29.77 $objects | % Name 0.184
Windows PowerShell v5.1.18362.145
Factor Command Secs (10-run avg.)
------ ------- ------------------
1.00 $objects.Name 0.008
1.14 foreach($o in $objects) { $o.Name } 0.009
1.76 $objects.ForEach('Name') 0.015
10.36 $objects | Select-Object -ExpandProperty Name 0.085
11.18 $objects.ForEach({ $_.Name }) 0.092
16.79 $objects | % { $_.Name } 0.138
61.14 $objects | % Name 0.503
Conclusions:
Note how with [pscustomobject] input .ForEach('Name') by far outperforms the script-block based variant, .ForEach({ $_.Name }).
Similarly, [pscustomobject] input makes the pipeline-based Select-Object -ExpandProperty Name faster, in Windows PowerShell virtually on par with .ForEach({ $_.Name }), but in PowerShell Core still about 50% slower.
In short: With the odd exception of % Name, with [pscustomobject] the string-based methods of referencing the properties outperform the scriptblock-based ones.
Source code for the tests:
Note:
Download function Time-Command from this Gist to run these tests.
Assuming you have looked at the linked code to ensure that it is safe (which I can personally assure you of, but you should always check), you can install it directly as follows:
irm https://gist.github.com/mklement0/9e1f13978620b09ab2d15da5535d1b27/raw/Time-Command.ps1 | iex
Set $useCustomObjectInput to $true to measure with [pscustomobject] instances instead.
$count = 1e4 # max. input object count == 10,000
$runs = 10 # number of runs to average
# Note: Using [pscustomobject] instances rather than instances of
# regular .NET types changes the performance characteristics.
# Set this to $true to test with [pscustomobject] instances below.
$useCustomObjectInput = $false
# Create sample input objects.
if ($useCustomObjectInput) {
# Use [pscustomobject] instances.
$objects = 1..$count | % { [pscustomobject] #{ Name = "$foobar_$_"; Other1 = 1; Other2 = 2; Other3 = 3; Other4 = 4 } }
} else {
# Use instances of a regular .NET type.
# Note: The actual count of files and folders in your file-system
# may be less than $count
$objects = Get-ChildItem / -Recurse -ErrorAction Ignore | Select-Object -First $count
}
Write-Host "Comparing property-value extraction methods with $($objects.Count) input objects, averaged over $runs runs..."
# An array of script blocks with the various approaches.
$approaches = { $objects | Select-Object -ExpandProperty Name },
{ $objects | % Name },
{ $objects | % { $_.Name } },
{ $objects.ForEach('Name') },
{ $objects.ForEach({ $_.Name }) },
{ $objects.Name },
{ foreach($o in $objects) { $o.Name } }
# Time the approaches and sort them by execution time (fastest first):
Time-Command $approaches -Count $runs | Select Factor, Command, Secs*
[1] Technically, even a command without |, the pipeline operator, uses a pipeline behind the scenes, but for the purpose of this discussion using the pipeline refers only to commands that use |, the pipeline operator, and therefore by definition involve multiple commands.
Caution, member enumeration only works if the collection itself has no member of the same name. So if you had an array of FileInfo objects, you couldn't get an array of file lengths by using
$files.length # evaluates to array length
And before you say "well obviously", consider this. If you had an array of objects with a capacity property then
$objarr.capacity
would work fine UNLESS $objarr were actually not an [Array] but, for example, an [ArrayList]. So before using member enumeration you might have to look inside the black box containing your collection.
(Note to moderators: this should be a comment on rageandqq's answer but I don't yet have enough reputation.)
I learn something new every day! Thank you for this. I was trying to achieve the same. I was directly doing this:
$ListOfGGUIDs = $objects.{Object GUID}
Which basically made my variable an object again! I later realized I needed to define it first as an empty array,
$ListOfGGUIDs = #()
I'm trying to come up with a percent column by dividing two columns in a select-object. This is what I'm using:
Get-DbaDbLogSpace -SqlInstance serverName |
Select-Object ComputerName, InstanceName, SqlInstance, Database, LogSize,
LogSpaceUsed, LogSpaceUsedPercent,
#{Name="PercentFree"; Expression={($_.LogSpaceUsed / $_.LogSize)}} |
Format-Table
This returns an 'OB' on the expression column (see pic below). How do I do math with two columns in a Select-Object expression please?
If doing this a different way outside of a Select-Object would be better, I'm open to it.
This is what the data looks like for the above code:
Thanks for the help.
The operands of your calculation appear to be strings with byte multipliers (e.g. 38.99 MB), so you'll have to transform them to numbers in order to perform division on them.
Here's a simplified example:
Note: I'm using Invoke-Expression to transform the strings to numbers, relying on PowerShell's support for byte-multiplier suffixes such as mb in number literals (e.g., 1mb - note that there must be no space before the suffix). While Invoke-Expression (iex) should generally be avoided, it is safe to use if you trust that the relevant property values only ever contain strings such as '38.99 MB'.
[pscustomobject] #{
Database = 'db1'
LogSize = '11 MB'
LogSpaceUsed = '3 MB'
} |
Format-Table Database, LogSize, LogSpaceUsed,
#{
Name = "PercentFree"
Expression = {
'{0:P2}' -f ((Invoke-Expression ($_.LogSpaceUsed -replace ' ')) /
(Invoke-Expression ($_.LogSize -replace ' ')))
}
}
Note that I'm passing the properties, including the calculated one directly to Format-Table - no need for an intermediary Select-Object call. Note that the calculated property outputs the percentage as a formatted string, using the -f operator, so that the number of decimal places can be controlled.
Output:
Database LogSize LogSpaceUsed PercentFree
-------- ------- ------------ -----------
db1 11 MB 3 MB 27.27%
Thanks to #mklement0, this is what works.
Get-DbaDbLogSpace -SQLInstance servername | Format-Table Database, LogSize, LogSpaceUsed,
#{
Name = "PercentFree"
Expression = {
'{0:P2}' -f ((Invoke-Expression ($_.LogSpaceUsed -replace ' ')) /
(Invoke-Expression ($_.LogSize -replace ' ')))
}
}
Below is column imported from CSV file. Measure is not able to calculate sum, and others operations on it. As you can see I tried data type conversion, but still Measure won't calculate sum. Salary column may have blanks and zeros. Is there one liner solution for this?
Salary
------
120
220
450
620
780
0
This is info from power shell
# measure just gives count, but not sum, max, min
# count in string = 7 (0 and blank included)
# count in int32 = 5
# Input object "" is not numeric.
Import-Csv -path .\testing.csv | Where-Object {$_.Salary -as [Int32]} | Select-Object Salary | Measure -Sum
I also tried below, code. It gave output in currency format, and I feel, which is numeric, and measure can sum, average it, but it still failed.
$objFile | ForEach-Object { $_.Salary.ToString("C2") } | Measure
Also, tried below, which ensured that Salary was Int, but still measure won't sum it.
$objFile = Import-Csv -Path .\testing.csv | Select #{Name="Salary"; Expression={[int32]$_.Salry}}
$objFile |GM
#Measure-Object : Input object "" is not numeric.
Are you looking for this:
(Import-Csv -path .\testing.csv | Select-Object -expandproperty Salary | Measure -Sum).sum
More about Expandproperty Here
I have the following code that pulls in some server information from a text file and spits it into a hashtable.
Get-Content $serverfile | Foreach-Object {
if($_ -match '^([^\W+]+)\s+([^\.+]+)')
{
$sh[$matches[1]] = $matches[2]
}
}
$sh.GetEnumerator()| sort -Property Name
This produces the following:
Name Value
---- -----
Disk0 40
Disk1 40
Disk2 38
Disk3 43
Memory 4096
Name Value
Number_of_disks 1
Number_of_network_cards 2
Number_of_processors 1
ServerName WIN02
Depending on the server there may be one Disk0 or many more.
My challenge here is to pull each Disk* value from each of the varying number of Disk keys and return the values in a comma separated list, for example;
$disks = 40,40,38,43
I have tried varying approaches to this problem however none have met the criteria of being dynamic and including the ',' after each disk.
Any help would be appreciated.
I assume that when you say "Depending on the server there may be one Disk0 or many more", you mean "one Disk or many more", each with a different number? You can't have more than one Disk0, because key names can't be duplicated in a hash.
This will give you a list of all the hash values for keys starting with "Disk":
$sh.Keys | ?{$_ -match '^Disk'} | %{$sh.$_}
If you actually want to get a comma-separated list (a single string value), you can use the -join operator:
$disks = ($sh.Keys | ?{$_ -match '^Disk'} | %{$sh.$_}) -join ','
However, if the reason you want a comma-separated list is in order to get an array of the values, you don't really need the comma-separated list; just assign the results (which are already an array) to the variable:
$disks = $sh.Keys | ?{$_ -match '^Disk'} | %{$sh.$_}
Note, BTW, that hashes are not ordered. There's no guarantee that the order of the keys listed will be the same as the order in which you added them or in ascending alphanumeric order. So, in the above example, your result could be 38,40,43,40. If order does matter (i.e. you're counting on the values in $disks to be in the order of their respective Disk numbers, you have two options.
Filter the listing of the keys through Sort-Object:
$sh.Keys | ?{$_ -match '^Disk'} | sort | %{$sh.$_}
(You can put the | sort between $sh.Keys and | ?{..., but it's more efficient this way...which makes little difference here but would matter with larger data sets.)
Use an ordered dictionary, which functions pretty much the same as a hash, but maintains the keys in the order added:
$sh = New-Object System.Collections.Specialized.OrderedDictionary