PowerShell: Converting String Representations of Numbers to Integers - powershell

I've tried really hard not ask this question, but I keep coming back to it as I'm not sure if I'm doing everything as efficiently as I can or if there might be problems under the hood. Basically, I have a CSV file that contains a number field, but it includes a decimal and values out to the ten-thousandths place, e.g. 15.0000. All I need to do is convert that to a whole number without the decimal place.
I came across a related question here, but the selected answer seems to cast doubt on casting the string representation directly to an integer data type - without explaining why.
Simply casting the string as an int won't work reliably. You need to convert it to an int32.
I've haven't had much luck getting the [System.Convert] method to work, or doing something like $StringNumber.ToInt32(). I realize that once I save the data back to the PSCustomObject they'll be stored as strings, so at the end of the day maybe I'm making this even more complicated than necessary for my use case and I just need to reformat $StringNumber...but even that has caused me some problems.
Any ideas on why casting wouldn't be reliable or better ways to handle this in my case?
Examples of what I've tried:
PS > $StringNumber = '15.0000'
PS > [Convert]::ToInt32($StringNumber)
#MethodInvocationException: Exception calling "ToInt32" with "1" argument(s): "Input string was not in a correct format."
PS > [Convert]::ToInt32($StringNumber, [CultureInfo]::InvariantCulture)
#MethodInvocationException: Exception calling "ToInt32" with "2" argument(s): "Input string was not in a correct format."
PS > $StringNumber.ToInt32()
#MethodException: Cannot find an overload for "ToInt32" and the argument count: "0".
PS > $StringNumber.ToInt32([CultureInfo]::InvariantCulture)
#MethodInvocationException: Exception calling "ToInt32" with "1" argument(s): "Input string was not in a correct format."
PS > $StringNumber.ToString("F0")
#MethodException: Cannot find an overload for "ToString" and the argument count: "1".
PS > $StringNumber.ToString("F0", [CultureInfo]::CurrentCulture)
#MethodException: Cannot find an overload for "ToString" and the argument count: "2".
PS > "New format: {0:F0}" -f $StringNumber
#New format: 15.0000
So basically what I've come up with is:
Someone in 2014 said casting my string to an int wouldn't work reliably, even though it seems like the Cast operator is actually doing a conversion
The ToInt32 methods don't like strings with decimals as the input
Apparently String.ToString Method is useless
Thanks to String.ToString and the processing order of composite formatting, simple "reformatting" of my string representation won't work
In summary: Is there a way to safely cast my $StringNumber into a whole number, and, if so, what's the most efficient way to do it on a large dataset?
Bonus Challenge:
If anyone can make this work using the ForEach magic method then I'll buy you a beer. Here's some pseudo code that doesn't work, but would be awesome if it did. As far as I can figure out, there's no way to reference the current item in the collection when setting the value of a string property
#This code DOES NOT work as written
PS > $CSVData = Import-Csv .\somedata.csv
PS > $CSVData.ForEach('StringNumberField', [int]$_.StringNumberField)

If your string representation can be interpreted as a number, you can cast it to an integer, as long as the specific integer type used is large enough to accommodate (the integer portion of) the value represented (e.g. [int] '15.0000')
A string that can not be interpreted as a number or represents a number that is too large (or small, for negative numbers) for the target type, results in a statement-terminating error; e.g. [int] 'foo' or [int] '444444444444444'
Note that PowerShell's casts and implicit string-to-number conversions use the invariant culture, which means that only ever . is recognized as the decimal mark (and , is effectively ignored, because it is interpreted as the thousands-grouping symbol), irrespective of the culture currently in effect (as reflected in $PSCulture).
As for integer types you can use (all of them - except the open-ended [bigint] type - support ::MinValue and ::MaxValue to determine the range of integers they can accommodate; e.g. [int]::MaxValue)
Signed integer types: [sbyte], [int16], [int] ([int32]), [long] ([int64]), [bigint]
Unsigned integer types: [byte], [uint16], [uint] ([uint32]), [ulong] ([uint64]) - but note that PowerShell itself uses only signed types natively in its calculations.
Casting to an integer type performs half-to-even midpoint rounding, which means that a string representing a value whose fractional part is .5 is rounded to the nearest even integer; e.g. [int] '1.5' and [int] '2.5' both round to 2.
To choose a different midpoint rounding strategy, use [Math]::Round() with a System.MidpointRounding argument; e.g.:
[Math]::Round('2.5', [MidPointRounding]::AwayFromZero) # -> 3
To unconditionally round up or down to the nearest integer, use [Math]::Ceiling(), [Math]::Floor(), or [Math]::Truncate(); e.g.:
[Math]::Ceiling('2.5') # -> 3
[Math]::Floor('2.5') # -> 2
[Math]::Truncate('2.5') # -> 2
#
[Math]::Ceiling('-2.5') # -> -2
[Math]::Floor('-2.5') # -> -3
[Math]::Truncate('-2.5') # -> -2
Note: While the resulting number is conceptually an integer, technically it is a [double] or - with explicit [decimal] or integer-number-literal input - a [decimal].
As for the bonus challenge:
With an integer-type cast:
[int[]] (Import-Csv .\somedata.csv).StringNumberField
Note: (Import-Csv .\somedata.csv).StringNumberField.ForEach([int]) would work too, but offers no advantage here.
With a [Math]::*() call and the .ForEach() array method:
(Import-Csv .\somedata.csv).StringNumberField.ForEach(
{ [Math]::Round($_, [MidPointRounding]::AwayFromZero) }
)

Casting [int] as you explained, is something that would work in most cases, however it is also prone to errors. What if the number is higher than [int]::MaxValue ? The alternative you could use to avoid the exceptions would be to use the -as [int] operator however there is another problem with this, if the value cannot be converted to integer you would be getting $null as a result.
To be safe that the string will be converted and you wouldn't get null as a result first you need to be 100% sure that the data you're feeding is correct or assume the worst and use [math]::Round(..) in combination with -as [decimal] or -as [long] or -as [double] (∞) to round your numbers:
[math]::Round('123.123' -as [decimal]) # => 123
[math]::Round('123.asd' -as [decimal]) # => 0
Note: I'm using round but [math]::Ceiling(..) or [math]::Floor(..) or [math]::Truncate(..) are valid alternatives too, depending on your expected output.
Another alternative is to use [decimal]::TryParse(..) however this would throw if there ever be something that is not a number:
$StringNumber = '15.0000'
$ref = 0
[decimal]::TryParse( $StringNumber, ([ref]$ref) )
[math]::Round($ref) # => 15
Using Hazrelle's advise would work too but again, would throw an exception for invalid input or "Value was either too large or too small for an Int32."
[System.Decimal]::ToInt32('123123123.123') # => 123123123
As for the Bonus Challenge, I don't think it's possible to cast and then set the rounded values to your CSV on just one go using ForEach(type convertToType), and even if it was, it could also bring problems because of what was mentioned before:
$csv = #'
"Col1","Col2"
"val1","15.0000"
"val2","20.123"
"val3","922337203685477.5807"
'# | ConvertFrom-Csv
$csv.Col2.ForEach([int])
Cannot convert argument "item", with value: "922337203685477.5807", for "Add" to type "System.Int32": "Cannot convert value "922337203685477.5807" to type "System.Int32".
Using .foreach(..) array method combined with a script block would work:
$csv.ForEach({
$_.Col2 = [math]::Round($_.Col2 -as [decimal])
})
In case you wonder why not just use [math]::Round(..) over the string and forget about it:
[math]::Round('123.123') # => 123 Works!
But what about:
PS /> [math]::Round([decimal]::MaxValue -as [string])
7.92281625142643E+28
PS /> [math]::Round([decimal]([decimal]::MaxValue -as [string]))
79228162514264337593543950335

Related

Using Sort-Object to sort a Currency/Money

I have a inventory database from my company and I'm wanting to sort some entries based on pricing. I was thinking originally I would have to do everything by hand but I figured Sort-Object should work... until I remembered Sort-Object and its infamous string sorting. Easy, i'll sort by converting it to an integer except of course a currency value has symbol such as $ at the start.
The original code I used which caused the string sorting is below. The classic 200 is higher than 1000 etc:
$Result | Sort-Object -Property Price | Format-Table -Property Price
The int code I tried is:
$Result | Sort-Object -Property { [int]$_.Price } | Format-Table -Property Price
This results in output like "Cannot convert value "$414.50" to type "System.Int32". | Error: "Input string was not in a correct format." Makes sense, cant convert a $ to an int.
So is there any way around this without me having to sort by hand?
Thanks
To add to mclayton's helpful answer:
It is simpler to use a predefined [cultureinfo] instance that uses the your currency format, such as en-US (US-English) in the [decimal]::Parse() call, in combination with C, the currency format specifier.
#(
[pscustomobject] #{ Price='$414.50' },
[pscustomobject] #{ Price='99.02$' }
[pscustomobject] #{ Price='999.03' }
[pscustomobject] #{ Price='$5.04' }
) |
Sort-Object { [decimal]::Parse($_.Price, 'C', [cultureinfo] 'en-US') }
Output (correctly numerically sorted):
Price
-----
$5.04
99.02$
$414.50
999.03
Note:
As the sample input values show, there's some flexibility with respect to what input formats are accepted, such as a trailing $, and a value without $_.
If the current culture can be assumed to be en-US (or a different culture that uses the same currency symbol and formatting, notably also the same decimal separator, .), you can omit the [cultureinfo] 'en-US' argument in the [decimal]::Parse() call above - though for robustness I suggest keeping it.
As an aside: PowerShell's casts (which don't support currency values) always use the invariant culture with string operands, irrespective of the current culture. Thus, something like [decimal] '3.14' is recognized even while a culture that uses , as the decimal separator is in effect.
While the invariant culture - whose purpose is to provide representations that aren't culture-dependent and remain stable over time - is based on the US-English culture, it can not be used here, because its currency symbol is ¤; e.g., (9.99).ToString('C', [cultureinfo]::InvariantCulture) yields ¤9.99.
An input value that cannot be parsed as a currency causes an (effectively) non-terminating error,[1] and such values sort before the currency values.
If you simply want to ignore non-conforming values, use try { [decimal]::Parse(...) } catch { }
If you want to abort processing on encountering non-confirming values pass -ErrorAction Stop to the Sort-Object call.
[1] A .NET method call that fails causes a statement-terminating error, but since the error occurs in a script block (in the context of a calculated property), only the statement inside the script block is terminated, not the enclosing Sort-Object call
Firstly, you probably want [decimal] instead of [int] because [int] "414.50" is 414, not 414.50 so you'll be losing precision.
That aside, I'm adapting this answer for C#: https://stackoverflow.com/a/56603818/3156906
$fi = new-object System.Globalization.NumberFormatInfo;
$fi.CurrencySymbol = "`$";
#("`$10.00", "`$2.00") | Sort-Object -Property #{
"Expression" = { [decimal]::Parse($_, "Currency", $fi) }
};
# $2.00
# $10.00
The advantage of this is that invalid database values like - e.g. $1.$10 - that might have crept in will throw an exception, as will different currencies like £1.00 so you're getting a bit of extra data validation for free.
Note that the results remain as strings, but they're sorted as currency amounts (decimals). If you want the actual numeric value you'll need to convert the values separately...

Powershell String format fails to add hex prefix when looping

Ciao all -
I'm using Powershell 7.2 to automate some hardware configuration through the hardware's CLI.
I am using a loop to generate strings that include "0x" prefixes to express hex bytes, but having an issue where any consecutive iterations after the first pass of the loop do not print the "0x" prefix.
The following will produce the issue:
function fTest($id)
{
foreach($n in #(1, 2, 3))
{
write-host $id.gettype()
write-host ("{0:x}" -f $id)
$id++
}
}
fTest 0x1a
Actual output:
System.Int32
0x1a
System.Int32
1b
System.Int32
1c
The 0xprefixes are omitted in iters 2 and 3.
Why is this happening?
What is a clean way to correct the issue?
I'm a PowerShell noob, so I am happy to receive suggestions or examples of entirely different approaches.
Thanks in advance for the help!
tl;dr
Type-constrain your $p parameter to unambiguously make it a number (integer), as Theo suggests:
function fTest($id) -> function fTest([int] $id)
Build the 0x prefix into the format string passed to -f:
"{0:x}" -f $id -> '0x{0:x}' -f $id
Building on the helpful comments:
Why is this happening?
Format string {0:x}, when applied to a number, only ever produces a hexadecimal representation without a 0x prefix; e.g.:
PS> '{0:x}' -f 10
a # NOT '0xa'
If the operand is not a number, the numeric :x specification is ignored:
PS> '{0:x}' -f 'foo'
foo
The problem in your case is related to how PowerShell handles arguments passed to parameters that are not type-constrained:
Argument 0x1a is ambiguous: it could be a number - expressed as hexadecimal constant 0x1a, equivalent to decimal 26 - or a string.
While in expression-parsing mode this ambiguity would not arise (strings must be quoted there), it does in argument-parsing mode, where quoting around strings is optional (except if the string contains metacharacters) - see the conceptual about_Parsing topic.
What PowerShell does in this case is to create a hybrid argument value: The value is parsed as a number, but it caches its original string representation behind the scenes, which is used for display formatting, for instance:
PS> & { param($p) $p; $p.ToString() } 0x1a
0x1a # With default output formatting, the original string form is used.
26 # $p is an [int], so .ToString() yields its decimal representation
As of PowerShell 7.2.2, surprisingly and problematically, in the context of -f, the string-formatting operator, such a hybrid value is treated as a string, even though it self-reports as a number:
PS> & { param($p) $p.GetType().FullName; '{0:N2}' -f $p } 0x1a
System.Int32 # $p is of type [int] == System.Int32
0x1a # !! With -f $p is unexpectedly treated *as a string*,
# !! yielding the cached original string representation.
This unexpected behavior has been reported in GitHub issue #17199.
Type-constraining the parameter to which such a hybrid argument is passed, as shown at the top, avoids the ambiguity: on invocation, the argument is converted to an unwrapped instance of the parameter's type (see next point).
As for why the output changed starting with the 2nd iteration:
The cached string representation is implemented by way of an invisible [psobject] wrapper around the instance of the numeric type stored in $id, in this case.
When you update this value by way of an increment operation (++), the [psobject] wrapper is lost, and the variable is updated with an unwrapped number (the original value + 1).
Therefore, starting with the 2nd iteration, $id contained an unwrapped [int] instance, resulting in the {0:x} number format being honored and therefore yielding a hexadecimal representation without a 0x prefix.
The only reason the 1st iteration yielded a 0x prefix was that it was present in the original string representation of the argument; as stated above, the numeric :x format specifier was ignored in this case, given that the -f operand was (unexpectedly) treated as a string.

Why is powershell converting simply arithmetic on Integers to double?

I am using powershell for some time now, and just stumbled upon the PSKoan Project:
https://github.com/vexx32/PSKoans
On one of the very first koans I found some strange behaviour, I cant explain to my self.
The koan about number types try to teach how powershell will convert variable types dynamically from int to long and from int to double on certain operations.
So I filled out the blanks (as expected during the course) of this specific pester test:
It 'can be a larger number if needed' {
<#
Integers come in two flavours:
- int (Int32)
- long (Int64)
If an integer value exceeds the limits of the Int32 type, it is
automatically expanded to the larger Int64 type.
#>
# What exactly are the limitations of the [int] type?
$MaxValue = [int]::MaxValue
$MinValue = [int]::MinValue
2147483647 | Should -Be $MaxValue
-2147483648 | Should -Be $MinValue
# If you enter a number larger than that, the type should change.
$BigValue = $MaxValue +1
$BigValue | Should -BeOfType [long]
$BigValue | Should -BeGreaterThan $MaxValue
$SmallValue = $MinValue -1
$SmallValue | Should -BeOfType [long]
$SmallValue | Should -BeLessThan $MinValue
}
Use show-karma to let PSKoans to their magic:
The answers you seek...
Expected the value to have type [long] or any of its subtypes, but got 2147483648 with type [double].
This can be narrowed down to the following example, which I dont understand:
Why the heck is this a double now, and not a long?
I am using PS 5.1:
P.S. Its late here, so perhaps I miss something really obvious, I am already prepared for the great face palm ;)
Indeed, in the context of expressions (calculations), PowerShell indeed automatically widens anything that exceeds the max. value of [int] / [uint] (32-bit signed / unsigned integers) or [long] / [ulong] (64-bit) to [double], which is easy to verify[1]:
# Ditto for [uint], [long], [ulong]
PS> ([int]::MaxValue + 1).GetType().Name
Double # [double]
By contrast, integer types smaller than [int] are promoted to [int] (System.Int32) if their max. value is exceeded in a calculation:
# Ditto for [sbyte], [int16], [uint16]
PS> ([byte]::MaxValue + 1).GetType().Name
Int32 # same as: [int]
It is only with number literals that promotion to the next biggest - signed - integer type occurs for values beyond [int]:
Note: Any number literal whose value is between [int]::MinValue and [int]::MaxValue - even if it could fit into a smaller integer type - defaults to [int], i.e., a 32-bit signed integer (System.Int32).
# Note: 2147483648 is the result of ([int]::MaxValue + 1)
PS> (2147483648).GetType().Name
Int64 # same as: [long]
And if a number literal exceeds the value of [long]::MaxValue, promotion to [decimal] occurs:
# Note: 9223372036854775808 is the result of ([long]::MaxValue + 1)
PS> (9223372036854775808).GetType().Name
Decimal # [decimal]
It is only if you exceed [decimal]::MaxValue] that promotion to [double] - with its potential loss of precision - occurs:
# Note: 79228162514264337593543950336 is the result of ([decimal]::MaxValue + 1)
PS> (79228162514264337593543950336).GetType().Name
Double # [double]
Outputting the value above directly makes the conversion to [double] immediately obvious, because the output formatting uses exponential notation: 7.92281625142643E+28
[1] Curiously, trying to calculate a value beyond [decimal]::MaxValue fails, causing a statement-terminating error: ([decimal]::MaxValue + 1).GetType().Name

Count the scale of a given decimal

How can I count the scale of a given decimal in Powershell?
$a = 0.0001
$b = 0.000001
Casting $a to a string and returning $a.Length gives a result of 6...I need 4.
I thought there'd be a decimal or math function but I haven't found it and messing with a string seems inelegant.
There's probably a better mathematic way but I'd find the decimal places like this:
$a = 0.0001
$decimalPlaces = ("$a" -split '\.')[-1].TrimEnd('0').Length
Basically, split the string on the . character and get the length of the last string in the array. Wrapping $a in double-quotes implicitly calls .ToString() with an invariant culture (you could expand this as $a.ToString([CultureInfo]::InvariantCulture)), making this method to determine the number of decimal places culture-invariant.
.TrimEnd('0') is used in case $a were sourced from a string, not a proper number type, it's possible that trailing zeroes could be included that should not count as decimal places. However, if you want the scale and not just the used decimal places, leave .TrimEnd('0') off like so:
$decimalPlaces = ("$a" -split '\.')[-1].Length
mclayton helpfully linked to this answer to a related C# question in a comment, and the solution there can indeed be adapted to PowerShell, if working with or conversion to type [decimal] is acceptable:
# Define $a as a [decimal] literal (suffix 'd')
# This internally records the scale (number of decimal places) as specified.
$a = 0.0001d
# [decimal]::GetBits() allows extraction of the scale from the
# the internal representation:
[decimal]::GetBits($a)[-1] -shr 16 -band 0xFF # -> 4, the number of decimal places
The System.Decimal.GetBits method returns an array of internal bit fields whose last element contains the scale in bits 16 - 23 (8 bits, even though the max. scale allowed is 28), which is what the above extracts.
Note: A PowerShell number literal that is a fractional number without the d suffix - e.g., 0.0001 becomes a [double] instance, i.e. a double-precision binary floating-point number.
PowerShell automatically converts [double] to [decimal] values on demand, but do note that there can be rounding errors due to the differing internal representations, and that [double] can store larger numbers than [decimal] can (although not accurately).
A [decimal] literal - one with suffix d (note that C# uses suffix m) - is parsed with a scale exactly as specified, so that applying the above to 0.000d and 0.010d yields 3 in both cases; that is, the trailing zeros are meaningful.
This does not apply if you (implicitly) convert from [double] instances such as 0.000 and 0.010, for which the above yields 0 and 2, respectively.
A string-based solution:
To offer a more concise (also culture-invariant) alternative to Bender The Greatest's helpful answer:
$a = 0.0001
("$a" -replace '.+\.').Length # -> 4, the number of decimal places
Caveat: This solution relies on the default string representation of a [double] number, which need not match the original input format; for instance, .0100, when stringified later, becomes '0.01'; however, as discussed above, you can preserve trailing zeros if you start with a [decimal] literal: .0100d stringifies to '0.0100' (input number of decimals preserved).
"$a", uses an expandable string (PowerShell's string interpolation) to create a culture-invariant string representation of the number so as to ensure that the string representation uses . as the decimal mark.
In effect, PowerShell calls $a.ToString([cultureinfo]::InvariantCulture) behind the scenes.[1].
By contrast, .ToString() (argument-less) applies the rules of the current culture, and in some cultures it is , - not . - that is used as the decimal mark.
Caveat: If you use just $a as the LHS of -replace, $a is implicitly stringified, in which case you - curiously - get culture-sensitive behavior, as with .ToString() - see this GitHub issue.
-replace '.+\.' effectively removes all characters up to and including the decimal point from the input string, and .Length counts the characters in the resulting string - the number of decimal places.
[1] Note that casts from strings in PowerShell too use the invariant culture (effectively, ::Parse($value, [cultureinfo]::InvariantCulture) is called) so that in order to parse a a culture-local string representation you'll need to use the ::Parse() method explicitly; e.g., [double]::Parse('1,2'), not [double] '1,2'.

Powershell - remove currency formatting from a number

can you please tell me how to remove currency formatting from a variable (which is probably treated as a string).
How do I strip out currency formatting from a variable and convert it to a true number?
Thank you.
example
PS C:\Users\abc> $a=($464.00)
PS C:\Users\abc> "{0:N2}" -f $a
<- returns blank
However
PS C:\Users\abc> $a=-464
PS C:\Users\abc> "{0:C2}" -f $a
($464.00) <- this works
PowerShell, the programming language, does not "know" what money or currency is - everything PowerShell sees is a variable name ($464) and a property reference (.00) that doesn't exist, so $a ends up with no value.
If you have a string in the form: $00.00, what you can do programmatically is:
# Here is my currency amount
$mySalary = '$500.45'
# Remove anything that's not either a dot (`.`), a digit, or parentheses:
$mySalary = $mySalary -replace '[^\d\.\(\)]'
# Check if input string has parentheses around it
if($mySalary -match '^\(.*\)$')
{
# remove the parentheses and add a `-` instead
$mySalary = '-' + $mySalary.Trim('()')
}
So far so good, now we have the string 500.45 (or -500.45 if input was ($500.45)).
Now, there's a couple of things you can do to convert a string to a numerical type.
You could explicitly convert it to a [double] with the Parse() method:
$mySalaryNumber = [double]::Parse($mySalary)
Or you could rely on PowerShell performing an implicit conversion to an appropriate numerical type with a unary +:
$mySalaryNumber = +$mySalary