I recently answered a SO-question about using -lt or -gt with strings. My answer was based on something I've read earlier which said that -lt compares one char from each string at a time until a ASCII-value is not equal to the other. At that point the result (lower/equal/greater) decides. By that logic, "Less" -lt "less" should return True because L has a lower ASCII-byte-value than l, but it doesn't:
[System.Text.Encoding]::ASCII.GetBytes("Less".ToCharArray())
76
101
115
115
[System.Text.Encoding]::ASCII.GetBytes("less".ToCharArray())
108
101
115
115
"Less" -lt "less"
False
It seems that I may have been missing a crucial piece: the test is case-insensitive
#L has a lower ASCII-value than l. PS doesn't care. They're equal
"Less" -le "less"
True
#The last s has a lower ASCII-value than t. PS cares.
"Less" -lt "lest"
True
#T has a lower ASCII-value than t. PS doesn't care
"LesT" -lt "lest"
False
#Again PS doesn't care. They're equal
"LesT" -le "lest"
True
I then tried to test char vs single-character-string:
[int][char]"L"
76
[int][char]"l"
108
#Using string it's case-insensitive. L = l
"L" -lt "l"
False
"L" -le "l"
True
"L" -gt "l"
False
#Using chars it's case-sensitive! L < l
([char]"L") -lt ([char]"l")
True
([char]"L") -gt ([char]"l")
False
For comparison, I tried to use the case-sensitive less-than operator, but it says L > l which is the opposite of what -lt returned for chars.
"L" -clt "l"
False
"l" -clt "L"
True
How does the comparison work, because it clearly isn't by using ASCII-value and why does it behave differently for chars vs. strings?
A big thank-you to PetSerAl for all his invaluable input.
tl; dr:
-lt and -gt compare [char] instances numerically by Unicode codepoint.
Confusingly, so do -ilt, -clt, -igt, -cgt - even though they only make sense with string operands, but that's a quirk in the PowerShell language itself (see bottom).
-eq (and its alias -ieq), by contrast, compare [char] instances case-insensitively, which is typically, but not necessarily like a case-insensitive string comparison (-ceq again compares strictly numerically).
-eq/-ieq ultimately also compares numerically, but first converts the operands to their uppercase equivalents using the invariant culture; as a result, this comparison is not fully equivalent to PowerShell's string comparison, which additionally recognizes so-called compatible sequences (distinct characters or even sequences considered to have the same meaning; see Unicode equivalence) as equal.
In other words: PowerShell special-cases the behavior of only -eq / -ieq with [char] operands, and does so in a manner that is almost, but not quite the same as case-insensitive string comparison.
This distinction leads to counter-intuitive behavior such as [char] 'A' -eq [char] 'a' and [char] 'A' -lt [char] 'a' both returning $true.
To be safe:
always cast to [int] if you want numeric (Unicode codepoint) comparison.
always cast to [string] if you want string comparison.
For background information, read on.
PowerShell's usually helpful operator overloading can be tricky at times.
Note that in a numeric context (whether implicit or explicit), PowerShell treats characters ([char] ([System.Char]) instances) numerically, by their Unicode codepoint (not ASCII).
[char] 'A' -eq 65 # $true, in the 'Basic Latin' Unicode range, which coincides with ASCII
[char] 'Ā' -eq 256 # $true; 0x100, in the 'Latin-1 Supplement' Unicode range
What makes [char] unusual is that its instances are compared to each other numerically as-is, by Unicode codepoint, EXCEPT with -eq/-ieq.
ceq, -lt, and -gt compare directly by Unicode codepoints, and - counter-intuitively - so do -ilt, -clt, -igt and -cgt:
[char] 'A' -lt [char] 'a' # $true; Unicode codepoint 65 ('A') is less than 97 ('a')
-eq (and its alias -ieq) first transforms the characters to uppercase, then compares the resulting Unicode codepoints:
[char] 'A' -eq [char] 'a' # !! ALSO $true; equivalent of 65 -eq 65
It's worth reflecting on this Buddhist turn: this and that: in the world of PowerShell, character 'A' is both less than and equal to 'a', depending on how you compare.
Also, directly or indirectly - after transformation to uppercase - comparing Unicode codepoints is NOT the same as comparing them as strings, because PowerShell's string comparison additionally recognizes so-called compatible sequences, where characters (or even character sequences) are considered "the same" if they have the same meaning (see Unicode equivalence); e.g.:
# Distinct Unicode characters U+2126 (Ohm Sign) and U+03A9 Greek Capital Letter Omega)
# ARE recognized as the "same thing" in a *string* comparison:
"Ω" -ceq "Ω" # $true, despite having distinct Unicode codepoints
# -eq/ieq: with [char], by only applying transformation to uppercase, the results
# are still different codepoints, which - compared numerically - are NOT equal:
[char] 'Ω' -eq [char] 'Ω' # $false: uppercased codepoints differ
# -ceq always applies direct codepoint comparison.
[char] 'Ω' -ceq [char] 'Ω' # $false: codepoints differ
Note that use of prefixes i or c to explicitly specify case-matching behavior is NOT sufficient to force string comparison, even though conceptually operators such as -ceq, -ieq, -clt, -ilt, -cgt, -igt only make sense with strings.
Effectively, the i and c prefixes are simply ignored when applied to -lt and -gt while comparing [char] operands; as it turns out (unlike what I originally thought), this is a general PowerShell pitfall - see below for an explanation.
As an aside: -lt and -gt logic in string comparison is not numeric, but based on collation order (a human-centric way of ordering independent of codepoints / byte values), which in .NET terms is controlled by cultures (either by default by the one currently in effect, or by passing a culture parameter to methods).
As #PetSerAl demonstrates in a comment (and unlike what I originally claimed), PS string comparisons use the invariant culture, not the current culture, so their behavior is the same, irrespective of what culture is the current one.
Behind the scenes:
As #PetserAl explains in the comments, PowerShell's parsing doesn't distinguish between the base form of an operator its i-prefixed form; e.g., both -lt and -ilt are translated to the same value, Ilt.
Thus, Powershell cannot implement differing behavior for -lt vs. -ilt, -gt vs. igt, ..., because it treats them the same at the syntax level.
This leads to somewhat counter-intuitive behavior in that operator prefixes are effectively ignored when comparing data types where case-sensitivity has no meaning - as opposed to getting coerced to strings, as one might expect; e.g.:
"10" -cgt "2" # $false, because "2" comes after "1" in the collation order
10 -cgt 2 # !! $true; *numeric* comparison still happens; the `c` is ignored.
In the latter case I would have expected the use of -cgt to coerce the operands to strings, given that case-sensitive comparison is only a meaningful concept in string comparison, but that is NOT how it works.
If you want to dig deeper into how PowerShell operates, see #PetSerAl's comments below.
Not quite sure what to post here other than the comparisons are all correct when dealing with strings/characters. If you want an Ordinal comparison, do an Ordinal comparison and you get results based on that.
Best Practices for Using Strings in the .NET Framework
[string]::Compare('L','l')
returns 1
and
[string]::Compare("L","l", [stringcomparison]::Ordinal)
returns -32
Not sure what to add here to help clarify.
Also see: Upper vs Lower Case
Related
On comparing '1' with '_' the answer I'm expecting is '1' < '_' because their Ascii values are 49 and 95 respectively. But the answer is the other way. For that matter, even ':' instead of '_' gives the same result.
[byte][char]'1' -gt [byte][char]'_'
>> False
which makes sense. However:
'1' -gt '_'
>> True
Would appreciate any pointers on what I may be missing here. In essence I'm looking for a reliable way to lexicographically compare strings in powershell. Thanks!
Let's break down your two examples.
[byte][char]'1' -gt [byte][char]'_'
In this example you're comparing a byte to a byte. It is important to note that byte and char are both numeric values. The only real difference is that a char is 16 bits (so it can represent Unicode characters) and a byte is only 8 bits. Casting a string to a char gets the numeric representation of the character in the string (provided the string only contains a single character).
This means that [byte][char]'1' results in the number 49 and [byte][char]'_' results in 95. The expression will evaluate to false since 49 is not greater than 95.
Now let's look at your second example
'1' -gt '_'
In this example, you're comparing a string to a string. When comparing two strings using -gt, -ge, -lt, or -le, it uses the alphabetical sort order to determine whether or not the expression should be true or false, not the numeric values of the characters in the string. If one string is sorted before another, the first string is considered less than the second and vice versa.
You can see this behavior if you pass some strings to the Sort-Object cmdlet.
'1', '2', '3', '_' | Sort-Object
# returns '_', '1', '2', '3'
This means that your second example will return true because in the sort order implemented by .NET, _ comes before 1.
The order of special characters can vary by language and/or culture as there does not appear to be a standard, however it is pretty universally accepted that special characters should be sorted before numbers and letters.
Your first example is using Byte.CompareTo(Byte) whereas your second example is using String.CompareTo(String).
'1' -gt '_' returns $true because 1 follows _.
Two different ways you could see it:
'1'.CompareTo('_') # => 1
([char] '1').CompareTo([char] '_') # => -46
'1', '_' | Sort-Object # => `_` goes first
'1', '_' -as [char[]] | Sort-Object # => `1` goes first
Is there an easy way in PowerShell to format numbers and the like in another locale? I'm currently writing a few functions to ease SVG generation for me and SVG uses . as a decimal separator, while PowerShell honors my locale settings (de-DE) when converting floating-point numbers to strings.
Is there an easy way to set another locale for a function or so without sticking
.ToString((New-Object Globalization.CultureInfo ""))
after every double variable?
Note: This is about the locale used for formatting, not the format string.
(Side question: Should I use the invariant culture in that case or rather en-US?)
ETA: Well, what I'm trying here is something like the following:
function New-SvgWave([int]$HalfWaves, [double]$Amplitude, [switch]$Upwards) {
"<path d='M0,0q0.5,{0} 1,0{1}v1q-0.5,{2} -1,0{3}z'/>" -f (
$(if ($Upwards) {-$Amplitude} else {$Amplitude}),
("t1,0" * ($HalfWaves - 1)),
$(if ($Upwards -xor ($HalfWaves % 2 -eq 0)) {-$Amplitude} else {$Amplitude}),
("t-1,0" * ($HalfWaves - 1))
)
}
Just a little automation for stuff I tend to write all the time and the double values need to use the decimal point instead of a comma (which they use in my locale).
ETA2: Interesting trivia to add:
PS Home:> $d=1.23
PS Home:> $d
1,23
PS Home:> "$d"
1.23
By putting the variable into a string the set locale doesn't seem to apply, somehow.
While Keith Hill's helpful answer shows you how to change a script's current culture on demand (more modern alternative as of PSv3+ and .NET framework v4.6+:
[cultureinfo]::CurrentCulture = [cultureinfo]::InvariantCulture), there is no need to change the culture, because - as you've discovered in your second update to the question - PowerShell's string interpolation - as opposed to using the -f operator - always uses the invariant rather than the current culture:
In other words:
If you replace 'val: {0}' -f 1.2 with "val: $(1.2)", the number literal 1.2 is not formatted according to the rules of the current culture.
You can verify in the console by running (on a single line; PSv3+, .NET framework v4.6+):
PS> [cultureinfo]::currentculture = 'de-DE'; 'val: {0}' -f 1.2; "val: $(1.2)"
val: 1,2 # -f operator: GERMAN culture applies, where ',' is the decimal mark
val: 1.2 # string interpolation: INVARIANT culture applies, where '.' is the decimal mark.
Note: In PowerShell (Core) 7+, the change to a different culture remains in effect for the remainder of the session (as it arguably should for Windows PowerShell too, but doesn't).
Background:
By design,[1] but perhaps surprisingly, PowerShell applies the invariant rather than the current culture in the following string-related contexts, if the type at hand supports culture-specific conversion to and from strings:
As explained in this in-depth answer, PowerShell explicitly requests culture-invariant processing, if possible - by passing the [cultureinfo]::InvariantCulture instance - in the following scenarios (the stringification PowerShell performs is the equivalent of calling .psobject.ToString([NullString]::Value, [cultureinfo]::InvariantCulture) on a value):
When string-interpolating: if the object's type implements the IFormattable interface.
When casting:
to a string, including implicit conversion when binding to a [string]-typed parameter: if the source type implements the [IFormattable] interface.
from a string: if the target type's static .Parse() method has an overload with an [IFormatProvider]-typed parameter (which is an interface implemented by [cultureinfo]).
When string-comparing (-eq, -lt, -gt) , using a String.Compare() overload that accepts a CultureInfo parameter.
Others?
Note that, separately, custom stringification is applied in casts / implicit stringification for the following .NET types:
Arrays and, more generally, similar list-like collection types that PowerShell enumerates in the pipeline (see the bottom section of this answer for what those types are).
The (stringified) elements of such types are concatenated with spaces (strictly speaking: with the string specified in the rarely used $OFS preference variable); the stringification of the elements is recursively subject to the rules described here.
E.g, [string] (1, 2) yields '1 2'
[pscustomobject]
Such instances result in a hashtable-like string format described in this answer; e.g.:
# -> '#{foo=1; bar=2.2}'; values are formatted with the *invariant* culture
[string] ([pscustomobject] #{ foo = 1; bar = 2.2 })
The fact that calling .ToString() directly on a [pscustomobject] instance does not yield this representation and instead returns the empty string should be considered a bug - see GitHub issue #6163.
Others?
As for the purpose of the invariant culture:
The invariant culture is culture-insensitive; it is associated with the English language but not with any country/region.
[...]
Unlike culture-sensitive data, which is subject to change by user customization or by updates to the .NET Framework or the operating system, invariant culture data is stable over time and across installed cultures and cannot be customized by users. This makes the invariant culture particularly useful for operations that require culture-independent results, such as formatting and parsing operations that persist formatted data, or sorting and ordering operations that require that data be displayed in a fixed order regardless of culture.
Presumably, it is the stability across cultures that motivated PowerShell's designers to consistently use the invariant culture when implicitly converting to and from strings.
For instance, if you hard-code a date string such as '7/21/2017' into a script and later try to convert it to date with a [date] cast, PowerShell's culture-invariant behavior ensures that the script doesn't break even when run while a culture other than US-English is in effect - fortunately, the invariant culture also recognizes ISO 8601-format date and time strings;
e.g., [datetime] '2017-07-21' works too.
On the flip side, if you do want to convert to and from current-culture-appropriate strings, you must do so explicitly.
To summarize:
Converting to strings:
Embedding instances of data types with culture-sensitive-by-default string representations inside "..." yields a culture-invariant representation ([double] or [datetime] are examples of such types).
To get a current-culture representation, call .ToString() explicitly or use -f), the formatting operator (possibly inside "..." via an enclosing $(...)).
Converting from strings:
A direct cast ([<type>] ...) only ever recognizes culture-invariant string representations.
To convert from a current-culture-appropriate string representation (or a specific culture's representation), use the target type's static ::Parse() method explicitly (optionally with an explicit [cultureinfo] instance to represent a specific culture).
Culture-INVARIANT examples:
string interpolation and casts:
"$(1/10)" and [string] 1/10
both yield string literal 0.1, with decimal mark ., irrespective of the current culture.
Similarly, casts from strings are culture-invariant; e.g., [double] '1.2'
. is always recognized as the decimal mark, irrespective of the current culture.
Another way of putting it: [double] 1.2 is not translated to the culture-sensitive-by-default method overload [double]::Parse('1.2'), but to the culture-invariant [double]::Parse('1.2', [cultureinfo]::InvariantCulture)
string comparison (assume that [cultureinfo]::CurrentCulture='tr-TR' is in effect - Turkish, where i is NOT a lowercase representation of I)
[string]::Equals('i', 'I', 'CurrentCultureIgnoreCase')
$false with the Turkish culture in effect.
'i'.ToUpper() shows that in the Turkish culture the uppercase is İ, not I.
'i' -eq 'I'
is still $true, because the invariant culture is applied.
implicitly the same as: [string]::Equals('i', 'I', 'InvariantCultureIgnoreCase')
Culture-SENSITIVE examples:
The current culture IS respected in the following cases:
With -f, the string-formatting operator (as noted above):
[cultureinfo]::currentculture = 'de-DE'; '{0}' -f 1.2 yields 1,2
Pitfall: Due to operator precedence, any expression as the RHS of -f must be enclosed in (...) in order to be recognized as such:
E.g., '{0}' -f 1/10 is evaluated as if ('{0}' -f 1) / 10 had been specified;
use '{0}' -f (1/10) instead.
Default output to the console:
e.g., [cultureinfo]::CurrentCulture = 'de-DE'; 1.2 yields 1,2
The same applies to output from cmdlets; e.g.,
[cultureinfo]::CurrentCulture = 'de-DE'; Get-Date '2017-01-01' yields
Sonntag, 1. Januar 2017 00:00:00
Caveat: In certain scenarios, literals passed to a script block as unconstrained parameters can result in culture-invariant default output - see GitHub issue #4557 and GitHub issue #4558.
In (all?) cmdlets:
Those that that perform equality comparisons:
Select-Object with the -Unique switch; also note that - unusually - case-sensitive comparison is performed, and as of PowerShell 7.2.4 case-insensitivity isn't even available as an opt-in - see GitHub issue #12059.
Select-Object
Compare-Object
Others?
Those that write to files:
Set-Content and Add-Content
Out-File and therefore its virtual alias, > (and >>)
e.g., [cultureinfo]::CurrentCulture = 'de-DE'; 1.2 > tmp.txt; Get-Content tmp.txt yields 1,2
Due to .NET's logic, when using the static ::Parse() / ::TryParse() methods on number types such as [double] while passing only the string to parse; e.g., with culture fr-FR in effect (where , is the decimal mark), [double]::Parse('1,2') returns double 1.2 (i.e., 1 + 2/10).
Caveat: As bviktor points out, thousands separators are recognized by default, but in a very loose fashion: effectively, the thousands separator can be placed anywhere inside the integer portion, irrespective of how many digits are in the resulting groups, and a leading 0 is also accepted; e.g., in the en-US culture (where , is the thousands separator), [double]::Parse('0,18') perhaps surprisingly succeeds and yields 18.
To suppress recognition of thousands separators, use something like [double]::Parse('0,18', 'Float'), via the NumberStyles parameter
Unintentional culture-sensitivity that won't be corrected to preserve backward compatibility:
In parameter-binding type conversions for compiled cmdlets (but PowerShell code - scripts or functions - is culture-invariant) - see GitHub issue #6989.
In the -as operator - see GitHub issue #8129.
In [hashtable] key lookups - see this answer and GitHub issue #8280.
[Fixed in v7.1+] In the LHS of -replace operations - see GitHub issue #10948.
Others?
[1] The aim is to support programmatic processing using representations that do not vary by culture and do not change over time. See the linked quote from the docs later in the answer.
This is a PowerShell function I use for testing script in other cultures. I believe it could be used for what you are after:
function Using-Culture ([System.Globalization.CultureInfo]$culture =(throw "USAGE: Using-Culture -Culture culture -Script {scriptblock}"),
[ScriptBlock]$script=(throw "USAGE: Using-Culture -Culture culture -Script {scriptblock}"))
{
$OldCulture = [System.Threading.Thread]::CurrentThread.CurrentCulture
$OldUICulture = [System.Threading.Thread]::CurrentThread.CurrentUICulture
try {
[System.Threading.Thread]::CurrentThread.CurrentCulture = $culture
[System.Threading.Thread]::CurrentThread.CurrentUICulture = $culture
Invoke-Command $script
}
finally {
[System.Threading.Thread]::CurrentThread.CurrentCulture = $OldCulture
[System.Threading.Thread]::CurrentThread.CurrentUICulture = $OldUICulture
}
}
PS> $res = Using-Culture fr-FR { 1.1 }
PS> $res
1.1
I was thinking about how to make it easy and came up with accelerators:
Add-type -typedef #"
using System;
public class InvFloat
{
double _f = 0;
private InvFloat (double f) {
_f = f;
}
private InvFloat(string f) {
_f = Double.Parse(f, System.Globalization.CultureInfo.InvariantCulture);
}
public static implicit operator InvFloat (double f) {
return new InvFloat(f);
}
public static implicit operator double(InvFloat f) {
return f._f;
}
public static explicit operator InvFloat (string f) {
return new InvFloat (f);
}
public override string ToString() {
return _f.ToString(System.Globalization.CultureInfo.InvariantCulture);
}
}
"#
$acce = [type]::gettype("System.Management.Automation.TypeAccelerators")
$acce::Add('f', [InvFloat])
$y = 1.5.ToString()
$z = ([f]1.5).ToString()
I hope it will help.
If you already have the culture loaded in your environment,
#>Get-Culture
LCID Name DisplayName
---- ---- -----------
1031 de-DE German (Germany)
#>Get-UICulture
LCID Name DisplayName
---- ---- -----------
1033 en-US English (United States)
it is possible to resolve this problem:
PS Home:> $d=1.23
PS Home:> $d
1,23
like this:
$d.ToString([cultureinfo]::CurrentUICulture)
1.23
Of course you need to keep in mind that if other users run the script with a different locale setting, the results may not turn out as originally intended.
Nevertheless, this solution could come in useful. Have fun!
How can I count the scale of a given decimal in Powershell?
$a = 0.0001
$b = 0.000001
Casting $a to a string and returning $a.Length gives a result of 6...I need 4.
I thought there'd be a decimal or math function but I haven't found it and messing with a string seems inelegant.
There's probably a better mathematic way but I'd find the decimal places like this:
$a = 0.0001
$decimalPlaces = ("$a" -split '\.')[-1].TrimEnd('0').Length
Basically, split the string on the . character and get the length of the last string in the array. Wrapping $a in double-quotes implicitly calls .ToString() with an invariant culture (you could expand this as $a.ToString([CultureInfo]::InvariantCulture)), making this method to determine the number of decimal places culture-invariant.
.TrimEnd('0') is used in case $a were sourced from a string, not a proper number type, it's possible that trailing zeroes could be included that should not count as decimal places. However, if you want the scale and not just the used decimal places, leave .TrimEnd('0') off like so:
$decimalPlaces = ("$a" -split '\.')[-1].Length
mclayton helpfully linked to this answer to a related C# question in a comment, and the solution there can indeed be adapted to PowerShell, if working with or conversion to type [decimal] is acceptable:
# Define $a as a [decimal] literal (suffix 'd')
# This internally records the scale (number of decimal places) as specified.
$a = 0.0001d
# [decimal]::GetBits() allows extraction of the scale from the
# the internal representation:
[decimal]::GetBits($a)[-1] -shr 16 -band 0xFF # -> 4, the number of decimal places
The System.Decimal.GetBits method returns an array of internal bit fields whose last element contains the scale in bits 16 - 23 (8 bits, even though the max. scale allowed is 28), which is what the above extracts.
Note: A PowerShell number literal that is a fractional number without the d suffix - e.g., 0.0001 becomes a [double] instance, i.e. a double-precision binary floating-point number.
PowerShell automatically converts [double] to [decimal] values on demand, but do note that there can be rounding errors due to the differing internal representations, and that [double] can store larger numbers than [decimal] can (although not accurately).
A [decimal] literal - one with suffix d (note that C# uses suffix m) - is parsed with a scale exactly as specified, so that applying the above to 0.000d and 0.010d yields 3 in both cases; that is, the trailing zeros are meaningful.
This does not apply if you (implicitly) convert from [double] instances such as 0.000 and 0.010, for which the above yields 0 and 2, respectively.
A string-based solution:
To offer a more concise (also culture-invariant) alternative to Bender The Greatest's helpful answer:
$a = 0.0001
("$a" -replace '.+\.').Length # -> 4, the number of decimal places
Caveat: This solution relies on the default string representation of a [double] number, which need not match the original input format; for instance, .0100, when stringified later, becomes '0.01'; however, as discussed above, you can preserve trailing zeros if you start with a [decimal] literal: .0100d stringifies to '0.0100' (input number of decimals preserved).
"$a", uses an expandable string (PowerShell's string interpolation) to create a culture-invariant string representation of the number so as to ensure that the string representation uses . as the decimal mark.
In effect, PowerShell calls $a.ToString([cultureinfo]::InvariantCulture) behind the scenes.[1].
By contrast, .ToString() (argument-less) applies the rules of the current culture, and in some cultures it is , - not . - that is used as the decimal mark.
Caveat: If you use just $a as the LHS of -replace, $a is implicitly stringified, in which case you - curiously - get culture-sensitive behavior, as with .ToString() - see this GitHub issue.
-replace '.+\.' effectively removes all characters up to and including the decimal point from the input string, and .Length counts the characters in the resulting string - the number of decimal places.
[1] Note that casts from strings in PowerShell too use the invariant culture (effectively, ::Parse($value, [cultureinfo]::InvariantCulture) is called) so that in order to parse a a culture-local string representation you'll need to use the ::Parse() method explicitly; e.g., [double]::Parse('1,2'), not [double] '1,2'.
I have code that works, but I have no idea WHY it works.
This will generate a list containing each letter of the English alphabet:
[char[]]([char]'a'..[char]'z')
However, this will not:
[char]([char]'a'..[char]'z')
and this will actually generate a list of numbers from 97 - 122
([char]'a'..[char]'z')
Could any experts out there explain to me how this works (or doesn't)?
In your second example, you are trying to cast an array of characters to a single character [char]. That won't work. In the third example, the 'a' is considered a string by PowerShell. So casting it to [char] tells PowerShell it is a single char. The .. operator ranges over numbers. Fortunately, PowerShell can convert the character 'a' to its ASCII value 97 and 'z' to 122. So you effectively wind up with 97..122. Then in your first example, the [char[]] converts that array of ints back to an array of characters: a through z.
In Powershell 'a' is a [string] type. [char]'a' is, obviously a [char] type. These are very different things.
$string = 'a'
$char = [char]$string
$string can be cast as a [char] because it is a string, consisting of a single character. If there is more than one character in the string, e.g. 'ab' then you need an array of [chars], which is type [char[]]. The extra set of square brackets designates an array.
$string | get-member
$char | get-member
reveals much different methods for the two types. The [char] type has .toint() methods. If you cast it as [int], it assumes the numeric ASCII code for that character.
[int]$char
returns 97, the ASCII code for the letter 'a'.
Is there an easy way in PowerShell to format numbers and the like in another locale? I'm currently writing a few functions to ease SVG generation for me and SVG uses . as a decimal separator, while PowerShell honors my locale settings (de-DE) when converting floating-point numbers to strings.
Is there an easy way to set another locale for a function or so without sticking
.ToString((New-Object Globalization.CultureInfo ""))
after every double variable?
Note: This is about the locale used for formatting, not the format string.
(Side question: Should I use the invariant culture in that case or rather en-US?)
ETA: Well, what I'm trying here is something like the following:
function New-SvgWave([int]$HalfWaves, [double]$Amplitude, [switch]$Upwards) {
"<path d='M0,0q0.5,{0} 1,0{1}v1q-0.5,{2} -1,0{3}z'/>" -f (
$(if ($Upwards) {-$Amplitude} else {$Amplitude}),
("t1,0" * ($HalfWaves - 1)),
$(if ($Upwards -xor ($HalfWaves % 2 -eq 0)) {-$Amplitude} else {$Amplitude}),
("t-1,0" * ($HalfWaves - 1))
)
}
Just a little automation for stuff I tend to write all the time and the double values need to use the decimal point instead of a comma (which they use in my locale).
ETA2: Interesting trivia to add:
PS Home:> $d=1.23
PS Home:> $d
1,23
PS Home:> "$d"
1.23
By putting the variable into a string the set locale doesn't seem to apply, somehow.
While Keith Hill's helpful answer shows you how to change a script's current culture on demand (more modern alternative as of PSv3+ and .NET framework v4.6+:
[cultureinfo]::CurrentCulture = [cultureinfo]::InvariantCulture), there is no need to change the culture, because - as you've discovered in your second update to the question - PowerShell's string interpolation - as opposed to using the -f operator - always uses the invariant rather than the current culture:
In other words:
If you replace 'val: {0}' -f 1.2 with "val: $(1.2)", the number literal 1.2 is not formatted according to the rules of the current culture.
You can verify in the console by running (on a single line; PSv3+, .NET framework v4.6+):
PS> [cultureinfo]::currentculture = 'de-DE'; 'val: {0}' -f 1.2; "val: $(1.2)"
val: 1,2 # -f operator: GERMAN culture applies, where ',' is the decimal mark
val: 1.2 # string interpolation: INVARIANT culture applies, where '.' is the decimal mark.
Note: In PowerShell (Core) 7+, the change to a different culture remains in effect for the remainder of the session (as it arguably should for Windows PowerShell too, but doesn't).
Background:
By design,[1] but perhaps surprisingly, PowerShell applies the invariant rather than the current culture in the following string-related contexts, if the type at hand supports culture-specific conversion to and from strings:
As explained in this in-depth answer, PowerShell explicitly requests culture-invariant processing, if possible - by passing the [cultureinfo]::InvariantCulture instance - in the following scenarios (the stringification PowerShell performs is the equivalent of calling .psobject.ToString([NullString]::Value, [cultureinfo]::InvariantCulture) on a value):
When string-interpolating: if the object's type implements the IFormattable interface.
When casting:
to a string, including implicit conversion when binding to a [string]-typed parameter: if the source type implements the [IFormattable] interface.
from a string: if the target type's static .Parse() method has an overload with an [IFormatProvider]-typed parameter (which is an interface implemented by [cultureinfo]).
When string-comparing (-eq, -lt, -gt) , using a String.Compare() overload that accepts a CultureInfo parameter.
Others?
Note that, separately, custom stringification is applied in casts / implicit stringification for the following .NET types:
Arrays and, more generally, similar list-like collection types that PowerShell enumerates in the pipeline (see the bottom section of this answer for what those types are).
The (stringified) elements of such types are concatenated with spaces (strictly speaking: with the string specified in the rarely used $OFS preference variable); the stringification of the elements is recursively subject to the rules described here.
E.g, [string] (1, 2) yields '1 2'
[pscustomobject]
Such instances result in a hashtable-like string format described in this answer; e.g.:
# -> '#{foo=1; bar=2.2}'; values are formatted with the *invariant* culture
[string] ([pscustomobject] #{ foo = 1; bar = 2.2 })
The fact that calling .ToString() directly on a [pscustomobject] instance does not yield this representation and instead returns the empty string should be considered a bug - see GitHub issue #6163.
Others?
As for the purpose of the invariant culture:
The invariant culture is culture-insensitive; it is associated with the English language but not with any country/region.
[...]
Unlike culture-sensitive data, which is subject to change by user customization or by updates to the .NET Framework or the operating system, invariant culture data is stable over time and across installed cultures and cannot be customized by users. This makes the invariant culture particularly useful for operations that require culture-independent results, such as formatting and parsing operations that persist formatted data, or sorting and ordering operations that require that data be displayed in a fixed order regardless of culture.
Presumably, it is the stability across cultures that motivated PowerShell's designers to consistently use the invariant culture when implicitly converting to and from strings.
For instance, if you hard-code a date string such as '7/21/2017' into a script and later try to convert it to date with a [date] cast, PowerShell's culture-invariant behavior ensures that the script doesn't break even when run while a culture other than US-English is in effect - fortunately, the invariant culture also recognizes ISO 8601-format date and time strings;
e.g., [datetime] '2017-07-21' works too.
On the flip side, if you do want to convert to and from current-culture-appropriate strings, you must do so explicitly.
To summarize:
Converting to strings:
Embedding instances of data types with culture-sensitive-by-default string representations inside "..." yields a culture-invariant representation ([double] or [datetime] are examples of such types).
To get a current-culture representation, call .ToString() explicitly or use -f), the formatting operator (possibly inside "..." via an enclosing $(...)).
Converting from strings:
A direct cast ([<type>] ...) only ever recognizes culture-invariant string representations.
To convert from a current-culture-appropriate string representation (or a specific culture's representation), use the target type's static ::Parse() method explicitly (optionally with an explicit [cultureinfo] instance to represent a specific culture).
Culture-INVARIANT examples:
string interpolation and casts:
"$(1/10)" and [string] 1/10
both yield string literal 0.1, with decimal mark ., irrespective of the current culture.
Similarly, casts from strings are culture-invariant; e.g., [double] '1.2'
. is always recognized as the decimal mark, irrespective of the current culture.
Another way of putting it: [double] 1.2 is not translated to the culture-sensitive-by-default method overload [double]::Parse('1.2'), but to the culture-invariant [double]::Parse('1.2', [cultureinfo]::InvariantCulture)
string comparison (assume that [cultureinfo]::CurrentCulture='tr-TR' is in effect - Turkish, where i is NOT a lowercase representation of I)
[string]::Equals('i', 'I', 'CurrentCultureIgnoreCase')
$false with the Turkish culture in effect.
'i'.ToUpper() shows that in the Turkish culture the uppercase is İ, not I.
'i' -eq 'I'
is still $true, because the invariant culture is applied.
implicitly the same as: [string]::Equals('i', 'I', 'InvariantCultureIgnoreCase')
Culture-SENSITIVE examples:
The current culture IS respected in the following cases:
With -f, the string-formatting operator (as noted above):
[cultureinfo]::currentculture = 'de-DE'; '{0}' -f 1.2 yields 1,2
Pitfall: Due to operator precedence, any expression as the RHS of -f must be enclosed in (...) in order to be recognized as such:
E.g., '{0}' -f 1/10 is evaluated as if ('{0}' -f 1) / 10 had been specified;
use '{0}' -f (1/10) instead.
Default output to the console:
e.g., [cultureinfo]::CurrentCulture = 'de-DE'; 1.2 yields 1,2
The same applies to output from cmdlets; e.g.,
[cultureinfo]::CurrentCulture = 'de-DE'; Get-Date '2017-01-01' yields
Sonntag, 1. Januar 2017 00:00:00
Caveat: In certain scenarios, literals passed to a script block as unconstrained parameters can result in culture-invariant default output - see GitHub issue #4557 and GitHub issue #4558.
In (all?) cmdlets:
Those that that perform equality comparisons:
Select-Object with the -Unique switch; also note that - unusually - case-sensitive comparison is performed, and as of PowerShell 7.2.4 case-insensitivity isn't even available as an opt-in - see GitHub issue #12059.
Select-Object
Compare-Object
Others?
Those that write to files:
Set-Content and Add-Content
Out-File and therefore its virtual alias, > (and >>)
e.g., [cultureinfo]::CurrentCulture = 'de-DE'; 1.2 > tmp.txt; Get-Content tmp.txt yields 1,2
Due to .NET's logic, when using the static ::Parse() / ::TryParse() methods on number types such as [double] while passing only the string to parse; e.g., with culture fr-FR in effect (where , is the decimal mark), [double]::Parse('1,2') returns double 1.2 (i.e., 1 + 2/10).
Caveat: As bviktor points out, thousands separators are recognized by default, but in a very loose fashion: effectively, the thousands separator can be placed anywhere inside the integer portion, irrespective of how many digits are in the resulting groups, and a leading 0 is also accepted; e.g., in the en-US culture (where , is the thousands separator), [double]::Parse('0,18') perhaps surprisingly succeeds and yields 18.
To suppress recognition of thousands separators, use something like [double]::Parse('0,18', 'Float'), via the NumberStyles parameter
Unintentional culture-sensitivity that won't be corrected to preserve backward compatibility:
In parameter-binding type conversions for compiled cmdlets (but PowerShell code - scripts or functions - is culture-invariant) - see GitHub issue #6989.
In the -as operator - see GitHub issue #8129.
In [hashtable] key lookups - see this answer and GitHub issue #8280.
[Fixed in v7.1+] In the LHS of -replace operations - see GitHub issue #10948.
Others?
[1] The aim is to support programmatic processing using representations that do not vary by culture and do not change over time. See the linked quote from the docs later in the answer.
This is a PowerShell function I use for testing script in other cultures. I believe it could be used for what you are after:
function Using-Culture ([System.Globalization.CultureInfo]$culture =(throw "USAGE: Using-Culture -Culture culture -Script {scriptblock}"),
[ScriptBlock]$script=(throw "USAGE: Using-Culture -Culture culture -Script {scriptblock}"))
{
$OldCulture = [System.Threading.Thread]::CurrentThread.CurrentCulture
$OldUICulture = [System.Threading.Thread]::CurrentThread.CurrentUICulture
try {
[System.Threading.Thread]::CurrentThread.CurrentCulture = $culture
[System.Threading.Thread]::CurrentThread.CurrentUICulture = $culture
Invoke-Command $script
}
finally {
[System.Threading.Thread]::CurrentThread.CurrentCulture = $OldCulture
[System.Threading.Thread]::CurrentThread.CurrentUICulture = $OldUICulture
}
}
PS> $res = Using-Culture fr-FR { 1.1 }
PS> $res
1.1
I was thinking about how to make it easy and came up with accelerators:
Add-type -typedef #"
using System;
public class InvFloat
{
double _f = 0;
private InvFloat (double f) {
_f = f;
}
private InvFloat(string f) {
_f = Double.Parse(f, System.Globalization.CultureInfo.InvariantCulture);
}
public static implicit operator InvFloat (double f) {
return new InvFloat(f);
}
public static implicit operator double(InvFloat f) {
return f._f;
}
public static explicit operator InvFloat (string f) {
return new InvFloat (f);
}
public override string ToString() {
return _f.ToString(System.Globalization.CultureInfo.InvariantCulture);
}
}
"#
$acce = [type]::gettype("System.Management.Automation.TypeAccelerators")
$acce::Add('f', [InvFloat])
$y = 1.5.ToString()
$z = ([f]1.5).ToString()
I hope it will help.
If you already have the culture loaded in your environment,
#>Get-Culture
LCID Name DisplayName
---- ---- -----------
1031 de-DE German (Germany)
#>Get-UICulture
LCID Name DisplayName
---- ---- -----------
1033 en-US English (United States)
it is possible to resolve this problem:
PS Home:> $d=1.23
PS Home:> $d
1,23
like this:
$d.ToString([cultureinfo]::CurrentUICulture)
1.23
Of course you need to keep in mind that if other users run the script with a different locale setting, the results may not turn out as originally intended.
Nevertheless, this solution could come in useful. Have fun!