How do I encode Unicode character codes in a PowerShell string literal? - powershell

How can I encode the Unicode character U+0048 (H), say, in a PowerShell string?
In C# I would just do this: "\u0048", but that doesn't appear to work in PowerShell.

Replace '\u' with '0x' and cast it to System.Char:
PS > [char]0x0048
H
You can also use the "$()" syntax to embed a Unicode character into a string:
PS > "Acme$([char]0x2122) Company"
AcmeT Company
Where T is PowerShell's representation of the character for non-registered trademarks.
Note: this method works only for characters in Plane 0, the BMP (Basic Multilingual Plane), chars < U+10000.

According to the documentation, PowerShell Core 6.0 adds support with this escape sequence:
PS> "`u{0048}"
H
see https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_special_characters?view=powershell-6#unicode-character-ux

Maybe this isn't the PowerShell way, but this is what I do. I find it to be cleaner.
[regex]::Unescape("\u0048") # Prints H
[regex]::Unescape("\u0048ello") # Prints Hello

For those of us still on 5.1 and wanting to use the higher-order Unicode charset (for which none of these answers work) I made this function so you can simply build strings like so:
'this is my favourite park ',0x1F3DE,'. It is pretty sweet ',0x1F60A | Unicode
#takes in a stream of strings and integers,
#where integers are unicode codepoints,
#and concatenates these into valid UTF16
Function Unicode {
Begin {
$output=[System.Text.StringBuilder]::new()
}
Process {
$output.Append($(
if ($_ -is [int]) { [char]::ConvertFromUtf32($_) }
else { [string]$_ }
)) | Out-Null
}
End { $output.ToString() }
}
Note that getting these to display in your console is a whole other problem, but if you're outputting to an Outlook email or a Gridview (below) it will just work (as utf16 is native for .NET interfaces).
This also means you can also output plain control (not necessarily unicode) characters pretty easily if you're more comfortable with decimal since you dont actually need to use the 0x (hex) syntax to make the integers. 'hello',32,'there' | Unicode would put a non-breaking space betwixt the two words, the same as if you did 0x20 instead.

Another way using PowerShell.
$Heart = $([char]0x2665)
$Diamond = $([char]0x2666)
$Club = $([char]0x2663)
$Spade = $([char]0x2660)
Write-Host $Heart -BackgroundColor Yellow -ForegroundColor Magenta
Use the command help Write-Host -Full to read all about it.

To make it work for characters outside the BMP you need to use Char.ConvertFromUtf32()
'this is my favourite park ' + [char]::ConvertFromUtf32(0x1F3DE) +
'. It is pretty sweet ' + [char]::ConvertFromUtf32(0x1F60A)

Note that some characters like 🌎 might need a "double rune" to be printed:
PS> "C:\foo\bar\$([char]0xd83c)$([char]0xdf0e)something.txt"
Will print:
C:\foo\bar\🌎something.txt
You can find these "runes" here, in the "unicode escape" row:
https://dencode.com/string

Related

how to add quotes in this string

I have this string
[{listenport:443,connectaddress:10.1.10.20,connectport:443,firewallrulename:port443,direction:Inbound,action:Allow,protocol:TCP},{listenport:80,connectaddress:10.1.10.20,connectport:80,firewallrulename:port80,direction:Inbound,action:Allow,protocol:TCP}]
i'm wondering how can I write a function to convert it to
[{"listenport":"443","connectaddress":"10.1.10.20","connectport":"443","firewallrulename":"port443","direction":"Inbound","action":"Allow","protocol":"TCP"},{"listenport":"80","connectaddress":"10.1.10.20","connectport":"80","firewallrulename":"port80","direction":"Inbound","action":"Allow","protocol":"TCP"}]
I have tried to use insert and indexof , but couldn't figure out how to do for an entire string
If you really have to work with this format and cannot produce well-formed JSON to begin with, at least in your sample input both the property names and values are composed only of characters that are either . or fall into the \w regex category, so a single -replace operation is possible:
#'
[{listenport:443,connectaddress:10.1.10.20,connectport:443,firewallrulename:port443,direction:Inbound,action:Allow,protocol:TCP},{listenport:80,connectaddress:10.1.10.20,connectport:80,firewallrulename:port80,direction:Inbound,action:Allow,protocol:TCP}]
'# -replace '[\w.]+', '"$&"'
The result is well-formed JSON, which you can pipe to ConvertFrom-Json for OO processing in PowerShell.
If you can only assume that the property names are composed of only \w characters:
#'
[{listenport:443,connectaddress:10.1.10.20,connectport:443,firewallrulename:port443,direction:Inbound,action:Allow,protocol:TCP},{listenport:80,connectaddress:10.1.10.20,connectport:80,firewallrulename:port80,direction:Inbound,action:Allow,protocol:TCP}]
'# -replace '(\w+):', '"$1":"' -replace '\}|(?<!\}),', '"$&'
eventually hacked it by using replace
$proxyinfosjson = $proxyinfosjson.Replace(',', '","').Replace('{', '{"').Replace('}', '"}').replace(':', '":"').Replace('}"', '}').Replace('"{', '{')
so ugly.. not proud of it.. but works..

Getting the binary value of a character string in Powershell

$getInput = Read-Host "ASCII or Binary? `n"
$getInput = $getInput.toLower()
if($getInput -eq "ascii"){
""
#Write-Host "Type In Your ASCII" -backgroundcolor "black"
$getAscii = Read-Host "Type In Your ASCII`n"
""
""
$readAscii = #($getAscii)
[byte[]]$outBytes = $readAscii
}
elseif($getInput -eq "binary"){
}
else{
Write-Host "Wrong Input... [ASCII] or [BINARY]" -backgroundcolor "red" -foregroundcolor "white"
}
I want to be able to get a users paragraph or whatever string they put in and convert it to binary. The [conver]::toString($getAscii,2) only works for integers.
Try this
$string = "ABCDEF"
[system.Text.Encoding]::Default.GetBytes($String) | %{[System.Convert]::ToString($_,2).PadLeft(8,'0') }
[system.Text.Encoding]::Default.GetBytes($String)
This turns a string into a byte array. You can change Default to another Encoding
| %{[System.Convert]::ToString($_,2).PadLeft(8,'0') }
This turns each byte in the byte array into a binary representation.
ToString([object],[Enum]), in this case the byte will have a number value like 65 if converted to string the 2 will say turn the 65 into base 2. You could also use 8(octo), 10(which is the same as none aka base 10) and 16(Hex). Then it pads the left till its 8 char long with char 0's
'hello world' -split '' | % {
if ($_ -ne '') {
#[int][char]$_
[System.Convert]::ToString(([int][char]$_),2)
}
}
Use the split operator to split the string by each character
Send that down the pipeline to a foreach-object loop
The split operation ends up including the space character in the string
so the conditional makes sure we don't act upon it--we filter it out.
The commented line was for testing purposes. Each character has a
TYPE of [string] and we need it as a [char] so we explicitly cast it
as such and the PowerShell engine dynamically switches it for us (as
long as it can). In the same line, we explicitly cast the [char] to
an [int] to get the ASCII->decimal representation. This test was just to
ensure I was getting the right output and I left it commented in
case the OP wanted to see it.
Finally, we use the ToString() method of the System.Convert class which accepts a "base"
parameter to define that we want a base2 (binary) representation of
the integer supplied in position 1, casted as TYPE [string].
I recommend utilizing the Encoding library similarly to this user:
$stringToConvert = "Hello World"
$test = [System.Text.Encoding]::UTF8.GetBytes($stringToConvert) | %{ [System.Convert]::ToString($_,2).PadLeft(8,'0') }
$test
Source: https://www.reddit.com/r/PowerShell/comments/3e82vk/convert_string_to_binary_and_back/
*Note: I believe the original poster of this method intended to assign $foo to the second conversion. I believe it will work either way because the return will be dumped to the variable below.

String.Trim() not removing characters in a string

I need to create a String from double the use String.Trim() to remove the full stop, but it doesn't remove it. I think there is also a way to do this numerically but I'd like to do it with the string. Is there a reason it won't remove it? The output from the code is 5.5
$MyDouble = 5.5
[String]$MyDouble2 = $MyDouble
$MyDouble2.Trim(".")
$MyDouble2
String.Trim() only trims from the beginning and end of strings, so it has no effect in your command, because the . only occurs inside your input string.
If you truly want to remove just the . and keep the post-decimal-point digits, use the -replace operator:
$MyDouble2 -replace '\.' # -> '55'
Note:
* -replace takes a regex (regular expression) as the search operand, hence the need to escape regex metacharacter . as \.
* The above is short for $MyDouble2 -replace '\.', ''. Since the replacement string is the empty string in this case, it can be omitted.
If you only want to extract the integer portion, use either 4c74356b41's .Split()-based answer, or adapt the regex passed to -replace to match everything from the . through the end of the string.
$MyDouble2 -replace '\..*' # -> '5'
#Matt mentions the following alternatives:
For removing the . only: Using String.Replace() to perform literal substring replacement (note how . therefore does not need \-escaping, as it did with -replace, and that specifying the replacement string is mandatory):
$MyDouble2.Replace('.', '') # -> '55'
For removing the fractional part of the number (extracting the integer part only), using a numerical operation directly on $MyDouble (as opposed to via the string representation stored in $MyDouble2), via Math.Floor():
[math]::Floor($MyDouble) # -> 5 (still a [double])
Looking at some documentation for .Trim([char[]]) you will see that
Removes all leading and trailing occurrences of a set of characters specified in an array from the current String object.
That does not cover the middle of strings, so using the .Replace() method would accomplish that.
I think there is also a way to do this numerically but I'd like to do it with the string.
Just wanted to mention that converting numbers to strings to then drop decimals via string manipulation is a poor approach. Assuming your example is what you are actually trying to do, I suggest using a static method from the [math] class instead.
$MyDouble = 5.5
[math]::Floor($MyDouble)
$MyDouble = 5.5
[String]$MyDouble2 = $MyDouble
$MyDouble2.Replace(".", "")
Well, why would it trim not the last (or first) character? It wouldn't, what you need (probably) is:
$MyDouble = 5.5
[String]$MyDouble2 = $MyDouble
$MyDouble2.Split(".")[0]
$MyDouble = 5.5
[String]$MyDouble2 = $MyDouble
$res=$MyDouble2 -split "\."
$res[0..($res.Count-1)] -join ""

Prevent coercion

Assuming:
Function Invoke-Foo {
Param(
[string[]]$Ids
)
Foreach ($Id in $Ids) {
Write-Host $Id
}
}
If I do this:
PS> Invoke-Foo -ids '0000','0001'
0000
0001
If I do this:
PS> Invoke-Foo -ids 0000,0001
0
1
In the second case, is there a way to prevent the coercion, other than make them explicit strings (first case)?
No, unfortunately not.
From the about_Parsing help file:
When processing a command, the Windows PowerShell parser operates
in expression mode or in argument mode:
- In expression mode, character string values must be contained in
quotation marks. Numbers not enclosed in quotation marks are treated
as numerical values (rather than as a series of characters).
- In argument mode, each value is treated as an expandable string
unless it begins with one of the following special characters: dollar
sign ($), at sign (#), single quotation mark ('), double quotation
mark ("), or an opening parenthesis (().
So, the parser evaluates 0001 before anything is passed to the function. We can test the effect of treating 0001 as an "Expandable String" with the ExpandString() method:
PS C:\> $ExecutionContext.InvokeCommand.ExpandString(0001)
1
At least, if you are sure that your ids are in the range [0, 9999], you can do the formatting like this:
Function Invoke-Foo {
Param([int[]]$Ids)
Foreach ($Id in $Ids) {
Write-Host ([System.String]::Format("{0:D4}", $Id))
}
}
More about padding numbers with leading zeros can be found here.
What important to note here:
Padding will work for numbers. I changed the parameter typing to int[] so that if you pass the strings they will be converted to numbers and the padding will work for them too.
This method (as it is now) limits you to the range of ids I mentioned before and it always will give you four-zeros-padded output, even if you pass it '003'

How to convert Unicode characters to escape codes

So, I have a bunch of strings like this: {\b\cf12 γ‚ˆγ‚γ¦γ } . I'm thinking I could iterate over each character and replace any unicode (Edit: Anything where AscW(char) > 127 or < 0) with a unicode escape code (\u###). However, I'm not sure how to programmatically do so. Any suggestions?
Clarification:
I have a string like {\b\cf12 γ‚ˆγ‚γ¦γ } and I want a string like {\b\cf12 [STUFF]}, where [STUFF] will display as γ‚ˆγ‚γ¦γ when I view the rtf text.
You can simply use the AscW() function to get the correct value:-
sRTF = "\u" & CStr(AscW(char))
Note unlike other escapes for unicode, RTF uses the decimal signed short int (2 bytes) representation for a unicode character. Which makes the conversion in VB6 really quite easy.
Edit
As MarkJ points out in a comment you would only do this for characters outside of 0-127 but then you would also need to give some other characters inside the 0-127 range special handling as well.
Another more roundabout way, would be to add the MSScript.OCX to the project and interface with VBScript's Escape function. For example
Sub main()
Dim s As String
s = ChrW$(&H3088) & ChrW$(&H308D) & ChrW$(&H3066) & ChrW$(&H305D)
Debug.Print MyEscape(s)
End Sub
Function MyEscape(s As String) As String
Dim scr As Object
Set scr = CreateObject("MSScriptControl.ScriptControl")
scr.Language = "VBScript"
scr.Reset
MyEscape = scr.eval("escape(" & dq(s) & ")")
End Function
Function dq(s)
dq = Chr$(34) & s & Chr$(34)
End Function
The Main routine passes in the original Japanese characters and the debug output says:
%u3088%u308D%u3066%u305D
HTH