How can I stop Server.HtmlEncode to encode UTF8 characters? I set the codepage to UTF8 but didn't help
Here is my test case:
<%#CODEPAGE=65001%>
<%
Response.CodePage = 65001
Response.CharSet = "utf-8"
%>
<%=Server.HtmlEncode("русский stuff <b>bold stuff</b>")%>
It should normally output this:
русский stuff <b>bold stuff</b>
but the output is:
русский stuff <b>bold stuff</b>
Server.HtmlEncode method escapes >,<,&," chars and any ascii code character whose code is greater than or equal to 0x80.You can filter which character will be escaped.
There are generic characters will be encoded in pattern.
If you prefer, you can add some other chars too.
Private Function cb_Escape(ByVal a, ByVal b, ByVal c, ByVal d)
cb_Escape = Server.HTMLEncode(b)
End Function
Private Function HTMLEncode2(ByVal sHTML)
Dim oReg
Set oReg = New RegExp
oReg.Global = True
oReg.Pattern = "([<>""&]+)"
HTMLEncode2 = oReg.Replace(sHTML, GetRef("cb_Escape"))
Set oReg = Nothing
End Function
Response.Write HTMLEncode2("русский stuff <b>bold stuff</b>")
Related
When Sha256 and md5 are added together include in page
<!--#include file="hash/sha256.asp"-->
<!---#include file="hash/md5.asp"--->
gives this error
Microsoft VBScript derleme hatasý hata '800a0411'
Ad yeniden tanýmlandý
/3wPay/include/hash/md5.asp, satır 3
Private Const BITS_TO_A_BYTE = 8
--------------^
but it works alone when md5 or sha256 is included.
How can I run both together
I don't know what that error says, but I'm assuming that BITS_TO_A_BYTE is already defined. I would clean up both of those files and created a 3rd file with all of your constants, and remove the constants from both of those files.
When you create your 3rd file, only define each constant once.
Then include it whenever you use either md5.asp or sha256.asp
As already mentioned, looks like you have 2 constants with the same name. A more efficient and compact alternative in my opinion would be to use the System.Security.Cryptography.X object, which supports MD5, SHA1, SHA256, SHA384 and SHA512.
The example code below also allows you to encode as Hex or Base64:
<%
Function Hash(ByVal Input, Alg, Encoding)
Dim hAlg, hEnc
Set hAlg = CreateObject("System.Security.Cryptography." & Get_Hash_Obj(Alg))
Set hEnc = CreateObject("System.Text.UTF8Encoding")
Hash = BinaryEncode(hAlg.ComputeHash_2(hEnc.GetBytes_4(Input)),Encoding)
Set hEnc = Nothing
Set hAlg = Nothing
End Function
Function Get_Hash_Obj(Alg)
' Get the cryptography class name for the specified hashing algorithm,
' return the class name for SHA1 if not found
Select Case uCase(Alg)
Case "MD5"
Get_Hash_Obj = "MD5CryptoServiceProvider"
Case "SHA1"
Get_Hash_Obj = "SHA1CryptoServiceProvider"
Case "SHA256"
Get_Hash_Obj = "SHA256Managed"
Case "SHA384"
Get_Hash_Obj = "SHA384Managed"
Case "SHA512"
Get_Hash_Obj = "SHA512Managed"
Case Else
Get_Hash_Obj = "SHA1CryptoServiceProvider"
End Select
End Function
Function BinaryEncode(Binary, Encoding)
Dim Enc
Encoding = lCase(Encoding)
Set Enc = CreateObject("MSXML2.DomDocument").CreateElement("encode")
If Encoding = "base64" OR Encoding = "b64" Then
' base64 string
Enc.DataType = "bin.base64"
Enc.NodeTypedValue = Binary
BinaryEncode = Enc.Text
Else
' Hexadecimal string
Enc.DataType = "bin.hex"
Enc.NodeTypedValue = Binary
BinaryEncode = Enc.Text
End If
Set Enc = Nothing
End Function
%>
Example:
Response.Write Hash("Hello World","SHA256","Hex")
Response.Write Hash("Hello World","SHA256","Base64") ' Or B64
Output:
a591a6d40bf420404a011733cfb7b190d62c65bf0bcda32b57b277d9ad9f146e
pZGm1Av0IEBKARczz7exkNYsZb8LzaMrV7J32a2fFG4=
I want to encode a powershell command (a string (get-date).date) to base64 in order to run it via powershell -encodedcommand xxx.
Using the standard VBS methods (or even https://www.base64encode.org/) I get KGdldC1kYXRlKS5kYXRl which does not run.
Using the following powershell script:
$bytes = [System.Text.Encoding]::Unicode.GetBytes($command)
$encodedCommand = [Convert]::ToBase64String($bytes)
I get KABnAGUAdAAtAGQAYQB0AGUAKQAuAGQAYQB0AGUA which works. The difference appears to be that the command is first encoded as Unicode bytes.
Can anyone provide a VBS Function which does this or the VBS equivalent of Unicode.GetBytes() so that we can get the right string encoded?
PowerShell only accepts Base64 encodings of UTF-16 LE-encoded strings with its
-EncodedCommand parameter.
UTF-16 LE is what Unicode stands for in [System.Text.Encoding]::Unicode, and it encodes the vast majority of Unicode characters (code points) as two bytes each; it is also the string encoding used internally by both VBScript and PowerShell.
By contrast, most VBScript solutions out there use the single-byte ASCII encoding, and even the otherwise laudably Unicode-aware https://www.base64encode.org/ only offers UTF-8-based encoding (which is a mostly-single-byte-for-Western-languages encoding with other languages' chars. represented as 2-4 bytes).
Here's a robust UTF-16 LE-based Base64 encoding solution.
I've posted a more modular variant that optionally supports UTF-8 here; the code in both locations builds on this great answer.
Example call:
Base64EncodeUtf16Le("(get-date).date") ' -> "KABnAGUAdAAtAGQAYQB0AGUAKQAuAGQAYQB0AGUA"
Source code:Tip of the hat to MC ND for helping to simplify the solution.
' Base64-encodes the specified string using UTF-16 LE as the underlying text
' encoding.
Function Base64EncodeUtf16Le(ByVal sText)
Dim bytesUtf16Le
' Create an aux. stream from which we can get a binary (byte array)
' representation of the input string in UTF-16 LE encoding.
With CreateObject("ADODB.Stream")
' Create a UTF 16-LE encoded text stream...
.Type = 2 ' adTypeText
.Charset = "utf-16le"
.Open
.WriteText sText
' ... and convert it to a binary stream,
' so we can get the string as a byte array
.Position = 0
.Type = 1 ' adTypeBinary
.Position = 2 ' Skip BOM
bytesUtf16Le = .Read
.Close
End With
' Use an aux. XML document with a Base64-encoded element.
' Assigning a byte stream (array) to .NodeTypedValue
' automatically performs Base64-encoding, whose result can then be accessed
' as the element's text.
With CreateObject("Msxml2.DOMDocument").CreateElement("aux")
.DataType = "bin.base64"
.NodeTypedValue = bytesUtf16Le
Base64EncodeUtf16Le = .Text
End With
End Function
I'm not sure this will handle all your needs, but at least it matches the output indicated in your question
Function ToBase64( ByVal text )
Const adTypeText = 2
Const adTypeBinary = 1
' Right pad each character with a null
With New RegExp
.Pattern = "(.)"
.Global = True
text = .Replace( text, "$1" & Chr(0) )
End With
' Convert String to binary
With WScript.CreateObject("ADODB.Stream")
.Type = adTypeText
.CharSet = "us-ascii"
.Open
.WriteText text
.Position = 0
.Type = adTypeBinary
text = .Read
End With
' Encode binary as Base64
With WScript.CreateObject("Msxml2.DOMDocument.6.0").CreateElement("base64")
.dataType = "bin.base64"
.nodeTypedValue = text
ToBase64 = Replace( .text, vbLf, "" )
End With
End Function
WScript.Echo ToBase64("(get-date).date")
edited just to adapt my previous code to the information in mklement0 answer where you can find the details of the powershell requirements and all the details about how the code works.
Function ToBase64( ByVal text )
Const adTypeText = 2
Const adTypeBinary = 1
' Change string encoding
With WScript.CreateObject("ADODB.Stream")
' Convert input string to UTF-16 LE
.Type = adTypeText
.CharSet = "utf-16le"
.Open
.WriteText text
' Get binary representation of the string
.Position = 0
.Type = adTypeBinary
.Position = 2 ' Skip BOM
text = .Read
End With
' Encode binary as Base64
With WScript.CreateObject("Msxml2.DOMDocument.6.0").CreateElement("base64")
.dataType = "bin.base64"
.nodeTypedValue = text
ToBase64 = Replace( .text, vbLf, "" )
End With
End Function
WScript.Echo ToBase64("(get-date).date")
I need code for removing all unicode characters in a vb6 string.
If this is UTF-16 text (as normal VB6 String values all are) and you can ignore the issue of surrogate pairs, then this is fairly quick and reasonably concise:
Private Sub DeleteNonAscii(ByRef Text As String)
Dim I As Long
Dim J As Long
Dim Char As String
I = 1
For J = 1 To Len(Text)
Char = Mid$(Text, J, 1)
If (AscW(Char) And &HFFFF&) <= &H7F& Then
Mid$(Text, I, 1) = Char
I = I + 1
End If
Next
Text = Left$(Text, I - 1)
End Sub
This has the workaround for the unfortunate choice VB6 had to make in returning a signed 16-bit integer from the AscW() function. It should have been a Long for symmatry with ChrW$() but it is what it is.
It should beat the pants off any regular expression library in clarity, maintainability, and performance. If better performance is required for truly massive amounts of text then SAFEARRAY or CopyMemory stunts could be used.
Public Shared Function StripUnicodeCharactersFromString(ByVal inputValue As String) As String
Return Regex.Replace(inputValue, "[^\u0000-\u007F]", String.Empty)
End Function
Vb6 - not sure will
sRTF = "\u" & CStr(AscW(char))
work? - You could do this for all char values above 127
StrConv is the command for converting strings.
StrConv Function
Returns a Variant (String) converted as specified.
Syntax
StrConv(string, conversion, LCID)
The StrConv function syntax has these named arguments:
Part Description
string Required. String expression to be converted.
conversion Required. Integer. The sum of values specifying the type of conversion to perform. `128` is Unicode to local code page (or whatever the optional LCID is)
LCID Optional. The LocaleID, if different than the system LocaleID. (The system LocaleID is the default.)
I have a special case when the user is first typing through IME by Press Alphabetic KeyCode on my Grid UserControl, How do I pick up the Unicode on IME Window? If the user is typing in English, it is OK. But if the user is typing Chinese or Japanese on IME, the Unicode turns into question marks.
Select Case uMsg
Case WM_IME_SETCONTEXT
If Not wParam = 0 Then
Dim flag As Boolean
flag = ImmAssociateContextEx(lng_hWnd, 0, 16)
If flag Then
Dim IntPtr As Long
IntPtr = ImmGetContext(lng_hWnd)
flag = ImmSetOpenStatus(IntPtr, True)
End If
End If
Case WM_IME_STARTCOMPOSITION
Dim hIMC As Long
hIMC = ImmGetContext(lng_hWnd)
Dim cf As COMPOSITIONFORM
cf.dwStyle = 2
cf.ptCurrentPos.X = UserControl1.ScaleLeft + 3
cf.ptCurrentPos.Y = UserControl1.ScaleTop + UserControl1.Height - 16
ImmSetCompositionWindow hIMC, cf
Case WM_IME_CHAR
'Send IME Char to UserControl1.KeyPress
UserControl1_KeyPress(wParam And &HFFFF&)
Exit Sub
End Select
After I use different Subclasser from Krool, now I can get Right Unicode. Not sure why Paul Caton and LaVolpe cSelfSubHookCallBack doesn't work.
The Subclasser may internally turn Unicode to ANSI or failed to prevent Windows from UNICODE to ANSI conversion.
So, I have a bunch of strings like this: {\b\cf12 よろてそ } . I'm thinking I could iterate over each character and replace any unicode (Edit: Anything where AscW(char) > 127 or < 0) with a unicode escape code (\u###). However, I'm not sure how to programmatically do so. Any suggestions?
Clarification:
I have a string like {\b\cf12 よろてそ } and I want a string like {\b\cf12 [STUFF]}, where [STUFF] will display as よろてそ when I view the rtf text.
You can simply use the AscW() function to get the correct value:-
sRTF = "\u" & CStr(AscW(char))
Note unlike other escapes for unicode, RTF uses the decimal signed short int (2 bytes) representation for a unicode character. Which makes the conversion in VB6 really quite easy.
Edit
As MarkJ points out in a comment you would only do this for characters outside of 0-127 but then you would also need to give some other characters inside the 0-127 range special handling as well.
Another more roundabout way, would be to add the MSScript.OCX to the project and interface with VBScript's Escape function. For example
Sub main()
Dim s As String
s = ChrW$(&H3088) & ChrW$(&H308D) & ChrW$(&H3066) & ChrW$(&H305D)
Debug.Print MyEscape(s)
End Sub
Function MyEscape(s As String) As String
Dim scr As Object
Set scr = CreateObject("MSScriptControl.ScriptControl")
scr.Language = "VBScript"
scr.Reset
MyEscape = scr.eval("escape(" & dq(s) & ")")
End Function
Function dq(s)
dq = Chr$(34) & s & Chr$(34)
End Function
The Main routine passes in the original Japanese characters and the debug output says:
%u3088%u308D%u3066%u305D
HTH