I would like to get the UTF-8 Code of a character, have attempted to use streams but it doesn't seem to work:
Example: פ should give 16#D7A4, according to https://en.wikipedia.org/wiki/Pe_(Semitic_letter)#Character_encodings
Const adTypeBinary = 1
Dim adoStr, bytesthroughado
Set adoStr = CreateObject("Adodb.Stream")
adoStr.Charset = "utf-8"
adoStr.Open
adoStr.WriteText labelString
adoStr.Position = 0
adoStr.Type = adTypeBinary
adoStr.Position = 3
bytesthroughado = adoStr.Read
Msgbox(LenB(bytesthroughado)) 'gives 2
adoStr.Close
Set adoStr = Nothing
MsgBox(bytesthroughado) ' gives K
Note: AscW gives Unicode - not UTF-8
The bytesthroughado is a value of byte() subtype (see 1st output line) so you need to handle it in an appropriate way:
Option Explicit
Dim ss, xx, ii, jj, char, labelString
labelString = "ařЖפ€"
ss = ""
For ii=1 To Len( labelString)
char = Mid( labelString, ii, 1)
xx = BytesThroughAdo( char)
If ss = "" Then ss = VarType(xx) & " " & TypeName( xx) & vbNewLine
ss = ss & char & vbTab
For jj=1 To LenB( xx)
ss = ss & Hex( AscB( MidB( xx, jj, 1))) & " "
Next
ss = ss & vbNewLine
Next
Wscript.Echo ss
Function BytesThroughAdo( labelChar)
Const adTypeBinary = 1 'Indicates binary data.
Const adTypeText = 2 'Default. Indicates text data.
Dim adoStream
Set adoStream = CreateObject( "Adodb.Stream")
adoStream.Charset = "utf-8"
adoStream.Open
adoStream.WriteText labelChar
adoStream.Position = 0
adoStream.Type = adTypeBinary
adoStream.Position = 3
BytesThroughAdo = adoStream.Read
adoStream.Close
Set adoStream = Nothing
End Function
Output:
cscript D:\bat\SO\61368074q.vbs
8209 Byte()
a 61
ř C5 99
Ж D0 96
פ D7 A4
€ E2 82 AC
I used characters ařЖפ€ to demonstrate the functionality of your UTF-8 encoder (the alts8.ps1 PowerShell script comes from another project):
alts8.ps1 "ařЖפ€"
Ch Unicode Dec CP IME UTF-8 ? IME 0405/cs-CZ; CP852; ANSI 1250
a U+0061 97 …97… 0x61 a Latin Small Letter A
ř U+0159 345 …89… 0xC599 Å� Latin Small Letter R With Caron
Ж U+0416 1046 …22… 0xD096 Ð� Cyrillic Capital Letter Zhe
פ U+05E4 1508 …228… 0xD7A4 פ Hebrew Letter Pe
€ U+20AC 8364 …172… 0xE282AC â�¬ Euro Sign
I have a web service load driver that's a Windows Script File (WSF), that includes some VBScript and JavaScript files. My web service requires that the incoming message is base64 encoded. I currently have a VBScript function that does this, but it's very inefficient (memory intensive, mostly due to VBScripts awful string concatenation)
[Aside; Yes, I've seen Jeff's latest blog post. The concatenation is happening in a loop across messages that are 1,000's to 10,000's bytes in size.]
I've tried using some custom string concatenation routines; one using an array and one using ADODB.Stream. These help, a little, but I think it would help more if I had some other way of encoding the message rather than via my own VBS function.
Is there some other way of encoding my message, preferebly using native Windows methods?
I was originally using some VBScript code from Antonin Foller:
Base64 Encode VBS Function and Base64 Decode VBS Function.
Searching Antonin's site, I saw he had some code for quoted printable encoding, using the CDO.Message object, so I tried that.
Finally, I ported the code mentioned in Mark's answer to VBScript (also used some code from this SO question), and used the Stream___StringToBinary and Stream_BinaryToString functions from Antonin's site to get functions that used MSXML encoding.
I ran a quick test to measure the encoding time for a 1,500 character message (the average message size I need to send to my web service) across all four methods:
Native VBScript (VBScript)
Quoted Printable, using CDO.Message (QP)
Quoted Printable Binary, using CDO.Message (QP Binary)
MSXML/ADODB.Stream (MSXML)
Here are the results:
Iterations : 10,000
Message Size : 1,500
+-------------+-----------+
+ Method | Time (ms) +
+-------------+-----------+
| VBScript | 301,391 |
+-------------+-----------+
| QP | 12,922 |
+-------------+-----------+
| QP (Binary) | 13,953 |
+-------------+-----------+
| MSXML | 3,312 |
+-------------+-----------+
I also monitored the memory utilization (Mem Usage for the cscript.exe process in the Windows Task Manager) while the test was running. I don't have any raw numbers, but the memory utilization for both the quoted printable and MSXML solutions were below the VBScript solution (7,000K for the former, around 16,000K for VBScript).
I decided to go with the MSXML solution for my driver. For those interested, here's the code I'm using:
base64.vbs
Function Base64Encode(sText)
Dim oXML, oNode
Set oXML = CreateObject("Msxml2.DOMDocument.3.0")
Set oNode = oXML.CreateElement("base64")
oNode.dataType = "bin.base64"
oNode.nodeTypedValue =Stream_StringToBinary(sText)
Base64Encode = oNode.text
Set oNode = Nothing
Set oXML = Nothing
End Function
Function Base64Decode(ByVal vCode)
Dim oXML, oNode
Set oXML = CreateObject("Msxml2.DOMDocument.3.0")
Set oNode = oXML.CreateElement("base64")
oNode.dataType = "bin.base64"
oNode.text = vCode
Base64Decode = Stream_BinaryToString(oNode.nodeTypedValue)
Set oNode = Nothing
Set oXML = Nothing
End Function
'Stream_StringToBinary Function
'2003 Antonin Foller, http://www.motobit.com
'Text - string parameter To convert To binary data
Function Stream_StringToBinary(Text)
Const adTypeText = 2
Const adTypeBinary = 1
'Create Stream object
Dim BinaryStream 'As New Stream
Set BinaryStream = CreateObject("ADODB.Stream")
'Specify stream type - we want To save text/string data.
BinaryStream.Type = adTypeText
'Specify charset For the source text (unicode) data.
BinaryStream.CharSet = "us-ascii"
'Open the stream And write text/string data To the object
BinaryStream.Open
BinaryStream.WriteText Text
'Change stream type To binary
BinaryStream.Position = 0
BinaryStream.Type = adTypeBinary
'Ignore first two bytes - sign of
BinaryStream.Position = 0
'Open the stream And get binary data from the object
Stream_StringToBinary = BinaryStream.Read
Set BinaryStream = Nothing
End Function
'Stream_BinaryToString Function
'2003 Antonin Foller, http://www.motobit.com
'Binary - VT_UI1 | VT_ARRAY data To convert To a string
Function Stream_BinaryToString(Binary)
Const adTypeText = 2
Const adTypeBinary = 1
'Create Stream object
Dim BinaryStream 'As New Stream
Set BinaryStream = CreateObject("ADODB.Stream")
'Specify stream type - we want To save binary data.
BinaryStream.Type = adTypeBinary
'Open the stream And write binary data To the object
BinaryStream.Open
BinaryStream.Write Binary
'Change stream type To text/string
BinaryStream.Position = 0
BinaryStream.Type = adTypeText
'Specify charset For the output text (unicode) data.
BinaryStream.CharSet = "us-ascii"
'Open the stream And get text/string data from the object
Stream_BinaryToString = BinaryStream.ReadText
Set BinaryStream = Nothing
End Function
This answer improves on Patrick Cuff's great answer in that it adds support for UTF-8 and UTF-16 LE encodings ("Unicode"). (Additionally, the code is streamlined).
Examples:
' Base64-encode: from UTF-8-encoded bytes.
Base64Encode("Motörhead", False) ' "TW90w7ZyaGVhZA=="
' Base64-encode: from UTF-16 LE-encoded bytes.
Base64Encode("Motörhead", True) ' "TQBvAHQA9gByAGgAZQBhAGQA"
' Base64-decode: back to a VBScript string via UTF-8.
Base64Decode("TW90w7ZyaGVhZA==", False) ' "Motörhead"
' Base64-decode: back to a VBScript string via UTF-16 LE.
Base64Decode("TQBvAHQA9gByAGgAZQBhAGQA", True) ' "Motörhead"
Important:
If you want to be able to represent all Unicode characters (e.g., €) as literals in your .vbs file, save it as UTF-16LE ("Unicode").
If your script is run as a console application, via cscript.exe, not all Unicode characters may render correctly in direct-to-display output (due to font limitations, but you can copy & paste them) and, more importantly, if you try to capture or redirect the output, any non-ASCII-range characters that aren't part of the console's OEM code page are effectively lost (replaced with literal ? characters).
' Base64-encodes the specified string.
' Parameter fAsUtf16LE determines how the input text is encoded at the
' byte level before Base64 encoding is applied.
' * Pass False to use UTF-8 encoding.
' * Pass True to use UTF-16 LE encoding.
Function Base64Encode(ByVal sText, ByVal fAsUtf16LE)
' Use an aux. XML document with a Base64-encoded element.
' Assigning the byte stream (array) returned by StrToBytes() to .NodeTypedValue
' automatically performs Base64-encoding, whose result can then be accessed
' as the element's text.
With CreateObject("Msxml2.DOMDocument").CreateElement("aux")
.DataType = "bin.base64"
if fAsUtf16LE then
.NodeTypedValue = StrToBytes(sText, "utf-16le", 2)
else
.NodeTypedValue = StrToBytes(sText, "utf-8", 3)
end if
Base64Encode = .Text
End With
End Function
' Decodes the specified Base64-encoded string.
' If the decoded string's original encoding was:
' * UTF-8, pass False for fIsUtf16LE.
' * UTF-16 LE, pass True for fIsUtf16LE.
Function Base64Decode(ByVal sBase64EncodedText, ByVal fIsUtf16LE)
Dim sTextEncoding
if fIsUtf16LE Then sTextEncoding = "utf-16le" Else sTextEncoding = "utf-8"
' Use an aux. XML document with a Base64-encoded element.
' Assigning the encoded text to .Text makes the decoded byte array
' available via .nodeTypedValue, which we can pass to BytesToStr()
With CreateObject("Msxml2.DOMDocument").CreateElement("aux")
.DataType = "bin.base64"
.Text = sBase64EncodedText
Base64Decode = BytesToStr(.NodeTypedValue, sTextEncoding)
End With
End Function
' Returns a binary representation (byte array) of the specified string in
' the specified text encoding, such as "utf-8" or "utf-16le".
' Pass the number of bytes that the encoding's BOM uses as iBomByteCount;
' pass 0 to include the BOM in the output.
function StrToBytes(ByVal sText, ByVal sTextEncoding, ByVal iBomByteCount)
' Create a text string with the specified encoding and then
' get its binary (byte array) representation.
With CreateObject("ADODB.Stream")
' Create a stream with the specified text encoding...
.Type = 2 ' adTypeText
.Charset = sTextEncoding
.Open
.WriteText sText
' ... and convert it to a binary stream to get a byte-array
' representation.
.Position = 0
.Type = 1 ' adTypeBinary
.Position = iBomByteCount ' skip the BOM
StrToBytes = .Read
.Close
End With
end function
' Returns a string that corresponds to the specified byte array, interpreted
' with the specified text encoding, such as "utf-8" or "utf-16le".
function BytesToStr(ByVal byteArray, ByVal sTextEncoding)
If LCase(sTextEncoding) = "utf-16le" then
' UTF-16 LE happens to be VBScript's internal encoding, so we can
' take a shortcut and use CStr() to directly convert the byte array
' to a string.
BytesToStr = CStr(byteArray)
Else ' Convert the specified text encoding to a VBScript string.
' Create a binary stream and copy the input byte array to it.
With CreateObject("ADODB.Stream")
.Type = 1 ' adTypeBinary
.Open
.Write byteArray
' Now change the type to text, set the encoding, and output the
' result as text.
.Position = 0
.Type = 2 ' adTypeText
.CharSet = sTextEncoding
BytesToStr = .ReadText
.Close
End With
End If
end function
It's possible to encode base64 in pure vbscript without ADODB.Stream and MSXml2.DOMDocument.
for example:
Function btoa(sourceStr)
Dim i, j, n, carr, rarr(), a, b, c
carr = Array("A", "B", "C", "D", "E", "F", "G", "H", _
"I", "J", "K", "L", "M", "N", "O" ,"P", _
"Q", "R", "S", "T", "U", "V", "W", "X", _
"Y", "Z", "a", "b", "c", "d", "e", "f", _
"g", "h", "i", "j", "k", "l", "m", "n", _
"o", "p", "q", "r", "s", "t", "u", "v", _
"w", "x", "y", "z", "0", "1", "2", "3", _
"4", "5", "6", "7", "8", "9", "+", "/")
n = Len(sourceStr)-1
ReDim rarr(n\3)
For i=0 To n Step 3
a = AscW(Mid(sourceStr,i+1,1))
If i < n Then
b = AscW(Mid(sourceStr,i+2,1))
Else
b = 0
End If
If i < n-1 Then
c = AscW(Mid(sourceStr,i+3,1))
Else
c = 0
End If
rarr(i\3) = carr(a\4) & carr((a And 3) * 16 + b\16) & carr((b And 15) * 4 + c\64) & carr(c And 63)
Next
i = UBound(rarr)
If n Mod 3 = 0 Then
rarr(i) = Left(rarr(i),2) & "=="
ElseIf n Mod 3 = 1 Then
rarr(i) = Left(rarr(i),3) & "="
End If
btoa = Join(rarr,"")
End Function
Function char_to_utf8(sChar)
Dim c, b1, b2, b3
c = AscW(sChar)
If c < 0 Then
c = c + &H10000
End If
If c < &H80 Then
char_to_utf8 = sChar
ElseIf c < &H800 Then
b1 = c Mod 64
b2 = (c - b1) / 64
char_to_utf8 = ChrW(&HC0 + b2) & ChrW(&H80 + b1)
ElseIf c < &H10000 Then
b1 = c Mod 64
b2 = ((c - b1) / 64) Mod 64
b3 = (c - b1 - (64 * b2)) / 4096
char_to_utf8 = ChrW(&HE0 + b3) & ChrW(&H80 + b2) & ChrW(&H80 + b1)
Else
End If
End Function
Function str_to_utf8(sSource)
Dim i, n, rarr()
n = Len(sSource)
ReDim rarr(n - 1)
For i=0 To n-1
rarr(i) = char_to_utf8(Mid(sSource,i+1,1))
Next
str_to_utf8 = Join(rarr,"")
End Function
Function str_to_base64(sSource)
str_to_base64 = btoa(str_to_utf8(sSource))
End Function
'test
msgbox btoa("Hello") 'SGVsbG8=
msgbox btoa("Hell") 'SGVsbA==
msgbox str_to_base64("中文한국어") '5Lit5paH7ZWc6rWt7Ja0
If there are wide characters (AscW(c) > 255 or < 0) in your string, you can convert it to utf-8 before call btoa.
utf-8 convertion also can be written in pure vbscript.
So I have some other full example of encoder and decoder:
Encoder:
' This script reads jpg picture named SuperPicture.jpg, converts it to base64
' code using encoding abilities of MSXml2.DOMDocument object and saves
' the resulting data to encoded.txt file
Option Explicit
Const fsDoOverwrite = true ' Overwrite file with base64 code
Const fsAsASCII = false ' Create base64 code file as ASCII file
Const adTypeBinary = 1 ' Binary file is encoded
' Variables for writing base64 code to file
Dim objFSO
Dim objFileOut
' Variables for encoding
Dim objXML
Dim objDocElem
' Variable for reading binary picture
Dim objStream
' Open data stream from picture
Set objStream = CreateObject("ADODB.Stream")
objStream.Type = adTypeBinary
objStream.Open()
objStream.LoadFromFile("SuperPicture.jpg")
' Create XML Document object and root node
' that will contain the data
Set objXML = CreateObject("MSXml2.DOMDocument")
Set objDocElem = objXML.createElement("Base64Data")
objDocElem.dataType = "bin.base64"
' Set binary value
objDocElem.nodeTypedValue = objStream.Read()
' Open data stream to base64 code file
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFileOut = objFSO.CreateTextFile("encoded.txt", fsDoOverwrite, fsAsASCII)
' Get base64 value and write to file
objFileOut.Write objDocElem.text
objFileOut.Close()
' Clean all
Set objFSO = Nothing
Set objFileOut = Nothing
Set objXML = Nothing
Set objDocElem = Nothing
Set objStream = Nothing
Decoder:
' This script reads base64 encoded picture from file named encoded.txt,
' converts it in to back to binary reprisentation using encoding abilities
' of MSXml2.DOMDocument object and saves data to SuperPicture.jpg file
Option Explicit
Const foForReading = 1 ' Open base 64 code file for reading
Const foAsASCII = 0 ' Open base 64 code file as ASCII file
Const adSaveCreateOverWrite = 2 ' Mode for ADODB.Stream
Const adTypeBinary = 1 ' Binary file is encoded
' Variables for reading base64 code from file
Dim objFSO
Dim objFileIn
Dim objStreamIn
' Variables for decoding
Dim objXML
Dim objDocElem
' Variable for write binary picture
Dim objStream
' Open data stream from base64 code filr
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFileIn = objFSO.GetFile("encoded.txt")
Set objStreamIn = objFileIn.OpenAsTextStream(foForReading, foAsASCII)
' Create XML Document object and root node
' that will contain the data
Set objXML = CreateObject("MSXml2.DOMDocument")
Set objDocElem = objXML.createElement("Base64Data")
objDocElem.DataType = "bin.base64"
' Set text value
objDocElem.text = objStreamIn.ReadAll()
' Open data stream to picture file
Set objStream = CreateObject("ADODB.Stream")
objStream.Type = adTypeBinary
objStream.Open()
' Get binary value and write to file
objStream.Write objDocElem.NodeTypedValue
objStream.SaveToFile "SuperPicture.jpg", adSaveCreateOverWrite
' Clean all
Set objFSO = Nothing
Set objFileIn = Nothing
Set objStreamIn = Nothing
Set objXML = Nothing
Set objDocElem = Nothing
Set objStream = Nothing
This is a decode example that does not use the ADODB object.
option explicit
dim inobj,outobj,infile,myname,state,rec,outfile,content,table(256),bits,c,x,outword
state = 0
const r64 = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"
myname = wscript.scriptfullname
set inobj = createobject("Scripting.FileSystemObject")
set outobj = createobject("Scripting.FileSystemObject")
set infile = inobj.opentextfile(myname,1)
set outfile = outobj.createtextfile("q.png")
for x = 1 to 256 step 1
table(x) = -1
next
for x = 1 to 64 step 1
table(1+asc(mid(r64,x,1))) = x - 1
next
bits = 0
do until(infile.atendofstream)
dim size
rec = infile.readline
if (state = 1) then
content = mid(rec,2)
size = len(content)
for x = 1 to size step 1
c = table(1+asc(mid(content,x,1)))
if (c <> -1) then
if (bits = 0) then
outword = c*4
bits = 6
elseif (bits = 2) then
outword = c+outword
outfile.write(chr(clng("&H" & hex(outword mod 256))))
bits = 0
elseif (bits = 4) then
outword = outword + int(c/4)
outfile.write(chr(clng("&H" & hex(outword mod 256))))
outword = c*64
bits = 2
else
outword = outword + int(c/16)
outfile.write(chr(clng("&H" & hex(outword mod 256))))
outword = c*16
bits = 4
end if
end if
next
end if
if (rec = "'PAYLOAD") then
state = 1
end if
loop
infile.close
outfile.close
wscript.echo "q.png created"
wscript.quit
'PAYLOAD
'iVBORw0KGgoAAAANSUhEUgAAAD4AAAA+CAIAAAD8oz8TAAABoklEQVRo3u2awQrDMAxDl7H/
'/+Xu0EsgSDw7hRF7vWywpO0UW5acjOu6Xmde79ex1+f+GGPACfcqzePXdVvvts7iv6rx56Ou
'8FNYkgyZx9xzZ3TVHfg7VEHdR+o6ZsWV54O/yDvUQj2KzYyH5wof5f14fR97xdPrmjy1ArVQ
'55yteMYzEqma5B2qoM5VBK+OuXUrHutjJ8c59l4z/vV6Vv15PbOjiFRunB/rOcYgIz1jEPek
'nnh+rBPsiYbOaRu/DipzKrqkqNOJdgEIF3mNVLGa7jM9YSReg+t6U/UvFTYqmn13gGeUr9C1
'ul85rlCVgVTHnGeo2xGIdnT3PRR3vbUYhjAJqXxRHxTtslfsrxOe8aziWdlnAukRVPGmuX9P
'KnG0y9Wjv+71IPf8JEMIZxeP9ZHDkvO0z6XoXmlF1APTMIpR38R5qd8ZAa7gc76JaMl+ZwR4
'N0vdn6hRf89+ZwRIXZy/e473bks9sd9uterERvmbKP4end6cVlFRHt2n9mxTN9b3PTzfIco5
'4Ip9mGd1ud8bUriS3Oh6RuC318GofwHqKhl/Nn0DHQAAAABJRU5ErkJggg==
So you can use this object to Encode or Decode Base64 = CreateObject("Msxml2.DOMDocument.3.0")
And use Array to Encode or Decode It.
More info VBS_Array
Here is my way :
Function Base64Encode(sText)
Set oNode = CreateObject("Msxml2.DOMDocument.3.0").CreateElement("base64")
oNode.dataType = "bin.base64"
oNode.nodeTypedValue =Stream_StringToBinary(sText)
Base64Encode = oNode.text
Set oNode = Nothing
End Function
Function Base64Decode(ByVal vCode)
Set oNode = CreateObject("Msxml2.DOMDocument.3.0").CreateElement("base64")
oNode.dataType = "bin.base64"
oNode.text = vCode
Base64Decode = Stream_BinaryToString(oNode.nodeTypedValue)
Set oNode = Nothing
End Function
Function Stream_StringToBinary(Text)
Set BinaryStream = CreateObject("ADODB.Stream")
BinaryStream.Type = 2
' All Format => utf-16le - utf-8 - utf-16le
BinaryStream.CharSet = "us-ascii"
BinaryStream.Open
BinaryStream.WriteText Text
BinaryStream.Position = 0
BinaryStream.Type = 1
BinaryStream.Position = 0
Stream_StringToBinary = BinaryStream.Read
Set BinaryStream = Nothing
End Function
Function Stream_BinaryToString(Binary)
Set BinaryStream = CreateObject("ADODB.Stream")
BinaryStream.Type = 1
BinaryStream.Open
BinaryStream.Write Binary
BinaryStream.Position = 0
BinaryStream.Type = 2
' All Format => utf-16le - utf-8 - utf-16le
BinaryStream.CharSet = "utf-8"
Stream_BinaryToString = BinaryStream.ReadText
Set BinaryStream = Nothing
End Function
''''''''''''''''''''''''''''''''''''''''''''''Testing'''''''''''''''''''''''''''''''''''''''''
arr=array("Hello","&Welcome","To My Program")
For Each Endcode In arr
WSH.Echo Base64Encode(Endcode)
Next
arr=array("2LPZhNin2YU==","R29vZA==","QnkhIQ==")
For Each Decode In arr
WSH.Echo Base64Decode(Decode)
Next