Base64 Encode String in VBScript - encoding

I have a web service load driver that's a Windows Script File (WSF), that includes some VBScript and JavaScript files. My web service requires that the incoming message is base64 encoded. I currently have a VBScript function that does this, but it's very inefficient (memory intensive, mostly due to VBScripts awful string concatenation)
[Aside; Yes, I've seen Jeff's latest blog post. The concatenation is happening in a loop across messages that are 1,000's to 10,000's bytes in size.]
I've tried using some custom string concatenation routines; one using an array and one using ADODB.Stream. These help, a little, but I think it would help more if I had some other way of encoding the message rather than via my own VBS function.
Is there some other way of encoding my message, preferebly using native Windows methods?

I was originally using some VBScript code from Antonin Foller:
Base64 Encode VBS Function and Base64 Decode VBS Function.
Searching Antonin's site, I saw he had some code for quoted printable encoding, using the CDO.Message object, so I tried that.
Finally, I ported the code mentioned in Mark's answer to VBScript (also used some code from this SO question), and used the Stream___StringToBinary and Stream_BinaryToString functions from Antonin's site to get functions that used MSXML encoding.
I ran a quick test to measure the encoding time for a 1,500 character message (the average message size I need to send to my web service) across all four methods:
Native VBScript (VBScript)
Quoted Printable, using CDO.Message (QP)
Quoted Printable Binary, using CDO.Message (QP Binary)
MSXML/ADODB.Stream (MSXML)
Here are the results:
Iterations : 10,000
Message Size : 1,500
+-------------+-----------+
+ Method | Time (ms) +
+-------------+-----------+
| VBScript | 301,391 |
+-------------+-----------+
| QP | 12,922 |
+-------------+-----------+
| QP (Binary) | 13,953 |
+-------------+-----------+
| MSXML | 3,312 |
+-------------+-----------+
I also monitored the memory utilization (Mem Usage for the cscript.exe process in the Windows Task Manager) while the test was running. I don't have any raw numbers, but the memory utilization for both the quoted printable and MSXML solutions were below the VBScript solution (7,000K for the former, around 16,000K for VBScript).
I decided to go with the MSXML solution for my driver. For those interested, here's the code I'm using:
base64.vbs
Function Base64Encode(sText)
Dim oXML, oNode
Set oXML = CreateObject("Msxml2.DOMDocument.3.0")
Set oNode = oXML.CreateElement("base64")
oNode.dataType = "bin.base64"
oNode.nodeTypedValue =Stream_StringToBinary(sText)
Base64Encode = oNode.text
Set oNode = Nothing
Set oXML = Nothing
End Function
Function Base64Decode(ByVal vCode)
Dim oXML, oNode
Set oXML = CreateObject("Msxml2.DOMDocument.3.0")
Set oNode = oXML.CreateElement("base64")
oNode.dataType = "bin.base64"
oNode.text = vCode
Base64Decode = Stream_BinaryToString(oNode.nodeTypedValue)
Set oNode = Nothing
Set oXML = Nothing
End Function
'Stream_StringToBinary Function
'2003 Antonin Foller, http://www.motobit.com
'Text - string parameter To convert To binary data
Function Stream_StringToBinary(Text)
Const adTypeText = 2
Const adTypeBinary = 1
'Create Stream object
Dim BinaryStream 'As New Stream
Set BinaryStream = CreateObject("ADODB.Stream")
'Specify stream type - we want To save text/string data.
BinaryStream.Type = adTypeText
'Specify charset For the source text (unicode) data.
BinaryStream.CharSet = "us-ascii"
'Open the stream And write text/string data To the object
BinaryStream.Open
BinaryStream.WriteText Text
'Change stream type To binary
BinaryStream.Position = 0
BinaryStream.Type = adTypeBinary
'Ignore first two bytes - sign of
BinaryStream.Position = 0
'Open the stream And get binary data from the object
Stream_StringToBinary = BinaryStream.Read
Set BinaryStream = Nothing
End Function
'Stream_BinaryToString Function
'2003 Antonin Foller, http://www.motobit.com
'Binary - VT_UI1 | VT_ARRAY data To convert To a string
Function Stream_BinaryToString(Binary)
Const adTypeText = 2
Const adTypeBinary = 1
'Create Stream object
Dim BinaryStream 'As New Stream
Set BinaryStream = CreateObject("ADODB.Stream")
'Specify stream type - we want To save binary data.
BinaryStream.Type = adTypeBinary
'Open the stream And write binary data To the object
BinaryStream.Open
BinaryStream.Write Binary
'Change stream type To text/string
BinaryStream.Position = 0
BinaryStream.Type = adTypeText
'Specify charset For the output text (unicode) data.
BinaryStream.CharSet = "us-ascii"
'Open the stream And get text/string data from the object
Stream_BinaryToString = BinaryStream.ReadText
Set BinaryStream = Nothing
End Function

This answer improves on Patrick Cuff's great answer in that it adds support for UTF-8 and UTF-16 LE encodings ("Unicode"). (Additionally, the code is streamlined).
Examples:
' Base64-encode: from UTF-8-encoded bytes.
Base64Encode("Motörhead", False) ' "TW90w7ZyaGVhZA=="
' Base64-encode: from UTF-16 LE-encoded bytes.
Base64Encode("Motörhead", True) ' "TQBvAHQA9gByAGgAZQBhAGQA"
' Base64-decode: back to a VBScript string via UTF-8.
Base64Decode("TW90w7ZyaGVhZA==", False) ' "Motörhead"
' Base64-decode: back to a VBScript string via UTF-16 LE.
Base64Decode("TQBvAHQA9gByAGgAZQBhAGQA", True) ' "Motörhead"
Important:
If you want to be able to represent all Unicode characters (e.g., €) as literals in your .vbs file, save it as UTF-16LE ("Unicode").
If your script is run as a console application, via cscript.exe, not all Unicode characters may render correctly in direct-to-display output (due to font limitations, but you can copy & paste them) and, more importantly, if you try to capture or redirect the output, any non-ASCII-range characters that aren't part of the console's OEM code page are effectively lost (replaced with literal ? characters).
' Base64-encodes the specified string.
' Parameter fAsUtf16LE determines how the input text is encoded at the
' byte level before Base64 encoding is applied.
' * Pass False to use UTF-8 encoding.
' * Pass True to use UTF-16 LE encoding.
Function Base64Encode(ByVal sText, ByVal fAsUtf16LE)
' Use an aux. XML document with a Base64-encoded element.
' Assigning the byte stream (array) returned by StrToBytes() to .NodeTypedValue
' automatically performs Base64-encoding, whose result can then be accessed
' as the element's text.
With CreateObject("Msxml2.DOMDocument").CreateElement("aux")
.DataType = "bin.base64"
if fAsUtf16LE then
.NodeTypedValue = StrToBytes(sText, "utf-16le", 2)
else
.NodeTypedValue = StrToBytes(sText, "utf-8", 3)
end if
Base64Encode = .Text
End With
End Function
' Decodes the specified Base64-encoded string.
' If the decoded string's original encoding was:
' * UTF-8, pass False for fIsUtf16LE.
' * UTF-16 LE, pass True for fIsUtf16LE.
Function Base64Decode(ByVal sBase64EncodedText, ByVal fIsUtf16LE)
Dim sTextEncoding
if fIsUtf16LE Then sTextEncoding = "utf-16le" Else sTextEncoding = "utf-8"
' Use an aux. XML document with a Base64-encoded element.
' Assigning the encoded text to .Text makes the decoded byte array
' available via .nodeTypedValue, which we can pass to BytesToStr()
With CreateObject("Msxml2.DOMDocument").CreateElement("aux")
.DataType = "bin.base64"
.Text = sBase64EncodedText
Base64Decode = BytesToStr(.NodeTypedValue, sTextEncoding)
End With
End Function
' Returns a binary representation (byte array) of the specified string in
' the specified text encoding, such as "utf-8" or "utf-16le".
' Pass the number of bytes that the encoding's BOM uses as iBomByteCount;
' pass 0 to include the BOM in the output.
function StrToBytes(ByVal sText, ByVal sTextEncoding, ByVal iBomByteCount)
' Create a text string with the specified encoding and then
' get its binary (byte array) representation.
With CreateObject("ADODB.Stream")
' Create a stream with the specified text encoding...
.Type = 2 ' adTypeText
.Charset = sTextEncoding
.Open
.WriteText sText
' ... and convert it to a binary stream to get a byte-array
' representation.
.Position = 0
.Type = 1 ' adTypeBinary
.Position = iBomByteCount ' skip the BOM
StrToBytes = .Read
.Close
End With
end function
' Returns a string that corresponds to the specified byte array, interpreted
' with the specified text encoding, such as "utf-8" or "utf-16le".
function BytesToStr(ByVal byteArray, ByVal sTextEncoding)
If LCase(sTextEncoding) = "utf-16le" then
' UTF-16 LE happens to be VBScript's internal encoding, so we can
' take a shortcut and use CStr() to directly convert the byte array
' to a string.
BytesToStr = CStr(byteArray)
Else ' Convert the specified text encoding to a VBScript string.
' Create a binary stream and copy the input byte array to it.
With CreateObject("ADODB.Stream")
.Type = 1 ' adTypeBinary
.Open
.Write byteArray
' Now change the type to text, set the encoding, and output the
' result as text.
.Position = 0
.Type = 2 ' adTypeText
.CharSet = sTextEncoding
BytesToStr = .ReadText
.Close
End With
End If
end function

It's possible to encode base64 in pure vbscript without ADODB.Stream and MSXml2.DOMDocument.
for example:
Function btoa(sourceStr)
Dim i, j, n, carr, rarr(), a, b, c
carr = Array("A", "B", "C", "D", "E", "F", "G", "H", _
"I", "J", "K", "L", "M", "N", "O" ,"P", _
"Q", "R", "S", "T", "U", "V", "W", "X", _
"Y", "Z", "a", "b", "c", "d", "e", "f", _
"g", "h", "i", "j", "k", "l", "m", "n", _
"o", "p", "q", "r", "s", "t", "u", "v", _
"w", "x", "y", "z", "0", "1", "2", "3", _
"4", "5", "6", "7", "8", "9", "+", "/")
n = Len(sourceStr)-1
ReDim rarr(n\3)
For i=0 To n Step 3
a = AscW(Mid(sourceStr,i+1,1))
If i < n Then
b = AscW(Mid(sourceStr,i+2,1))
Else
b = 0
End If
If i < n-1 Then
c = AscW(Mid(sourceStr,i+3,1))
Else
c = 0
End If
rarr(i\3) = carr(a\4) & carr((a And 3) * 16 + b\16) & carr((b And 15) * 4 + c\64) & carr(c And 63)
Next
i = UBound(rarr)
If n Mod 3 = 0 Then
rarr(i) = Left(rarr(i),2) & "=="
ElseIf n Mod 3 = 1 Then
rarr(i) = Left(rarr(i),3) & "="
End If
btoa = Join(rarr,"")
End Function
Function char_to_utf8(sChar)
Dim c, b1, b2, b3
c = AscW(sChar)
If c < 0 Then
c = c + &H10000
End If
If c < &H80 Then
char_to_utf8 = sChar
ElseIf c < &H800 Then
b1 = c Mod 64
b2 = (c - b1) / 64
char_to_utf8 = ChrW(&HC0 + b2) & ChrW(&H80 + b1)
ElseIf c < &H10000 Then
b1 = c Mod 64
b2 = ((c - b1) / 64) Mod 64
b3 = (c - b1 - (64 * b2)) / 4096
char_to_utf8 = ChrW(&HE0 + b3) & ChrW(&H80 + b2) & ChrW(&H80 + b1)
Else
End If
End Function
Function str_to_utf8(sSource)
Dim i, n, rarr()
n = Len(sSource)
ReDim rarr(n - 1)
For i=0 To n-1
rarr(i) = char_to_utf8(Mid(sSource,i+1,1))
Next
str_to_utf8 = Join(rarr,"")
End Function
Function str_to_base64(sSource)
str_to_base64 = btoa(str_to_utf8(sSource))
End Function
'test
msgbox btoa("Hello") 'SGVsbG8=
msgbox btoa("Hell") 'SGVsbA==
msgbox str_to_base64("中文한국어") '5Lit5paH7ZWc6rWt7Ja0
If there are wide characters (AscW(c) > 255 or < 0) in your string, you can convert it to utf-8 before call btoa.
utf-8 convertion also can be written in pure vbscript.

So I have some other full example of encoder and decoder:
Encoder:
' This script reads jpg picture named SuperPicture.jpg, converts it to base64
' code using encoding abilities of MSXml2.DOMDocument object and saves
' the resulting data to encoded.txt file
Option Explicit
Const fsDoOverwrite = true ' Overwrite file with base64 code
Const fsAsASCII = false ' Create base64 code file as ASCII file
Const adTypeBinary = 1 ' Binary file is encoded
' Variables for writing base64 code to file
Dim objFSO
Dim objFileOut
' Variables for encoding
Dim objXML
Dim objDocElem
' Variable for reading binary picture
Dim objStream
' Open data stream from picture
Set objStream = CreateObject("ADODB.Stream")
objStream.Type = adTypeBinary
objStream.Open()
objStream.LoadFromFile("SuperPicture.jpg")
' Create XML Document object and root node
' that will contain the data
Set objXML = CreateObject("MSXml2.DOMDocument")
Set objDocElem = objXML.createElement("Base64Data")
objDocElem.dataType = "bin.base64"
' Set binary value
objDocElem.nodeTypedValue = objStream.Read()
' Open data stream to base64 code file
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFileOut = objFSO.CreateTextFile("encoded.txt", fsDoOverwrite, fsAsASCII)
' Get base64 value and write to file
objFileOut.Write objDocElem.text
objFileOut.Close()
' Clean all
Set objFSO = Nothing
Set objFileOut = Nothing
Set objXML = Nothing
Set objDocElem = Nothing
Set objStream = Nothing
Decoder:
' This script reads base64 encoded picture from file named encoded.txt,
' converts it in to back to binary reprisentation using encoding abilities
' of MSXml2.DOMDocument object and saves data to SuperPicture.jpg file
Option Explicit
Const foForReading = 1 ' Open base 64 code file for reading
Const foAsASCII = 0 ' Open base 64 code file as ASCII file
Const adSaveCreateOverWrite = 2 ' Mode for ADODB.Stream
Const adTypeBinary = 1 ' Binary file is encoded
' Variables for reading base64 code from file
Dim objFSO
Dim objFileIn
Dim objStreamIn
' Variables for decoding
Dim objXML
Dim objDocElem
' Variable for write binary picture
Dim objStream
' Open data stream from base64 code filr
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFileIn = objFSO.GetFile("encoded.txt")
Set objStreamIn = objFileIn.OpenAsTextStream(foForReading, foAsASCII)
' Create XML Document object and root node
' that will contain the data
Set objXML = CreateObject("MSXml2.DOMDocument")
Set objDocElem = objXML.createElement("Base64Data")
objDocElem.DataType = "bin.base64"
' Set text value
objDocElem.text = objStreamIn.ReadAll()
' Open data stream to picture file
Set objStream = CreateObject("ADODB.Stream")
objStream.Type = adTypeBinary
objStream.Open()
' Get binary value and write to file
objStream.Write objDocElem.NodeTypedValue
objStream.SaveToFile "SuperPicture.jpg", adSaveCreateOverWrite
' Clean all
Set objFSO = Nothing
Set objFileIn = Nothing
Set objStreamIn = Nothing
Set objXML = Nothing
Set objDocElem = Nothing
Set objStream = Nothing

This is a decode example that does not use the ADODB object.
option explicit
dim inobj,outobj,infile,myname,state,rec,outfile,content,table(256),bits,c,x,outword
state = 0
const r64 = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"
myname = wscript.scriptfullname
set inobj = createobject("Scripting.FileSystemObject")
set outobj = createobject("Scripting.FileSystemObject")
set infile = inobj.opentextfile(myname,1)
set outfile = outobj.createtextfile("q.png")
for x = 1 to 256 step 1
table(x) = -1
next
for x = 1 to 64 step 1
table(1+asc(mid(r64,x,1))) = x - 1
next
bits = 0
do until(infile.atendofstream)
dim size
rec = infile.readline
if (state = 1) then
content = mid(rec,2)
size = len(content)
for x = 1 to size step 1
c = table(1+asc(mid(content,x,1)))
if (c <> -1) then
if (bits = 0) then
outword = c*4
bits = 6
elseif (bits = 2) then
outword = c+outword
outfile.write(chr(clng("&H" & hex(outword mod 256))))
bits = 0
elseif (bits = 4) then
outword = outword + int(c/4)
outfile.write(chr(clng("&H" & hex(outword mod 256))))
outword = c*64
bits = 2
else
outword = outword + int(c/16)
outfile.write(chr(clng("&H" & hex(outword mod 256))))
outword = c*16
bits = 4
end if
end if
next
end if
if (rec = "'PAYLOAD") then
state = 1
end if
loop
infile.close
outfile.close
wscript.echo "q.png created"
wscript.quit
'PAYLOAD
'iVBORw0KGgoAAAANSUhEUgAAAD4AAAA+CAIAAAD8oz8TAAABoklEQVRo3u2awQrDMAxDl7H/
'/+Xu0EsgSDw7hRF7vWywpO0UW5acjOu6Xmde79ex1+f+GGPACfcqzePXdVvvts7iv6rx56Ou
'8FNYkgyZx9xzZ3TVHfg7VEHdR+o6ZsWV54O/yDvUQj2KzYyH5wof5f14fR97xdPrmjy1ArVQ
'55yteMYzEqma5B2qoM5VBK+OuXUrHutjJ8c59l4z/vV6Vv15PbOjiFRunB/rOcYgIz1jEPek
'nnh+rBPsiYbOaRu/DipzKrqkqNOJdgEIF3mNVLGa7jM9YSReg+t6U/UvFTYqmn13gGeUr9C1
'ul85rlCVgVTHnGeo2xGIdnT3PRR3vbUYhjAJqXxRHxTtslfsrxOe8aziWdlnAukRVPGmuX9P
'KnG0y9Wjv+71IPf8JEMIZxeP9ZHDkvO0z6XoXmlF1APTMIpR38R5qd8ZAa7gc76JaMl+ZwR4
'N0vdn6hRf89+ZwRIXZy/e473bks9sd9uterERvmbKP4end6cVlFRHt2n9mxTN9b3PTzfIco5
'4Ip9mGd1ud8bUriS3Oh6RuC318GofwHqKhl/Nn0DHQAAAABJRU5ErkJggg==

So you can use this object to Encode or Decode Base64 = CreateObject("Msxml2.DOMDocument.3.0")
And use Array to Encode or Decode It.
More info VBS_Array
Here is my way :
Function Base64Encode(sText)
Set oNode = CreateObject("Msxml2.DOMDocument.3.0").CreateElement("base64")
oNode.dataType = "bin.base64"
oNode.nodeTypedValue =Stream_StringToBinary(sText)
Base64Encode = oNode.text
Set oNode = Nothing
End Function
Function Base64Decode(ByVal vCode)
Set oNode = CreateObject("Msxml2.DOMDocument.3.0").CreateElement("base64")
oNode.dataType = "bin.base64"
oNode.text = vCode
Base64Decode = Stream_BinaryToString(oNode.nodeTypedValue)
Set oNode = Nothing
End Function
Function Stream_StringToBinary(Text)
Set BinaryStream = CreateObject("ADODB.Stream")
BinaryStream.Type = 2
' All Format => utf-16le - utf-8 - utf-16le
BinaryStream.CharSet = "us-ascii"
BinaryStream.Open
BinaryStream.WriteText Text
BinaryStream.Position = 0
BinaryStream.Type = 1
BinaryStream.Position = 0
Stream_StringToBinary = BinaryStream.Read
Set BinaryStream = Nothing
End Function
Function Stream_BinaryToString(Binary)
Set BinaryStream = CreateObject("ADODB.Stream")
BinaryStream.Type = 1
BinaryStream.Open
BinaryStream.Write Binary
BinaryStream.Position = 0
BinaryStream.Type = 2
' All Format => utf-16le - utf-8 - utf-16le
BinaryStream.CharSet = "utf-8"
Stream_BinaryToString = BinaryStream.ReadText
Set BinaryStream = Nothing
End Function
''''''''''''''''''''''''''''''''''''''''''''''Testing'''''''''''''''''''''''''''''''''''''''''
arr=array("Hello","&Welcome","To My Program")
For Each Endcode In arr
WSH.Echo Base64Encode(Endcode)
Next
arr=array("2LPZhNin2YU==","R29vZA==","QnkhIQ==")
For Each Decode In arr
WSH.Echo Base64Decode(Decode)
Next

Related

How to split word files by the number of characters

Could you anybody help me how to split word file by character!
I can't find any way to split word files by the number of characters on the internet!
For example, to split a document into 500-character blocks:
Sub SplitDocument()
Application.ScreenUpdating = False
Dim Rng As Range, i As Long, j As Long
Const Char As Long = 500
With ActiveDocument
' Process each character block
For i = 1 To Int(.Characters.Count / Char)
j = j + 1
' Get the character block
Set Rng = .Range((i - 1) * Char, i * Char)
' Copy the character block
Rng.Copy
Rng.Collapse wdCollapseEnd
Call NewDoc(ActiveDocument, (i - 1) * Char + 1, j)
Next
If Rng.End < .Range.End Then
i = i + 1: j = j + 1
Rng.End = .Range.End
' Copy the range
Rng.Copy
Rng.Collapse wdCollapseEnd
Call NewDoc(ActiveDocument, (i - 1) * Char + 1, j)
End If
End With
Set Rng = Nothing
Application.ScreenUpdating = True
End Sub
Sub NewDoc(DocSrc As Document, i As Long, j As Long)
Dim DocTgt As Document, HdFt As HeaderFooter
' Create the output document
Set DocTgt = Documents.Add(Visible:=False)
With DocTgt
' Paste contents into the output document, preserving the formatting
.Range.PasteAndFormat (wdFormatOriginalFormatting)
' Replicate the headers & footers
For Each HdFt In DocSrc.Sections(DocSrc.Characters(i).Sections(1).Index).Headers
.Sections(1).Headers(HdFt.Index).Range.FormattedText = HdFt.Range.FormattedText
Next
For Each HdFt In DocSrc.Sections(DocSrc.Characters(i).Sections(1).Index).Footers
.Sections(1).Footers(HdFt.Index).Range.FormattedText = HdFt.Range.FormattedText
Next
' Save & close the output document
.SaveAs FileName:=Split(DocSrc.FullName, ".doc")(0) & "_" & j & ".docx", _
FileFormat:=wdFormatXMLDocument, AddToRecentFiles:=False
.Close SaveChanges:=False
End With
Set DocTgt = Nothing: Set DocSrc = Nothing
End Sub

VB6 Hashing SHA1 output not matched

need help for my problem here. i do searching and googling for this problem but still don't found the solution why my output didnt matched with the expected output.
data to hash :
0800210142216688003333311100000554478000000
expected output :
DAAC526D4806C88CEDB8B7C6EA42A7442DE6E7DC
my output :
805C790E6BF39E3482067C44909EE126F9CBB878
and i am using this function to generate the hash
Public Function HashString(ByVal Str As String, Optional ByVal Algorithm As HashAlgorithm = SHA1) As String
On Error Resume Next
Dim hCtx As Long
Dim hHash As Long
Dim lRes As Long
Dim lLen As Long
Dim lIdx As Long
Dim AbData() As Byte
lRes = CryptAcquireContext(hCtx, vbNullString, vbNullString, PROV_RSA_FULL, CRYPT_VERIFYCONTEXT)
If lRes <> 0 Then
lRes = CryptCreateHash(hCtx, Algorithm, 0, 0, hHash)
If lRes <> 0 Then
lRes = CryptHashData(hHash, ByVal Str, Len(Str), 0)
If lRes <> 0 Then
lRes = CryptGetHashParam(hHash, HP_HASHSIZE, lLen, 4, 0)
If lRes <> 0 Then
ReDim AbData(0 To lLen - 1)
lRes = CryptGetHashParam(hHash, HP_HASHVAL, AbData(0), lLen, 0)
If lRes <> 0 Then
For lIdx = 0 To UBound(AbData)
HashString = HashString & Right$("0" & Hex$(AbData(lIdx)), 2)
Next
End If
End If
End If
CryptDestroyHash hHash
End If
End If
CryptReleaseContext hCtx, 0
If lRes = 0 Then
MsgBox Err.LastDllError
End If
End Function
and this is command to call the function
Dim received As String
Dim HASH As String
HASH = "0800210142216688003333311100000554478000000"
received = HashString(HASH)
Debug.Print ("HASH VALUE : " & received)
thanks
UPDATE:
finally i managed to get the expected output. i change the function to generate the sha1 using the sha1 function in this website :
http://vb.wikia.com/wiki/SHA-1.bas
and i do use this function to convert my hexstring to byte array
Public Function HexStringToByteArray(ByRef HexString As String) As Byte()
Dim bytOut() As Byte, bytHigh As Byte, bytLow As Byte, lngA As Long
If LenB(HexString) Then
' preserve memory for output buffer
ReDim bytOut(Len(HexString) \ 2 - 1)
' jump by every two characters (in this case we happen to use byte positions for greater speed)
For lngA = 1 To LenB(HexString) Step 4
' get the character value and decrease by 48
bytHigh = AscW(MidB$(HexString, lngA, 2)) - 48
bytLow = AscW(MidB$(HexString, lngA + 2, 2)) - 48
' move old A - F values down even more
If bytHigh > 9 Then bytHigh = bytHigh - 7
If bytLow > 9 Then bytLow = bytLow - 7
' I guess the C equivalent of this could be like: *bytOut[++i] = (bytHigh << 8) || bytLow
bytOut(lngA \ 4) = (bytHigh * &H10) Or bytLow
Next lngA
' return the output
HexStringToByteArray = bytOut
End If
End Function
and i using this command to get the expected output
Dim received As String
Dim HASH As String
Dim intVal As Integer
Dim temp() As Byte
HASH = "08002101422166880033333111000005544780000000"
temp = HexStringToByteArray(HASH)
received = Replace(HexDefaultSHA1(temp), " ", "")
Debug.Print ("HASH VALUE : " & received)
and finally i got the same output as expected. Yeah!!..
805c... is the SHA1 hash of the characters in your input string, i.e. '0', '8', '0', '0', ...
daac... is the SHA1 hash of the characters in your input string after conversion of each pair of hexadecimal digits to a byte, i.e. 0x08, 0x00, ...
Convert the input string to an array of bytes prior to hashing.
Your output is correct. This is SHA1 using python:
>>> import hashlib
>>> s = hashlib.sha1('0800210142216688003333311100000554478000000')
>>> s.hexdigest()
'805c790e6bf39e3482067c44909ee126f9cbb878'
Where did you get the other SHA1 computation from?

Encoding of Text Files in VB 6.0

I have huge external files with the "ANSI" and "UCS-2 Little Endian" encoding formats.
Now I want to change the file encoding format into UTF-8 using Visual Basic 6.0. I don't want to modify the original file; I just want to change the encoding format alone.
I have searched on the web; but can't understand the VB code, and have no idea how to do it.
I would also like to be able to read lines one at a time from UTF-8 encoded files.
NOTE. This answer has been extensively expanded to fit in with the edited question, which in turn was due to Visual Basic 6 and UTF-8
The following code wraps up converting ANSI, UTF-16 and UTF-32 encoded strings from a file to UTF-8 strings, in VB6. You have to load in the entire file and output it. Note that if it was truly generic, the LineInputUTF8() method would be LineInput(), and require a code page.
Option Explicit
Private Declare Function MultiByteToWideChar Lib "Kernel32.dll" ( _
ByVal CodePage As Long, _
ByVal dwFlags As Long, _
ByVal lpMultiByteStr As Long, _
ByVal cbMultiByte As Long, _
ByVal lpWideCharStr As Long, _
ByVal cchWideChar As Long _
) As Long
Private Declare Function WideCharToMultiByte Lib "Kernel32.dll" ( _
ByVal CodePage As Long, _
ByVal dwFlags As Long, _
ByVal lpWideCharStr As Long, _
ByVal cchWideChar As Long, _
ByVal lpMultiByteStr As Long, _
ByVal cbMultiByte As Long, _
ByVal lpDefaultChar As Long, _
ByVal lpUsedDefaultChar As Long _
) As Long
Public Const CP_ACP As Long = 0 ' Default ANSI code page.
Public Const CP_UTF8 As Long = 65001 ' UTF8.
Public Const CP_UTF16_LE As Long = 1200 ' UTF16 - little endian.
Public Const CP_UTF16_BE As Long = 1201 ' UTF16 - big endian.
Public Const CP_UTF32_LE As Long = 12000 ' UTF32 - little endian.
Public Const CP_UTF32_BE As Long = 12001 ' UTF32 - big endian.
' Purpose: Heuristic to determine whether bytes in a file are UTF-8.
Private Function FileBytesAreUTF8(ByVal the_iFileNo As Integer) As Boolean
Const knSampleByteSize As Long = 2048
Dim nLof As Long
Dim nByteCount As Long
Dim nByteIndex As Long
Dim nCharExtraByteCount As Long
Dim bytValue As Byte
' We look at the first <knSampleByteSize> bytes of the file. However, if the file is smaller, we will have to
' use the smaller size.
nLof = LOF(the_iFileNo)
If nLof < knSampleByteSize Then
nByteCount = nLof
Else
nByteCount = knSampleByteSize
End If
' Go to the start of the file.
Seek #the_iFileNo, 1
For nByteIndex = 1 To nByteCount
Get #the_iFileNo, , bytValue
' If the character we are processing has bytes beyond 1, then we are onto the next character.
If nCharExtraByteCount = 0 Then
'
' The UTF-8 specification says that the first byte of a character has masking bits which indicate how many bytes follow.
'
' See: http://en.wikipedia.org/wiki/UTF-8#Description
'
' Bytes in
' sequence Byte 1 Byte 2 Byte 3 Byte 4
' 1 0xxxxxxx
' 2 110xxxxx 10xxxxxx
' 3 1110xxxx 10xxxxxx 10xxxxxx
' 4 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
'
If (bytValue And &H80) = &H0 Then
nCharExtraByteCount = 0
ElseIf (bytValue And &HE0) = &HC0 Then
nCharExtraByteCount = 1
ElseIf (bytValue And &HF0) = &HE0 Then
nCharExtraByteCount = 2
ElseIf (bytValue And &HF8) = &HF0 Then
nCharExtraByteCount = 3
Else
' If none of these masks were matched, then this can't be a UTF-8 character.
FileBytesAreUTF8 = False
Exit Function
End If
Else
' All following bytes must be masked as in the table above.
If (bytValue And &HC0) = &H80 Then
nCharExtraByteCount = nCharExtraByteCount - 1
If nCharExtraByteCount = 0 Then
FileBytesAreUTF8 = True
End If
Else
' Not a UTF8 character.
FileBytesAreUTF8 = False
Exit Function
End If
End If
Next nByteIndex
End Function
' Purpose: Take a string whose bytes are in the byte array <the_abytCPString>, with code page <the_nCodePage>, convert to a VB string.
Private Function FromCPString(ByRef the_abytCPString() As Byte, ByVal the_nCodePage As Long) As String
Dim sOutput As String
Dim nValueLen As Long
Dim nOutputCharLen As Long
' If the code page says this is already compatible with the VB string, then just copy it into the string. No messing.
If the_nCodePage = CP_UTF16_LE Then
FromCPString = the_abytCPString()
Else
' Cache the input length.
nValueLen = UBound(the_abytCPString) - LBound(the_abytCPString) + 1
' See how big the output buffer will be.
nOutputCharLen = MultiByteToWideChar(the_nCodePage, 0&, VarPtr(the_abytCPString(LBound(the_abytCPString))), nValueLen, 0&, 0&)
' Resize output byte array to the size of the UTF-8 string.
sOutput = Space$(nOutputCharLen)
' Make this API call again, this time giving a pointer to the output byte array.
MultiByteToWideChar the_nCodePage, 0&, VarPtr(the_abytCPString(LBound(the_abytCPString))), nValueLen, StrPtr(sOutput), nOutputCharLen
' Return the array.
FromCPString = sOutput
End If
End Function
Public Function GetContents(ByVal the_sTextFile As String, ByRef out_nCodePage As Long, Optional ByVal the_nDesiredCodePage As Long = -1, Optional ByRef out_bContainedBOM As Boolean) As String
Dim iFileNo As Integer
Dim abytFileContents() As Byte
Dim nDataSize As Long
iFileNo = FreeFile
OpenForInput the_sTextFile, iFileNo, out_nCodePage, the_nDesiredCodePage, out_bContainedBOM
' We want to read the entire contents of the file (not including any BOM value).
' After calling OpenForInput(), the file pointer should be positioned after any BOM.
' So size file contents buffer to <file size> - <current position> + 1.
nDataSize = LOF(iFileNo) - Seek(iFileNo) + 1
ReDim abytFileContents(1 To nDataSize)
Get #iFileNo, , abytFileContents()
Close iFileNo
' Now we must convert this to UTF-8. But we have to first convert to the Windows NT standard UTF-16 LE.
GetContents = FromCPString(abytFileContents(), out_nCodePage)
End Function
' Purpose: Reads up to the end of the current line of the file, repositions to the beginning of the next line, if any, and
' outputs all characters found.
' Inputs: the_nFileNo The number of the file.
' Outputs: out_sLine The line from the current position in the file.
' Return: True if there is more data.
Public Function LineInputUTF8(ByVal the_nFileNo As Integer, ByRef out_sLine As String) As Boolean
Dim bytValue As Byte
Dim abytLine() As Byte
Dim nStartOfLinePos As Long
Dim nEndOfLinePos As Long
Dim nStartOfNextLine As Long
Dim nLineLen As Long
' Save the current file position as the beginning of the line, and cache this value.
nStartOfLinePos = Seek(the_nFileNo)
' Retrieves the first byte from the current position.
Get #the_nFileNo, , bytValue
' Loop until the end of file is encountered.
Do Until EOF(the_nFileNo)
' Check whether this byte represents a carriage return or line feed character (indicating new line).
If bytValue = 13 Or bytValue = 10 Then
' By this point, the current position is *after* the CR or LF character, so to get the position of the
' last byte in the line, we must go back two bytes.
nEndOfLinePos = Seek(the_nFileNo) - 2
' If this is a carriage return, then we must check the next character.
If bytValue = 13 Then
Get #the_nFileNo, , bytValue
' Is this a line feed?
If bytValue = 10 Then
' Yes. Assume that CR-LF counts as a single NewLine. So the start of the next line should skip over the line feed.
nStartOfNextLine = nEndOfLinePos + 3
Else
' No. The start of the next line is the current position.
nStartOfNextLine = nEndOfLinePos + 2
End If
ElseIf bytValue = 10 Then
' If this is a line feed, then the start of the next line is the current position.
nStartOfNextLine = nEndOfLinePos + 2
End If
' Since we have processed all the bytes in the line, exit the loop.
Exit Do
End If
' Get the next byte.
Get #the_nFileNo, , bytValue
Loop
' Check to see if there was an end of line.
If nEndOfLinePos = 0 Then
' No, this is the end of the file - so use all the remaining characters.
nLineLen = Seek(the_nFileNo) - nStartOfLinePos - 1
Else
' Yes - so use all the characters up to the end of line position.
nLineLen = nEndOfLinePos - nStartOfLinePos + 1
End If
' Is this line empty?
If nLineLen = 0 Then
' Yes - just return an empty string.
out_sLine = vbNullString
Else
' No - pull all the bytes from the beginning to the end of the line into a byte array, and then convert that from UTF-8 to a VB string.
ReDim abytLine(1 To nLineLen)
Get #the_nFileNo, nStartOfLinePos, abytLine()
out_sLine = FromCPString(abytLine(), CP_UTF8)
End If
' If there is a line afterwards, then move to the beginning of the line, and return True.
If nStartOfNextLine > 0 Then
Seek #the_nFileNo, nStartOfNextLine
LineInputUTF8 = True
End If
End Function
' Purpose: Analogue of 'Open "fileName" For Input As #fileNo' - but also return what type of text this is via a Code Page value.
' Inputs: the_sFileName
' the_iFileNo
' (the_nDesiredCodePage) The code page that you want to use with this file.
' If this value is set to the default, -1, this indicates that the code page will be ascertained from the file.
' Outputs: out_nCodePage There are only six valid values that are returned if <the_nDesiredCodePage> was set to -1.
' CP_ACP ANSI code page
' CP_UTF8 UTF-8
' CP_UTF16LE UTF-16 Little Endian (VB and NT default string encoding)
' CP_UTF16BE UTF-16 Big Endian
' CP_UTF32LE UTF-32 Little Endian
' CP_UTF32BE UTF-32 Big Endian
' (out_bContainedBOM) If this was set to True, then the file started with a BOM (Byte Order Marker).
Public Sub OpenForInput(ByRef the_sFilename As String, ByVal the_iFileNo As Integer, ByRef out_nCodePage As Long, Optional ByVal the_nDesiredCodePage As Long = -1, Optional ByRef out_bContainedBOM As Boolean)
' Note if we want to take account of every case, we should read in the first four bytes, and check for UTF-32 low and high endian BOMs, check
' the first three bytes for the UTF-8 BOM, and finally check the first two bytes for UTF-16 low and hight endian BOMs.
Dim abytBOM(1 To 4) As Byte
Dim nCodePage As Long
' By default, there is no BOM.
out_bContainedBOM = False
Open the_sFilename For Binary Access Read As #the_iFileNo
' We are interested in -1 (ascertain code page), and then various UTF encodings.
Select Case the_nDesiredCodePage
Case -1, CP_UTF8, CP_UTF16_BE, CP_UTF16_LE, CP_UTF32_BE, CP_UTF32_LE
' Default code page.
nCodePage = CP_ACP
' Pull in the first four bytes to determine the BOM (byte order marker).
Get #the_iFileNo, , abytBOM()
' The following are the BOMs for text files:
'
' FF FE UTF-16, little endian
' FE FF UTF-16, big endian
' EF BB BF UTF-8
' FF FE 00 00 UTF-32, little endian
' 00 00 FE FF UTF-32, big-endian
'
' Work out the code page from this information.
Select Case abytBOM(1)
Case &HFF
If abytBOM(2) = &HFE Then
If abytBOM(3) = 0 And abytBOM(4) = 0 Then
nCodePage = CP_UTF32_LE
Else
nCodePage = CP_UTF16_LE
End If
End If
Case &HFE
If abytBOM(2) = &HFF Then
nCodePage = CP_UTF16_BE
End If
Case &HEF
If abytBOM(2) = &HBB And abytBOM(3) = &HBF Then
nCodePage = CP_UTF8
End If
Case &H0
If abytBOM(2) = &H0 And abytBOM(3) = &HFE And abytBOM(4) = &HFF Then
nCodePage = CP_UTF32_BE
End If
End Select
' Did we match any BOMs?
If nCodePage = CP_ACP Then
' No - we are still defaulting to the ANSI code page.
' Special check for UTF-8. The BOM is not specified in the standard for UTF-8, but according to Wikipedia (which is always right :-) ),
' only Microsoft includes this marker at the beginning of files.
If FileBytesAreUTF8(the_iFileNo) Then
out_nCodePage = CP_UTF8
Else
out_nCodePage = CP_ACP
End If
Else
' Yes - we have worked out the code page from the BOM.
' If no code page was suggested, we now return the code page we found.
If the_nDesiredCodePage = -1 Then
out_nCodePage = nCodePage
End If
' Inform the caller that a BOM was found.
out_bContainedBOM = True
End If
' Reset the file pointer to the beginning of the file data.
If out_bContainedBOM Then
' Note that if the code page found was one of the two UTF-32 values, then we are already in the correct position.
' Otherwise, we have to move to just after the end of the BOM.
Select Case nCodePage
Case CP_UTF16_BE, CP_UTF16_LE
Seek #the_iFileNo, 3
Case CP_UTF8
Seek #the_iFileNo, 4
End Select
Else
' There is no BOM, so simply go the beginning of the file.
Seek #the_iFileNo, 1
End If
Case Else
out_nCodePage = the_nDesiredCodePage
End Select
End Sub
' Purpose: Analogue of 'Open "fileName" For Append As #fileNo'
Public Sub OpenForAppend(ByRef the_sFilename As String, ByVal the_iFileNo As Integer, Optional ByVal the_nCodePage As Long = CP_ACP, Optional ByVal the_bPrefixWithBOM As Boolean = True)
' Open the file and move to the end of the file.
Open the_sFilename For Binary Access Write As #the_iFileNo
Seek the_iFileNo, LOF(the_iFileNo) + 1
If the_bPrefixWithBOM Then
WriteBOM the_iFileNo, the_nCodePage
End If
End Sub
' Purpose: Analogue of 'Open "fileName" For Output As #fileNo'
Public Sub OpenForOutput(ByRef the_sFilename As String, ByVal the_iFileNo As Integer, Optional ByVal the_nCodePage As Long = CP_ACP, Optional ByVal the_bPrefixWithBOM As Boolean = True)
' Ensure we overwrite the file by deleting it ...
On Error Resume Next
Kill the_sFilename
On Error GoTo 0
' ... before creating it.
Open the_sFilename For Binary Access Write As #the_iFileNo
If the_bPrefixWithBOM Then
WriteBOM the_iFileNo, the_nCodePage
End If
End Sub
' Purpose: Analogue of the 'Print #fileNo, value' statement. But only one value allowed.
' Setting <the_bAppendNewLine> = False is analagous to 'Print #fileNo, value;'.
Public Sub Print_(ByVal the_iFileNo As Integer, ByRef the_sValue As String, Optional ByVal the_nCodePage As Long = CP_ACP, Optional ByVal the_bAppendNewLine As Boolean = True)
Const kbytNull As Byte = 0
Const kbytCarriageReturn As Byte = 13
Const kbytNewLine As Byte = 10
Put #the_iFileNo, , ToCPString(the_sValue, the_nCodePage)
If the_bAppendNewLine Then
Select Case the_nCodePage
Case CP_UTF16_BE
Put #the_iFileNo, , kbytNull
Put #the_iFileNo, , kbytCarriageReturn
Put #the_iFileNo, , kbytNull
Put #the_iFileNo, , kbytNewLine
Case CP_UTF16_LE
Put #the_iFileNo, , kbytCarriageReturn
Put #the_iFileNo, , kbytNull
Put #the_iFileNo, , kbytNewLine
Put #the_iFileNo, , kbytNull
Case CP_UTF32_BE
Put #the_iFileNo, , kbytNull
Put #the_iFileNo, , kbytNull
Put #the_iFileNo, , kbytNull
Put #the_iFileNo, , kbytCarriageReturn
Put #the_iFileNo, , kbytNull
Put #the_iFileNo, , kbytNull
Put #the_iFileNo, , kbytNull
Put #the_iFileNo, , kbytNewLine
Case CP_UTF32_LE
Put #the_iFileNo, , kbytCarriageReturn
Put #the_iFileNo, , kbytNull
Put #the_iFileNo, , kbytNull
Put #the_iFileNo, , kbytNull
Put #the_iFileNo, , kbytNewLine
Put #the_iFileNo, , kbytNull
Put #the_iFileNo, , kbytNull
Put #the_iFileNo, , kbytNull
Case Else
Put #the_iFileNo, , kbytCarriageReturn
Put #the_iFileNo, , kbytNewLine
End Select
End If
End Sub
Public Sub PutContents(ByRef the_sFilename As String, ByRef the_sFileContents As String, Optional ByVal the_nCodePage As Long = CP_ACP, Optional the_bPrefixWithBOM As Boolean)
Dim iFileNo As Integer
iFileNo = FreeFile
OpenForOutput the_sFilename, iFileNo, the_nCodePage, the_bPrefixWithBOM
Print_ iFileNo, the_sFileContents, the_nCodePage, False
Close iFileNo
End Sub
' Purpose: Converts a VB string (UTF-16) to UTF8 - as a binary array.
Private Function ToCPString(ByRef the_sValue As String, ByVal the_nCodePage As Long) As Byte()
Dim abytOutput() As Byte
Dim nValueLen As Long
Dim nOutputByteLen As Long
If the_nCodePage = CP_UTF16_LE Then
ToCPString = the_sValue
Else
' Cache the input length.
nValueLen = Len(the_sValue)
' See how big the output buffer will be.
nOutputByteLen = WideCharToMultiByte(the_nCodePage, 0&, StrPtr(the_sValue), nValueLen, 0&, 0&, 0&, 0&)
If nOutputByteLen > 0 Then
' Resize output byte array to the size of the UTF-8 string.
ReDim abytOutput(1 To nOutputByteLen)
' Make this API call again, this time giving a pointer to the output byte array.
WideCharToMultiByte the_nCodePage, 0&, StrPtr(the_sValue), nValueLen, VarPtr(abytOutput(1)), nOutputByteLen, 0&, 0&
End If
' Return the array.
ToCPString = abytOutput()
End If
End Function
Private Sub WriteBOM(ByVal the_iFileNo As Integer, ByVal the_nCodePage As Long)
' FF FE UTF-16, little endian
' FE FF UTF-16, big endian
' EF BB BF UTF-8
' FF FE 00 00 UTF-32, little endian
' 00 00 FE FF UTF-32, big-endian
Select Case the_nCodePage
Case CP_UTF8
Put #the_iFileNo, , CByte(&HEF)
Put #the_iFileNo, , CByte(&HBB)
Put #the_iFileNo, , CByte(&HBF)
Case CP_UTF16_LE
Put #the_iFileNo, , CByte(&HFF)
Put #the_iFileNo, , CByte(&HFE)
Case CP_UTF16_BE
Put #the_iFileNo, , CByte(&HFE)
Put #the_iFileNo, , CByte(&HFF)
Case CP_UTF32_LE
Put #the_iFileNo, , CByte(&HFF)
Put #the_iFileNo, , CByte(&HFE)
Put #the_iFileNo, , CByte(&H0)
Put #the_iFileNo, , CByte(&H0)
Case CP_UTF32_BE
Put #the_iFileNo, , CByte(&H0)
Put #the_iFileNo, , CByte(&H0)
Put #the_iFileNo, , CByte(&HFE)
Put #the_iFileNo, , CByte(&HFF)
End Select
End Sub
The following code was added to a Form which had a VSFlexGrid control with Lucida Console font - purely to provide a way to display as many characters as possible:
Option Explicit
Private Sub Command_Click()
Example_ConvertFileToUTF8
End Sub
Private Sub Command2_Click()
Example_IterateUTF8Lines
End Sub
Private Sub Command3_Click()
Example_ReadWriteUTF8Lines
End Sub
Private Sub Form_Load()
VSFlexGrid.ColWidth(0) = 7000!
End Sub
' Purpose: Converts *any* pure text file (UTF16, ASCII, ANSI) to UTF8.
Private Sub Example_ConvertFileToUTF8()
Dim nCodePage As Long
Dim bContainedBOM As Boolean
Dim sFileContents As String
' Read in contents.
sFileContents = TextFile.GetContents("C:\MysteryEncoding.txt", nCodePage, , bContainedBOM)
' And then convert to UTF8.
TextFile.PutContents "C:\output.txt", sFileContents, CP_UTF8, bContainedBOM
End Sub
' Purpose: Iterates through each line of a UTF-8 text file, and adds it to a control which can display VB strings containing non-ANSI characters.
' In this case, I am adding items to a FlexGrid with Font = "Lucida Console".
Private Sub Example_IterateUTF8Lines()
Dim iFileNo As Integer
Dim lCodePage As Long
Dim sLine As String
iFileNo = FreeFile
TextFile.OpenForInput "C:\UTF8.txt", iFileNo, lCodePage
If lCodePage = CP_UTF8 Then
Do While TextFile.LineInputUTF8(iFileNo, sLine)
VSFlexGrid.AddItem sLine
Loop
VSFlexGrid.AddItem sLine
Else
MsgBox "This is not a UTF8 file."
End If
Close #iFileNo
End Sub
Private Sub Example_ReadWriteUTF8Lines()
Dim iFileNoInput As Integer
Dim iFileNoOutput As Integer
Dim lCodePage As Long
Dim bBOM As Boolean
Dim sLine As String
iFileNoInput = FreeFile
TextFile.OpenForInput "C:\UTF8.txt", iFileNoInput, lCodePage, , bBOM
If lCodePage = CP_UTF8 Then
iFileNoOutput = FreeFile
TextFile.OpenForOutput "C:\output.txt", iFileNoOutput, lCodePage, bBOM
Do While TextFile.LineInputUTF8(iFileNoInput, sLine)
TextFile.Print_ iFileNoOutput, sLine, lCodePage
Loop
TextFile.Print_ iFileNoOutput, sLine, lCodePage, False
Close #iFileNoOutput
Else
MsgBox "This is not a UTF8 file."
End If
Close #iFileNoInput
End Sub
I have to some vb6 code to change file encoding ANSI to Encoding UTF-8.
'--- start function for convert to UTF-8
Private Function OpenAppendUTF8(ByVal FileName As String) As Integer
OpenAppendUTF8 = FreeFile(0)
Open FileName For Binary Access Write As #OpenAppendUTF8
Seek #OpenAppendUTF8, LOF(OpenAppendUTF8) + 1
End Function
Sub DeleteFile(ByVal FileToDelete As String)
If Dir$(FileToDelete) = "" Then 'See above
Else
SetAttr FileToDelete, vbNormal
Kill FileToDelete
End If
End Sub
'-
Private Sub WriteUTF8( _
ByVal FNum As Integer, _
ByVal Text As String, _
Optional ByVal NL As Boolean)
Dim lngResult As Long
Dim UTF8() As Byte
If NL Then Text = Text & vbNewLine
lngResult = WideCharToMultiByte(CP_UTF8, 0, StrPtr(Text), Len(Text), _
0, 0, 0, 0)
If lngResult > 0 Then
ReDim UTF8(lngResult - 1)
WideCharToMultiByte CP_UTF8, 0, StrPtr(Text), Len(Text), _
VarPtr(UTF8(0)), lngResult, 0, 0
Put #FNum, , UTF8
End If
End Sub
'------- end function for convert to UTF-8
'>>> coding
Dim ReadString As String
Dim writeString As String
Dim txtNotepad As String
Dim FileNumber As Integer
Dim UTF8 As String
Dim strsql As String
Dim iloop As Integer
FileNumber = FreeFile(1)
'--- if you need to delete file first.
DeleteFile App.Path & "\PRB\fieldRules.txt"
'---
FileNumber = OpenAppendUTF8(App.Path & "\PRB\fieldRules.txt")
iloop = 1
Do While iloop > 0
iloop = InStr(1, txtNotepad, "\n")
' "\n" for end of line
If iloop <= 5 Then
iloop = 0
Else
'writeString = Mid(txtNotepad, 1, iloop - 1)
'Print #FileNumber, writeString
WriteUTF8 FileNumber, Mid(txtNotepad, 1, iloop - 1), True
txtNotepad = Mid(txtNotepad, iloop + 3, 9999)
End If
Loop
Close #FileNumber
<<<< end code

Can images be read from an iPhone programmatically using CreateFile in Windows?

When an iPhone is connected to a Win7 computer, the images can be viewed using Explorer (and the open file dialog of my app). However, the file location does not contain a drive letter.
For example Computer\Apple iPhone\Internal Storage\DCIM\800AAAAA\IMG_0008.JPG instead of E:\DCIM\800AAAAA\IMG_0008.JPG which is common of sdcards, usb drives, etc...
I've tried using CreateFileW to read images from an iPhone but it fails with '(Error Code: 3) The system cannot find the path specified.' I've also tried accessing them with Chrome and it fails too.
Any suggestions?
The folder is actually what is referred to as a 'Virtual Folder' and does not have a full path on the file system. You will need to use the shell item returned from the open dialog to get the content of the file rather than using CreateFile.
The data should be accessible, but you should follow the instructions from the MSDN documentation. I'm sure there are probably better examples (as this only gives guidelines).
edit the rough process is to get the IShellItem from IFileOpenDialog, then to bind to the stream and then read the stream (assuming reading only) - bear in mind that this code is pretty much without error handling or checking or safety:
if (pitem->GetDisplayName(SIGDN_NORMALDISPLAY, &destName) == S_OK) {
std::cout << destName << std::endl;
CoTaskMemFree(destName);
}
IStream *pistream;
if (pitem->BindToHandler(0, BHID_Stream, IID_PPV_ARGS(&pistream)) == S_OK) {
char input[1024];
long to_read = 1024;
unsigned long read;
while (S_OK == pistream->Read(input, to_read, &read)) {
std::cout << input << std::endl;
}
pistream->Release();
}
pitem->Release();
Most often such a device is inserted in the Windows Explorer as a Shell Namespace Extension and not like an USB stick with drive letter. Most of the normal file commands like CopyFile(..), FindFirst() or GetFileInfo(..) can not be used directly in such a Shell Namespace extension. Only the CopyHere(..) is working.
I needed long time to figure out how to enumerate the files on a digicam and now also on an Android device with an vb.net program and to copy my pictures to my Windows PC:
Public Const MyComputer As Integer = &H11&
Sub EnumMyComputer()
Dim oItem As Object
Dim res As Integer
For Each oItem In DirectCast(CreateObject("Shell.Application").Namespace(MyComputer).Items, System.Collections.IEnumerable)
Debug.Print(oItem.Type.ToString)
if oItem.Type.ToString="Tragbares Medienwiedergabegerät" then '<- check, adopt!
res = EnumNamespaceItems(oItem, "", oItem.Name.ToString, 0)
End If
Next oItem
End Sub
Function EnumNamespaceItems(oItem As Object, SrcCPath As String, SrcDPath As String, folderLevel As Integer) As Integer
Dim y As Object
Dim tempFullFileName As String
Debug.Print(StrDup(folderLevel, " ") & "\" & oItem.Name.ToString & " (" & oItem.Path.ToString & ")")
For Each y In DirectCast(oItem.GetFolder.items, System.Collections.IEnumerable)
'Debug.Print(StrDup(folderLevel, " ") & SrcDPath & y.Name.ToString)
If y.IsFolder = True Then
Dim n1 As Integer
n1 = EnumNamespaceItems(y, SrcCPath & y.Path.ToString & "\", SrcDPath & y.Name.ToString & "\", 1 + folderLevel)
If n1 < 0 Then 'failure: Cancel
EnumNamespaceItems = n1
Exit Function
End If
Else 'it's a file:
Debug.Print(StrDup(folderLevel, " ") & " " & y.Name.ToString)
tempFullFileName = System.IO.Path.GetTempPath() & y.Name.ToString
' CopyFile is not possible here if SrcCPath is like "::{…}…":
' My.Computer.FileSystem.CopyFile(SrcCPath & y.Name.ToString , fFile.FullName)
Dim suc As Integer = CopyHereFileWait(y, My.Computer.FileSystem.SpecialDirectories.Temp)
If suc >= 0 Then 'now we can do things like this:
Dim MyFileInfo As System.IO.FileInfo = My.Computer.FileSystem.GetFileInfo(tempFullFileName)
Dim fileDate As Date = MyFileInfo.LastWriteTime
End If 'suc
End If 'else y.IsFolder
Next y
EnumNamespaceItems = 0
End Function
Function CopyHereFileWait(sourceNamespaceObject As Object, targetFolder As String) As Integer
Dim fsMyStream As System.IO.FileStream
Dim n1 As Integer
Dim taregetFullFileName As String
n1 = Len(targetFolder)
If Mid(targetFolder, n1, 1) = "\" Then
targetFolder = Microsoft.VisualBasic.Left(targetFolder, n1 - 1)
End If
taregetFullFileName = targetFolder & "\" & sourceNamespaceObject.Name.ToString
Dim oNsTargetFolder As Object
oNsTargetFolder = CreateObject("Shell.Application").Namespace(CStr(targetFolder))
oNsTargetFolder.copyHere(sourceNamespaceObject)
'returns immediately and is doing the work in the background
n1 = 0
Do
Threading.Thread.Sleep(50) 'ms
Try
fsMyStream = System.IO.File.Open(taregetFullFileName, IO.FileMode.Open, IO.FileAccess.ReadWrite)
fsMyStream.Close()
CopyHereFileWait = n1
Exit Function
Catch ex As Exception
Debug.Print(ex.Message)
End Try
n1 = n1 + 1
Loop While n1 < 400 'timeout 400*50ms = 20s
CopyHereFileWait = -n1
End Function
You may add to check for folders with y.Name.ToString="DCIM" (on folderLevel=1) and for files with ".jpg".

Renaming a Word document and saving its filename with its first 10 letters

I have recovered some Word documents from a corrupted hard drive using a piece of software called photorec. The problem is that the documents' names can't be recovered; they are all renamed by a sequence of numbers. There are over 2000 documents to sort through and I was wondering if I could rename them using some automated process.
Is there a script I could use to find the first 10 letters in the document and rename it with that? It would have to be able to cope with multiple documents having the same first 10 letters and so not write over documents with the same name. Also, it would have to avoid renaming the document with illegal characters (such as '?', '*', '/', etc.)
I only have a little bit of experience with Python, C, and even less with bash programming in Linux, so bear with me if I don't know exactly what I'm doing if I have to write a new script.
How about VBScript? Here is a sketch:
FolderName = "C:\Docs\"
Set fs = CreateObject("Scripting.FileSystemObject")
Set fldr = fs.GetFolder(Foldername)
Set ws = CreateObject("Word.Application")
For Each f In fldr.Files
If Left(f.name,2)<>"~$" Then
If InStr(f.Type, "Microsoft Word") Then
MsgBox f.Name
Set doc = ws.Documents.Open(Foldername & f.Name)
s = vbNullString
i = 1
Do While Trim(s) = vbNullString And i <= doc.Paragraphs.Count
s = doc.Paragraphs(i)
s = CleanString(Left(s, 10))
i = i + 1
Loop
doc.Close False
If s = "" Then s = "NoParas"
s1 = s
i = 1
Do While fs.FileExists(s1)
s1 = s & i
i = i + 1
Loop
MsgBox "Name " & Foldername & f.Name & " As " & Foldername & s1 _
& Right(f.Name, InStrRev(f.Name, "."))
'' This uses copy, because it seems safer
f.Copy Foldername & s1 & Right(f.Name, InStrRev(f.Name, ".")), False
'' MoveFile will copy the file:
'' fs.MoveFile Foldername & f.Name, Foldername & s1 _
'' & Right(f.Name, InStrRev(f.Name, "."))
End If
End If
Next
msgbox "Done"
ws.Quit
Set ws = Nothing
Set fs = Nothing
Function CleanString(StringToClean)
''http://msdn.microsoft.com/en-us/library/ms974570.aspx
Dim objRegEx
Set objRegEx = CreateObject("VBScript.RegExp")
objRegEx.IgnoreCase = True
objRegEx.Global = True
''Find anything not a-z, 0-9
objRegEx.Pattern = "[^a-z0-9]"
CleanString = objRegEx.Replace(StringToClean, "")
End Function
Word documents are stored in a custom format which places a load of binary cruft on the beginning of the file.
The simplest thing would be to knock something up in Python that searched for the first line beginning with ASCII chars. Here you go:
#!/usr/bin/python
import glob
import os
for file in glob.glob("*.doc"):
f = open(file, "rb")
new_name = ""
chars = 0
char = f.read(1)
while char != "":
if 0 < ord(char) < 128:
if ord("a") <= ord(char) <= ord("z") or ord("A") <= ord(char) <= ord("Z") or ord("0") <= ord(char) <= ord("9"):
new_name += char
else:
new_name += "_"
chars += 1
if chars == 100:
new_name = new_name[:20] + ".doc"
print "renaming " + file + " to " + new_name
f.close()
break;
else:
new_name = ""
chars = 0
char = f.read(1)
if new_name != "":
os.rename(file, new_name)
NOTE: if you want to glob multiple directories you'll need to change the glob line accordingly. Also this takes no account of whether the file you're trying to rename to already exists, so if you have multiple docs with the same first few chars then you'll need to handle that.
I found the first chunk of 100 ASCII chars in a row (if you look for less than that you end up picking up doc keywords and such) and then used the first 20 of these to make the new name, replacing anything that's not a-z A-Z or 0-9 with underscores to avoid file name issues.