I am finding that VBscript's SendKeys does not support Unicode. It supports some like A-65, but not foreign letters like the letter Aleph (א) from the Hebrew alphabet. Prob outside its supported range. Could be for decimal values of 128+, it gives a "?", and it only supports the ASCII range.
I can type and see Hebrew letters on my computer using Windows XP. So the OS support for the characters is there and set up. My source code demonstrates that, since the line
msgbox Chrw(1488)
displays the Aleph character and I've displayed it in Notepad and MS Word.
It looks to me like it is sending a question mark for a character it doesn't recognize. I think MS Word or Notepad if they did have a problem displaying a character (e.g. when the font doesn't support a char), they would display a box rather than a question mark. Certainly in the case of Notepad anyway. So it looks like a SendKeys issue. Any ideas? Any kind of workaround?
Dim objShell
Set objShell = CreateObject("WScript.Shell")
objShell.Run "notepad" ''#can change to winword
Wscript.Sleep 2000
msgbox Chrw(1488) ''#aleph
objShell.SendKeys ("abc" & ChrW(1488) & "abc") ''#bang, it displays a ? instead of an aleph
WScript.Quit
You're most likely right in your guess that VBscript's SendKeys doesn't support Unicode.
Monitoring of Windows API function calls performed by SendKeys using Blade API Monitor on Russian Windows XP with English US, Russian and Hebrew keyboards) shows that SendKeys isn't Unicode aware. Specifically, SendKeys does the following:
Calls the ANSI (not Unicode) version of the VkKeyScan function — VkKeyScanA — to get the virtual key code of the character to be sent. This function translates the character into VK_SHIFT + VK_OEM_2, so it seems that somewhere before or in the process the Aleph character is converted into a different, ANSI character.
Calls the SendInput function to send the VK_SHIFT + VK_OEM_2 keystrokes instead of the Aleph character.
The main problem here is that to send a Unicode character, SendInput must be called with the KEYEVENTF_UNICODE flag and the character in question must be passed via the function parameters — the experiment shows that none of this is the case. Also, VkKeyScan isn't actually needed in case of a Unicode character, as SendInput itself handles Unicode input.
Given this, the only way to send Unicode input to an application from VBScript is to write a custom utility or COM component that will utilize SendInput properly and to call this utility/component from your script. (VBScript doesn't have any native means to access Windows API.)
Note added by barlop: While VBScript's obj.SendKeys(..) isn't Unicode-aware, VB's SendKeys.Send(..) would be.
I'm using Dragon Naturally Speaking with hebrew, and SendKeys indeed cannot send hebrew characters even though they show in the macro editor, however, what I did was set the clipboard with the hebrew text that I wanted, and then SendKeys with Ctrl-V for paste, and that does works, it's just SendKeys that messes with the encoding.
Clipboard("טקסט בעברית")
SendKeys("^V")
That would override the users clipboard and won't work for mixing command characters (Ctrl,Alt,Shift) with hebrew characters, but it's a work around.
You can get around this by using:
wshShell.SendKeys "1488+({LEFT}{LEFT}{LEFT}{LEFT})%(x)"
This uses a windows keyboard shortcut from here.
Try setting the font to Microsoft Sans Serif in Notepad.
Related
I am facing a problem in my VB 6.0 application that Unicode characters are not supporting. I need to set Chinese characters in field of a recordset in my application-(size of each field is setting from program itself). If we are setting Chinese char into the field of recordset then getting Multiple-step operation error(because of the holding field size is not enough). This error will not fire, if we are setting language to Chinese from Regional settings from control panel in server (Control Panel > Region and Language setting > Administrative Tab > Change system Locale.. > to Chinese )
if we are setting this then time settings of our application will be change. I need some help with out changing from control panel how can we solve this problem.
please help.
Thanks in advance.
In Windows, you can set your regional settings to Chinese, while keeping the time and date format. http://www.techpavan.com/2009/04/07/change-time-format-windows/
For using Unicode in Visual Basic 6 applications, here is an article with thorough explanations and examples: http://www.example-code.com/vb/vbUnicode1.asp
Quoting this link:
Internally, VB6 stores strings as Unicode. Your VB6 program is capable of manipulating strings in any language containing any character -- whether it's Chinese, Japanese, Icelandic, Arabic, etc. It's fully Unicode capable. A single string may contain characters in multiple languages. You can save these strings to databases, files, etc., and there shouldn't be a problem. Problems arise only when trying to display (i.e. render the glyphs) for foreign characters in the standard VB6 controls.
When displaying a string, the standard VB6 textbox and label controls do an implicit (and internal) conversion from Unicode to ANSI. This is the confounding behavior that causes all the trouble. Internal to VB6, the runtime is converting Unicode to the current Windows ANSI code page identifier for the operating system. There is no way to change this conversion short of changing the ANSI code page for the system.
The standard VB6 textbox and label controls display the ANSI bytes according to a character encoding that you can specify. After the Unicode-to-ANSI conversion, VB6 then attempts to display the character data according to the control's Font.Charset property, which if left unchanged is equal to the ANSI charset. Changing the control's Font.Charset changes the way VB6 interprets the "ANSI" bytes. In other words, you're telling VB6 to treat the bytes as some other character encoding instead of "ANSI". Note: VB6 is capable of displaying characters in all the major languages. It simply needs to be told to do so, and the correct bytes need to be in place internally for it to happen.
Try setting the font on those controls to Lucida Sans Unicode to add Unicode Support in.
Our MFC application uses Multi Byte Character Set (MBCS). OS is Windows 7.
We could type in Chinese Simplified characters by virtual keyboard, but copy pasting Chinese characters from Google Translate to an edit box in the application shows junk characters "????"
Is this a known issue with MBCS applications? Is there a workaround?
When copying and pasting into a multi-byte app the Unicode characters will be converted into the local code page. If they can't be converted you'll get ?. You really should be compiling and distributing your app in Unicode otherwise you'll be fighting these sorts of issues all the time.
If you can't re-compile in Unicode try catching the 'Paste' action and handle the clipboard yourself. Use GetClipboardData and read the value for CF_UNICODETEXT, which will be the valid text. You'll then need to do your own conversion to the correct multi-byte format.
Whenever i Copy and paste any Below Mention CHARACTER in text Box
Below are Copied character ( test this in notepad )
…
”
‘
Below are Typed Character
...
"
'
then that was converted to Junk Character. How can i Block this .
When i Type those character from keybord then it works but when copy paste it converted to Junk.
How can i detect and delete all this character before processing because ..user dont know about this issue ..
I want to delete that character wen user press Submit button.
” and ’ are not junk characters. They are perfectly good Unicode characters (U+201C LEFT DOUBLE QUOTATION MARK and U+2018 LEFT SINGLE QUOTATION MARK). Modern applications should be capable of dealing with all Unicode characters; if you can't handle the smart quotes you probably also can't handle accents, Greek, Cyrillic, Chinese or any of the other characters users are likely to want to use. You should concentrate on ensuring that your application supports Unicode, rather than trying to fix this one visible symptom.
Pasting ' and " (ASCII straight quote) characters into a text box should not turn them into non-ASCII ‘smart’ quotes. Where they typically tend to come from is Microsoft Word's misguided ‘AutoReplace’ feature, which replaces straight quotes with smart quotes as you type. This is an annoyance, but ultimately it's limited to Office and there's not really much you can do about it. Whilst you can manually replace “ and ” with " by doing a trivial string replacement (and how you do that depends on what language/environment you are talking about), you'll also be removing correct usage of those characters, and you won't be fixing all the other sad broken auto-replacements that MS Office does.
The … single-character ellipsis is a slightly different case, and arguably ‘junk’: to Unicode, U+2026 HORIZONTAL ELLIPSIS is a ‘compatibility character’ which is only intended to round-trip nicely to existing encodings that include it as a separate characters. Normally three dot characters should be used instead. You can replace compatibility characters by using Unicode normalisation, in particular Normal Form KC. Again, how you access normalisation is something that depends on your programming language/environment. For example in Python, unicodedata.normalize('NFKC', u'…') gives you u'...'.
Is your vnc client / server ON, try to exit (shutdown) all vnc server / clients and try again - if your copy paste works.
I've stuck with the following problem:
I have a script which is retrieving title form the Firefox window:
tell application "Firefox"
if the (count of windows) is not 0 then
set window_name to name of front window
end if
end tell
It works well as long as the title contains only English characters but when title contains some non-ASCII characters(Cyrillic in my case) it produces some utf-8 garbage. I've analyzed this garbage a bit and it seems that my Cyrillic character is converted to the Utf-8 without any concerning about codepage i.e instead of using Cyrillic codepage for conversion it uses non codepages at all and I have utf-8 text with characters different from those in the window title.
My question is: How can I retrieved the window title in utf-8 directly without any conversion?
I can achieve this goal by using AXAPI but I want to achieve this by AppleScript because AXAPI needs some option turned on in the system.
UPD:
It works fine in the AppleScript Editor. But I'm compiling it through the C++ code via OSACompile->OSAExecute->OSADisplay
I don't know the guts of the AppleScript Editor so maybe it has some inside information about how to encode the characters
I've found the answer when wrote update. Sometimes it is good to ask a question for better it understanding :)
So for the future searchers: If you want to use unicode result of the script execution you should provide typeUnicodeText to the OSADisplay then you will have result in the UTF-16LE in the result AEDesc
What is the secret to japanese characters in a Windows XP .bat file?
We have a script for open a file off disk in kiosk mode:
#ECHO OFF
"%ProgramFiles%\Internet Explorer\iexplore.exe" –K "%CD%\XYZ.htm"
It works fine when the OS is english, and it works fine for the japanese OS when XYZ is made up of english characters, but when XYZ is made up of japanese characters, they are getting mangled into gibberish by the time IE tries to find the file.
If the batch file is saved as Unicode or Unicode big endian the script wont even run.
I have tried various ways of encoding the japanese characters. ampersand escape does not work (〹)
Percent escape does not work %xx%xx%xx
ABC works, AB%43 becomes AB3 in the error message, so it looks like the percent escape is trying to do parameter substitution. This is confirmed because %043 puts in the name of the script !
One thing that does work is pasting the ja characters into a command prompt.
#ECHO OFF
CD "%ProgramFiles%\Internet Explorer\"
Set /p URL ="file to open: "
start iexplore.exe –K %URL%
This tells me that iexplore.exe will accept and parse the parameter correctly when it has ja characters, but not when they are written into the script.
So it would be nice to know what the secret may be to getting the parameter into IE successfully via the batch file, as opposed to via the clipboard and an environment variable.
Any suggestions greatly appreciated !
best regards
Richard Collins
P.S.
another post has has made this suggestion, which i am yet to follow up:
You might have more luck in cmd.exe if you opened it in UNICODE mode. Use "cmd /U".
Batch renaming of files with international chars on Windows XP
I will need to find out if this can be from inside the script.
For the record, a simple answer has been found for this question.
If the batch file is saved as ANSI - it works !
First of all: Batch files are pretty limited in their internationalization support. There is no direct way of telling cmd what codepage a batch file is in. UTF-16 is out anyway, since cmd won't even parse that.
I have detailed an option in my answer to the following question:
Batch file encoding
which might be helpful for your needs.
In principle it boils down to the following:
Use an encoding which has single-byte mappings for ASCII
Put a chcp ... at the start of the batch file
Use the set codepage for the rest of the file
You can use codepage 65001, which is UTF-8 but make sure that your file doesn't include the U+FEFF character at the start (used as byte-order mark in UTF-16 and UTF-32 and sometimes used as marker for UTF-8 files as well). Otherwise the first command in the file will produce an error message.
So just use the following:
echo off
chcp 65001
"%ProgramFiles%\Internet Explorer\iexplore.exe" –K "%CD%\XYZ.htm"
and save it as UTF-8 without BOM (Note: Notepad won't allow you to do that) and it should work.
cmd /u won't do anything here, that advice is pretty much bogus. The /U switch only specifies that Unicode will be used for redirection of input and output (and piping). It has nothing to do with the encoding the console uses for output or reading batch files.
URL encoding won't help you either. cmd is hardly a web browser and outside of HTTP and the web URL encoding isn't exactly widespread (hence the name). cmd uses percent signs for environment variables and arguments to batch files and subroutines.
"Ampersand escape" also known as character entities known from HTML and XML, won't work either, because cmd is also not HTML or XML. The ampersand is used to execute multiple commands in a single line.
I too suffered this frustrating problem in batch/cmd files. However, so far as I can see, no one yet has stated the reason why this problem occurs, here or in other, similar posts at StackOverflow. The nearest statement addressing this was:
“First of all: Batch files are pretty limited in their internationalization support. There is no direct way of telling cmd what codepage a batch file is in.”
Here is the basic problem. Cmd files are the Windows-2000+ successor to MS-DOS and IBM-DOS bat(ch) files. MS and IBM DOS (1984 vintage) were written in the IBM-PC character set (code page 437). There, the 8th-bit codes were assigned (or “clothed” with) characters different from those assigned to the corresponding codes of Windows, ANSI, or Unicode. The presumption of CP437 encoding is unalterable (except, as previously noted, through cmd.exe /u). Where the characters of the IBM-PC set have exact counterparts in the Unicode set, Windows Explorer remaps them to the Unicode counterparts. Alas, even Windows-1252 characters like š and ¾ have no counterpart in code page 437.
Here is another way to see the problem. Try opening your batch/cmd script using the Windows Edit.com program (at C:\Windows\system32\Edit.com). The Windows-1252 character 0145 ‘ (Unicode 8217) instead appears as IBM-PC 145 æ. A batch command to rename Mary'sFile.txt as Mary’sFile.txt fails, as it is interpreted as MaryæsFile.txt.
This problem can be avoided in the case of copying a file named Mary’sFile.txt: cite it as Mary?sFile.txt, e.g.:
xCopy Mary?sFile.txt Mary?sLastFile.txt
You will see a similar treatment (substitution of question marks) in a DIR list of files having Unicode characters.
Obviously, this is useless unless an extant file has the Unicode characters. This solution’s range is paltry and inadequate, but please make what use of it you can.
You can try to use Shift-JIS encoding.