is it possible to automatically escape variables in xp bath (cmd) scripts? - command-line

so in *.cmd file I have:
set /p variable=user input here
And now I'm supposed to do sth with this variable, but user may put here anything, even characters that are disallowed if unescaped. So have I anything in CMD toolbox to deal with this possibility. That is to create now automatically (without user awareness of what is allowed and what not) another variable that would contain all necessary characters escaped so I could use it safely in my script ?

SET /P will properly accept any input, including unescaped special characters. When it comes time to use the variable (possibly) containing special characters, you just need to use delayed expansion.
#echo off
setlocal enableDelayedExpansion
set /p "test=Enter something: "
echo test=!test!
The quotes in the SET /P statement are not needed to allow special character entry. They are just there to enable a space at the end of the prompt.

Related

How i can add content in .txt with operators like < in a .txt with powershell? Use a delimiter to treat text?

I'm trying to add content in a txt file with the add-content command but within the text I have things like several "",,[], commands that I don't want powershell to run but I can't get powershell to identify all as text.
My question, could you add some kind of delimiter as in MySQL that indicates that it treats everything inside as text?
To prevent PS interpreting that characters, you must to use bacticks before the character (e.g.: `[, `\, `"). Do not confuse with apostrophe (')

Can not seem to use a base64 string from a file for a variable in batch

I'm having an issue parsing a base64 string from a text file in to a batch variable
I have a script that is generating a config for an application in XML using batch, I have the XML generating fine. The problem is, within the XML I generate is some base64 that encodes more XML with variables that I need to modify. Headache and a half. (The application in question requires this or it breaks the config)
I have the XML that needs to be encoded in to a base64 string in a text file but I need to load that text file into a variable but I think the string is breaking the variable.
The base64 it has generated is this:
PD94bWwgdmVyc2lvbj0iMS4wIj8+DQo8QXJyYXlPZlN5c3RlbVZhcmlhYmxlIHhtbG5zOnhzaT0iaHR0cDovL3d3dy53My5vcmcvMjAwMS9YTUxTY2hlbWEtaW5zdGFuY2UiIHhtbG5zOnhzZD0iaHR0cDovL3d3dy53My5vcmcvMjAwMS9YTUxTY2hlbWEiPg0KPFN5c3RlbVZhcmlhYmxlPg0KPElEPiVVU0VSTkFNRSU8L0lEPg0KPFZhbHVlPmxlZ2FsaXQ8L1ZhbHVlPg0KPFJlYWRPbmx5PnRydWU8L1JlYWRPbmx5Pg0KPFR5cGU+U3RyaW5nPC9UeXBlPg0KPC9TeXN0ZW1WYXJpYWJsZT4NCjxTeXN0ZW1WYXJpYWJsZT4NCjxJRD4lTE9HT05fVVNFUk5BTUUlPC9JRD4NCjxSZWFkT25seT50cnVlPC9SZWFkT25seT4NCjxWYWx1ZT5sZWdhbGl0PC9WYWx1ZT4NCjxUeXBlPlN0cmluZzwvVHlwZT4NCjwvU3lzdGVtVmFyaWFibGU+DQo8U3lzdGVtVmFyaWFibGU+DQo8SUQ+JVNFX0xPQ0FMX1RFTVAlPC9JRD4NCjxWYWx1ZT5DOlxVc2Vyc1xsZWdhbGl0XEFwcERhdGFcTG9jYWxcVGVtcFwyXDwvVmFsdWU+DQo8UmVhZE9ubHk+dHJ1ZTwvUmVhZE9ubHk+DQo8VHlwZT5QYXRoPC9UeXBlPg0KPC9TeXN0ZW1WYXJpYWJsZT4NCjxTeXN0ZW1WYXJpYWJsZT4NCjxJRD4lU0VfTE9DQUxfRElDVF9ST09UJTwvSUQ+DQo8VmFsdWU+QzpcVXNlcnNcbGVnYWxpdFxEb2N1bWVudHNcU3BlZWNoRXhlY1w8L1ZhbHVlPg0KPFJlYWRPbmx5PnRydWU8L1JlYWRPbmx5Pg0KPFR5cGU+UGF0aDwvVHlwZT4NCjwvU3lzdGVtVmFyaWFibGU+DQo8U3lzdGVtVmFyaWFibGU+DQo8SUQ+JVNFX0NFTlRSQUxfRElDVF9ST09UJTwvSUQ+DQo8VmFsdWU+XFxMSVQtU0VSVkVSXFBoaWxpcHNfU0VfRW50ZXJwcmlzZVxDZW50cmFsX0RpY3RhdGlvbjwvVmFsdWU+DQo8UmVhZE9ubHk+ZmFsc2U8L1JlYWRPbmx5Pg0KPFR5cGU+UGF0aDwvVHlwZT4NCjwvU3lzdGVtVmFyaWFibGU+DQo8L0FycmF5T2ZTeXN0ZW1WYXJpYWJsZT4NCg==
for /f "tokens=*" %%c in (%~dp0\base.txt) do (
set base=%%c
)
echo %base%
I'm using the above for loop to load the file into a variable, but when echoing I get no output as it doesn't seem to have set the variable for some reason. Other text files I've loaded in to a variable using this method work.
TL;DR -
Is this a pre-WinXp system?
Details:
Is that a long base64, or are you happy to see me? :-)
First I wondered about the line's length. But any WinXp+ will handle 8k chars, so the 7,664 in your example shouldn't be an issue.
So I wondered about plus signs and back slashes outside double-quotes. However, the lack of dbl-quotes wasn't an issue in my testing. This worked just fine:
--CMD:--
for /f %A in (c:\temp\z.txt) do #set zLongTmp=%A
--OUTPUT:--
...a 7,664 char output...
So I checked the string, and found only [A-Z0-9+/]....so nothing out of the ordinary
I've done other tests, but all succeeded. I'm left wondering if this is a pre-WinXp system, that has a cmdline max of 2047 characters. And if that's not it, I'd still advise using Double quotes around the 'set' command.
Example:
for /f %A in (c:\temp\z.txt) do #set "zLongTmp=%A"
echo %A
The main issue here I was having was the file I was trying to load as a variable was encoded in UCS-2 LE BOM. I had to ensure the file was encoded in UTF 8 Without BOM and all worked as required.
Thanks to everyone who helped figure this out.

How to programmatically rename a file containing decomposed characters?

I occasionnaly have to deal with files produced in a Mac environment, and with filenames containing decomposed characters (looks like "é", but really is "e´"). Those are visibly not recognized by Scripting.FileSystemObject and therefore cannot be acted on. I need to programmatically rename those files to remove the decomposed characters before further processing.
From what I found : "é (U+00E9) is a character that can be decomposed into an equivalent string of the base letter e (U+0065) and combining acute accent (U+0301)."
In other words, both strings look exactly like this : "é", but the length of the first one is 1 and the length of the second one is 2. If converted, it actually looks like this "e´".
Here's a little script for testing purposes :
(Please create those two test files by copy/pasting the names)
Filename with composed character (working) : é.txt
Filename with decomposed character (not working) : é.txt
Set args = WScript.Arguments
Set FSO = CreateObject("Scripting.FileSystemObject")
For Each Arg in Wscript.Arguments
Set objFile = FSO.GetFile(Arg)
fPath = Left(objFile.Path, Len(objFile.Path)-Len(objFile.Name))
FSO.movefile arg, fpath & "a.txt"
Set objFile = Nothing
Set FSO = Nothing
next
The file with the decomposed character produces a "File not found" error.
I managed to convert a string from decomposed to composed characters, but still not working when trying to rename an actual file.
I'm completely stuck at this point, and any help would be highly appreciated! Thanks in advance.
This has to do with the VBS/WSH DropHandler (HKEY_CLASSES_ROOT\VBSFile\ShellEx\DropHandler)
The DropHandler of VBS/WSH files is {60254CA5-953B-11CF-8C96-00AA00B8708C}.
EXE/BAT/CMD files are handled by {86C86720-42A0-1069-A2E8-08002B30309D}.
VBS/WSH drophandler parses the dropped object(s) to a long file path while the EXE/BAT/CMD drophandler parses the dropped object(s) to short file path (such as C:\PROGRA~1).
The problem is that the DropHandler of VBS doesn't parse the dropped object in Unicode way.
Your code is relying on items being dropped apparently so you rely on the WScript.Arguments.
The FSO functions CAN handle filenames like you describe.
You can test this by performing a
Set objFile = FSO.GetFile("<PATH>\e´.txt")`
or even
FSO.FileExists("<PATH>\e´.txt")
However, coming in through the arguments, the filenames are already crippled by the drophandler. I see no safe way of changing this behaviour other than messing around in the Windows Registry or by changing your script to not use 'drag-'n-drop' but getting the filenames from the OpenFile dialog perhaps.

Charset conversion from XXX to utf-8, command line

I have a bunch of text files that are encoded in ISO-8851-2 (have some polish characters). Is there a command line tool for linux/mac that I could run from a shell script to convert this to a saner utf-8?
Use iconv, for example like this:
iconv -f LATIN1 -t UTF-8 input.txt > output.txt
Some more information:
You may want to specify UTF-8//TRANSLIT instead of plain UTF-8. To quote the manpage:
If the string //TRANSLIT is appended to to-encoding, characters being converted are transliterated when needed and possible. This means that when a character cannot be represented in the target character set, it can be approximated through one or several similar looking characters. Characters that are outside of the target character set and cannot be transliterated are replaced with a question mark (?) in the output.
For a full list of encoding codes accepted by iconv, execute iconv -l.
The example above makes use of shell redirection. Make sure you are not using a shell that mangles encodings on redirection – that is, do not use PowerShell for this.
recode latin2..utf8 myfile.txt
This will overwrite myfile.txt with the new version. You can also use recode without a filename as a pipe.
GNU 'libiconv' should be able to do the job.

ja chars in windows batch file

What is the secret to japanese characters in a Windows XP .bat file?
We have a script for open a file off disk in kiosk mode:
#ECHO OFF
"%ProgramFiles%\Internet Explorer\iexplore.exe" –K "%CD%\XYZ.htm"
It works fine when the OS is english, and it works fine for the japanese OS when XYZ is made up of english characters, but when XYZ is made up of japanese characters, they are getting mangled into gibberish by the time IE tries to find the file.
If the batch file is saved as Unicode or Unicode big endian the script wont even run.
I have tried various ways of encoding the japanese characters. ampersand escape does not work (〹)
Percent escape does not work %xx%xx%xx
ABC works, AB%43 becomes AB3 in the error message, so it looks like the percent escape is trying to do parameter substitution. This is confirmed because %043 puts in the name of the script !
One thing that does work is pasting the ja characters into a command prompt.
#ECHO OFF
CD "%ProgramFiles%\Internet Explorer\"
Set /p URL ="file to open: "
start iexplore.exe –K %URL%
This tells me that iexplore.exe will accept and parse the parameter correctly when it has ja characters, but not when they are written into the script.
So it would be nice to know what the secret may be to getting the parameter into IE successfully via the batch file, as opposed to via the clipboard and an environment variable.
Any suggestions greatly appreciated !
best regards
Richard Collins
P.S.
another post has has made this suggestion, which i am yet to follow up:
You might have more luck in cmd.exe if you opened it in UNICODE mode. Use "cmd /U".
Batch renaming of files with international chars on Windows XP
I will need to find out if this can be from inside the script.
For the record, a simple answer has been found for this question.
If the batch file is saved as ANSI - it works !
First of all: Batch files are pretty limited in their internationalization support. There is no direct way of telling cmd what codepage a batch file is in. UTF-16 is out anyway, since cmd won't even parse that.
I have detailed an option in my answer to the following question:
Batch file encoding
which might be helpful for your needs.
In principle it boils down to the following:
Use an encoding which has single-byte mappings for ASCII
Put a chcp ... at the start of the batch file
Use the set codepage for the rest of the file
You can use codepage 65001, which is UTF-8 but make sure that your file doesn't include the U+FEFF character at the start (used as byte-order mark in UTF-16 and UTF-32 and sometimes used as marker for UTF-8 files as well). Otherwise the first command in the file will produce an error message.
So just use the following:
echo off
chcp 65001
"%ProgramFiles%\Internet Explorer\iexplore.exe" –K "%CD%\XYZ.htm"
and save it as UTF-8 without BOM (Note: Notepad won't allow you to do that) and it should work.
cmd /u won't do anything here, that advice is pretty much bogus. The /U switch only specifies that Unicode will be used for redirection of input and output (and piping). It has nothing to do with the encoding the console uses for output or reading batch files.
URL encoding won't help you either. cmd is hardly a web browser and outside of HTTP and the web URL encoding isn't exactly widespread (hence the name). cmd uses percent signs for environment variables and arguments to batch files and subroutines.
"Ampersand escape" also known as character entities known from HTML and XML, won't work either, because cmd is also not HTML or XML. The ampersand is used to execute multiple commands in a single line.
I too suffered this frustrating problem in batch/cmd files. However, so far as I can see, no one yet has stated the reason why this problem occurs, here or in other, similar posts at StackOverflow. The nearest statement addressing this was:
“First of all: Batch files are pretty limited in their internationalization support. There is no direct way of telling cmd what codepage a batch file is in.”
Here is the basic problem. Cmd files are the Windows-2000+ successor to MS-DOS and IBM-DOS bat(ch) files. MS and IBM DOS (1984 vintage) were written in the IBM-PC character set (code page 437). There, the 8th-bit codes were assigned (or “clothed” with) characters different from those assigned to the corresponding codes of Windows, ANSI, or Unicode. The presumption of CP437 encoding is unalterable (except, as previously noted, through cmd.exe /u). Where the characters of the IBM-PC set have exact counterparts in the Unicode set, Windows Explorer remaps them to the Unicode counterparts. Alas, even Windows-1252 characters like š and ¾ have no counterpart in code page 437.
Here is another way to see the problem. Try opening your batch/cmd script using the Windows Edit.com program (at C:\Windows\system32\Edit.com). The Windows-1252 character 0145 ‘ (Unicode 8217) instead appears as IBM-PC 145 æ. A batch command to rename Mary'sFile.txt as Mary’sFile.txt fails, as it is interpreted as MaryæsFile.txt.
This problem can be avoided in the case of copying a file named Mary’sFile.txt: cite it as Mary?sFile.txt, e.g.:
xCopy Mary?sFile.txt Mary?sLastFile.txt
You will see a similar treatment (substitution of question marks) in a DIR list of files having Unicode characters.
Obviously, this is useless unless an extant file has the Unicode characters. This solution’s range is paltry and inadequate, but please make what use of it you can.
You can try to use Shift-JIS encoding.