Generating md5 checksum using Windows Certutil program - hash

I need to generate an md5 hash for a list of tif and jpg images.
The hash must be inserted in an XML file with metadata about each image, which will be used for the digitalisation of the document.
The use of the md5 hash was not my decision, but a formal requirement of a standard based on the Dublin Core for the digitalisation of these kinds of documents. Xml file, md5 tag is underlined
I am currently generating each md5 hash using Windows built-in Certutil program from the command prompt.
My question is simple: am I doing this right?
I know the process is slow, but the list is short.
Certutil hash function

it looks OK, remember that specifying hash algorithm (MD5) works from windows 7 and up (older windows throw and error) and must be in uppercase.
You can also add find /v "hash" to get only hash itself
like this
certUtil -hashfile pathToFileToCheck MD5 | find /v "hash"
for example, running on windows 8, i got this output
C:\Users\xxxx\Documents>certutil -hashfile innfo MD5
MD5 hash of file innfo:
67 4b ba 79 42 32 d6 24 f0 56 91 b6 da 41 34 6d
CertUtil: -hashfile command completed successfully.
and with find /v "hash" i've got
C:\Users\xxxx\Documents>certutil -hashfile innfo MD5
67 4b ba 79 42 32 d6 24 f0 56 91 b6 da 41 34 6d
the find trick is to exclude (/v parameter) following string "hash" and you have to specify the string in double quotes.
As the first and last line around hash itself contain word hash, you have clean output
my version works from cmd not powershell

I often need to create a hash file to place on an FTP service.
These hash files must contain the name of the original file to allow automatic verification that is done by various tools.
For example, if you have a file named foo.txt, it is necessary to have a file foo.txt.md5 with the following content:
a3713593c5edb65c8287eb6ff9ec4bc0 *foo.txt
The following batch does the job
for %%a in (%1) do set filename=%%~nxa
#certutil -hashfile "%1" md5 | find /V "hash" >temp.txt
for /f "delims=" %%i in (temp.txt) do set hash=%%i
echo %hash% *%filename%>%1.md5
del temp.txt
You can replace md5 for sha256 or 512 if you need. The CERTUTIL supports.

Related

How to remove leading whitespace denoted with ? using rename in MacOS

I have directories that looks like this in my MacOs:
For example ???9-24_v_hMrgprx2 where ??? is actually
white spaces. What I want to do is to use rename to remove those leading white spaces.
I tried this but failed.
rename "s/\s*//g" *
What's the right way to do it?
Update
Hexdump looks like this:
ls | hexdump -C
00000000 e3 80 80 39 2d 32 34 5f 76 5f 68 4d 72 67 70 72 |...9-24_v_hMrgpr|
00000010 78 32 0a |x2.|
00000013
Verify what those characters are first, since macOS doesn't display ASCII whitespace characters in filenames as ? (unless you have some weird encoding issue going on). It would help if you added information like this to your question:
$ touch " touched"
$ ls -l *touched
-rw-rw-r--# 1 brian staff 0 Aug 18 13:52 touched
$ ls *touched | hexdump -C
00000000 20 20 20 74 6f 75 63 68 65 64 0a | touched.|
0000000b
For rename, you've almost got it right if those leading characters were whitespace. However, you want to anchor the pattern so you only match whitespace at the beginning of the name:
rename 's/\A\s+//' *
Now that we know your filenames start with U+3000 (which is whitespace), I can see what's going on.
There are various versions of rename. Larry Wall wrote one, #tchrist wrote one based on that (and I use that), and File::Rename is another modification of Larry's original. Then there is Aristotle's version.
The problem with my rename (from #tchrist) is that it doesn't interpret the filenames as UTF-8. So, U+3000, looks like the three bytes you see: e3 80 80. I'm guessing that your font might not support any of those. There could be all sorts of things going on. See #tchrist's Unicode answer.
I can create the file:
% perl -CS -le 'print qq(\x{3000}abc)' | xargs touch
I can easily see the file, but I have a font that can display that character:
$ ls -l *abc
-rw-rw-r-- 1 brian staff 0 Aug 22 02:44  abc
But, when I try to rename it, using the -n for a dry run, I get no output (so, no matching files to change):
$ rename -n 's/\A\s+//' *abc
If I run perl directly and give it -CSA to treat the standard file handles (-CS) and the command-line arguments (-CA) as UTF-8, the file matches and the replacement happens:
$ perl -CSA `which rename` -n 's/\A\s+//' *abc
rename  abc abc
So, for my particular version, I can edit the shebang line to have the options I need. This works for me because I know my terminal settings, so it might not work everywhere for all settings:
#!/usr/bin/env perl -CSA
But the trick is how did I get that version of rename? I'm pretty sure I installed some module from CPAN that gave it to me, but what? I'd supply a patch if I could.
E3 80 80 is the UTF-8 encoding of U+3000 which is a CJK whitespace character.
rename is not a standard utility on MacOS, and there are several popular utilities with this name, so what exactly works will depend on which version you have installed. The syntax looks like you have this one, from the Perl distribution. Maybe try
rename 's/\xe3\x80\x80//' *

Hashing a string, and then verifying a string if it equals to said hash

So I'm essentially looking to password protect my batch script in a more secure way where I don't store the password within the batch file. The best idea in mind for me is to simply hash the password under SHA256, and then have the batch file let me provide the preimage to the hash. So essentially the hash is being stored in the batch file code but not the genuine password itself.
How can I do this?
Synopsis
I need a way that I can hash a string under SHA256
Then input that hash into the batch file and require an input value that will be checked to see if equal to specified hash.
I can't seem to find any kind of native command to make my batch script check for a specified hash
set value==certutil -hashstring blablabla SHA256`
if %value%== 492F3F38D6B5D3CA859514E250E25BA65935BCDD9F4F40C124B773FE536FEE7D echo this is the valid hash preimage, authenticated!
Thats an example of what im going for.
Specifics- Windows 10. Powershell
Certutil can hash only files so you need to write the string you want into a file. You can use sha256.bat that utilizes certutil:
(echo(blabla)>#
call sha256 # value
del /q /f # >nul 2>nul
echo %value%
check also this
#ECHO OFF
SETLOCAL
set /p "userinput=Batch Hash ? "
FOR /f %%h IN ('certutil -hashfile "%~f0" sha256 ^|find /v ":"') DO SET "hash=%%h"
ECHO hash=%hash%
if "%userinput%"=="%hash%" (echo match) else (echo miss)
GOTO :EOF
Nothing complicated. "%~f0" is the full filename of the batch, pass this through an escaped-pipe to find which eliminates lines containing : from the certutil response and assigns the result to %%h, thence to hash

How do applications know character encoding?

Lets say I have two files as below :
$ ll
total 8
-rw-rw-r--. 1 matias matias 6 Nov 27 20:25 ascii.txt
-rw-rw-r--. 1 matias matias 8 Nov 28 21:57 unicode.txt
Both contain a single line of text, but there is an extra character in the second file as shown here ( Greek letter Sigma ) :
$ cat ascii.txt
matias
$ cat unicode.txt
matiasΣ
If I pass them through file command this is the output :
$ file *
ascii.txt: ASCII text, with no line terminators
unicode.txt: UTF-8 Unicode text, with no line terminators
Which seems ok. Now If I make an hexdump of the file I get this :
$ hexdump -C ascii.txt
00000000 6d 61 74 69 61 73 |matias|
00000006
$ hexdump -C unicode.txt
00000000 6d 61 74 69 61 73 ce a3 |matias..|
00000008
So, my question is, how does an application as cat know that the last two bytes are actually a single Unicode character. If I print the last two bytes individually I get:
$ printf '%d' '0xce'
206
$ printf '%d' '0xa3'
163
Which in extended ASCII are :
$ py3 -c 'print(chr(206))'
Î
$ py3 -c 'print(chr(163))'
£
Is my logic flawed? What Am I missing here?
Command-line tools work with bytes – they receive bytes and send bytes.
The notion of a character – be it represented by a single or multiple bytes – is a task-specific interpretation of the raw bytes.
When you call cat on a UTF-8 file, I assume it just forwards the bytes it reads without caring about characters.
But your terminal, which has to display the output of cat, does take care to interpret the bytes as characters and show a single character for the byte sequence 206, 163.
From its configuration (locale env vars etc.), your terminal apparently assumes that text IO happens with UTF-8.
If this assumption is violated (eg. if a command sends the byte 206 in isolation, which is invalid UTF-8), you will see � symbols or other text garbage.
Since UTF-8 was designed to be backwards-compatible to ASCII, ASCII text files can be treated just like UTF-8 files (the are UTF-8).
While cat probably doesn't care about characters, many other commands do, eg. the wc -m command to count characters (not bytes!) in a text file.
Such commands all need to know how UTF-8 (or whatever your terminal encoding is) maps bytes to characters and vice versa.
For example, when you print(chr(206)) in Python, then it sends the bytes 195, 142 to STDOUT because:
(a) it has figured out your terminal expects UTF-8 and (b) the character "Î" (to which Unicode codepoint 206 corresponds) is represented with these two bytes in UTF-8.
Finally, the terminal displays "Î", because it decodes the two bytes to the corresponding character.
How do applications know character encoding?
Either:
(They guess—perhaps with heuristics. This isn't "knowing".)
They tell you exactly which one to use (via documentation, standard, convention, etc). (This isn't really "knowing" either.)
They allow you to tell them which one you are using.
It's your file; You have to know.

How to use Rar or WinRAR for creating an encrypted archive with a password starting with a double quote?

I am trying to create a command line to compress as RAR file using password through command line in Windows 7. I have installed WinRAR 5.31 x64.
The following command works for me:
rar a -r -m0 -hp"!(/!$!#!#=)\%" C:\files1.rar" *.*
The password is !(/!$!#!#=)\%.
My problem occurs if I wanted to put double quotes " inside my password, for example at the beginning:
rar a -r -m0 -hp""!(/!$!#!#=)\%" C:\files1.rar" *.*
The password should be "!(/!$!#!#=)\%.
That does not work for me, I tried putting \ before of ", but this is also not working.
Could anyone guide me through it in order to figure it out this special character in my password?
Further to the answer by Mofi:
Especially for Linux users using winrar/rar from the commandline, it may be worth realizing that rar effectively accepts "keyfiles", which may overcome the need to fiddle with quotes as part of the password.
Rar's documented maximum password length is 127 characters/bytes. It is not clear (to me) precisely which characters are part of the password space, but at least base64-encoded strings work. However, rar currently uses a password based key derivation function based on PBKDF2 using the HMAC-SHA256 hash function, which has a block size of 512 bits. Per PBKDF2, passwords longer than the block size of the hash function are first pre-hashed into a digest of 256 bits, which digest is then used as the password (instead of the original password). To avoid this, the archive password you pick should be no longer than 512 bits or 64 characters.
In a base64-encoded string, each character represents 6 bits of data; a 64 character password thus amounts to 384 random bits, which may be derived from 48 random bytes.
rar a -hp"$(dd if=/dev/urandom bs=48 count=1 | base64 -w0 | tee /tmp/pwd)" archive
The dd-pipe above will read 48 (pseudo)random bytes from the kernel's (non-blocking) random number source device, convert these into a 64 character password, tell rar to use that password for deriving a 256-bit (AES256) encryption key (RAR5-format), and at the same time store the password in the file `/tmp/pwd'.
The archive may again be accessed, e.g. listed, by reading the password back from the file, for instance like so:
rar l -p"$(cat /tmp/pwd)" archive.rar
The password file may be safely stored separately or together with the archive, in the latter case (of course) after encrypting it, e.g. with gpg using your own public key so as to lock the archive password under your private key/key phrase. All of this aims to conveniently make good use of rar's password/key space.
I note that I didn't dive into unrar's publicly available source code; the above is merely based on the general documentation. In addition, I don't know if the above can be made to work under Windows.
The Windows command interpreter cmd.exe and Rar.exe itself determine how arguments specified on command line are interpreted on parsing the command line. Argument strings containing a space or one of these characters &()[]{}^=;!'+,`~<|> must be enclosed in double quotes. This makes it very difficult to pass a double quote character as part of an argument string to a console application, especially at begin of an argument string.
But there is a solution for this very uncommon and very specific problem caused by a password/passphrase starting with a straight double quote character which marks usually begin/end of an argument string within all characters between are interpreted literally.
The manual of console version of WinRAR is the text file Rar.txt in program files folder of WinRAR. It can be read in this manual that Rar.exe supports reading switches from an environment variable RAR. By using this environment variable and special parsing of Windows command line interpreter on a SET command line it is possible to create a RAR archive from command line with a password starting with a single straight double quote character.
#echo off
setlocal EnableExtensions DisableDelayedExpansion
set "RAR=-hp""!(/!$!#!#=)\%%""
"%ProgramFiles%\WinRAR\Rar.exe" a -r -m0 -x"%~f0" "%USERPROFILE%\Desktop\files1.rar" *.*
endlocal
The switch -hp is read from environment variable RAR in addition to the other switches specified directly on RAR command line as explained by the manual.
The environment variable RAR is defined using syntax set "variable=value" as explained in detail by answer on Why is no string output with 'echo %var%' after using 'set var = text' on command line?
A password/passphrase with space or one of these characters &()[]{}^=;!'+,`~<|> needs to be enclosed in double quotes on Windows command line. For that reason Rar.exe removes from the passed password/passphrase the first and last double quote if there is one at begin and/or end. So it is not possible to define the password with "!(/!$!#!#=)\%. The password must be defined with two additional double quotes using ""!(/!$!#!#=)\%" to let really used password start with a straight double quote character.
In a batch file % marks begin/end of an environment variable reference except it is escaped with one more %.
So finally the command line set "RAR=-hp""!(/!$!#!#=)\%%"" defines the environment variable RAR with switch -hp passing the string "!(/!$!#!#=)\% to Rar.exe as password to use on encryption.
The RAR archive files1.rar is created on user's desktop by this code as root of directory C: is usually write-protected.
Note: Rar and WinRAR interpret *.* different to * as also explained in manual in comparison to Windows kernel functions interpreting them identical. Rar adds only files containing a dot in name of file into the RAR archive file on using *.*. So you might better use just * as wildcard.
The switch -x"%~f0" prevents adding the batch file also into the RAR archive file if being stored in current directory on execution of the batch file. Run in a command prompt window call /? for an explanation of %~f0 – full name of argument 0 which means batch file name with extension and full path.

Batch script is not executed if chcp was called

I'm trying to delete some files with unicode characters in them with batch script (it's a requirement). So I run cmd and execute:
> chcp 65001
Effectively setting codepage to UTF-8. And it works:
D:\temp\1>dir
Volume in drive D has no label.
Volume Serial Number is 8C33-61BF
Directory of D:\temp\1
02.02.2010 09:31 <DIR> .
02.02.2010 09:31 <DIR> ..
02.02.2010 09:32 508 1.txt
02.02.2010 09:28 12 delete.bat
02.02.2010 09:20 95 delete.cmd
02.02.2010 09:13 <DIR> Rún
02.02.2010 09:13 <DIR> Гуцул Каліпсо
3 File(s) 615 bytes
4 Dir(s) 11 576 438 784 bytes free
D:\temp\1>rmdir Rún
D:\temp\1>dir
Volume in drive D has no label.
Volume Serial Number is 8C33-61BF
Directory of D:\temp\1
02.02.2010 09:56 <DIR> .
02.02.2010 09:56 <DIR> ..
02.02.2010 09:32 508 1.txt
02.02.2010 09:28 12 delete.bat
02.02.2010 09:20 95 delete.cmd
02.02.2010 09:13 <DIR> Гуцул Каліпсо
3 File(s) 615 bytes
3 Dir(s) 11 576 438 784 bytes free
Then I put the same rmdir commands in batch script and save it in UTF-8 encoding. But when I run nothing happens, literally nothing: not even echo works from batch script in this case. Even saving script in OEM encoding does not help.
So it seems that when I change codepage to UTF-8 in console, scripts just stop working. Does somebody know how to fix that?
If you want to have unicode supported in batch file, then CHCP on a line by itself just aborts the batch file. What I suggest is putting CHCP on each batch file line that needs unicode as follows
chcp 65001 > nul && <real command here>
Example: In my case I wanted to have a nice TAIL of my log files while debugging, but the content for even Latin-1 characters was being messed up. So here is my batch file which wraps the real tail implementation from Windows Resource Kit.
#C:\WINDOWS\system32\chcp.com 65001 >nul && tail.exe -f %1
In addition, for output to a console, you need to set a true type font, i.e. Lucidia Console.
And apparently for output to a file the command line needs to run as Unicode, so you would kick off your batch script as follows
cmd /u /c <batch file command here>
Disclaimer: Tested on Windows XP sp3 with Windows Resource Kit.
The Unicode support in console, and especially in batch files, is pretty bad.
Can you "twist" the requirement to say PowerShell or Active Scripting (VBScript or JScript)?
It will save you a lot of grief in the long run (if you need to grow this beyond this simple task)
Not to mention that both PowerShell and ActiveScripting use way more powerful languages, allowing for functions, proper loops, real variables, debuggers, a lot of goodies for a more serious project.
Try inserting a blank line as first line in your batch file...
Line 1:
Line 2:CHCP 65001
Line 3:script commmands
Should work!