What is the easiest way to list all files in a directory with their hashes?
I am trying to compare a list of files in a folder. The problem is that all the file sizes are the same but I need to ensure that their content are the same too.
Powershell has a cmdlet named Get-FileHash. One can just do a ls and pipe the output to GetFileHash
eg. ls | Get-FileHash
You can also specify the hash algorithm by passing the -Algorithm parameter:
eg. ls | Get-FileHash -Algorithm MD5
Related
I receive files from the client and I need to make sure those files are healthy even after I process these files.
What I want to beware of is a situation of possible tampering with already processed files.
A customer error may occur for sending files with missing information and the customer wants to edit the original file to say the information was there.
That's why I had the idea of producing a report with the hash of these files that are in the reception directory.
I tried hard to make everything clearer and more intelligible, even though I didn't speak English.
Scene 1:
I have a directory with several text files and I need to export the SHA-256 and MD5 hash calculations from each file to a CSV somewhere.
Scene 2:
I have a directory with several subfolders with customer names and within these customers, other subfolders. How to extract SHA-256 & MD5 Hash from all files in these directories and subdirectories?
Something like the following should get you started. This finds the SHA256 hash for every file inside the current directory (not including subdirectories)
Get-ChildItem | Get-FileHash | Export-CSV -Path "C:\Temp\summary.csv"
PS C:\Users\Jan\Dropbox\py\advent_of_code> cat "C:\Temp\summary.csv"
#TYPE Microsoft.Powershell.Utility.FileHash
"Algorithm","Hash","Path"
"SHA256","8E8FF02948E62C54EF782373500CD0D97B8A2DA0F1655A6134B37284CF5BCE79","C:\Users\Jan\Dropbox\py\advent_of_code\20191201.in"
"SHA256","63E68322B7E8131CDDEFB77492EC7E1B8B8C46696772CE561850C854E4E8B6EA","C:\Users\Jan\Dropbox\py\advent_of_code\20191201.py"
"SHA256","FFF1D3F7F7FBDDC4CDC90E123D9BC7B0B7A450DC28F3A4F2D786701B4A9B279D","C:\Users\Jan\Dropbox\py\advent_of_code\20191204.py"
"SHA256","DE3662893F2779446AFC78B20E63F5250826D5F52206E5718E1A2713F876941E","C:\Users\Jan\Dropbox\py\advent_of_code\20191206.in"
"SHA256","B9381E11A442DDC8CF802F337F0487E9269800B145106FFDDB0E6472D8C6F129","C:\Users\Jan\Dropbox\py\advent_of_code\20191206.py"
If you really need both SHA256 and MD5:
$h = #(Get-ChildItem | Get-FileHash)
$h2 = ($h | Get-Item | Get-FileHash -Algorithm MD5)
for ($i=0; $i -lt $h.Length; $i++) {
$h[$i] = [PSCustomObject]#{Path=$h[$i].Path; SHA256=$h[$i].Hash; MD5=$h2[$i].Hash}
}
$h | Export-Csv "C:\Temp\expo.txt"
PS C:\Users\Jan\Dropbox\py\advent_of_code> cat "C:\Temp\expo.txt"
#TYPE System.Management.Automation.PSCustomObject
"Path","SHA256","MD5"
"C:\Users\Jan\Dropbox\py\advent_of_code\20191201.in","8E8FF02948E62C54EF782373500CD0D97B8A2DA0F1655A6134B37284CF5BCE79","11689BA3058C306DAA3651562621BE20"
"C:\Users\Jan\Dropbox\py\advent_of_code\20191201.py","63E68322B7E8131CDDEFB77492EC7E1B8B8C46696772CE561850C854E4E8B6EA","3F50E29C797ED4BD65122F8DA1208D4D"
"C:\Users\Jan\Dropbox\py\advent_of_code\20191204.py","FFF1D3F7F7FBDDC4CDC90E123D9BC7B0B7A450DC28F3A4F2D786701B4A9B279D","41094A3E067669F46C965D0E34EA5CA6"
"C:\Users\Jan\Dropbox\py\advent_of_code\20191206.in","DE3662893F2779446AFC78B20E63F5250826D5F52206E5718E1A2713F876941E","1A7571873E6B9430A2BFD846CA0B8AB7"
"C:\Users\Jan\Dropbox\py\advent_of_code\20191206.py","B9381E11A442DDC8CF802F337F0487E9269800B145106FFDDB0E6472D8C6F129","0CBB058D315944B7B5183E33FC780A4D"
I'm trying to generate an MD5-Checksum with powershell for a whole directory.
On Linux there is a 1-liner that works just great, like this one:
$ tar -cf - somedir | md5sum
I learned that "tar" is now part of Windows10 and that it can be adressed in the PowerShell.
So I tried this:
tar -cf - C:\data | Get-FileHash -Algorithm MD5
What I get from PowerShell is this:
tar.exe: Removing leading drive letter from member names
Get-FileHash : the input object cannot be bound to any parameters of the command because the command does not accept pipeline input or the input and its properties do not match any of the parameters that accept pipeline Input
My Shell is set to german, so I ran the german error text through a Translation machine (https://www.translator.eu/#).
I wondered why I get this particular error message, because Get-FileHash IS able to process pipelined Input, e.g.:
ls | Get-FileHash -Algorithm MD5
This command just works like a charm, but it gives me checksums for each and every file.
What I want is 1 checksum for a complete given directory.
So, I probably messed up something… - any ideas?
EDIT: Here's an alternate method that is consistent even if all the files are moved/copied to another location. This one uses the hashes of all files to create a "master hash". It takes longer to run for obvious reasons but will be more reliable.
$HashString = (Get-ChildItem C:\Temp -Recurse | Get-FileHash -Algorithm MD5).Hash | Out-String
Get-FileHash -InputStream ([IO.MemoryStream]::new([char[]]$HashString))
Original, faster but less robust, method:
$HashString = Get-ChildItem C:\script\test\TestFolders -Recurse | Out-String
Get-FileHash -InputStream ([IO.MemoryStream]::new([char[]]$HashString))
could be condensed into one line if wanted, although it starts getting harder to read:
Get-FileHash -InputStream ([IO.MemoryStream]::new([char[]]"$(Get-ChildItem C:\script\test\TestFolders -Recurse|Out-String)"))
Basically it creates a memory stream with the information from Get-ChildItem and passes that to Get-FileHash.
Not sure if this is a great way of doing it, but it's one way :-)
I was trying to populated hashes using the PowerShell Get-FileHash but it gives an error.
Get-FileHash "C:\Intel\Logs\IntelCPHS.log" -Algorithm sha256 | Format-List
This is the error:
Get-FileHash : The file 'C:\Intel\Logs\IntelCPHS.log' cannot be read:
The process cannot access the file 'C:\Intel\Logs\IntelCPHS.log'
because it is being used by another process.
So, I'm trying to copy the file to another location.
How can I do it?
I try to write a windows PowerShell script. I need to get file hash from all files in directory tree.
This is what I got so far:
Get-ChildItem -Path "c:\temp\path" -Recurse -Force -Attributes !Directory | % {Get-FileHash $_.Fullname} | Out-File "c:\temp\report_file.txt"
File c:\temp\report_file.txt is something like this:
Algorithm Hash Path
--------- ---- ----
SHA256 E3B0C44298...E4649B934CA495991B7852B855 c:\temp\path\report1.txt
SHA256 7B989C1C95...6756624B3887E501DCC377DB23 c:\temp\path\report2.txt
SHA256 EA0155401C...A6D44F1DEBB95E401AEFF4F908 c:\temp\path\report3.txt
SHA256 06DAA0E452...32E3F3104EA4564EAB67CA6A0A c:\temp\path\report4.txt
**SHA256 9C7C9FEA96...45F460BA9015C8F0A5CA830B6B c:\temp\path\report5.txt**
All works fine.
Expect:
I run this cmdlet a lot of times per day. Files are deleted and re-created from time to time in this directory tree. And... Several times order of files in output file is not the same.
In the example below file report5.txt in report file must be in last line, but it is on second line. I suppose it is because of recurse option is selected. This recurse option is needed for me. When I run cmdlet on directory with no subdirectories, result is equal all the time. But when on directory with subdirectories (directory tree) - not.
Algorithm Hash Path
--------- ---- ----
SHA256 E3B0C44298...E4649B934CA495991B7852B855 c:\temp\path\report1.txt
**SHA256 9C7C9FEA96...45F460BA9015C8F0A5CA830B6B c:\temp\path\report5.txt**
SHA256 7B989C1C95...6756624B3887E501DCC377DB23 c:\temp\path\report2.txt
SHA256 EA0155401C...A6D44F1DEBB95E401AEFF4F908 c:\temp\path\report3.txt
SHA256 06DAA0E452...32E3F3104EA4564EAB67CA6A0A c:\temp\path\report4.txt
Is here a solution to sort somehow all data by column fullpath before data outputed to report file?
You can sort by the path property of the hash object.
You can also run the files directly into Get-FileHash without using a loop, and I suggest exporting them to CSV instead of text, so it keeps the algorithm, hash and path separated so you can use them more easily:
Get-ChildItem -path "c:\temp\path" -Recurse -Force -File |
Get-FileHash |
Sort-Object -Property 'Path' |
Export-Csv -Path "c:\temp\report_file.csv" -NoTypeInformation
I am struggling to combine the output from two commands into a single CSV / TXT file.
The first command is to recursively search a folder and create an MD5 number for each document. This is then exported to a CSV file that includes the full path.
dir -recurse | Get-FileHash -Algorithm MD5 | Export-CSV MD5ofFolder.csv
The second command is to retrieve all the filenames within the folder (and sub-folders) WITHOUT including any pathing:
get-childitem -recurse|foreach {$_.name} > filename.txt
In a perfect world, I would be able to export a single CSV or TXT document that contains the MD5 values, the full path, and the filename (with extension).
I note that my second code string also produces the folder names in the output, which is not desirable. I am able to produce a text output without the folder names, but the code is ugly, and it doesn't do what I want:
dir -recurse | Get-FileHash -Algorithm MD5 | dir -recurse | foreach {$_.name} > filename.txt
I am sure this is a simple problem for someone smarter than me, so any and all help would be appreciated - I am VERY new to PowerShell.
Add the name to the output from Get-FileHash with Select-Object and a calculated property:
dir -recurse |Get-FileHash -Algorithm MD5 |Select-Object Hash,Path,#{Name='Name';Expression={[System.IO.Path]::GetFileName($_.Path)}} |Export-Csv filename.csv
Now you have it all in a single csv