Checksum between 2 files - hash

i have 2 files, let's call them "a.exe" and "b.exe" what i want is to have "a.exe" to contain "b.exe"'s checksum and compare it so it knows if "b.exe" was edited, and vice versa.
What i already tried:
My problem is that when i include the checksum on "a.exe" it's checksum changes, same if i include "a.exe"'s checksum on "b.exe".
Is there any workaround to this?
Basically what i want is that both files check each other so if one of them has been changed the other would know.

The following can work, although it's kind of a hack and it's not particularly secure. But depending on your use case, it could be good enough.
First, compile a.exe and b.exe. Then, compute the checksum of a.exe and append it to the end of b.exe. And compute the checksum for b.exe and append it to the end of a.exe.
Now, to have a.exe verify the checksum of b.exe, you do the following in a.exe:
Open b.exe as a binary file.
Read the contents of b.exe, minus the last few bytes that contain the a.exe checksum, into memory.
Compute the checksum of that memory block.
Read the checksum for b.exe from the end of a.exe.
Compare computed checksum against the value you got from the end of a.exe.
This isn't secure, because somebody could modify the files and change the checksums.
To append the checksum to an exe in Windows:
copy /b a.exe + checksum.txt newa.exe
I don't know how to do it on other operating systems.

Related

Add the hash of the code in executable file

I have an STM32 application which uses two blocks of memory. In 0th block, I have a boot code (which runs just after power-on) and in 7th block, I have an application code (which may or may not run depending on the authorization decision given by the boot code).
Those two codes are developed hence generated by two separate projects. They are flashed on the specific blocks (boot code to 0th block and application code to 7th block) of STM32 NOR memory using openocd tool by giving an offset value to the openocd's write_image command.
What I would like to do basically in the boot code is that I want to calculate the hash of the application code and compare it with the reference digest. If they are equal, I will give the hand to the application code. For that, after I generate the executable (can be in elf, hex or bin format) of the application code, I want to:
Create another file (in any format listed above) which has 128K byte size
Copy the content of the executable file to the recently created file from its beginning (0 offset)
Write the hash of the executable to the last 32 bytes of the recently created file
Fill the gap with 0xFF
Finally flash this executable file (if it is still) to the 7th block of the memory
Do you think that it is doable and feasible? If so:
Which format should I use to generate the executable?
Do I have something that I need to give specific attention to achieve this?
Lastly, do you think that it makes sense to do that or is there any other more standard way for this purpose?
Thanks a lot in advance.
You just need to add an additional step to your building sequence. After the linking extract the binary file from elf
Then write a program in your favourite programming language which will calculate something and append the result to that bin file

Is Strawberry Perl safe?

Strawberry Perl lists SHA1 digests on its download page.
However, looking at download page snapshots on archive.org, their SHA1 digests for the same perl version and build seem to change over time.
Example: in the download page snapshot from 2013-05-10, strawberry-perl-5.16.3.1-32bit-portable.zip is shown to be 86.8 MB long with an SHA1 digest of 3b9c4c32bf29e141329c3be417d9c425a7f6c2ff.
In the download page snapshot from 2017-02-14, the same strawberry-perl-5.16.3.1-32bit-portable.zip is shown to be 87.3 MB long with an SHA1 digest of 7f6da2c3e1b7a808f27969976777f47a7a7c6544.
And on the current download page, the same strawberry-perl-5.16.3.1-32bit-portable.zip is shown to be 91.0 MB long with an SHA1 digest of 00ba29e351e2f74a7dbceaad5d9bc20159dd7003
I thought they might have recompiled the package for some reason, but the current strawberry-perl-5.10.0.6-portable.zip has only one file dated later than 2009 (it's portable.perl), so this doesn't explain why the archive grew over time. Sadly, I don't have older zip files, so I have no way of knowing what changed inside the archive.
What's going on here? Why do past builds change over time?? I am kind of concerned that some hackers might be injecting malicious code or something into binary perl packages...
Is there a rational explanation here ? Thanks...
An hash such as a SHA1 digest is good to defend against communication errors, that is to ensure integrity of what you downloaded (basically proving: "file on your hard disk" = "file on the webserver"), but just by itself does not help to ensure authentication.
For that, files should be signed with some PGP signature, or using X.509 certificates. This is the only way you could verify that the file was indeed produced by the true intended authors.
So just by itself your observation neither signal an attack nor helps you defend against one in fact.
Like #ikegami said, you can even configure compressors with a different RAM/time ratio and the same on will produce different results.
See for example in Unix zip:
-Z cm
--compression-method cm
Set the default compression method. Currently the main methods supported by zip are store and deflate. Compression method can be set to:
store - Setting the compression method to store forces zip to store entries with no compression. This is generally faster than compressing entries, but results in no space savings. This is the same as using -0 (compression level zero).
deflate - This is the default method for zip. If zip determines that storing is better than deflation, the entry will be stored instead.
bzip2 - If bzip2 support is compiled in, this compression method also becomes available. Only some modern unzips currently support the bzip2 compression method, so test the unzip you will be using before relying on archives using this method (compression method 12).
and
-#
(-0, -1, -2, -3, -4, -5, -6, -7, -8, -9)
Regulate the speed of compression using the specified digit #, where -0 indicates no compression (store all files), -1 indicates the fastest compression speed (less compression) and -9 indicates the slowest compression speed (optimal compression, ignores the suffix list). The default compression level is -6.
Though still being worked, the intention is this setting will control compression speed for all compression methods. Currently only deflation is controlled.
The same source code could have been recompressed over time when the website is regenerated. So same content but different archive results.
I have downloaded all three files. In fact they are the same sizes exactly (or the WayBack machine did not store them correctly, it does not do a redirection), and all in fact with a SHA1 hash of 00ba29e351e2f74a7dbceaad5d9bc20159dd7003.
Your best bet is probably to ask StrawberryPerl organization directly.

check if MAT file is corrupt without load

I have a data set consisting of large number of .mat files. Each .mat file is of considerable size i.e. loading them is time-consuming. Unfortunately, some of them are corrupt and load('<name>') returns error on those files. I have implemented a try-catch routine to determine which files are corrupt. However, given the situation that only handful of them are corrupt, loading each file and checking if it is corrupt is time taking. Is there any way I can check the health of a .mat file without using load('<name>')?
I have been unsuccessful in finding such solution anywhere.
The matfile function is used to access variables in MAT-files, without loading them into memory. By changing your try-catch routine to use matfile instead of load, you reduce the overhead of loading the large files into the memory.
As matfile appears to only issue a warning when reading a corrupt file, you'll have to check if this warning was issued. This can be done using lastwarn: clear lastwarn before calling matfile, and check if the warning was issued afterwards:
lastwarn('');
matfile(...);
[~, warnId] = lastwarn;
if strcmp(warnId, 'relevantWarningId')
% File is corrupt
end
You will have to find out the relevant warning id first, by running the above code on a corrupt file, and saving the warnId.
A more robust solution would be to calculate a checksum or hash (e.g. MD5) of the file upon creation, and comparing this checksum before reading the file.

How to rewrite a file from a shell script without any danger of truncating the file if out of disk space?

How to rewrite a file from a shell script without any danger of truncating the file if out of disk space?
This handy perl one liner replaces all occurrences of "foo" with "bar" in a file called test.txt:
perl -pi -e 's/foo/bar/g' test.txt
This is very useful, but ...
If the file system where test.txt resides has run out of disk space, test.txt will be truncated to a zero-byte file.
Is there a simple, race-condition-free way to avoid this truncation occuring?
I would like the test.txt file to remain unchanged and the command to return an error if the file system is out of space.
Ideally the solution should be easily used from a shell script without requiring additional software to be installed (beyond "standard" UNIX tools like sed and perl).
Thanks!
In general, this can’t be done. Remember that the out-of-space condition can hit anywhere along the sequence of actions that give the appearance of in-place editing. Once the filesystem is full, perl may not be able to undo previous actions in order to restore the original state.
A safer way to use the -i switch is to use a nonempty backup suffix, e.g.,
perl -pi.bak -e 's/foo/bar/g' test.txt
This way, if something goes wrong along the way, you still have your original data.
If you want to roll your own, be sure to check the value returned from the close system call. As the Linux manual page notes,
Not checking the return value of close() is a common but nevertheless serious programming error. It is quite possible that errors on a previous write(2) operation are first reported at the final close(). Not checking the return value when closing the file may lead to silent loss of data. This can especially be observed with NFS and with disk quota.
As with everything else in life, leave yourself more margin for error. Disk is cheap. Dig out the pocket change from your couch cushions and go buy yourself half a terabyte or so.
From perldoc perlrun:
-i[extension]
specifies that files processed by the "<>" construct are to be edited in-place.
It does this by renaming the input file, opening the output file by the original
name, and selecting that output file as the default for print() statements. The
extension, if supplied, is used to modify the name of the old file to make a
backup copy, following these rules:
If no extension is supplied, no backup is made and the current file is
overwritten.
[…]
Rephrased:
The backup filename is determined from the value of the -i-switch, if one is given.
The original file is renamed to the new filename, and opened for the script. Renaming is atomic on most filesystems.
A file with the name of the original file is opened for writing. The file will start with length zero, but is not identical to the original file (which has a different name now).
After the script has finished, and if no explicit backup extension was provided, the backup file is deleted. The original file is then lost.
Should the system run out of drive space, then the new file is endangered, not the original file which was never copied or moved (at least on filesystems with an inode-like concept).

Why do some sites have a md5 string on each file?

On some sites, in their download section each file has a md5. md5 of what? i cant understand the purpose
on phpBB.com for example:
Download phpBB 3.0.6 (zip)
Size: 2.30 MiB
MD5: 63e2bde5bd03d8ed504fe181a70ec97a
It is the signature of the file's hash. The idea is that you can run MD5 against the downloaded file, then compare it against that value to make sure you did not end up with a corrupted download.
This is a checksum, for verifying that the file as-downloaded is intact, without transmission errors. If the checksum listing is on a different server than the download, it also may give a little peace of mind that the download server hasn't been hacked (with the presumption that two servers are harder to hack than one).
It's a hash of the file. Used to ensure file integrity once you download said file. You'd use an md5 checksum tool to verify the file state.
Sites will post checksums so that you can make sure the file downloaded is the same as the file they're offering. This lets you ensure that file has not been corrupted or tampered with.
On most unix operating systems you can run md5 or md5sum on a file to get the hash for it. If the hash you get matches the hash from the website, you can be reasonably certain that the file is intact. A quick Google search will get you md5sum utilities for Windows.
You might also see an SHA-1 hash sometimes. It's the same concept, but a different and more secure algorithm.
This is an md5 hash of the entire binary contents of the file. The point is that if two files have different md5 hashes, they are different. This helps you determine whether a local file on your computer is the same as the file on the website, without having to download it again. For instance:
You downloaded your local copy somewhere else and think there might be a virus inside.
Your connection is lossy and you fear the file might be corrupted by the download.
You have changed the local file name and want to know which version you have.