MD5 hashing Algorithm step 2 and 3 - hash

I created a repository on github to write some of the computer security algorithms, and it's time to write MD5 Algorithm, i searched about papers/videos explain the algorithm with examples alongside steps, but i didn't.
I wrote this for step 1 and i don't know if this is correct or not?
//step1
var textP = ToBinaryString(Encoding.UTF8, text);
textP = textP.Length < 448 ? textP + '1' : textP;
while (textP.Length <448)
{
textP += '0';
}
Console.WriteLine(textP);
Second: to step 2, append length
A 64 bit representation of b is appended to the result of the previous step
The resulting message has a length that is an exact multiple of 512 bits
it means to append to the 448bits the origin bits of the string?

No, the first step is not correct, as it doesn't pad correctly in case there are fewer than 64 bits left in the block. In that case the padding will have to span two blocks - first put in a 1 and fill the rest with zero's, then create a 448 bit block.
The second sentence is unclear to me. The 64 bits encoding of the input size in bits needs to be added after the padding has taken place.
Note that you're trying to recreate the algorithm description literally. That's not a good idea. You need to process blocks of plaintext, keeping count of the number of bits or bytes and then perform the padding and length encoding when the end of the stream is indicated. You need a 512 bit buffer, an update and final method.
Creating the hash by using a string representing binary is not a good idea. You should process bytes and possibly words on the inside. You only need to encode anything for debugging purposes.

Related

SHA256 Hashing with two blocks: Where should the hash of the first block be inputed?

I have just implemented an SHA256 generator, but am encountering problems for multiblock has. Could anyone help to clarify the problem, please?
For easy checking, we use this specific text input: "And to every beast of the earth, and to every fowl of th"
Since this is exactly 448 bits, it must be split into 2 blocks, according to the padding rule (length field is just too short by one bit).
My step by step outputs are as follows:
Original message in binary is: 0,1,0,0,0,0,0,1,0,1,1,0,1,1,1,0,0,1,1,0,0,1,0,0,0,0,1,0,0,0,0,0,0,1,1,1,0,1,0,0,0,1,1,0,1,1,1,1,0,0,1,0,0,0,0,0,0,1,1,0,0,1,0,1,0,1,1,1,0,1,1,0,0,1,1,0,0,1,0,1,0,1,1,1,0,0,1,0,0,1,1,1,1,0,0,1,0,0,1,0,0,0,0,0,0,1,1,0,0,0,1,0,0,1,1,0,0,1,0,1,0,1,1,0,0,0,0,1,0,1,1,1,0,0,1,1,0,1,1,1,0,1,0,0,0,0,1,0,0,0,0,0,0,1,1,0,1,1,1,1,0,1,1,0,0,1,1,0,0,0,1,0,0,0,0,0,0,1,1,1,0,1,0,0,0,1,1,0,1,0,0,0,0,1,1,0,0,1,0,1,0,0,1,0,0,0,0,0,0,1,1,0,0,1,0,1,0,1,1,0,0,0,0,1,0,1,1,1,0,0,1,0,0,1,1,1,0,1,0,0,0,1,1,0,1,0,0,0,0,0,1,0,1,1,0,0,0,0,1,0,0,0,0,0,0,1,1,0,0,0,0,1,0,1,1,0,1,1,1,0,0,1,1,0,0,1,0,0,0,0,1,0,0,0,0,0,0,1,1,1,0,1,0,0,0,1,1,0,1,1,1,1,0,0,1,0,0,0,0,0,0,1,1,0,0,1,0,1,0,1,1,1,0,1,1,0,0,1,1,0,0,1,0,1,0,1,1,1,0,0,1,0,0,1,1,1,1,0,0,1,0,0,1,0,0,0,0,0,0,1,1,0,0,1,1,0,0,1,1,0,1,1,1,1,0,1,1,1,0,1,1,1,0,1,1,0,1,1,0,0,0,0,1,0,0,0,0,0,0,1,1,0,1,1,1,1,0,1,1,0,0,1,1,0,0,0,1,0,0,0,0,0,0,1,1,1,0,1,0,0,0,1,1,0,1,0,0,0
message length is: 448 bits
Padded complete message is: 0,1,0,0,0,0,0,1,0,1,1,0,1,1,1,0,0,1,1,0,0,1,0,0,0,0,1,0,0,0,0,0,0,1,1,1,0,1,0,0,0,1,1,0,1,1,1,1,0,0,1,0,0,0,0,0,0,1,1,0,0,1,0,1,0,1,1,1,0,1,1,0,0,1,1,0,0,1,0,1,0,1,1,1,0,0,1,0,0,1,1,1,1,0,0,1,0,0,1,0,0,0,0,0,0,1,1,0,0,0,1,0,0,1,1,0,0,1,0,1,0,1,1,0,0,0,0,1,0,1,1,1,0,0,1,1,0,1,1,1,0,1,0,0,0,0,1,0,0,0,0,0,0,1,1,0,1,1,1,1,0,1,1,0,0,1,1,0,0,0,1,0,0,0,0,0,0,1,1,1,0,1,0,0,0,1,1,0,1,0,0,0,0,1,1,0,0,1,0,1,0,0,1,0,0,0,0,0,0,1,1,0,0,1,0,1,0,1,1,0,0,0,0,1,0,1,1,1,0,0,1,0,0,1,1,1,0,1,0,0,0,1,1,0,1,0,0,0,0,0,1,0,1,1,0,0,0,0,1,0,0,0,0,0,0,1,1,0,0,0,0,1,0,1,1,0,1,1,1,0,0,1,1,0,0,1,0,0,0,0,1,0,0,0,0,0,0,1,1,1,0,1,0,0,0,1,1,0,1,1,1,1,0,0,1,0,0,0,0,0,0,1,1,0,0,1,0,1,0,1,1,1,0,1,1,0,0,1,1,0,0,1,0,1,0,1,1,1,0,0,1,0,0,1,1,1,1,0,0,1,0,0,1,0,0,0,0,0,0,1,1,0,0,1,1,0,0,1,1,0,1,1,1,1,0,1,1,1,0,1,1,1,0,1,1,0,1,1,0,0,0,0,1,0,0,0,0,0,0,1,1,0,1,1,1,1,0,1,1,0,0,1,1,0,0,0,1,0,0,0,0,0,0,1,1,1,0,1,0,0,0,1,1,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0
Message length after padding is: 1024 bits
Starting block 1 hash for partial message: 0,1,0,0,0,0,0,1,0,1,1,0,1,1,1,0,0,1,1,0,0,1,0,0,0,0,1,0,0,0,0,0,0,1,1,1,0,1,0,0,0,1,1,0,1,1,1,1,0,0,1,0,0,0,0,0,0,1,1,0,0,1,0,1,0,1,1,1,0,1,1,0,0,1,1,0,0,1,0,1,0,1,1,1,0,0,1,0,0,1,1,1,1,0,0,1,0,0,1,0,0,0,0,0,0,1,1,0,0,0,1,0,0,1,1,0,0,1,0,1,0,1,1,0,0,0,0,1,0,1,1,1,0,0,1,1,0,1,1,1,0,1,0,0,0,0,1,0,0,0,0,0,0,1,1,0,1,1,1,1,0,1,1,0,0,1,1,0,0,0,1,0,0,0,0,0,0,1,1,1,0,1,0,0,0,1,1,0,1,0,0,0,0,1,1,0,0,1,0,1,0,0,1,0,0,0,0,0,0,1,1,0,0,1,0,1,0,1,1,0,0,0,0,1,0,1,1,1,0,0,1,0,0,1,1,1,0,1,0,0,0,1,1,0,1,0,0,0,0,0,1,0,1,1,0,0,0,0,1,0,0,0,0,0,0,1,1,0,0,0,0,1,0,1,1,0,1,1,1,0,0,1,1,0,0,1,0,0,0,0,1,0,0,0,0,0,0,1,1,1,0,1,0,0,0,1,1,0,1,1,1,1,0,0,1,0,0,0,0,0,0,1,1,0,0,1,0,1,0,1,1,1,0,1,1,0,0,1,1,0,0,1,0,1,0,1,1,1,0,0,1,0,0,1,1,1,1,0,0,1,0,0,1,0,0,0,0,0,0,1,1,0,0,1,1,0,0,1,1,0,1,1,1,1,0,1,1,1,0,1,1,1,0,1,1,0,1,1,0,0,0,0,1,0,0,0,0,0,0,1,1,0,1,1,1,1,0,1,1,0,0,1,1,0,0,0,1,0,0,0,0,0,0,1,1,1,0,1,0,0,0,1,1,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
Block length: 512
SHA256 hash value of Block 1 is: 2,1,8,2,5,A,3,D,F,3,1,C,E,E,3,9,3,6,A,D,A,F,C,3,6,0,F,8,0,F,9,8,8,0,C,5,8,D,2,F,F,E,C,3,4,4,A,8,2,9,D,F,6,8,2,F,2,5,D,F,6,D,C,D
Could anyone please clarify this two questions for me:
Is the hash number for the Block 1 correct?
If Block 1 hash is all correct and good, where should this hash number 21825A3D... be inputed when Block 2 hashing is about to start?
Thanks very much for any help!
Is the hash number for the Block 1 correct?
Yes. Congrats. It's not a number though; it is just a hash value as the binary doesn't represent a single number value.
If Block 1 hash is all correct and good, where should this hash number 21825A3D... be inputed when Block 2 hashing is about to start?
There are these initial hash values, sometimes also called constants. They are named h0..h7 in the pseudo-code of Wikipedia. Your found intermediate hash should replace them. The h0..h7 is also the output of the final hash (unfortunately, as having no final operation on the hash allows for the length extension attacks).

Struggling with Base 64 encoding in T-SQL - and the padding [duplicate]

What is the purpose of padding in base64 encoding. The following is the extract from wikipedia:
"An additional pad character is allocated which may be used to force the encoded output into an integer multiple of 4 characters (or equivalently when the unencoded binary text is not a multiple of 3 bytes) ; these padding characters must then be discarded when decoding but still allow the calculation of the effective length of the unencoded text, when its input binary length would not be not a multiple of 3 bytes (the last non-pad character is normally encoded so that the last 6-bit block it represents will be zero-padded on its least significant bits, at most two pad characters may occur at the end of the encoded stream)."
I wrote a program which could base64 encode any string and decode any base64 encoded string. What problem does padding solves?
Your conclusion that padding is unnecessary is right. It's always possible to determine the length of the input unambiguously from the length of the encoded sequence.
However, padding is useful in situations where base64 encoded strings are concatenated in such a way that the lengths of the individual sequences are lost, as might happen, for example, in a very simple network protocol.
If unpadded strings are concatenated, it's impossible to recover the original data because information about the number of odd bytes at the end of each individual sequence is lost. However, if padded sequences are used, there's no ambiguity, and the sequence as a whole can be decoded correctly.
Edit: An Illustration
Suppose we have a program that base64-encodes words, concatenates them and sends them over a network. It encodes "I", "AM" and "TJM", sandwiches the results together without padding and transmits them.
I encodes to SQ (SQ== with padding)
AM encodes to QU0 (QU0= with padding)
TJM encodes to VEpN (VEpN with padding)
So the transmitted data is SQQU0VEpN. The receiver base64-decodes this as I\x04\x14\xd1Q) instead of the intended IAMTJM. The result is nonsense because the sender has destroyed information about where each word ends in the encoded sequence. If the sender had sent SQ==QU0=VEpN instead, the receiver could have decoded this as three separate base64 sequences which would concatenate to give IAMTJM.
Why Bother with Padding?
Why not just design the protocol to prefix each word with an integer length? Then the receiver could decode the stream correctly and there would be no need for padding.
That's a great idea, as long as we know the length of the data we're encoding before we start encoding it. But what if, instead of words, we were encoding chunks of video from a live camera? We might not know the length of each chunk in advance.
If the protocol used padding, there would be no need to transmit a length at all. The data could be encoded as it came in from the camera, each chunk terminated with padding, and the receiver would be able to decode the stream correctly.
Obviously that's a very contrived example, but perhaps it illustrates why padding might conceivably be helpful in some situations.
On a related note, here's an arbitrary base converter I created for you. Enjoy!
https://convert.zamicol.com/
What are Padding Characters?
Padding characters help satisfy length requirements and carry no other meaning.
Decimal Example of Padding:
Given the arbitrary requirement all strings be 8 characters in length, the number 640 can meet this requirement using preceding 0's as padding characters as they carry no meaning, "00000640".
Binary Encoding
The Byte Paradigm: For encoding, the byte is the de facto standard unit of measurement and any scheme must relate back to bytes.
Base256 fits exactly into the byte paradigm. One byte is equal to one character in base256.
Base16, hexadecimal or hex, uses 4 bits for each character. One byte can represent two base16 characters.
Base64 does not fit evenly into the byte paradigm (nor does base32), unlike base256 and base16. All base64 characters can be represented in 6 bits, 2 bits short of a full byte.
We can represent base64 encoding versus the byte paradigm as a fraction: 6 bits per character over 8 bits per byte. Reduced this fraction is 3 bytes over 4 characters.
This ratio, 3 bytes for every 4 base64 characters, is the rule we want to follow when encoding base64. Base64 encoding can only promise even measuring with 3 byte bundles, unlike base16 and base256 where every byte can stand on it's own.
So why is padding encouraged even though encoding could work just fine without the padding characters?
If the length of a stream is unknown or if it could be helpful to know exactly when a data stream ends, use padding. The padding characters communicate explicitly that those extra spots should be empty and rules out any ambiguity. Even if the length is unknown with padding you'll know where your data stream ends.
As a counter example, some standards like JOSE don't allow padding characters. In this case, if there is something missing, a cryptographic signature won't work or other non base64 characters will be missing (like the "."). Although assumptions about length aren't made, padding isn't needed because if there is something wrong it simply won't work.
And this is exactly what the base64 RFC says,
In some circumstances, the use of padding ("=") in base-encoded data
is not required or used. In the general case, when assumptions about
the size of transported data cannot be made, padding is required to
yield correct decoded data.
[...]
The padding step in base 64 [...] if improperly
implemented, lead to non-significant alterations of the encoded data.
For example, if the input is only one octet for a base 64 encoding,
then all six bits of the first symbol are used, but only the first
two bits of the next symbol are used. These pad bits MUST be set to
zero by conforming encoders, which is described in the descriptions
on padding below. If this property do not hold, there is no
canonical representation of base-encoded data, and multiple base-
encoded strings can be decoded to the same binary data. If this
property (and others discussed in this document) holds, a canonical
encoding is guaranteed.
Padding allows us to decode base64 encoding with the promise of no lost bits. Without padding there is no longer the explicit acknowledgement of measuring in three byte bundles. Without padding you may not be able to guarantee exact reproduction of original encoding without additional information usually from somewhere else in your stack, like TCP, checksums, or other methods.
Alternatively to bucket conversion schemes like base64 is radix conversion which has no arbitrary bucket sizes and for left-to-right readers is left padded. The "iterative divide by radix" conversion method is typically employed for radix conversions.
Examples
Here is the example form RFC 4648 (https://www.rfc-editor.org/rfc/rfc4648#section-8)
Each character inside the "BASE64" function uses one byte (base256). We then translate that to base64.
BASE64("") = "" (No bytes used. 0 % 3 = 0)
BASE64("f") = "Zg==" (One byte used. 1 % 3 = 1)
BASE64("fo") = "Zm8=" (Two bytes. 2 % 3 = 2)
BASE64("foo") = "Zm9v" (Three bytes. 3 % 3 = 0)
BASE64("foob") = "Zm9vYg==" (Four bytes. 4 % 3 = 1)
BASE64("fooba") = "Zm9vYmE=" (Five bytes. 5 % 3 = 2)
BASE64("foobar") = "Zm9vYmFy" (Six bytes. 6 % 3 = 0)
Here's an encoder that you can play around with: http://www.motobit.com/util/base64-decoder-encoder.asp
There is not much benefit to it in the modern day. So let's look at this as a question of what the original historical purpose may have been.
Base64 encoding makes its first appearance in RFC 1421 dated 1993. This RFC is actually focused on encrypting email, and base64 is described in one small section 4.3.2.4.
This RFC does not explain the purpose of the padding. The closest we have to a mention of the original purpose is this sentence:
A full encoding quantum is always completed at the end of a message.
It does not suggest concatenation (top answer here), nor ease of implementation as an explicit purpose for the padding. However, considering the entire description, it is not unreasonable to assume that this may have been intended to help the decoder read the input in 32-bit units ("quanta"). That is of no benefit today, however in 1993 unsafe C code would have very likely actually taken advantage of this property.
With padding, a base64 string always has a length that is a multiple of 4 (if it doesn't, the string has been corrupted for sure) and thus code can easily process that string in a loop that processes 4 characters at a time (always converting 4 input characters to three or less output bytes). So padding makes sanity checking easy (length % 4 != 0 ==> error as not possible with padding) and it makes processing simpler and more efficient.
I know what people will think: Even without padding, I can process all 4-byte chunks in a loop and then just add special handling for the last 1 to 3 bytes, if those exist. It's just a few lines of extra code and the speed difference will be too tiny to even measure. Probably true but you are thinking in terms of C (or higher languages) and a powerful CPU with plenty of RAM. What if you need to decode base64 in hardware, using a simple DSP, that has very limited processing power, no RAM storage and you have to write the code in very limited micro-assembly? What if you cannot use code at all and everything has to be done with just transistors stacked together (a hardwired hardware implementation)? With padding that's way simpler than without.
Padding fills the output length to a multiple of four bytes in a defined way.

Size of binary file after base64 encoding? Need explanation on the solution

So I'm studying for the upcoming exam, and there's this question: given a binary file with the size of 31 bytes what will its size be, after encoding it to base64?
The solution teacher gave us was (40 + 4) bytes as it needs to be a multiple of 4.
I'm not being able to come across this solution, and I have no idea how to solve this, so I was hoping somebody could help me figure this out.
Because base 64 encoding divide the input data in six bit block and one block use an ascii code.
If you have 31 byte in input you have 31*8/6 bit block to encode. As a rule of thumb every three byte in input you have 4 byte in output
If input data is not a multiple of six bit the base64 encoding fills the last block with 0 bit
In your example you have 42 block of six bit, with last filled with missing 0 bit.
Base 64 algorithm implementation filled the encoded data with '=' symbol in order to have of multiple of 4 as final result.

Go, midstate SHA-256 hash

Having 128 bytes of data, for example:
00000001c570c4764aadb3f09895619f549000b8b51a789e7f58ea750000709700000000103ca064f8c76c390683f8203043e91466a7fcc40e6ebc428fbcc2d89b574a864db8345b1b00b5ac00000000000000800000000000000000000000000000000000000000000000000000000000000000000000000000000080020000
And wanting to perform SHA-256 hash on it, one would have to separate it into two 64 bytes of data and hash them individually before hashing the results together. If one was to often change some bits in the second half of the data, one could simplify the calculations and hash the first half of the data only once. How would one do that in Google Go? I tried calling
func SingleSHA(b []byte)([]byte){
var h hash.Hash = sha256.New()
h.Write(b)
return h.Sum()
}
But instead of the proper answer
e772fc6964e7b06d8f855a6166353e48b2562de4ad037abc889294cea8ed1070
I got
12E84A43CBC7689AE9916A30E1AA0F3CA12146CBF886B60103AEC21A5CFAA268
When discussing the matter on Bitcoin forum, someone mentioned that there could be some problems with getting that midstate hash.
How do I calculate a midstate SHA-256 hash in Google Go?
Bitcoin-related byte operations are a bit tricky, as they tend to switch endianness at a whim. First of, we take the initial []byte array representing
00000001c570c4764aadb3f09895619f549000b8b51a789e7f58ea750000709700000000103ca064f8c76c390683f8203043e91466a7fcc40e6ebc428fbcc2d89b574a864db8345b1b00b5ac00000000000000800000000000000000000000000000000000000000000000000000000000000000000000000000000080020000
Then, we separate out the first half of the array, obtaining:
00000001c570c4764aadb3f09895619f549000b8b51a789e7f58ea750000709700000000103ca06 4f8c76c390683f8203043e91466a7fcc40e6ebc428fbcc2d8
After that, we need to swap some bytes around. We reverse the order of bytes in every slice of 4 bytes, thusly obtaining:
0100000076C470C5F0B3AD4A9F619598B80090549E781AB575EA587F977000000000000064A03C10396CC7F820F8830614E94330C4FCA76642BC6E0ED8C2BC8F
And that is the array we will be using for calculating the midstate. Now, we need to alter the file hash.go, adding to type Hash interface:
Midstate() []byte
And change the file sha256.go, adding this function:
func (d *digest) Midstate() []byte {
var answer []byte
for i:=0;i<len(d.h);i++{
answer=append(answer[:], Uint322Hex(d.h[i])...)
}
return answer
}
Where Uint322Hex converts an uint32 variable into a []byte variable. Having all that, we can call:
var h BitSHA.Hash = BitSHA.New()
h.Write(Str2Hex("0100000076C470C5F0B3AD4A9F619598B80090549E781AB575EA587F977000000000000064A03C10396CC7F820F8830614E94330C4FCA76642BC6E0ED8C2BC8F"))
log.Printf("%X", h.Midstate())
Where Str2Hex turns a string into []byte. The result is:
69FC72E76DB0E764615A858F483E3566E42D56B2BC7A03ADCE9492887010EDA8
Remembering the proper answer:
e772fc6964e7b06d8f855a6166353e48b2562de4ad037abc889294cea8ed1070
We can compare them:
69FC72E7 6DB0E764 615A858F 483E3566 E42D56B2 BC7A03AD CE949288 7010EDA8
e772fc69 64e7b06d 8f855a61 66353e48 b2562de4 ad037abc 889294ce a8ed1070
So we can see that we just need to swap the bytes around a bit in each slice of 4 bytes and we will have the proper "midstate" used by Bitcoin pools and miners (until it will no longer be needed due to being deprecated).
The Go code you have is the right way to compute sha256 of a stream of bytes.
Most likely the answer is that what you want to do is not sha256. Specifically:
one would have to separate it into two 64 bits of data and hash them individually before hashing the results together. If one was to often change some bits in the second half of the data, one could simplify the calculations and hash the first half of the data only once.
is not a valid way to calculate sha256 (read http://doc.golang.org/src/pkg/crypto/sha256/sha256.go to e.g. see that sha256 does its work on blocks of data, which must be padded etc.).
The algorithm you described calculates something, but not sha256.
Since you know the expected value you presumably have some reference implementation of your algorithm in another language so just do a line-by-line port to Go.
Finally, it's a dubious optimization in any case. 128 bits is 16 bytes. Hashing cost is usually proportional to the size of data. At 16 bytes, the cost is so small that the additional work of trying to be clever by splitting data in 8 byte parts will likely cost more than what you saved.
In sha256.go, at the start of function Sum() the implementation is making a copy of the SHA256 state. The underlying datatype of SHA256 (struct digest) is private to the sha256 package.
I would suggest to make your own private copy of the sha256.go file (it is a small file). Then add a Copy() function to save the current state of the digest:
func (d *digest) Copy() hash.Hash {
d_copy := *d
return &d_copy
}
Then simply call the Copy() function to save a midstate SHA256 hash.
I ran two Go benchmarks on your 128 bytes of data, using an Intel i5 2.70 GHz CPU. First, 1,000 times, I wrote all 128 bytes to the SHA256 hash and read the sum, which took a total of about 9,285,000 nanoseconds. Second, I wrote the first 64 bytes to the SHA256 hash once and then, 1,000 times, I wrote the second 64 bytes to a copy of the SHA256 hash and read the sum, which took a total of about 6,492,371 nanoseconds. The second benchmark, which assumed the first 64 bytes are invariant, ran in 30% less time than the first benchmark.
Using the first method, you could calculate about 9,305,331,179 SHA256 128-byte sums per day before buying a faster CPU. Using the second method, you could calculate 13,307,927,103 SHA256 128-byte sums per day, assuming the first 64 bytes are invariant 1,000 times in a row, before buying a faster CPU. How many SHA256 128-byte sums per day do you need to calculate? For how many SHA256 128-byte sums per day are the first 64 bytes are invariant?
What benchmarks did you run and what were the results?

Jpeg huffman coding procedure

A Huffman table, in JPEG standard, is generated from a collection of statistics in two steps. One of the steps is implementing function/method given by this picture: (This picture is given in Annex K of JPEG standard):
Problem is here. Previously in standard (Annex C) says this sentence:
Huffman tables are specified in terms of a 16-byte list (BITS) giving the number of codes for each code length from
1 to 16. This is followed by a list of the 8-bit symbol values (HUFFVAL), each of which is assigned a Huffman code.
Obviously BITS is list of 16 elements. But in picture above, i is first set to 32 (i=32) then we want to access BITS[i]. Probably I misunderstood something, so please let someone gives me an answer.
Here is JPEG standard description of picture:
Figure K.3 gives the procedure for adjusting the BITS list so that no code is longer than 16 bits. Since symbols are paired
for the longest Huffman code, the symbols are removed from this length category two at a time. The prefix for the pair
(which is one bit shorter) is allocated to one of the pair; then (skipping the BITS entry for that prefix length) a code word
from the next shortest non-zero BITS entry is converted into a prefix for two code words one bit longer. After the BITS
list is reduced to a maximum code length of 16 bits, the last step removes the reserved code point from the code length
count.
Here is code for picture above:
void adjustBitLengthTo16Bits(vector<char>&BITS){
int i=32,j=0;
while(1){
if(BITS[i]>0){
j=i-1;
j--;
while(BITS[j]<=0)
j--;
BITS[i]=BITS[i]-2;
BITS[i-1]=BITS[i-1]+1;
BITS[j+1]=BITS[j+1]+2;
BITS[j]=BITS[j]-1;
continue;
}
else{
i--;
if(i!=16)
continue;
while(BITS[i]==0)
i--;
BITS[i]--;
return;
}
}
}
This code is only for encoders that want to generate their own custom Huffman tables. The majority of JPEG encoders just use fixed tables that are reasonable approximations of the statistics of most images. In this particular case, the first step in generating a Huffman table for the AC coefficients produces a table up to 32 entries (bits) long. Since there are only 256 unique symbols to encode (skip/length pairs), there should never be more than 32 bits needed to specify all of the Huffman codes. After the first pass has produced a set of codes (up to 32-bits in length), the second pass takes the least frequent (longest) codes and "moves" them into shorter length slots so that the maximum code length is 16-bits. In an ideal Huffman table, the frequency distributions correspond to the code lengths. In this case, the table is being made to fit by squeezing the longest codes into slots reserved for shorter codes. This can be done because the 14/15/16 bit length Huffman codes have "room" for more permutations of bits and can "fit" the longer codes in them.
Update:
There is limited benefit to "optimizing" the Huffman tables in JPEG. Most of the compression occurs because of the quantization and DCT transform of the pixels. Switching to arithmetic coding has a measurable benefit (~10% size reduction), but then it limits the audience since most JPEG decoders don't support arithmetic coding due to past patent issues.