How to detect block cipher mode - aes

How to detect if a message was crypt by CBC or ECB mode?
I have made a function who encrypt in AES 128 CBC or ECB randomly, and I do hamming between clear text and cipher text, but seams not correlated to cipher mode.
How can I detect the block cipher mode?
Thank you in advance

The answer is pretty much given in the problem statement:
Remember that the problem with ECB is that it is stateless and
deterministic; the same 16 byte plaintext block will always produce
the same 16 byte cipher text.
Thus, with the assumption that some repeated plaintext blocks occur at the same ciphertext block offsets, we can simply go ahead and look for repeated ciphertext blocks of various lengths.

I am doing the same problem set and just finished this problem (using clojure).
My first hint is, it will be more clear what you need to do if you are using a language which supports first class functions/lambdas.
Anyways, let's break down the problem a bit:
First, just write a function which validates that a blackbox is encrypting data with ecb. How would you do this?
It might look something like (pseudocode below)
function boolean isEcbBlackbox(func f)
{ //what input can I use to determine this?
result = f("chosen input")
if(result ...) {//what property of result should I look for?
true
} else {
false
}
}
Remember, the key weakness of ECB is identical blocks of plaintext will be encrypted to identical blocks of ciphertext.
EDIT: The challenges are now public, so I will link to my solution(s):
https://github.com/dustinconrad/crypto-tutorial/blob/master/src/crypto_tutorial/lib/block.clj#L118

compute block size based on cipher text % 16 or 24 or 32 which ever is == 0
hamming distance should be done by cipher block 1 with rest of the cipher blocks
if we average to per byte using floating point arithmatic, if the value is below certain threshold then it is ECB.

I know the exact exercise you're doing, I'm currently doing it right now myself. I would recommend doing Frequency Analysis on the encrypted strings (don't forget the string might be base64'd or hex). If you get back a frequency distribution that matches the language of the string you encoded then it's safe to assume it's in ECB, otherwise it's probably CBC.
I don't know if this will actually work as I'm just doing the exercise now, but it's a start.
EDIT:
I rushed this answer a bit and feel I should explain more. If it's been encrypted in ECB mode then the frequency analysis should show a normal distribution style regardless of any padding to the start/end of the string and key used. Where as encryption in CBC mode should have a very random and probably flat distribution.

Related

is there a technique to identify hash algorithm/encoding thats being used?

For example, take this hash or encoding string:
[<åæ0®(k±¥Ò,X±}#ãqã Î(KmV
Is it possible to identify the hasing or encoding algorithm used to generate this string using some kind of algorithm?
Hash functions produce a result of a given size, so a 256 bit hash output must have been produced by a hash that produces at least 256 bits. That does not work the other way, since a longer output might be truncated before use -- the 256 bits may be taken from a 512 bit hash output. Beyond that it is basically a matter of trial and error.
Cyphers are designed to produce random-appearing output. Hence, the output from good cyphers appears random and you should not be able to tell which cypher was used to produce this particular random-appearing output.
The presence of IVs, nonces etc. and the cyphertext length might tell you something about what mode was used, and the block size for a block cypher.

Reduce the length of cipher text generated from RSA algorithm

I am generating a cipher text using RSA algorithm and it's working fine. But the thing is, I the cipher text generated is very high.
For example:
Plain text : 249488213
gets generated to,
Cipher text : 94489103D862769B7AE21EA42C2D400A584D0F919BBCAE2450AD1BE57EAC64E4A2F75FAB9F8FA25BCBC12AAAE58F43CCB071DC002332FF4C736F4DA96A36C3ED
which is too large for my use-case as it increase the file size of my plain text file to approx 2.5 times greater.
So my concern is, can we reduce the length of cipher text to some minimum length (despite of key size we use), or is there any other asymmetric algorithm that can help me achieve this.
Any help is appreciated. Thanks.
RSA encryption is described as
c = m^e mod N,
where c is a cipher text, m is an original message, e is public exponent (typically 65537) and N is public modulus.
Thus, c is always smaller than N, but in most of the cases of the same order as it. Sure you can select N and m to get a small c, but this will obviously lead to make encryption weaker, and you need a special key for every message.
Probably, the same problem will be with other assymetric cryptosystems. Shorter cipher text is easier to recover. But you can use AES, which can produce in counter mode a cipher text of the same size as an original message. Which reveals size of the message to the attacker.

Why must all inputs to AES be multiples of 16?

I'm using the PyCrypto implementation of AES and I'm trying to encrypt some text (24 bytes) using a 24 byte key.
aes_ecb = AES.new('\x00'*24, AES.MODE_ECB)
aes_ecb.encrypt("123456"*4)
I get this surprising error ValueError: Input strings must be a multiple of 16 in length
So why is it that my input must be a multiple of 16? It would make more sense to me that the input string length must be a multiple of my key size, because this would allow nice bitwise operations between the key and blocks of plaintext.
AES is a block cipher. Quote from the Wikipedia page: “a block cipher is a deterministic algorithm operating on fixed-length groups of bits”.
AES can only work with blocks of 128 bits (that is, 16 chars, as you noticed).
If your input can have lengths others than a multiple of 128, depending on your application, you may have to be extremely careful how you handle padding.
Just want to add info about mods of operations
Yes, AES is a 128-bit (16-byte) block cipher with multiple possible key length (128, 192, 256), but the cause of this text padding limitation (and error msg) is ECB mode of operation. ECB is the simplest of the encryption modes. I don't know your goals, so will just skip the part that it doesn't provide serious message confidentiality.
CBC and CTR are more common and usually appropriate to use and in CTR mode you don't need 128-bit message length.
There is also ciphertext stealing (CTS) method for ECB and CBC modes.
Method of using a block cipher mode of operation that allows for
processing of messages that are not evenly divisible into blocks
without resulting in any expansion of the ciphertext, at the cost of
slightly increased complexity
But Ciphertext stealing for ECB mode requires the plaintext to be longer than one 128-bit block.
Because the block size is 16 bytes, the way to handle this is to add padding when encrypting.

Is there any classic 3 byte fingerprint function?

I need a checksum/fingerprint function for short strings (say, 16 to 256 bytes) which fits in a 24 bits word. Is there any well known algorithm for that?
I propose to use a 24-bit CRC as an easy solution. CRCs are available in all lengths and always simple to compute. Wikipedia has a matching entry. The quality is far better than a modulo-reduced sum, because swapping characters will most likely produce a different CRC.
The next step (if it is a real threat to have a wrong string with the same checksum) would be a cryptographic MAC like CMAC. While this is too long out of the book, it can be reduced by taking the first 24 bits.
Simplest thing to do is a basic checksum - add up the bytes in the string, mod (2^24).
You have to watch out for character set issues when converting to bytes though, so everyone agrees on the same encoding of characters to bytes.

Binary integral data compression

I need to transmit integral data types over the network but don't want to transfer all 32 (or 64) bits all the time - data fits into just one byte 99% of time - so it looks like it's need to compress it somehow: for example first bit of a byte is 0 if other 7 bits means just some value (0-127), otherwise (if first byte is 1) it's need to shift these 7 bytes left and read second byte to do the same process.
Is there some common way to do this? I don't want to reinvent a wheel...
Thank you.
The scheme you describe (which is essentially a base-128 encoding: each byte is a 7-bit base-128 "digit" and a single bit flag to indicate whether or not it is the final digit) is a common way of doing this.
For example, see:
the section on "LEB128" in the DWARF spec (§7.6);
"Base 128 Varints" in Google's protocol buffers;
"Variable Width Integers" in the LLVM bitcode format (various different widths are used in various different places there).
Just about any data compression algorithm would be able to compress that kind of data stream very well. Use whatever compression libraries your language provides.