Block Encoding format in PKCS - rsa

In PKCS block encoding format for RSA what is difference in Block type 01 and 02? And when are they used?

From https://www.rfc-editor.org/rfc/rfc2313#section-8.1 (Thanks, James):
8.1 Encryption-block formatting
A block type BT, a padding string PS, and the data D shall be
formatted into an octet string EB, the encryption block.
EB = 00 || BT || PS || 00 || D . (1)
The block type BT shall be a single octet indicating the structure of
the encryption block. For this version of the document it shall have
value 00, 01, or 02. For a private- key operation, the block type
shall be 00 or 01. For a public-key operation, it shall be 02.
It's worth noting that the phrase "block type" does not appear in the PKCS#1 v2.0 RFC (https://www.rfc-editor.org/rfc/rfc2437)
Later we see the verification half:
9.4 Encryption-block parsing
The encryption block EB shall be parsed into a block type BT, a
padding string PS, and the data D according to Equation (1).
It is an error if any of the following conditions occurs:
The encryption block EB cannot be parsed
unambiguously (see notes to Section 8.1).
The padding string PS consists of fewer than eight
octets, or is inconsistent with the block type BT.
The decryption process is a public-key operation
and the block type BT is not 00 or 01, or the decryption
process is a private-key operation and the block type is
not 02.
(emphasis mine)
Using the terminology from RFC2313:
Signature Generation is an encryption process which is a private-key operation (BT=01)
Signature Verification is a decryption process which is a public-key operation (BT=01)
Initiate Key Transfer / Enveloping is an encryption process which is a public-key operation (BT=02)
Receive Key Transfer is a decryption process which is a private-key operation (BT=02).
RFC 2437 (PKCS#1 2.0) replaced many of these terms, and got rid of the block type notion altogether. That byte is simply dictated to be 01 for PKCS#1 signature formatting, and dictated to be 02 for PKCS#1 encryption. That byte isn't fixed for either OAEP (encryption) or PSS (signing) padding schemes.

Related

Sign sha256 hash with RSA using pkcs#11 api?

I'm currently using pyPkcs11 to sign files.
The following call works for signing common files with RSA and sha256,
session.sign(privKey, toSign, Mechanism(CKM_SHA256_RSA_PKCS, None)
But some of my files are already hashed (sha256), and the signature needs to give the same output that would be given by this openSSL command:
openssl pkeyutl -sign -pkeyopt digest:sha256 -in <inFilePath> -inkey <keyPath> -out <outFilePath>
I have tried the following call, which does not generate a hash of the file before signature,
session.sign(privKey, toSign, Mechanism(CKM_RSA_PKCS, None)
But the result is not the one i expected, and according to the first answer of this post CKM_RSA_PKCS vs CKM_RSA_X_509 mechanisms in PKCS#11,
CKM_RSA_PKCS on the other hand also performs the padding as defined in the PKCS#1 standards. This padding is defined within EMSA-PKCS1-v1_5, steps 3, 4 and 5. This means that this mechanism should only accept messages that are 11 bytes shorter than the size of the modulus. To create a valid RSASSA-PKCS1-v1_5 signature, you need to perform steps 1 and 2 of EMSA-PKCS1-v1_5 yourself.
After some research it appears that my file contains the first step of the signature described by the RFC 3447, so the missing part is the second one, where the ASN.1 value is generated.
Can I force this operation with pkcs11 and how ?
The PKCS#11 documentation doesn't seem to contain any information about it.
I see two ways to do this; the appropriate one depends on the token (not all tokens/wrappers do all machanisms).
As explained in this other answer, you could decorate the 32-octet SHA-256 that you start from into a full padded message representative. Basically, as explained in PKCS#1, if the RSA key is k octets (with I assume k≥51+11=62 octets, that is a public modulus at least 8⋅62-7=489 bits, which is a must for security), you
Append on the left the 19-octet string
30 31 30 0d 06 09 60 86 48 01 65 03 04 02 01 05 00 04 20
yielding a 51-octet string, which really is the ASN.1 DER encoding for the hash type and value per
DigestInfo ::= SEQUENCE {
digestAlgorithm DigestAlgorithm,
digest OCTET STRING
}
Further append on the left the string of k-51 octets (of which k-51-3 are FF)
00 01 FF FF FF FF..FF FF FF FF 00
yielding a k-octet string
Sign with mechanism CKM_RSA_X_509 (length k octets).
Or, alternatively: perform as in [1.]; skip [2.]; and in [3.] use mechanism CKM_RSA_PKCS (length 51 octets).
Disclaimer: I did not check, and have not used a PKCS#11 device lately.
Note: While there is no known attack against proper implementations of it, use of PKCS#1 v1.5 signature padding is increasingly frowned at; e.g. French authorities recommend
RecomSignAsym-1. Il est recommandé d’employer des mécanismes de signature asymétrique disposant d’une preuve de sécurité.
Or, in English:
It is recommended to use asymmetric signature mechanisms featuring a security proof
They mention RSA-SSA-PSS (sic). As a bonus, the PKCS#11 implementation of that is mechanism CKM_RSA_PKCS_PSS which accepts a hash, rather than the data to sign, making what's asked trivial.
I'm afraid, there is no PKCS#11 function, which can do what you want.
The only solution that I am aware of would be, to apply the PKCS#1 v1.5 padding manually to your hash and then sign the block using the CKM_RSA_X_509 (raw or textbook RSA) mechanism.
Rather than issuing the command,
openssl pkeyutl -sign -pkeyopt digest:sha256 -in <inFilePath> -inkey <keyPath> -out <outFilePath>
try
openssl dgst --sha256 -sign <keyPath> -sha256 -out <signature file> <inFilePath>
This produces 256 bytes signature that would probably be identical to
one generated by Pkcs11 C_Sign

Interpreting ASN.1 indefinite-lenght encoding with multiple encapsulated octet-strings

I have a BER structure like this...
$ openssl asn1parse -inform der -in test.der -i -dump
????:d=4 hl=2 l=inf cons: cont [ 0 ]
????:d=5 hl=3 l= 240 prim: OCTET STRING
0000 - AABBCCDD
????:d=5 hl=2 l= 8 prim: OCTET STRING
0000 - EEFF
????:d=5 hl=2 l= 0 prim: EOC
...or in der2ascii style...
[0] `80`
OCTET_STRING { `AABBCCDD` }
OCTET_STRING { `EEFF` }
`0000`
What I know: indefinite-length encoding must contain a constructed type, because primitive types may introduce ambiguities, e.g. when containing 0x0000. What I want to know: How does a decoder must behave when parsing this BER structure? Are the header bytes of both OCTET STRINGs included in the encoding? If yes, how is indefinite-length byte data encoded? How does an application interpret the value of the TLV field tagged [0], when the second OCTET STRING is e.g. an INTEGER?
I am asking this question, because in the CMS standard, a field is defined as single OCTET STRING, but in most BER encodings I always see two of them. Is this only due to the indefinite-length encoding? Am I missing something?
From ITU-T X.690:
8.1.4 Contents octets
The contents octets shall consist of zero, one or more octets, and shall encode the data value as specified in
subsequent clauses.
NOTE – The contents octets depend on the type of the data value;
subsequent clauses follow the same sequence as the definition of types
in ASN.1.
Does this mean, that I can put every constructed type and the application must only interpret the value part of the contructed TLV structure?
When you encode a primitive OCTET STRING in indefinite length mode, the encoder must:
split up the value into chunks of smaller OCTET STRINGs
encode each chunk in definite length mode so that each has its own TLV (with length!)
the whole sequence of definite length encoded primitive OCTET STRINGs must be framed by a single, indefinite length encoded constructed OCTET STRING "container" having its own TLV (without length, but with end-of-octets sentinel)
At the other end, the decoder extracts the V part from the inner, definite length OCTET STRING chunks (dropping their TL headers). Then joins/consumes V's together in the order of arrival dropping the TL part of the outer frame.
Note that the idea behind indefinite length encoding technique is that both encoder and decoder can emit/consume incomplete, possibly oversized, data.
Chunk size is chosen by the encoder/application based on data availability, memory situation and possibly the estimation of decoder's buffering capabilities. I think this is mentioned somewhere in the X.280/X.680 papers.
Encoder is not allowed to put chunks of different ASN.1 types into any single indefinite length encoded container. In other words, all chunks must be of the same type as the outer container.
That should hopefully explain why you may see multiple (depending on chunk size) OCTET STRINGs in the indefinite length encoded BER/CER stream where just a single OCTET STRING is expected.
DER forbids indefinite length encoding on the grounds that serialized representation of the same data may change on re-encoding (due to potentially changing chunk size).

SignatureValue calculation for XML-DSIG

I am trying to write a method that returns a signature of an XML element for XMLDSIG using NET framework components (RSACryptoServiceProvider) in C++/CLI. Could please someone explain this excerpt from XMLDSIG specs ( http://www.w3.org/TR/2002/REC-xmldsig-core-20020212/ ) in simpler words, for I am have very little programming and maths background and therefore have trouble undrestanding this - Or provide an excerpt form a real code as an example where this is implemented?
The SignatureValue content for an RSA signature is the base64 [MIME]
encoding of the octet string computed as per RFC 2437 [PKCS1, section
8.1.1: Signature generation for the RSASSA-PKCS1-v1_5 signature scheme]. As specified in the EMSA-PKCS1-V1_5-ENCODE function RFC 2437
[PKCS1, section 9.2.1], the value input to the signature function MUST
contain a pre-pended algorithm object identifier for the hash
function, but the availability of an ASN.1 parser and recognition of
OIDs is not required of a signature verifier. The PKCS#1 v1.5
representation appears as: CRYPT (PAD (ASN.1 (OID, DIGEST (data))))
Note that the padded ASN.1 will be of the following form: 01 | FF*
| 00 | prefix | hash where "|" is concatenation, "01", "FF", and "00"
are fixed octets of the corresponding hexadecimal value, "hash" is the
SHA1 digest of the data, and "prefix" is the ASN.1 BER SHA1 algorithm
designator prefix required in PKCS1 [RFC 2437], that is, hex 30 21
30 09 06 05 2B 0E 03 02 1A 05 00 04 14 This prefix is included to make
it easier to use standard cryptographic libraries. The FF octet MUST
be repeated the maximum number of times such that the value of the
quantity being CRYPTed is one octet shorter than the RSA modulus.
In other words, if I am have the hash value for a certain XML element (not encoded in base64, is that right?), what do I do with it before sending it to the SignHash (in RSACryptoServiceProvider) function?
I know it's in the text, but I have troubles understanding it.
I don't understand "CRYPT (PAD (ASN.1 (OID, DIGEST (data))))" at all, although I understand parts of it... I don't understand the way to get the OID and then ASN and how to pad it...
Let me try to explain the components, and see if this gets you any closer:
DIGEST(data) is the hash-value you already computed
OID is a globally unique identifier representing the hash-algorithm used. For SHA1 this is 1.3.14.3.2.26
ANS.1 means ANS.1-encoding of the OID and the hash-value as an ASN.1-sequence. This means the hex-values listed in the reference, followed by the actual hash.
PAD means concatenating 01 FF* 01 with the ASN.1-encoded prefix and the hash to get the desired length (FF* means repeat FF an appropriate number of times, the RFC gives details)
CRYPT is the RSA-encryption-function
However, I believe the signHash-function does all of this for you, you just provide the OID and the hash-value.

How can I convert the tiger hash values from the official implementations into the form used by Direct Connect?

I am trying to implement a Direct Connect Client, and I am currently stuck at a point where I need to hash the files in order to be able to upload them to other clients.
As the all other clients require a TTHL (Tiger Tree Hashing Leaves) support for verification of the downloaded data. I have searched for implementations of the algorithm, and found tiger-hash-python.
I have implemented a routine that uses the hash function from before, and is able to hash large files, according to the logic specified in Tree Hash EXchange format (THEX) (basically, the tree diagram is the important part on that page).
However, the value produced by it is similar to those shown on Wikipedia, a hex digest, but is different from those shown in the DC clients I'm using for reference.
I have been unable to find out how the hex digest form is converted to this other one (39 characters, A-Z, 0-9). Could someone please explain how that is done?
Well ... I tried what Paulo Ebermann said, using the following functions:
def strdivide(list,length):
result = []
# Calculate how many blocks there are, using the condition: i*length = len(list).
# The additional maths operations are to deal with the last block which might have a smaller size
for i in range(0,int(math.ceil(float(len(list))/length))):
result.append(list[i*length:(i+1)*length])
return result
def dchash(data):
result = tiger.hash(data) # From the aformentioned tiger-hash-python script, 48-char hex digest
result = "".join([ "".join(strdivide(result[i:i+16],2)[::-1]) for i in range(0,48,16) ]) # Representation Transform
bits = "".join([chr(int(c,16)) for c in strdivide(result,2)]) # Converting every 2 hex characters into 1 normal
result = base64.b32encode(bits) # Result will be 40 characters
return result[:-1] # Leaving behind the trailing '='
The TTH for an empty file was found to be 8B630E030AD09E5D0E90FB246A3A75DBB6256C3EE7B8635A, which after the transformation specified here, becomes 5D9ED00A030E638BDB753A6A24FB900E5A63B8E73E6C25B6. Base-32 encoding this result yielded LWPNACQDBZRYXW3VHJVCJ64QBZNGHOHHHZWCLNQ, which was found to be what DC++ generates.
The only mention of the format of the hash in the Direct Connect protocol I found is on the $SR page on the NMDC Protocol wiki:
For files containing TTH, the <hub_name> parameter is replaced with TTH:<base32_encoded_tth_hash> (ref: TTH_Hash).
So, it is Base32-encoding. This is defined in RFC 4648 (and some earlier ones), section 6.
Basically, you are using the capital letters A-Z and the decimal digits 2 to 7, and one base32 digit represents 5 bits, while one base16 (hexadecimal) digit represents only 4 ones.
This means, each 5 hex digits map to 4 base32-digits, and for a Tiger hash (192 bits) you will need 40 base32-digits (in the official encoding, the last one would be a = padding, which seems to be omitted if you say that there are always 39 characters).
I'm not sure of an implementation of a conversion from hex (or bytes) to base32, but it shouldn't be too complicated with a lookup table and some bit-shifting.

Play! hash password returns bad result

I'm using Play 1.2.1. I want to hash my users password. I thought that Crypto.passwordHash will be good, but it isn't. passwordHash documentation says it returns MD5 password hash. I created some user accounts in fixture, where I put md5 password hash:
...
User(admin):
login: admin
password: f1682b54de57d202ba947a0af26399fd
fullName: Administrator
...
The problem is, when I try to log in, with something like this:
user.password.equals(Crypto.passwordHash(password))
and it doesn't work. So I put a log statement in my autentify method:
Logger.info("\nUser hashed password is %s " +
"\nPassed password is %s " +
"\nHashed passed password is %s",
user.password, password, Crypto.passwordHash(password));
And the password hashes are indeed different, but hey! The output of passwordHash method isn't even an MD5 hash:
15:02:16,164 INFO ~
User hashed password is f1682b54de57d202ba947a0af26399fd
Passed password is <you don't have to know this :P>
Hashed passed password is 8WgrVN5X0gK6lHoK8mOZ/Q==
How about that? How to fix it? Or maybe I have to implement my own solution?
Crypto.passwordHash returns base64-encoded password hash, while you are comparing to hex-encoded.
MD5 outputs a sequence of 16 bytes, each byte having (potentially) any value between 0 and 255 (inclusive). When you want to print the value, you need to convert the bytes to a sequence of "printable characters". There are several possible conventions, the two main being hexadecimal and Base64.
In hexadecimal notation, each byte value is represented as two "hexadecimal digits": such a digit is either a decimal digit ('0' to '9') or a letter (from 'a' to 'f', case is irrelevant). The 16 bytes thus become 32 characters.
In Base64 encoding, each group of three successive bytes is encoded as four characters, taken in a list of 64 possible characters (digits, lowercase letters, uppercase letters, '+' and '/'). One or two final '=' signs may be added so that the encoded string consists in a number of characters which is multiple of 4.
Here, '8WgrVN5X0gK6lHoK8mOZ/Q==' is the Base64 encoding of a sequence of 16 bytes, the first one having value 241, the second one 104, then 43, and so on. In hexadecimal notation, the first byte would be represented by 'f1', the second by '68', the third by '2b'... and the hexadecimal notation of the complete sequence of 16 bytes is then 'f1682b54de57d202ba947a0af26399fd', the value that you expected.
The play.libs.Codec class contains methods for decoding and encoding Base64 and hexadecimal notations. It also contains Codec.hexMD5() which performs MD5 hashing and returns the value in hexadecimal notation instead of Base64.
as Nickolay said you are comparing Hex vs Base-64 strings. Also, I would recommend using BCrypt for that, not the Crypto tool of Play.