How RSA generate two primes number from passphrase - rsa

I know that two primes number p and q and e can conduct public and private key encrypt and decrypt functionality,but I don't understand how an passphrase enter by user can be convert to some 1024 or 2048 bit primes.
I only know using ASCII to convert to number but don't know how to convert to two primes for p and q.

Related

I want to convert a string to hash and divid into n bucket

Problem: I want to divide the M strings into N bucket as uniformly as possible.
One solution I was thinking is,
Create a hash of string
Convert the hash to integer by mapping ascii value of each character in hash
Sum up those ascii values
Divide the sum by N
Uniform distribution I believe will be solved by hashing but not sure converting to ascii will change anything.
Please suggest better solution if you have one.
Thank you in advance

Why does gcrypt say to recalculate the coefficient of an RSA key when converting from SSL format to gcrypt?

The documentation for libgcrypt says:
An RSA private key is described by this S-expression:
(private-key
(rsa
(n n-mpi)
(e e-mpi)
(d d-mpi)
(p p-mpi)
(q q-mpi)
(u u-mpi)))
...and...
p-mpi
RSA secret prime p.
q-mpi
RSA secret prime q with p < q.
u-mpi
Multiplicative inverse u = p^{-1} mod q.
...and...
Note that OpenSSL uses slighly different parameters: q < p and u = q^{-1} mod p.
To use these parameters you will need to swap the values and recompute u.
Here is example code to do this:
if (gcry_mpi_cmp (p, q) > 0)
{
gcry_mpi_swap (p, q);
gcry_mpi_invm (u, p, q);
}
If in one p is the smaller prime and in the other q is the smaller prime, and given that the two equations are identical save for exchanging p and q, is it really necessary to have to recompute u? Is it not sufficient just to exchange p and q?
As a side question, I am curious why gcrypt doesn't use the same values as the PKCS#1 encoding:
RSAPrivateKey ::= SEQUENCE {
version Version,
modulus INTEGER, -- n
publicExponent INTEGER, -- e
privateExponent INTEGER, -- d
prime1 INTEGER, -- p
prime2 INTEGER, -- q
exponent1 INTEGER, -- d mod (p-1)
exponent2 INTEGER, -- d mod (q-1)
coefficient INTEGER, -- (inverse of q) mod p
otherPrimeInfos OtherPrimeInfos OPTIONAL
}
o modulus is the RSA modulus n.
o publicExponent is the RSA public exponent e.
o privateExponent is the RSA private exponent d.
o prime1 is the prime factor p of n.
o prime2 is the prime factor q of n.
o exponent1 is d mod (p - 1).
o exponent2 is d mod (q - 1).
o coefficient is the CRT coefficient q^(-1) mod p.
The answer is that recalculating "u" is irrelevant. Simply swap the usage of "p" and "q" and it all works.
As a general comment on gcrypt, the asymmetric cryptography APIs are terrible. Truly abysmal.
There is no support for loading keys from file in ANY format.
There is no support to simply encrypt/decrypt a buffer. Instead you need to convert the buffer to an MPI before you can then convert it to an S-expression. After encryption you need to unwind the resulting S-expression to get the right piece and then call yet another function to get at the data itself. Decryption requires slightly more complexity in creating the S-expression to decrypt from a buffer, but retrieving the data is only one function call.
The parameters to the S-expression for the private key do not match the values in standard PKCS#1 format (although as covered by this question and answer, conversion is fairly easy). Why not?
During the course of my investigations I discovered that there is another GNU encryption library. Why they maintain two I have no idea. The other is called "nettle" and is much better:
*) It uses the GMP library for multi-precision integers, rather than having its own type as gcrypt does (mpi_t).
*) It supports loading keys from files in a variety of formats (I used it as the basis for my own code to load keys for use with gcrypt).
*) It supports conversion from various formats (PEM->DER, DER->Sexp).
*) It supports a variety of symmetric encryption algorithms and modes.
*) It supports asymmetric encryption/decryption/signing/verification.
I didn't actually use it so I cannot really comment on the APIs usability, but from what I saw it was generally much much better.
I don't really know the background on nettle, but I do wonder if it was created simply because gcrypt's API is so awful and they would rather start over than enhance gcrypt.

Are SHA1 hashes distributed uniformly?

I have a string in Python. I calculate the SHA1 hash of that string with hashlib. I convert it to its hexadecimal representation and take the last 16 characters to use as an identifier:
hash_str = "foobarbazάλφαβήταγάμμα..."
hash_obj = hashlib.sha1(hash_str, encode('utf-8'))
hash_id = hash_obj.hexdigest()[:16]
My goal is an identifier that provides reasonable length and is unlikely to yield the same hash_id value for a different hash_str input.
If the probability of a SHA1 collision is 1/(2^160), or 1/(16^40), then if I take the last sixteen characters of the hex representation, is the probability of a collision only 1/(16^16)? Or are the bytes (or their hex equivalent) not distributed evenly?
Yes. Any hash function which exhibits the property of uniformity has equal chance of any value in its output range being generated by a randomly chosen input value. Therefore, each value of the truncated hash is equally likely too. SHA-1 is is hash function that demonstrates uniformity, therefore your conjecture is true.

RSA-Calculate d without p and q

I am given a task to decrypt a message. However, i am only given the value of n and e. So, is it still possible to find the value of d? Is there any shortcut formula that can calculate d without knowing p and q?
The security of RSA is derived from the difficulty in calculating d from e and n (the public key). It sounds like the task you have been set is essentially to break RSA by factoring n into its prime factors p and q, and then using these to calculate d. Assuming n is not too large, factorization should be relatively easy (Wolfram|Alpha may be able to do it for example).

Using hash functions with Bloom filters

A bloom filter uses a hash function (or many) to generate a value between 0 and m given an input string X. My question is how to you use a hash function to generate a value in this way, for example an MD5 hash is typically represented by a 32 length hex string, how would I use an MD5 hashing algorithm to generate a value between 0 and m where I can specify m? I'm using Java at the moment so an example of to do this with the MessageDigest functionality it offers would be great, though just a generic description of how to do about it would be fine too.
Thanks
You should first convert the hash output to an unsigned integer, then reduce it modulo m. This looks like this:
MessageDigest md = MessageDigest.getInstance("MD5");
// hash data...
byte[] hashValue = md.digest();
BigInteger n = new BigInteger(1, hashValue);
n = n.mod(m);
// at that point, n has a value between 0 and m-1 (inclusive)
I have assumed that m is a BigInteger instance. If necessary, use BigInteger.valueOf(). Similarly, use n.intValue() or n.longValue() to get the value of n as one of the primitive types of Java.
The modular reduction is somewhat biased, but the bias is very small if m is substantially smaller than 2^128.
Simplest way would probably be to just convert the hash output (as a byte sequence) to a single binary number and take that modulo m.