Where is the ASN1 moduels for certificate extention - x509

I'm writing a DER parser for certificate requests in .net.
I based myself on the RFC 2986 which described most of the content of the request with ASN.1 modules.
However, it don't define how is structured the extensionRequest (oid 1.2.840.113549.1.9.14). I've searched high and low but I'm not able to find another rfc or publicly available documentation which describe what structure it use, what types are expected, etc (ie, the ASN.1 module of the extensionRequest object and it's children)
Sample Der decoded :
SEQUENCE (3 elem)
SEQUENCE (4 elem)
INTEGER 0
SEQUENCE (14 elem)
SEQUENCE (2 elem)
SEQUENCE (2 elem)
OBJECT IDENTIFIER 1.2.840.113549.1.1.1 rsaEncryption (PKCS #1)
NULL
BIT STRING (1120 bit) 001100001000000110001001000000101000000110000001000000001011111100011…
SEQUENCE (2 elem)
INTEGER (1024 bit) 134193393845175687447721541202995749257369077931432148182685911334902…
INTEGER 65537
[0] (4 elem)
SEQUENCE (2 elem)
OBJECT IDENTIFIER 1.3.6.1.4.1.311.13.2.3 osVersion (Microsoft attribute)
SET (1 elem)
IA5String 10.0.19042.2
SEQUENCE (2 elem)
OBJECT IDENTIFIER 1.3.6.1.4.1.311.21.20 requestClientInfo (Microsoft attribute)
SET (1 elem)
SEQUENCE (4 elem)
INTEGER 5
UTF8String EDITED
UTF8String EDITED\edited
UTF8String MMC.EXE
SEQUENCE (2 elem)
OBJECT IDENTIFIER 1.3.6.1.4.1.311.13.2.2 enrolmentCSP (Microsoft attribute)
SET (1 elem)
SEQUENCE (3 elem)
INTEGER 0
BMPString Microsoft Software Key Storage Provider
BIT STRING (0 bit)
SEQUENCE (2 elem)
OBJECT IDENTIFIER 1.2.840.113549.1.9.14 extensionRequest (PKCS #9 via CRMF)
SET (1 elem)
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv This sequence vvvvvvvvvvvvvvvvvvvvvvvvv
SEQUENCE (2 elem)
SEQUENCE (2 elem)
OBJECT IDENTIFIER 2.5.29.17 subjectAltName (X.509 extension)
OCTET STRING (153 byte) 308196A41430123110300E060355040B0C076469726E616D658204444E53318204444…
SEQUENCE (9 elem)
[4] (1 elem)
SEQUENCE (1 elem)
SET (1 elem)
SEQUENCE (2 elem)
OBJECT IDENTIFIER 2.5.4.11 organizationalUnitName (X.520 DN component)
UTF8String dirname
[2] (4 byte) DNS1
[2] (4 byte) DNS2
[1] (17 byte) othermail#mail.fr
[0] (2 elem)
OBJECT IDENTIFIER 1.3.6.1.4.1.311.25.1 ntdsReplication (Microsoft)
[0] (1 elem)
OCTET STRING (16 byte) ADC5FA58160E9F4ABB154A7DCEDC00A5
[7] (4 byte) 7F000002
[7] (16 byte) 00000000000000000000000000000001
[6] (3 byte) url
[0] (2 elem)
OBJECT IDENTIFIER 1.3.6.1.4.1.311.20.2.3 universalPrincipalName (Microsoft UPN)
[0] (1 elem)
UTF8String userprincipalname
SEQUENCE (2 elem)
OBJECT IDENTIFIER 2.5.29.14 subjectKeyIdentifier (X.509 extension)
OCTET STRING (20 byte) 87E201CF0B06CB290C98E7DF67796CF46AD9D507
OCTET STRING (20 byte) 87E201CF0B06CB290C98E7DF67796CF46AD9D507
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
SEQUENCE (2 elem)
OBJECT IDENTIFIER 1.2.840.113549.1.1.11 sha256WithRSAEncryption (PKCS #1)
NULL
BIT STRING (1024 bit) 101110000001101000110010011000110101111010001000011101110110001110000…
Do you know where I can find this info ?

Certificate extensions is an PKCS#9 request attribute. Specifically, extensionRequest attribute type is defined in RFC 2985 §5.4.2:
extensionRequest ATTRIBUTE ::= {
WITH SYNTAX ExtensionRequest
SINGLE VALUE TRUE
ID pkcs-9-at-extensionRequest
}
ExtensionRequest ::= Extensions
and in RFC 5280 Appendix A.1:
Extensions ::= SEQUENCE SIZE (1..MAX) OF Extension
Extension ::= SEQUENCE {
extnID OBJECT IDENTIFIER,
critical BOOLEAN DEFAULT FALSE,
extnValue OCTET STRING
-- contains the DER encoding of an ASN.1 value
-- corresponding to the extension type identified
-- by extnID
}
simply, attribute value is a SEQUENCE OF Extension type.

RFC 5912 has a fairly large set of PKIX-related ASN.1 modules collected into one RFC and altered to use ASN.1 extensions to more formally document open type fields (like the extnID/extnValue fields of Extension).

Related

I was wondering if someone could explain to me .decode and .encode in hashlib?

I understand that you have a hex string and perform SHA256 on it twice and then byte-swap the final hex string. The goal of this code is to find a Merkle Root by concatenating two transactions. I would like to understand what's going on in the background a bit more. What exactly are you decoding and encoding?
import hashlib
transaction_hex = "93a05cac6ae03dd55172534c53be0738a50257bb3be69fff2c7595d677ad53666e344634584d07b8d8bc017680f342bc6aad523da31bc2b19e1ec0921078e872"
transaction_bin = transaction_hex.decode('hex')
hash = hashlib.sha256(hashlib.sha256(transaction_bin).digest()).digest()
hash.encode('hex_codec')
'38805219c8ac7e9a96416d706dc1d8f638b12f46b94dfd1362b5d16cf62e68ff'
hash[::-1].encode('hex_codec')
'ff682ef66cd1b56213fd4db9462fb138f6d8c16d706d41969a7eacc819528038'
header_hex is a regular string of lower case ASCII characters and the decode() method with 'hex' argument changes it to a (binary) string (or bytes object in Python 3) with bytes 0x93 0xa0 etc. In C it would be an array of unsigned char of length 64 in this case.
This array/byte string of length 64 is then hashed with SHA256 and its result (another binary string of size 32) is again hashed. So hash is a string of length 32, or a bytes object of that length in Python 3. Then encode('hex_codec') is a synomym for encode('hex') (in Python 2); in Python 3, it replaces it (so maybe this code is meant to work in both versions). It outputs an ASCII (lower hex) string again that replaces each raw byte (which is just a small integer) with a two character string that is its hexadecimal representation. So the final bit reverses the double hash and outputs it as hexadecimal, to a form which I usually call "lowercase hex ASCII".

Emacs byte-to-position function is not consistent with document?

Emacs 24.3.1, Windows 2003
I found the 'byte-to-position' function is a little strange.
According to the document:
-- Function: byte-to-position byte-position
Return the buffer position, in character units, corresponding to
given BYTE-POSITION in the current buffer. If BYTE-POSITION is
out of range, the value is `nil'. **In a multibyte buffer, an
arbitrary value of BYTE-POSITION can be not at character boundary,
but inside a multibyte sequence representing a single character;
in this case, this function returns the buffer position of the
character whose multibyte sequence includes BYTE-POSITION.** In
other words, the value does not change for all byte positions that
belong to the same character.
We can make a simple experiment:
Create a buffer, eval this expression: (insert "a" (- (max-char) 128) "b")
Since the max bytes number in Emacs' internal coding system is 5, the character between 'a' and 'b' is 5 bytes. (Note that the last 128 characters is used for 8 bits raw bytes, their size is only 2 bytes.)
Then define and eval this test function:
(defun test ()
(interactive)
(let ((max-bytes (1- (position-bytes (point-max)))))
(message "%s"
(loop for i from 1 to max-bytes collect (byte-to-position i)))))
What I get is "(1 2 3 2 2 2 3)".
The number in the list represents the character position in the buffer. Because there is a 5 bytes big character, there should be five '2' between '1' and '3', but how to explain the magic '3' in the '2's ?
This was a bug. I no longer see this behavior in 26.x. You can read more about it here (which actually references this SO question).
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=20783

What does the "0" mean in MongoDB's BinData(0, "e8MEnzZoFyMmD7WSHdNrFJyEk8M=")?

The MongoDB shell prints binary data as a Base64-encoded string wrapped in what looks like a function call:
"_id" : BinData(0,"e8MEnzZoFyMmD7WSHdNrFJyEk8M=")
What does the "0" mean?
http://docs.mongodb.org/manual/reference/mongodb-extended-json/#binary
The BSON BinData datatype is represented via class BinData in the shell. Run help misc for more information.
> new BinData(2, "1234")
BinData(2,"1234")
from the shell
help misc
b = new BinData(subtype,base64str) create a BSON BinData value
The 0 in your case is the BSON subtype
http://bsonspec.org/#/specification
binary ::= int32 subtype (byte*) Binary - The int32 is the number of bytes in the (byte*).
subtype ::= "\x00" Generic binary subtype
| "\x01" Function
| "\x02" Binary (Old)
| "\x03" UUID (Old)
| "\x04" UUID
| "\x05" MD5
| "\x80" User defined
Similar question on this thread
http://groups.google.com/group/mongodb-dev/browse_thread/thread/1965aa234aa3ef1e
Macrolinux is right but you have to be careful with his example as it will work but by accident.
The first argument to BinData() is the BSON binary subtype which, as has been mentioned is one of the following:
generic: \x00 (0)
function: \x01 (1)
old: \x02 (2)
uuid_old: \x03 (3)
uuid: \x04 (4)
md5: \x05 (5)
user: \x80 (128)
These are just helpers so that the deserializer can interpret the binary data differently depending on what those bytes represent except for the subtype 2 which is like the generic subtype but stores an int32 representing the length of the byte array as the first 4 bytes of data.
Now to see why the example is wrong you'll note that calling BinData(2, "1234") doesn't store the binary representing the string "1234" for two reasons:
The BinData function interprets that string as a base64 encoded string.
Type 2 would require that the first 4 bytes be an int32 containing the length of the byte array.
See bsonspec.org for more information.
I believe they they correspond to the BSON subtypes:
subtype ::= "\x00" Binary / Generic
| "\x01" Function
| "\x02" Binary (Old)
| "\x03" UUID
| "\x05" MD5
| "\x80" User defined
Looking at that, it appears that 0 is almost always a valid choice.

Valid identifier characters in Scala

One thing I find quite confusing is knowing which characters and combinations I can use in method and variable names. For instance
val #^ = 1 // legal
val # = 1 // illegal
val + = 1 // legal
val &+ = 1 // legal
val &2 = 1 // illegal
val £2 = 1 // legal
val ¬ = 1 // legal
As I understand it, there is a distinction between alphanumeric identifiers and operator identifiers. You can mix an match one or the other but not both, unless separated by an underscore (a mixed identifier).
From Programming in Scala section 6.10,
An operator identifier consists of one or more operator characters.
Operator characters are printable ASCII characters such as +, :, ?, ~
or #.
More precisely, an operator character belongs to the Unicode set
of mathematical symbols(Sm) or other symbols(So), or to the 7-bit
ASCII characters that are not letters, digits, parentheses, square
brackets, curly braces, single or double quote, or an underscore,
period, semi-colon, comma, or back tick character.
So we are excluded from using ()[]{}'"_.;, and `
I looked up Unicode mathematical symbols on Wikipedia, but the ones I found didn't include +, :, ? etc. Is there a definitive list somewhere of what the operator characters are?
Also, any ideas why Unicode mathematical operators (rather than symbols) do not count as operators?
Working from the EBNF syntax in the spec:
upper ::= ‘A’ | ... | ‘Z’ | ‘$’ | ‘_’ and Unicode category Lu
lower ::= ‘a’ | ... | ‘z’ and Unicode category Ll
letter ::= upper | lower and Unicode categories Lo, Lt, Nl
digit ::= ‘0’ | ... | ‘9’
opchar ::= “all other characters in \u0020-007F and Unicode
categories Sm, So except parentheses ([]) and periods”
But also taking into account the very beginning on Lexical Syntax that defines:
Parentheses ‘(’ | ‘)’ | ‘[’ | ‘]’ | ‘{’ | ‘}’.
Delimiter characters ‘‘’ | ‘’’ | ‘"’ | ‘.’ | ‘;’ | ‘,’
Here is what I come up with. Working by elimination in the range \u0020-007F, eliminating letters, digits, parentheses and delimiters, we have for opchar... (drumroll):
! # % & * + - / : < = > ? # \ ^ | ~
and also Sm and So - except for parentheses and periods.
(Edit: adding valid examples here:). In summary, here are some valid examples that highlights all cases - watch out for \ in the REPL, I had to escape as \\:
val !#%&*+-/:<=>?#\^|~ = 1 // all simple opchars
val simpleName = 1
val withDigitsAndUnderscores_ab_12_ab12 = 1
val wordEndingInOpChars_!#%&*+-/:<=>?#\^|~ = 1
val !^©® = 1 // opchars ans symbols
val abcαβγ_!^©® = 1 // mixing unicode letters and symbols
Note 1:
I found this Unicode category index to figure out Lu, Ll, Lo, Lt, Nl:
Lu (uppercase letters)
Ll (lowercase letters)
Lo (other letters)
Lt (titlecase)
Nl (letter numbers like roman numerals)
Sm (symbol math)
So (symbol other)
Note 2:
val #^ = 1 // legal - two opchars
val # = 1 // illegal - reserved word like class or => or #
val + = 1 // legal - opchar
val &+ = 1 // legal - two opchars
val &2 = 1 // illegal - opchar and letter do not mix arbitrarily
val £2 = 1 // working - £ is part of Sc (Symbol currency) - undefined by spec
val ¬ = 1 // legal - part of Sm
Note 3:
Other operator-looking things that are reserved words: _ : = => <- <: <% >: # # and also \u21D2 ⇒ and \u2190 ←
The language specification. gives the rule in Chapter 1, lexical syntax (on page 3):
Operator characters. These consist of all printable ASCII
characters \u0020-\u007F. which are in none of the sets above,
mathematical sym- bols(Sm) and other symbols(So).
This is basically the same as your extract of Programming in Programming in Scala. + is not an Unicode mathematical symbol, but it is definitely an ASCII printable character not listed above (not a letter, including _ or $, a digit, a paranthesis, a delimiter).
In your list:
# is illegal not because the character is not an operator character
(#^ is legal), but because it is a reserved word (on page 4), for type projection.
&2 is illegal because you mix an operator character & and a non-operator character, digit 2
£2 is legal because £ is not an operator character: it is not a seven bit ASCII, but 8 bit extended ASCII. It is not nice, as $ is not one either (it is considered a letter).
use backticks to escape limitations and use Unicode symbols
val `r→f` = 150
println(`r→f`)

Building IP and Port from Byte Buffer

I have a byte buffer 6 bytes long first four contains ip address last 2 contains port, in big endian notation.
to get the ip i am using,
(apply str (interleave (map int (take 4 peer)) (repeat ".")))
Is casting bytes to int safe to get the ip address?
and also in java i use,
int port = 0;
port |= peerList[i+4] & 0xFF;
port <<= 8;
port |= peerList[i+5] & 0xFF;
this snippet to get the port address. How can i convert this to clojure?
yes mapping them to should be safe in this case because any leading zeros that are intoduced by writing into a larger data type will drop away again when it is converted into a string
The second part gets a lot easier because you are starting with a list of bytes.
(+ (* 256 (nth 5 peer)) (nth 4 peer))
A more general function for turning lists of bytes into numbers pulled from here
(defn bytes-to-num [bytes]
(let [powers (iterate #(* % 256) 1)]
(reduce + 0 (map * bytes powers))))