Encoding of a CHOICE type when the CHOICE itself is used with an implicit tag (using the specific example: CRLDistPoints) - x509

The Botan crypto library has only a very limited support for the X.509 extension CRLDistributionPoint and actually throws an exception, if any of the "advanced" attributes of the extension are set which are not expected by Botan.
Hence, I try to patch the decoding of this extension, but I have a problem to correctly determine the type of the encoded objects based on the tags. Either this is an oversight in the specification for this extension (I doubt it) or I am subject to a fundamental misunderstanding of the encoding/decoding rules.
Here are the relevant parts of the specification
CertificateExtensions {joint-iso-itu-t ds(5) module(1)
certificateExtensions(26) 5} DEFINITIONS IMPLICIT TAGS
CRLDistPointsSyntax ::= SEQUENCE SIZE (1..MAX) OF DistributionPoint
DistributionPoint ::= SEQUENCE {
distributionPoint [0] DistributionPointName OPTIONAL,
reasons [1] ReasonFlags OPTIONAL,
cRLIssuer [2] GeneralNames OPTIONAL
}
DistributionPointName ::= CHOICE {
fullName [0] GeneralNames,
nameRelativeToCRLIssuer [1] RelativeDistinguishedName
}
The modules uses implicit tagging by default. In the following this will be important. A DistributionPoint is a SEQUENCE where all attributes are optional. The first optional attribute distributionPoint has the type tag 0 and is of type DistributionPointName. In turn, DistributionPointName is a CHOICE with possible choices which are either tag 0 (if GeneralNames is chosen) or tag 1 (if RelativeDistinguishedName is chosen).
According to my understanding, in case of implicit tagging a CHOICE type is encoded using the tag of the chosen type. In other words, a CHOICE type is not "nested" somehow but encoded on the same level on which the CHOICE type is used. But DistributionPointName has already been given the tag 0.
The specific question is: How is a DistributionPoint encoded, if nameRelativeToCRLIssuer (tag 1) is chosen as the choice for DistributionPointName without triggering a clash with tag 1 of the reasons attribute?
Here is an illustration of my problem:
30 // Type tag for outer SEQUENCE, DistributionPoint starts here
ll // Length of SEQUENCE, omitted here for editorial purposes
+--> 00 vs 01 // Type tag for distributionPoint
| // At first glance, 00 according to SEQUENCE definition for OPTIONAL DistributionPointName,
| // but maybe 01 if RelativeDistinguishedName is the selected CHOICE
| kk // Length of RelativeDistinguishedName, omitted here for editorial purposes
| vv // Encoding of RelativeDistinguishedName begins
| vv
| vv // Encoding of RelativeDistinguishedName ends, accordingly to length kk
+--> 01 // Type tag for OPTIONAL ReasonsFlags
jj // Length of ReasonsFlags
ww // Encoding of ReasonsFlags begins
ww
ww // Encoding of ReasonsFlags ends, accordingly to length jj
// Encoding of DistributionPoint ends, too, accordingly to length ll
In line three, the type tag should be 00 to indicate that the OPTIONAL DistributionPointName exists. This also avoids a clash with the type tag 01 in line 8 for the OPTIONAL ReasonFlags.
However, in line three, the type tag should also indicate which type has been chosen for DistributionPointName. :-(

According to my understanding, in case of implicit tagging a CHOICE
type is encoded using the tag of the chosen type. In other words, a
CHOICE type is not "nested" somehow but encoded on the same level on
which the CHOICE type is used. But DistributionPointName has already
been given the tag 0.
I'm afraid this is the opposite: CHOICE tagging is always explicit whatever the default tagging ...
In the X.680 document, there is following note
The tagging construction specifies explicit tagging if any of the following holds:
c) the "Tag Type" alternative is used and the value of "TagDefault" for
the module is IMPLICIT TAGS or AUTOMATIC TAGS, but the type defined by
"Type" is an untagged choice type, an untagged open type, or an
untagged "DummyReference" (see Rec. ITU-T X.683 | ISO/IEC 8824-4,
8.3).
So, if RelativeDistinguishedName is chosen, distributionPoint component tagging will be 0 (distributionPoint) and then 1 (RelativeDistinguishedName)
The reason for this is that CHOICE does not have a UNIVERSAL tag

Related

"In Unicode programs, must have the same structure layout, irrespective of the length" error

I'm now getting this error after making a small mod to a working program. The structures have the same type (though the tables are different) but I get this error? I've looked at similar postings but couldn't find an answer to this.
Codes snippets included.
* ---
,begin of TY_RESB_CDC
,MANDT type MANDT
,RSNUM type RSNUM
,RSPOS type RSPOS
,RSART type RSART
,UDATE type CDDATUM
,UTIME type CDUZEIT
.include type ZBW_MATERIAL_RESVN_RESB.
types: end of TY_RESB_CDC
,TT_RESB_CDC type hashed table of TY_RESB_CDC
with unique key RSNUM RSPOS RSART
,TT_RESB_STD type standard table of TY_RESB_CDC
with empty key
---
,LT_RESB_CDC type TT_RESB_CDC
,WA_RESB_CDC type TY_RESB_CDC "like LINE OF LT_RESB_CDC
,LT_RESB_STD type TT_RESB_STD
,WA_RESB_STD type TY_RESB_CDC "like line of LT_RESB_STD
---
move-corresponding <FS_DATA> to WA_RESB_STD.
" already exists in CDC
if WA_RESB_STD eq WA_RESB_CDC. "<FS_RESB_CDC>.
continue. "no change, skip this record
A component name cannot contain a dot in its name, in a Unicode program. Same for any other ABAP symbolic name.
The below code with name .include is not permitted. You were mistaken by DDIC structures, which have different rules.
TYPES: begin of TY_RESB_CDC,
...
UTIME type CDUZEIT,
.include type ZBW_MATERIAL_RESVN_RESB,
end of TY_RESB_CDC.
Instead, you should use the ABAP statement INCLUDE TYPE to include the components of a structure (e.g. ZBW_MATERIAL_RESVN_RESB in your case):
TYPES: begin of TY_RESB_CDC,
...
UTIME type CDUZEIT.
INCLUDE TYPE ZBW_MATERIAL_RESVN_RESB.
TYPES: end of TY_RESB_CDC.

How to escape special charcters?

I am using a html purifier package for purifying my rich text from any xss before storing in database.
But my rich text allows for Wiris symbols which uses special character as → or  .
Problem is the package does not allow me to escape these characters. It removes them completely.
What should I do to escape them ??
Example of the string before purifying
<p><math xmlns="http://www.w3.org/1998/Math/MathML"><msup><mi>x</mi><mn>2</mn></msup><mo> </mo><mo>+</mo><mo> </mo><mmultiscripts><mi>y</mi><mprescripts/><none/><mn>2</mn></mmultiscripts><mo> </mo><mover><mo>→</mo><mo>=</mo></mover><mo> </mo><msup><mi>z</mi><mn>2</mn></msup><mo> </mo></math></p>
After purifying
<p><math xmlns="http://www.w3.org/1998/Math/MathML"><msup><mi>x</mi><mn>2</mn></msup><mo> </mo><mo>+</mo><mo> </mo><mmultiscripts><mi>y</mi><mprescripts></mprescripts><none><mn>2</mn></mmultiscripts><mo> </mo><mover><mo>→</mo><mo>=</mo></mover><mo> </mo><msup><mi>z</mi><mn>2</mn></msup><mo> </mo></math></p>
My guess is that these entities are failing the regexen that HTML Purifier is using to check for valid entities in HTMLPurifier_EntityParser, here:
$this->_textEntitiesRegex =
'/&(?:'.
// hex
'[#]x([a-fA-F0-9]+);?|'.
// dec
'[#]0*(\d+);?|'.
// string (mandatory semicolon)
// NB: order matters: match semicolon preferentially
'([A-Za-z_:][A-Za-z0-9.\-_:]*);|'.
// string (optional semicolon)
"($semi_optional)".
')/';
$this->_attrEntitiesRegex =
'/&(?:'.
// hex
'[#]x([a-fA-F0-9]+);?|'.
// dec
'[#]0*(\d+);?|'.
// string (mandatory semicolon)
// NB: order matters: match semicolon preferentially
'([A-Za-z_:][A-Za-z0-9.\-_:]*);|'.
// string (optional semicolon)
// don't match if trailing is equals or alphanumeric (URL
// like)
"($semi_optional)(?![=;A-Za-z0-9])".
')/';
Notice how it expects numeric entities to start with 0 currently. (Perfectly sane since it's designed to handle pure HTML, without add-ons, and to make that safe; but in your use-case, you want more entity flexibility.)
You could extend that class and overwrite the constructor (where these regexen are being defined, by instead defining your own where you remove the 0* from the // dec part of the regexen), instantiating that, try setting $this->_entity_parser on a Lexer created with HTMLPurifier_Lexer::create($config) to your instantiated EntityParser object (this is the part I am least sure about whether it would work; you might have to create a Lexer patch with extends as well), then supply the altered Lexer to the config using Core.LexerImpl.
I have no working proof-of-concept of these steps for you right now (especially in the context of Laravel), but you should be able to go through those motions in the purifier.php file, before the return.
I solved the problem by setting key Core.EscapeNonASCIICharacters to true
under my default key in my purifier.php file and the problem has gone.

How to write constraint involving a certain parameter in clingo?

I am attempting to solve the Google ASP Competition 2019 : Insurance Referee Assignment problem. The problem is provided in this link.
There is a hard constraint that if a referee has preference type of 0 then the case will not be assigned to that referee. I have simplified the problem to include a few variables.
case(cid) refers to a case with cid as the case id.
ref(rid) refers to the referee with referee id.
pref(rid, type) takes the preference of referee 'rid' and type which takes value from 0 to 3. The higher the number, the more likely it will take the case.
In ref(10, 3) and ref(9, 2), the higher preference will be given to ref(10).
I have tried the following clingo code:
ref(rid).
case(cid).
pref(rid, type).
assign(cid, rid) :- ref(rid), pref(rid, type), type != 0.
case(4).
ref(3).
ref(5).
pref(3, 0).
pref(5, 1).
#show assign/2.
However, when I run the command, it shows satisfiable but outputs only this
assign(cid, rid)
What am I doing wrong?
In clingo variables start with a capital letter and are "valid" just withhin the rule. So my guess is you want the following code:
assign(Rcid, Rrid) :- case(Rcid), ref(Rrid), pref(Rrid, Rtype), Rtype != 0.
case(4).
ref(3). ref(5).
pref(3, 0). pref(5, 1).
#show assign/2.
output:
Solving...
Answer: 1
assign(4,5)
SATISFIABLE
Please note that only renaming the constants to variable names will result in an error message, you have to add case(Rcid) to the rule body as well. Variable names were freely choosen, you can use any variable name as long as it starts with a capital letter.

Can there be multiples of the same CHOICE field in a rfc5280 certificate?

I'm currently validating my implementation of a certificate conforming to the RFC5280.
The General Name is defined as:
GeneralName ::= CHOICE {
otherName [0] OtherName,
rfc822Name [1] IA5String,
dNSName [2] IA5String,
x400Address [3] ORAddress,
directoryName [4] Name,
ediPartyName [5] EDIPartyName,
uniformResourceIdentifier [6] IA5String,
iPAddress [7] OCTET STRING,
registeredID [8] OBJECT IDENTIFIER }
Now I can't find the definition of the CHOICE keyword. Is it possible for my certificate to contain multiple directoryName-, or URI-fields? Or does choice mean any of the below but not more than once?
Is it possible for my certificate to contain multiple directoryName-, or URI-fields?
Yes.
Or does choice mean any of the below but not more than once?
Also yes.
A choice is a single choice. It probably says it succinctly in the document somewhere, but ITU-T X.680 always refers to choices as single values, such as:
29.8 The choice type contains values which do not all have the same tag. (The tag depends on the alternative which contributed the value to the choice type.)
(emphasis mine)
The certificate can contain multiple directoryName/etc values because things like the subject alternative name extension don't have GeneralName values, they have GeneralNames values. And, of course, GeneralNames ::= SEQUENCE SIZE (1..MAX) OF GeneralName

I need an example to understand Implicit Tagging in ASN.1

I have been going through the following tutorial
http://www.obj-sys.com/asn1tutorial/node12.html
Can you help me understand implicit tagging with an example?
In ASN.1 tagging, in fact, serves two purposes: typing and naming. Typing means it tells an en-/decoder what kind of data type that is (is it a string, an integer, a boolean, a set, etc.), naming means that if there are multiple fields of the same type and some (or all of them) are optional, it tells the en-/decoder for which field that value is.
If you compare ASN.1 to, let's say, JSON, and you look at the following JSON data:
"Image": {
"Width": 800,
"Height": 600,
"Title": "View from 15th Floor"
}
You'll notice that in JSON every field is always explicitly named ("Image", "Width", "Height", "Title") and either explicitly or implicitly typed ("Title" is a string, because its value is surrounded by quotes, "Width" is an integer, because it has no quotes, only digits, it's not "null", "true" or "false", and it has no decimal period).
In ASN.1 this piece of data would be:
Image ::= SEQUENCE {
Width INTEGER,
Height INTEGER,
Title UTF8String
}
This will work without any special tagging, here only the universal tags are required. Universal tags don't name data, they just type data, so en-/decoder know that the first two values are integers and the last one is a string. That the first integer is Width and the second one is Height doesn't need to be encoded in the byte stream, it is defined by their order (sequences have a fixed order, sets don't. On the page you referred to sets are being used).
Now change the schema as follows:
Image ::= SEQUENCE {
Width INTEGER OPTIONAL,
Height INTEGER OPTIONAL,
Title UTF8String
}
Okay, now we have a problem. Assume that the following data is received:
INTEGER(750), UTF8String("A funny kitten")
What is 750? Width or Height? Could be Width (and Height is missing) or could be Height (and Width is missing), both would look the same as a binary stream. In JSON that would be clear as every piece of data is named, in ASN.1 it isn't. Now a type alone isn't enough, now we also need a name. That's where the non-universal tags enter the game. Change it to:
Image ::= SEQUENCE {
Width [0] INTEGER OPTIONAL,
Height [1] INTEGER OPTIONAL,
Title UTF8String
}
And if you receive the following data:
[1]INTEGER(750), UTF8String("A funny kitten")
You know that 750 is the Height and not the Width (there simply is no Width). Here you declare a new tag (in that case a context specific one) that serves two purposes: It tells the en-/decoder that this is an integer value (typing) and it tells it which integer value that is (naming).
But what is the difference between implicit and explicit tagging? The difference is that implicit tagging just names the data, the en-/decoder needs to know the type implicitly for that name, while explicit tagging names and explicitly types the data.
If tagging is explicit, the data will be sent as:
[1]INTEGER(xxx), UTF8String(yyy)
so even if a decoder has no idea that [1] means Height, it knows that the bytes "xxx" are to be parsed/interpreted as an integer value. Another important advantage of explicit tagging is that the type can be changed in the future without changing the tag. E.g.
Length ::= [0] INTEGER
can be changed to
Length ::= [0] CHOICE {
integer INTEGER,
real REAL
}
Tag [0] still means length, but now length can either be an integer or a floating point value. Since the type is encoded explicitly, decoders will always know how to correctly decode the value and this change is thus forward and backward compatible (at least at decoder level, not necessarily backward compatible at application level).
If tagging is implicit, the data will be sent as:
[1](xxx), UTF8String(yyy)
A decoder that doesn't know what [1] is, will not know the type of "xxx" and thus cannot parse/interpret that data correctly. Unlike JSON, values in ASN.1 are just bytes. So "xxx" may be one, two, three or maybe four bytes and how to decode those bytes depends on their data type, which is not provided in the data stream itself. Also changing the type of [1] will break existing decoders for sure.
Okay, but why would anyone use implicit tagging? Isn't it better to always use explicit tagging? With explicit tagging, the type must also be encoded in the data stream and this will require two additional bytes per tag. For data transmissions containing several thousand (maybe even millions of) tags and where maybe every single byte counts (very slow connection, tiny packets, high packet loss, very weak processing devices) and where both sides know all custom tags anyway, why wasting bandwidth, memory, storage and/or processing time for encoding, transmitting and decoding unnecessary type information?
Keep in mind that ASN.1 is a rather old standard and it was intended to achieve a highly compact representation of data at a time where network bandwidth was very expensive and processors several hundred times slower than today. If you look at all the XML and JSON data transfers of today, it seems ridiculous to even think about saving two bytes per tag.
I find this thread to be clear enough, it also contains (small) examples even tough they are quite 'extreme' examples at that. A more 'realistic' examples using IMPLICIT tags can be found in this page.
Using the accepted answer as an example of encoding:
Image ::= SEQUENCE {
Width INTEGER,
Height INTEGER,
Title UTF8String
}
An example of encoding would be:
The internal sequence breaks down into:
Explicit Optional
If you then have EXPLICIT OPTIONAL values:
Image ::= SEQUENCE {
Width [0] EXPLICIT INTEGER OPTIONAL,
Height [1] EXPLICIT INTEGER OPTIONAL,
Title UTF8String
}
The encoded sequence might be:
SEQUENCE 30 15 A1 02 02 02 EE 0C 0E 41 20 66 75 6E 6E 79 20 6B 69 74 74 65 6E (21-bytes)
And the internal sequence breaks down into:
CONTEXT[1] INTEGER: A1 02 02 02 EE 750 (2-bytes)
UTF8STRING: 0C 0E 41 20 66 75 6E 6E 79 20 6B 69 74 74 65 6E "A funny kitten" (14-bytes)