Is the character "|" valid for URL parameter names? - rest

Is this a valid URL? Should the character "|" be included in it?
Sure, this works but I don't think it's the right way to do it
https://sales-stage-api.techsg.cloud/requests/statistics?"meetingTime"|date:timeZone={"start":"2022-02-01 00:00:00 +07:00","end":"2022-02-28 23:59:00 +07:00"}

A Vertical Line (U+007C) should never appear in the query part of a URI, because it is not consistent with the production rules defined in RFC-3986
query = *( pchar / "/" / "?" )
pchar = unreserved / pct-encoded / sub-delims / ":" / "#"
unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
/ "*" / "+" / "," / ";" / "="
Note that the spaces and braces are also suspect.
Key/Value pairs in the query string are normally an indication that they are an application/x-www-form-urlencoded representation of some information (for instance, values collected from HTML Form input controls); so you'll usually want to ensure that the serialization of that information matches the deserialization.
What spellings should be used for keys and values (before serialization/after deserialization) is largely a local design concern: the origin server controls its own space of resource identifiers, so if it wants to have some name like:
"meetingTime"|date:timeZone
Then that's fine? It's tradeoffs - you give up something, and get something else in return; if the thing you get is more important than the thing you are giving up, then you are winning.
That said, I haven't the foggiest idea what this designer thinks they are getting in return, that offsets the assorted miseries that this spelling convention introduces.
This is not a design that would make it through my review process without a lot of supporting documentation.

Related

Special character in key in the conf file for Internationlization in play framework

I am trying to use the Internationalization feature of the Play Framework.
It involves the creation of a new conf file for each language that we want to support. Example for french we create a messages.fr file in the conf folder.
Inside it we define key-values like this:
Hello.World = 'Bonjour le monde'
Now the issue is that I have lines that contain characters like "," and "(" and if these are included in the key then we get the error in parsing from the MessageApi
Example
Hello.(World) = 'Bonjour (le monde)'
Here the "(" before and after World is throwing an error while parsing.
Anyone having any idea how we could achieve this?
Try to escape these special characters:
Hello.\(World\) = 'Bonjour (le monde)'
Other examples:
string_one = String for translation 1
string_two = one + one \= two
# String key with spaces
key\ with\ spaces = This is the value that could be looked up with the key "key with spaces".
# Backslash in value should be escaped by another backslash
path=c:\\wiki\\ templates
Also, you can try to escape special characters by using Java Unicode:
Hello.\u0028World\u0029 = 'Bonjour (le monde)'
Reference - How to escape the equals sign in properties files

Rest API - Multi-Column Sort issue

I have seen few articles about Best Practices with REST API and they are suggesting belo for multi column sort.
GET /users?sort_by=-last_modified,+email
https://www.moesif.com/blog/technical/api-design/REST-API-Design-Filtering-Sorting-and-Pagination/
When I am using this approach, I see that - works fine but + gets replaced by a space.
A quick google indicates that + is a special character after ? in URL. What am I missing out here?
> The following characters have special meaning in the path component of
> your URL (the path component is everything before the '?'): ";" |
> "/" | "?"
>
> In addition to those, the following characters have special meaning in
> the query part of your URL (everything after '?'). Therefore, if they
> are after the '?' you need to escape them: ":" | "#" | "&" | "=" |
> "+" | "$" | ","
>
> For a more in-depth explanation, see the RFC.
What am I missing out here?
History, mostly.
U+002B (+) is a sub-delim, in the context of a URI, and can be used freely in the query part; see RFC 3986 Appendix A.
But on the web, a common source of query data is HTML form submissions; when we submit a form, the processing engine collects the key value pairs from the form and creates an application/x-www-form-urlencoded character sequence, which becomes the query of the URI.
Because this is such a common case, the query parsers in web server frameworks often default to reversing the encoding before giving your bespoke code access to the data.
Which means that in your web logs, you would see:
/users?sort_by=-last_modified,+email
because that's the URI that you received, but in your parameter mapping you would see
"sort_by" = "-last_modified, email"
Because the "form data" is being decoded before you get to look at it.
Form urlencoding has an explicit step in it that replaces any spaces (U+0020) with U+002B, and U+002B is instead percent-encoded.
To check if this is what is going on, try instead the following request:
GET /users?sort_by=-last_modified,%2Bemail
What I expect you will find is that the plus you are looking for now appears in your form parameters:
"sort_by" = "-last_modified,+email"

How to compute a canonical email?

Let's say I log into a website with my email address somewords#gmail.com. I cannot prevent someone to use the same just using a unique index because Some.Words#gmail.com would still be the same as per GMail rules. The soution would be to compute a canonical email and make it unique, but how to compute it?
I couldn't find any resource on this subject apart concerning GMail. Is it because it's the only mailbox provider doing this? If not are there general rules? If not what are provider's specific rules?
RFC5321 Section 4.1.2. defines the possible charset:
Local-part = Dot-string / Quoted-string; MAY be case-sensitive
Where, eventually, it goes back to atext which is accidentally missing from the said RFC, but it's expanded elsewhere (as well as in errata):
atext = ALPHA / DIGIT / ; Printable US-ASCII
"!" / "#" / ; characters not including
"$" / "%" / ; specials. Used for atoms.
"&" / "'" /
"*" / "+" /
"-" / "/" /
"=" / "?" /
"^" / "_" /
"`" / "{" /
"|" / "}" /
"~"
so apart from the wide range charset available it also specifies that local parts MAY be case sensitive.
Apart from those formal rules there are no restriction on the interpretation of the local part, so what you want doesn't exist. It is always up to the local mailserver to interpret local parts, do equivalnce conversions, ignore characters and like.

mailto: with multiple addresses and real names

I have backend admin tool that manages a number of groups of people working in different sections. From time to time I need to email all the people in one group, so I created a button in my admin tool which does a simple mailto: for all the users in that section. For example:
Mail All
And this works fine. However, I wanted to add their real names to the mailto link so when I'm sending the mail I can quickly see who's in the group. So I tried formatting the link like this:
Mail All
But that seemed to only pick up the first email address and list the 'real name' as one long name with commas.
Searching the web, documentation is scant on multiple addresses with real names (only found info when sending one). So wondering whether this is (a) not possible, (b) possible, but I've got the syntax wrong or c) only possible if I use a workaround like copying all the email address data onto the clipboard and paste it into the mail.
Any email gurus out there?
I had the same question. I fooled around and got it to work using UTF-8 encoding.
"First Last" <firstlastname#example.com>
becomes
Send Email
I was also able to add a bcc field with multiple addresses by following the example above and separated them with commas.
This launched in both Outlook and in Gmail Compose by replacing "mailto:" with "https://mail.google.com/mail/?view=cm&fs=1&tf=1&to="
I say "not possible", at least the way I interpret RFC 6068:
The syntax of a 'mailto' URI is described using the ABNF of [STD68],
non-terminal definitions from [RFC5322] (dot-atom-text, quoted-
string), and non-terminal definitions from [STD66] (unreserved, pct-
encoded):
mailtoURI = "mailto:" [ to ] [ hfields ]
**to = addr-spec *("," addr-spec )
hfields = "?" hfield *( "&" hfield )
hfield = hfname "=" hfvalue
hfname = *qchar
hfvalue = *qchar
**addr-spec = local-part "#" domain
**local-part = dot-atom-text / quoted-string
domain = dot-atom-text / "[" *dtext-no-obs "]"
dtext-no-obs = %d33-90 / ; Printable US-ASCII
%d94-126 ; characters not including
; "[", "]", or "\"
qchar = unreserved / pct-encoded / some-delims
some-delims = "!" / "$" / "'" / "(" / ")" / "*"
/ "+" / "," / ";" / ":" / "#"
(I've marked the interesting rules with **)
Specifically:
<addr-spec> is a mail address as specified in [RFC5322], but excluding
<comment> from [RFC5322].
The address format you are trying to use is called name-addr in RFC5322. addr-spec is just the name#domain part.

What's the best candidate padding char for url-safe and filename-safe base64?

The padding char for the official base64 is '=', which might need to be percent-encoded when used in a URL. I'm trying to find the best padding char so that my encoded string can be both url safe (I'll be using the encoded string as parameter value, such as id=encodedString) AND filename safe (I'll be using the encoded string directly as filename).
Dot ('.') is a popular candidate, it's url safe but it's not exactly filename safe: Windows won't allow a file name which ends with a trailing dot.
'!' seems to be a viable choice, although I googled and I've never seen anybody using it as the padding char. Any ideas? Thanks!
Update: I replaced "+" with "-" (minus) and replaced "/" with "_" (underscore) in my customized base64 encoding already, so '-' or '_' is not available for the padding char any more.
The best solution (I've spent last month working on this problem with an email sending website) is to not use padding character (=) at all
The only reason why padding character is there is because of "lazy" decoders. You can extremely easy add missing = -> just do %4 on text and subtract the number you get from 4 and that is how many = you need to add in string end. Here is C# code:
var pad = 4 - (text.Length % 4);
if (pad < 4)
text = text.PadRight(text.Length + pad, '=');
Also, most people who do this are interested in replacing + and / with other URL safe character... I propose:
replace with -
/ replace with _
DO NOT USE . as it can produce crazy results on different systems / web servers (for example on IIS Base64 encoded string can't end with . or IIS will search for the file)
The RFC 2396 unreserved characters in URIs are:
"-" | "_" | "." | "!" | "~" | "*" | "'" | "(" | ")"
It's worth pointing out, though, that the Microsoft article also says "Do not assume case sensitivity." Perhaps you should just stick with base 16 or 32?
The Wikipedia article states;
a modified Base64 for URL variant
exists, where no padding '=' will be
used
I would go with '-' or '_'
They're URL and file safe, and they looks more or less like padding