I was reading the W3C URL Specification and I noticed that there is nothing explicitly mentioned about this.
Experiments
So what I tried in curl was
www.google.com
and then
www.GOOGLE.com
and these returned the same document. So I thought maybe google owns all variations on its domain name, so I tried other sites and I get mixed results.
So I mixed the case on the URL Specification and it seems to allow mixed case.
Applying this to REST API Design
So when applying this to REST API design, sometimes we use the notion of an identifier to return a specific resource from the server. E.g.
In https://localhost:8080/contacts/MYSELF, MYSELF would be the typical identifier
Based on those previous experiences, the case of MYSELF should not matter. But what if I wanted strict validation on the identifier?
Sure, you can go against the spec and do this in the application; but what is the appropriate thing to do in this case?
So back to the subject. Are URLs meant to be case insensitive?
General URI syntax
In the general URI syntax (as defined by RFC 3986, which is currently the Internet Standard for URIs), only two components are case-insensitive:
Scheme:
[…] schemes are case-insensitive […]
Host:
The host subcomponent is case-insensitive.
And the letters in percent-encoding triplets (i.e., a-f, A-F) are case-insensitive, too.
Everything else is case-sensitive.
However, URI schemes can overwrite this for their URIs (see Case Normalization).
HTTP(S) URIs
In the case of HTTP(S) URIs, the spec doesn’t make any other components case-insensitive (it restates that scheme and host are case-insensitive).
That means the following HTTP URIs are equivalent:
http://example.com/foo
HTTP://example.com/foo
http://EXAMPLE.com/foo
http://example.COM/foo
HTTP://EXAMPLE.COM/foo
htTp://exAMPlE.cOm/foo
(Best practice is to normalize scheme and host to lowercase.)
And these are not equivalent:
http://example.com/foo?bar#baz
http://example.com/fOo?bar#baz
http://example.com/foo?bAr#baz
http://example.com/foo?bar#bAz
http://example.com/FOO?BAR#BAZ
Domains are case-insensitive. You don't need to buy every variation, because getting the domain gives you every variation.
There is no specification that says that the 'path' part of a url has to be a particular case. Paths are not case insensitive though, so accessing /foo and /FOO on the same domain may yield different resources.
According to W3 specification -
URLs in general are case-sensitive (with the exception of machine
names). There may be URLs, or parts of URLs, where case doesn't
matter, but identifying these may not be easy. Users should always
consider that URLs are case-sensitive.
Source:- https://www.w3.org/TR/WD-html40-970708/htmlweb.html
Related
I'm learning REST and I have a question.
Is there a scenario where the endpoint person/pathParm1/PathParam2 is legitimate?
For example:
person/ben/stiller
people /2/4
As far as I understand REST, query parameters should be used for searches:
person?firstName=ben&secondName=stiller
or
person/2/order4
REST doesn't care what spelling conventions you use for your resource identifiers.
So if you want to have a URI template with multiple variables to expand, and more than one of those variables are expanded as path segments, that's fine.
For example, you'll notice that your browser has no trouble with this resource identifier:
https://stackoverflow.com/questions/74969638/endpoint-with-two-path-parameters
which might reasonably be produced by expanding variables into a template like
https://stackoverflow.com/questions/{id}/{hint}
As far as I understand REST, query parameters should be used for searches:
That's not a REST constraint, although for the special case of the web it turned out that way. This is primarily a historical accident: we didn't have standards for URI templates when the web was young, which meant that searches came about from the standardized implementation of HTML form submissions (application/x-www-form-urlencoded key value parameters replacing the query part of the form action)
REST does say that we use resource identifiers to... identify resources; and that we all use the same general purpose resources (ie: conforming to the production rules defined in RFC 3986), but without constraints on the spelling or semantics of those identifiers.
Example: URL shorteners work.
(Note: your misunderstanding is a common one, and not at all your fault; the literature sucks. FWIW, I was once where you are; Stefan Tilkov's 2014 talk was the one that really got my own thinking straightened out.)
That said, you might find a "query parameters should be used for searches" constraint coming from somewhere else; a local style guide, for example.
this means I could also make a restful endpoint like this: api/person/{firstName}/{lastName} instead api/person?firstName=ben&lastName=stiller ?
Yes; you can use either of those spellings for your resource identifiers, and all of the general purpose components out there will still "just work" -- because they are treating the resource identifier as semantically opaque.
For getting the latest valid address (of the logged in user), how RESTful is the following URL?
GET /addresses/valid/latest
Probably
GET /addresses?valid=true&limit=1
is the best, but it should then return a list. And, I'd like to return an object rather then a list.
Any other suggestions?
Your url structure doesn't have much to do with how RESTful something is.
So lets assume which one is the 'best'. Also a bit hard to say, pretty subjective.
I would generally avoid a pattern like /addresses/valid/latest. This kinda suggest that there is a 'latest resource' in the 'valid collection', in the 'addresses collection'.
So I like your other suggestion a bit better, because it suggests that you're using an 'addresses' collection, filtering by valid items and only showing 1.
If you don't want all kinds of parameters, I would be more inclined to find a url pattern that's not literally 'addresses, but only the valid, but only the latest', but think about what the purpose is of the endpoint. Maybe something that's easier to remember like /fresh-address =)
how RESTful is the following URL?
Any identifier that satisfies the production rules described by RFC 3986 is RESTful.
General purpose components are not supposed to derive semantics from identifiers, they are opaque. Which means that the server is free to encode information into those identifiers at its own discretion.
Consider Google search: does your browser care what URI is used as the target of the search form? Does your browser care about the href provided by Google with each search result? In both cases, the browser just does what it is told, which is to say it creates an HTTP request based on the representation of application state that was provided by the server.
URI are in the same broad category as variable names in a programming language - the machines don't care so long as the spellings are consistent with some simple constraints. People care, so there are some benefits to having a locally consistent and logical scheme.
But there are contexts in which easily guessed URI are not what you want. See Mark Seemann 2013.
Since the semantic content of the URI is reserved for use by the server only, it follows that the server can choose to encode that information into path segments or the query part. Or both.
Spellings that can be described by a URI Template can be very powerful. The most familiar URI template is probably an HTML form using the GET method, which encodes key value pairs onto the query part of the URI; so you should think about whether that's a use case you want to support.
What would be the best approach to implement a GET REST API in order to check if a given URL existed in the database.
Each GET request will have the following parts : hostname, port, origin, path, and query.
My idea is to setup the resource as follows.
/urlservice/1/{hostname}/{port}/{origin}/{path}/{query}
But this seems very verbose since it will results in resource urls like:
/urlservice/1/google.com/80/"https://google.com/"/"/search"/"?q=aba"
What is a better way of designing this?
The main caveat with REST when designing your URI structure is that you follow the URI spec. That being said, looking at the URI spec in regards to the structure it has a couple important notes to help with your question:
1.2.3. Hierarchical Identifiers
The URI syntax is organized hierarchically, with components listed in
order of decreasing significance from left to right. For some URI
schemes, the visible hierarchy is limited to the scheme itself:
everything after the scheme component delimiter (":") is considered
opaque to URI processing. Other URI schemes make the hierarchy
explicit and visible to generic parsing algorithms.
The generic syntax uses the slash ("/"), question mark ("?"), and
number sign ("#") characters to delimit components that are
significant to the generic parser's hierarchical interpretation of an
identifier...
Now in regards to the query string:
3.4. Query
The query component contains non-hierarchical data that, along with
data in the path component (Section 3.3), serves to identify a
resource within the scope of the URI's scheme and naming authority
(if any). The query component is indicated by the first question
mark ("?") character and terminated by a number sign ("#") character
or by the end of the URI...
With the above, looking at your URI you need to determine whether your structure is hierarchical or not to follow the URI spec to satisfy REST. Of course, there is a bit of subjectivity here but looking at what you have, most (if not all) of the parameters you called out look like candidates for use in a query string is as it is non-hierarchical. To that end, I'd recommend moving them to the query string.
/urlservice/1?hostname={hostname}&port={port}&origin={origin}&path={path}&query={query}
Again, as there is some subjectivity and you know your domain better than others, use the above guidance and make your best judgement call.
You could decompose it to be service based rather than specifying everything in the request:
/urlservice/1/{service}/{request}
The services would be 'service based' so a google search service would know how to build a google search url:
/urlservice/1/google/aba
would be resolved to:
https://google.com/search/?q=aba
by the google service. It also means all clients wouldn't have to change if google changed their service parameters. Only the google service would change its internal implementation of the url builder.
This may be a self-answering question, but I'm hoping one of you could point me to any resource where it is declared, or can be inferred, whether to use upper or lower case letters when declaring an HTTP method name in HTTP or REST requests. The majority of examples I see put GET, PUT, POST, DELETE, PATCH etc in capital letters, whereas I go on the assumption that HTTP method field names are case insensitive - that is, for example, that "get" is equally as valid as "GET". Traditionally, I have always used capital letters, but I would just like to be sure.
The W3C explicitly declares that the method is case-sensitive and uses upper case, but in my travails, I've often encountered HTTP method field values using lower case, which I assume are incorrect, so from my point of view, it seems that practices and standards are somewhat out of touch on this matter.
Upper case is correct - right?
Method names are case-sensitive, and all registered methods are all upper-case.
(and the W3C really doesn't matter here; what's relevant are RFCs 7230 and 7231).
https://www.rfc-editor.org/rfc/rfc7231#section-4.1
yes as "Julian Reschk" said it should be upper-case by convention.
if you are using server like Django, Flask or Express (node) directly then your lower-case method names will be translated to upper-case automatically.
Frontent <---> Backend
but if there is a proxy in-between then it will a problem like nginx then you will get an error, also most services on cloud platforms like AWS, GCP, Azure use nginx behind the scenes. you will probably run in to this issue.
I'm reading the API for arin and notice that many of the REST parameters are in uppercase.
Is there a standard that defines what I should be expecting in REST-full service?
HTTP (RFC2616) spec states that everything other than the scheme and the host of an URL should be case sensitive.
I realize this part of the spec is regularly ignored, but that's the official word.
No, there isn't. The REST principle is based on the original ideas of the HTTP protocol, and it doesn't restrict parameter names to be case sensetive, case insensetive, upper case or lower case.
First you should decide if you want the parameters to be case sensetive or not. Perhaps the system that you are using makes either one a natual choise. (Other than that, I can't really think of a good reason to make them case sensetive.)
Then you should decice a casing that goes well with your resource addresses. If your addresses are in all lowercase, then it might look better if the parameter names are too.