JMAP uses /.well-known for service discovery, would it be considered valid use of RFC 5785? - email

I was surprised
A JMAP-supporting email host for the domain example.com SHOULD publish a SRV record _jmaps._tcp.example.com which gives a hostname and port (usually port 443).
The authentication URL is https://hostname/.well-known/jmap (following any redirects).
Other autodiscovery options using autoconfig.example.com or autodiscover.example.com may be added to a future version of JMAP to support clients which can’t use SRV lookup.
It doesn't match the original use cases for the well-known URI registry. Stuff like robots.txt, or dnt / dnt-policy.txt. And IPP / CUPS printing works fine without it, using a DNS TXT record to specify a URL. If you can look up SRV records, you can equally look up TXT. And the autodiscovery protocol involves XML which can obviously include a full URI.
E.g. what chance is there of this being accepted by the registry of well-known URIs? Or is it more likely to remain as something non-standard, like made-up URI schemes?

The idea almost certainly came from CalDav, which is already in the registry of well-known URIs. RFC 6532 defines DNS SRV and both DNS TXT and a well-known URI. So JMAP's proposal is perfectly well-founded.
It might sound strange that the URL is authenticated against, but this too is justified in CalDav. I think it helps shard users between multiple servers.
IMO it's not a good way to use SRV. On the other hand, JMAP is specifically considering clients that don't use SRV. One presumes the CalDav usage is for similar reasons.
It does seem bizarre that presumably web-centric implementations can't manage to discover full URIs (i.e. if they're using the autoconfig protocol).
I think you have to remember that these approaches start from user email addresses. The hallowed Web Architecture using HTTP URIs for everything... well, let's say it doesn't have much to say about mailto: URIs. DNS has got to be the "right" way to bridge the gap from domains to URIs. But in a web-centric world where you don't necessarily know how to resolve DNS, or only how to look up IPs to speak HTTP with? There are going to be some compromises.

Related

Forwarding to different IP addresses depending on which application I use

Is there anyway that I can go from a server to a website depending whether you use PuTTY or a browser? I have 1 domain using GoDaddy and I want it to go to a server when using PuTTY but a website when using a browser.
This is what I have so far.
By the way I am using GitHub pages for the browser side of things.
HTTP (port 80 and 443) is an end user protocol, SSH is an admin protocol.
The semantics (meaning rules) behind the Domain Name protocol, is that names represent an IP address, and they in turn represent a physical server.
While it is technically possible to break the semantic rules of a protocol, it is not convenient to do so, as it would make maintenance harder by making the design more opaque. If these were both end user protocols I would understand if maintenance ease were sacrificed for the sake of user ease, but you are trying to create an abstraction for admins, which will only make debugging harder.
The solution is to use a different name for a different server, you can use a subdomain, for example, by using ssh.domain.com if the endpoint is a bastion host that tunnels connections to domain.com, or just use a different domain name altogether if it's a completely different service.

If many clients request the same resource do reverse proxy servers make new request each time?

The scenario being when many different clients are requesting the same resource from a reverse proxy server, is the reverse proxy forwarding the client request (to the server with the resource) for each client request? If so is that by default or does it depend on the configuration (the request headers: etags, if-none-match, .ect) between the proxy and server with the resource. Thanks
This question might be more appropriate over in ServerFault. That being said, it highly depends on the reverse proxy setup. It's likely both the reverse-proxy software being used (there are many, many types, from nginx, to custom c++, to even Apache) as well as the configuration set for each. If you're trying to write a reverse proxy, syndicating requests to the resource is a tempting, but dangerous idea. It works very well for public, static pages, but you have to be careful not to open just one connection to, for example, facebook.com, and send the result to two different people (showing one of them the other's personal data). For that reason, it should only be attempted when the resource marks, through appropriate headers that it is safe to cache.

How can I recognize different applications in NetFlow dumps?

I try to discover what kind of applications work in my network (e.g. Facebook, Youtube, Twitter etc.) . Unfortunatelly I can't do Deep Packet Inspection, everything I have are NetFlow traces. I was thinking about resolving ip addresses using DNS server and check domain names of flows. But what if application use domain that doesn't contain app name? Is that any possibility to find all ip addresses that use specific app/website?
Outside deep packet inspection (in which I include tech like Cisco NBAR) your main tools are probably going to be whois and port/protocol pair. Some commercial NetFlow collectors will do some of the legwork for you, for example by doing autonomous system lookup on incoming IP addresses, or providing the IANA protocol list.
The term "application" is a bit overloaded in this domain, by the way: often it's used to mean HTTP, SSH, POP3 and similar protocols in the OSI Application Layer, which are generally guessed from the port/protocol pair. For Facebook, Hotmail, etc, the whois protocol is probably your best bet. It's a bit better than reverse DNS, but the return formats aren't standardized among the Regional Internet Registries, so your parser is going to need to have some smarts. Get the IP addresses for a few of the major sites and use the command line whois utility with them to get a feel for the output before scripting anything.
Fortunately, most of the big ones are handled by ARIN. Look for "NetName" and "OrgName" in the results (and watch for the RIR names (RIPE, APNIC, etc) to indicate where that IP address isn't handled by ARIN). For example, I see www.stackoverflow.com as 198.252.206.16. whois 198.252.206.16 returns (among other things,
NetName: SE-NET01
OrgName: Stack Exchange, Inc.
You didn't specify whether you were shell scripting or programming; if the latter, the WHOIS protocol is standard and has a number of implementations in most languages.

C Programming - Sending HTTP Request

My recent assignment is to make a proxy in C using socket programming. The proxy only needs to be built using HTTP/1.0. After several hours of work, I have made a proxy that can be used with Chromium. Various websites can be loaded such as google and several .edu websites; however, many websites give me a 404 error for page not found (these links work fine when not going through my proxy). These 404 errors even occur on the root address "/" of a site... which doesn't make sense.
Could this be a problem with my HTTP request? The HTTP request sent from the browser is parsed for the HTTP request method, hostname, and port. For example, if a GET request is parsed from the browser, a TCP connection is established to the hostname and port provided, and the HTTP GET request is sent in the following format:
GET /path/name/item.html HTTP/1.0\r\n\r\n
This format works for a small amount of websites, but a 404 error message is created for the rest. Could this be the problem? If not, what else could possibly be giving me this problem?
Any help would be greatly appreciated.
One likely explanation is the fact that you've designed a HTTP/1.0 proxy, whereas any website on a shared hosting site will only work with HTTP/1.1 these days (well, not quite, but I'll get to that in a second).
This isn't the only possible problem by a long way, but you'll have to give an example of a website which is failing like this to get some more ideas.
You seem to understand the basics of HTTP, that the client makes a TCP connection to the server and sends a HTTP request over it, which consists of a request line (such as GET /path/name/item.html HTTP/1.0) and then a set of optional header lines, all separated by CRLF (i.e. \r\n). The whole lot is ended with two consecutive CRLF sequences, at which point the server at the other end matches up the request with a resource and sends back an appropriate response. Resources are all identified by a path (e.g. /path/name/item.html) which could be a real file, or it could be a dynamic page.
That much of HTTP has stayed pretty much unchanged since it was first invented. However, think about how the client finds the server to connect to. What you give it is a URL, like this:
http://www.example.com/path/name/item.html
From this it looks at the scheme which is http, so it knows it's making a HTTP connection. The next part is the hostname. Under original HTTP the assumption was that each hostname resolved to its own IP address, and then the client connects to that IP address and makes the request. Since every server only had one website in those days, this worked fine.
As the number of websites increased, however, it became difficult to give every website a different IP address, particularly as many websites were so simple that they could easily be shared on the same physical machine. It was easy to point multiple domains at the same IP address (the DNS system makes this really simple), but when the server received the TCP request it would just know it had a request to its IP address - it wouldn't know which website to send back. So, a new Host header was added so that the client could indicate in the request itself which hostname it was requesting. This meant that one server could host lots of websites, and the webserver could use the Host header to tell which one to serve in the response.
These days this is very common - if you don't use the Host header than a number of websites won't know which server you're asking for. What usually happens is they assume some default website from the list they've got, and the chances are this won't have the file you're asking for. Even if you're asking for /, if you don't provide the Host header then the webserver may give you a 404 anyway, if it's configured that way - this isn't unreasonable if there isn't a sensible default website to give you.
You can find the description of the Host header in the HTTP RFC if you want more technical details.
Also, it's possible that websites just plain refuse HTTP/1.0 - I would be slightly surprised if that happened on so many websites, but you never know. Still, try the Host header first.
Contrary to what some people believe there's nothing to stop you using the Host header with HTTP/1.0, although you might still find some servers which don't like that. It's a little easier than supporting full HTTP/1.1, which requires that you understand chunked encoding and other complexities, although for simple example code you could probably get away with just adding the Host header and calling it HTTP/1.1 (I wouldn't suggest this is adequate for production code, however).
Anyway, you can try adding the Host header to make your request like this:
GET /path/name/item.html HTTP/1.0\r\n
Host: www.example.com\r\n
\r\n
I've split it across lines just for easy reading - you can see there's still the blank line at the end.
Even if this isn't causing the problem you're seeing, the Host header is a really good idea these days as there are definitely sites that won't work without it. If you're still having problems them give me an example of a site which doesn't work for you and we can try and work out why.
If anything I've said is unclear or needs more detail, just ask.

RESTful PUT and DELETE and firewalls

In the classic "RESTful Web Services" book (O'Reilly, ISBN 978-0-596-52926-0) it says on page 251 "Some firewalls block HTTP PUT and DELETE but not POST."
Is this still true?
If it's true I have to allow overloaded POST to substitute for DELETE.
Firewalls blocking HTTP PUT/DELETE are typically blocking incoming connections (to servers behind the firewall). Assuming you have controls over the firewall protecting your application, you shouldn't need to worry about it.
Also, firewalls can only block PUT/DELETE if they are performing deep inspection on the network traffic. Encryption will prevent firewalls from analyzing the URL, so if you're using HTTPS (you are protecting your data with SSL, right?) clients accessing your web service will be able to use any of the standard four HTTP verbs.
Some 7 layer firewalls could analyze traffic to this degree. But I'm not sure how many places would configure them as such. You might check on serverfault.com to see how popular such a config might be (you could also always check with your IT staff)
I would not worry about overloading a POST to support a DELETE request.
HTML 4.0 and XHTML 1.0 only support GET and POST requests (via ) so it is commonplace to tunnel a PUT/DELETE via a hidden form field which is read by the server and dispathced appropriately. This technique preserves compatibility across browsers and allows you to ignore any firewall issues.
Ruby On Rails and .NET both handle RESTful requests in this fashion.
As an aside GET, POST, PUT & DELETE requests are fully supported through the XMLHttpRequest request object at present. XHTML 2.0 's officially supports GET, POST, PUT & DELETE as well.
You can configure a firewall to whatever you want ( at least in theory ) so don't be surprised if some sys admins do block HTTP PUT/DELETE.
The danger of HTTP PUT/DELETE is concerning some mis-configured servers: PUT replaces documents (and DELETE deletes them ;-) on the target server. So some sys admins decide up right to block PUT in case a crack is opened somewhere.
Of course we are talking about Firewalls acting at "layer 7" and not just at the IP layer ;-)