IHttpClientFactory, SetBearerToken(), AddHttpClient<>, and "You're using HttpClient wrong" - httpclient

I'm sure we've all read this article that made waves back in 2016: https://aspnetmonsters.com/2016/08/2016-08-27-httpclientwrong/
For those who haven't, in summary (emphasis mine):
Instead of creating a new instance of HttpClient for each execution you should share a single instance of HttpClient for the entire lifetime of the application.
Now, consider an ASP.NET Core application that consumes other web-services by using a HttpClient. Its usage of HttpClient falls into two general situations:
An outgoing request that is unauthenticated or uses the website's own credentials - in which case a single HttpClient really can be shared by all parts of the program.
An outgoing request made on behalf of one of the website's current visitors - or a request that's otherwise using request-specific credentials.
While a single HttpClient instance can be used, you must be careful not to mutate its state, for example, by setting DefaultRequestHeaders (e.g. by using the SetBearerToken extension method).
Practically all of the guidance for using HttpClient in ASP.NET Core says to:
Use Typed Clients (POCOs that accept a HttpClient instance as a constructor parameter, and any other DI services).
Register these Typed Clients using services.AddHttpClient<TClient>().
This will register TClient as a transient instance.
This will register ITypedHttpClientFactory<TClient> as a transient instance.
This will register IHttpClientFactory as a singleton
This will register IHttpMessageHandlerFactory as a singleton
With that now discussed, I'll bring your attention towards a contradiction:
The famous blog article - and now Microsoft's own documentation - says the HttpClient instances must be long-life'd.
ASP.NET Core's HttpClient DI system really makes sure that HttpClient instances are short-life'd.
However, might this be okay if the blog article should really be talking about the underlying HttpMessageHandler (which is the real HttpClient implementation which is simply wrapped by the thin shell class HttpClient) - instead of the outer class HttpClient?
Confounding things, people report problems with using long-life'd HttpClient instances too and introduce other workarounds which all involve the static class ServicePointManager which makes me uncomfortable:
https://itnext.io/reusing-httpclient-didnt-solve-all-my-problems-142a32a5b4d8
The author suggests tweaking ServicePointManager during application startup.
...including disabling Nagle's algorithm - which I think is probably a very bad idea.
http://byterot.blogspot.com/2016/07/singleton-httpclient-dns.html
The author suggests using ConnectionLeaseTimeout
...but you need to call ServicePointManager.FindServicePoint() for every URI you call! That doesn't seem right to me.
My questions:
Are HttpClient instances really meant to be long-life'd or short-life'd?
Or is it just the HttpMessageHandler instances that need to be long-life'd?
If HttpClient or HttpMessageHandler instances are meant to be long-life'd:
How do I correctly address the ServicePointManager issues brought up in both linked blog posts?
Do I absolutely have to avoid using HttpClient.DefaultRequestHeaders and instead explicitly set the Authorization (Cookies or Credentials) headers on each HttpRequestMessage? Is there another way which has less hassle?
Could I add those headers using my own DelegatingHandler? What should be the DI registration of this proposed handler then?
If HttpClient instance (but not HttpMessageHandler instances) are meant to be short-life'd:
...then it's okay to use HttpClient.DefaultRequestHeaders (and SetBearerToken())?
What is the lifetime policy of HttpMessageHandler instances then?
Can I still use HttpClient.DefaultRequestHeaders (and SetBearerToken())?
If HttpClient instance and HttpMessageHandler instances are meant to be short-life'd:
What about the issues mentioned in the original 2016 blog posting?
Does IHttpClientFactory do anything to mitigate those problems?

The underlying problem is with socket connections. new HttpClient() makes a new port for every request and leaves them open in the TIME_WAIT status:
This isn't the right state for them to sit in, and it takes up a socket without being reused. When the server runs out of sockets, .net throws a SocketException because it can't make a connection without an open socket. This is also called port exhaustion because the port runs out of sockets.
IHttpClientFactory keeps ports in the ESTABLISHED status, reuses them if another request comes in, and if no request comes in for 4 minutes, it closes them.
Your questions are all ultimately about HTTP and socket connections, but you're using .net classes to describe it. IHttpClientFactory manages the tcp socket connections with a pool of HttpClientHandlers so you don't have to. The answer to the rest of your questions are here: https://learn.microsoft.com/en-us/dotnet/architecture/microservices/implement-resilient-applications/use-httpclientfactory-to-implement-resilient-http-requests
Handlers will be reused by the factory when it needs to create new HttpClients. You can still set dynamic values like the bearer token through your own implementation, e.g. write your own Get<T> method that takes in a string bearerToken and then sets the Authorization header so you can keep that code in one place.
... I'll bring your attention towards a contradiction:
The famous blog article - and now Microsoft's own documentation - says the HttpClient instances must be long-life'd.
ASP.NET Core's HttpClient DI system really makes sure that HttpClient instances are short-life'd.
Where do you see Microsoft's documentation about IHttpClientFactory that says httpclients need to live long? Through the factory, they're scoped in DI, which means they're disposed at the end of the request. That's why it's safe to reuse the handler, but modify the client at runtime.

Related

Sockets can replace HTTP requests? (sockets vs http)

Creating a user, adding some record to collection in the DB, updating some stuff, etc..
All of these we regularly do with HTTP requests against REST api.
Think about making Event bus as server instead of REST api.
In that method, create user will be an event name: "CreateUser" instead of REST api endpoint: POST /users.
In reflect to any action done in the event bus, it will re-emit a following event telling to any body needed to know about, that the event was done.
If for example someone viewing the vehicles collection and another user just edit one of the columns or add a new vehicle instance, it will be reflected immediately to who views it online.
My question is if there attitudes like I mentioned above, if there some formally names for it, if it a good practice, if you know someone who regularly uses it, a framework or something etc. Does the socket.io server can handle and behave like http server in high workloads?
You can use websockets for this; they provide a bidirectional channel between client and server to send messages across. You will have to catch and parse the messages on each end yourself, as there is no additional protocol on top of them.
They don't hold state, so there is no knowledge of who is looking at what, or who got what. You could send the same update message to all connected clients and leave it to the client to use it or not.
You would have to reprogram your client code and the API endpoints, because it's a different way of doing things, and it can also do server push.
I have no idea about frameworks though, as I always use them without one. Websockets are fast, but server behaviour at high workloads depends on implementation, and I only have experience with the websocket server I wrote myself. I suppose the performance of the socket.io can easily be googled.

Vertx WebClient shared vs single across multiple verticles?

I am using vert.x as an api gateway to route calls to downstream services.
As of now, I am using single web client instance which is shared across multiple verticles (injected through guice)
Does it make sense for each verticle to have it's own webclient? Will it help in boosting performance? (My each gateway instance runs 64 vericles and handles approximately 1000 requests per second)
What are the pros and cons of each approach?
Can someone help to figure out what's the ideal strategy for the same?
Thanks
Vert.x is optimized for using a single WebClient per-Verticle. Sharing a single WebClient instance between threads might work, but it can negatively affect performance, and could lead to some code running on the "wrong" event-loop thread, as described by Julien Viet, the lead developer of Vert.x:
So if you share a web client between verticles, then your verticle
might reuse a connection previously open (because of pooling) and you
will get callbacks on the event loop you won't expect. In addition
there is synchronization in the web client that might become contented
when used intensively from different threads.
Additionally, the Vert.x documentation for HttpClient, which is the underlying object used by WebClient, explicitly states not to share it between Vert.x Contexts (each Verticle gets its own Context):
The HttpClient can be used in a Verticle or embedded.
When used in a Verticle, the Verticle should use its own client
instance.
More generally a client should not be shared between different Vert.x
contexts as it can lead to unexpected behavior.
For example a keep-alive connection will call the client handlers on
the context of the request that opened the connection, subsequent
requests will use the same context.

Use of HTTP RESTful methods GET/POST/etc. Are they superfluous?

What is the value of RESTful “methods” (ie. GET, POST, PUT, DELETE, PATCH, etc.)?
Why not just make every client use the “GET” method w/ any/all relevant params, headers, requestbodies, JSON,etc. etc.?
On the server side, the response to each method is custom & independently coded!
For example, what difference does it make to issue a database query via GET instead of POST?
I understand that GET is for queries that don’t change the DB (or anything else?).
And POST is for calls that do make changes.
But, near as I can tell, the RESTful standard doesn’t prevent one to code up a server response to GET and issue a stored procedure call that indeed DOES change the DB.
Vice versa… the RESTful standard doesn’t prevent one to code up a server response to POST and issue a stored procedure call that indeed does NOT change the ANYTHING!
I’m not arguing that a midtier (HTTP) “RESTlike” layer is necessary. It clearly is.
Let's say I'm wrong (and I may be). Isn't it still likely that there are numerous REST servers violating the proper use of these protocols suffering ZERO repercussions?
The following do not directly address my questions but merely dance uncomfortably around it like an acidhead stoner at a Dead concert:
Different Models for RESTful GET and POST
RESTful - GET or POST - what to do?
GET vs POST in REST Web Service
PUT vs POST in REST
I just spent ~80 hours trying to communicate a PATCH to my REST server (older Android Java doesn't recognize the newer PATCH so I had to issue a stupid kluge HTTP-OVERIDE-METHOD in the header). A POST would have worked fine but the sysop wouldn't budge because he respects REST.
I just don’t understand why to bother with each individual method. They don't seem to have much impact on Idempotence. They seem to be mere guidelines. And if you "violate" these "guidelines" they give someone else a chance to point a feckless finger at you. But so what?
Aren't these guidelines more trouble than they're worth?
I'm just confused. Please excuse the stridency of my post.
Aren’t REST GET/POST/etc. methods superfluous?
What is the value of RESTful “methods” (ie. GET, POST, PUT, DELETE, PATCH, etc.)?
First, a clarification. Those aren't RESTful methods; those are HTTP methods. The web is a reference implementation (for the most part) of the REST architectural style.
Which means that the authoritative answers to your questions are documented in the HTTP specification.
But, near as I can tell, the RESTful standard doesn’t prevent one to code up a server response to GET and issue a stored procedure call that indeed DOES change the DB.
The HTTP specification designates certain methods as being safe. Casually, this designates that a method is read only; the client is not responsible for any side effects that may occur on the server.
The purpose of distinguishing between safe and unsafe methods is to allow automated retrieval processes (spiders) and cache performance optimization (pre-fetching) to work without fear of causing harm.
But you are right, the HTTP standard doesn't prevent you from changing your database in response to a GET request. In fact, it even calls out specifically a case where you may choose to do that:
a safe request initiated by selecting an advertisement on the Web will often have the side effect of charging an advertising account.
The HTTP specification also designates certain methods as being idempotent
Of the request methods defined by this specification, PUT, DELETE, and safe request methods are idempotent.
The motivation for having idempotent methods? Unreliable networks
Idempotent methods are distinguished because the request can be repeated automatically if a communication failure occurs before the client is able to read the server's response.
Note that the client here might not be the user agent, but an intermediary component (like a reverse proxy) participating in the conversation.
Thus, if I'm writing a user agent, or a component, that needs to talk to your server, and your server conforms to the definition of methods in the HTTP specification, then I don't need to know anything about your application protocol to know how to correctly handle lost messages when the method is GET, PUT, or DELETE.
On the other hand, POST doesn't tell me anything, and since the unacknowledged message may still be on its way to you, it is dangerous to send a duplicate copy of the message.
Isn't it still likely that there are numerous REST servers violating the proper use of these protocols suffering ZERO repercussions?
Absolutely -- remember, the reference implementation of hypermedia is HTML, and HTML doesn't include support PUT or DELETE. If you want to afford a hypermedia control that invokes an unsafe operation, while still conforming to the HTTP and HTML standards, the POST is your only option.
Aren't these guidelines more trouble than they're worth?
Not really? They offer real value in reliability, and the extra complexity they add to the mix is pretty minimal.
I just don’t understand why to bother with each individual method. They don't seem to have much impact on idempotence.
They don't impact it, they communicate it.
The server already knows which of its resources are idempotent receivers. It's the client and the intermediary components that need that information. The HTTP specification gives you the ability to communicate that information for free to any other compliant component.
Using the maximally appropriate method for each request means that you can deploy your solution into a topology of commodity components, and it just works.
Alternatively, you can give up reliable messaging. Or you can write a bunch of custom code in your components to tell them explicitly which of your endpoints are idempotent receivers.
POST vs PATCH
Same song, different verse. If a resource supports OPTIONS, GET, and PATCH, then I can discover everything I need to know to execute a partial update, and I can do so using the same commodity implementation I use everywhere else.
Achieving the same result with POST is a whole lot more work. For instance, you need some mechanism for communicating to the client that POST has partial update semantics, and what media-types are accepted when patching a specific resource.
What do I lose by making each call on the client GET and the server honoring such just by paying attention to the request and not the method?
Conforming user-agents are allowed to assume that GET is safe. If you have side effects (writes) on endpoints accessible via GET, then the agent is allowed to pre-fetch the endpoint as an optimization -- the side effects start firing even though nobody expects it.
If the endpoint isn't an idempotent receiver, then you have to consider that the GET calls can happen more than once.
Furthermore, the user agent and intermediary components are allowed to make assumptions about caching -- requests that you expect to get all the way through to the server don't, because conforming components along the way are permitted to server replies out of their own cache.
To ice the cake, you are introducing another additional risk; undefined behavior.
A payload within a GET request message has no defined semantics; sending a payload body on a GET request might cause some existing implementations to reject the request.
Where I believe you are coming from, though I'm not certain, is more of an RPC point of view. Client sends a message, server responds; so long as both participants in the conversation have a common understanding of the semantics of the message, does it matter if the text in the message says "GET" or "POST" or "PATCH"? Of course not.
RPC is a fantastic choice when it fits the problem you are trying to solve.
But...
RPC at web scale is hard. Can your team deliver that? can your team deliver with cost effectiveness?
On the other hand, HTTP at scale is comparatively simple; there's an enormous ecosystem of goodies, using scalable architectures, that are stable, tested, well understood, and inexpensive. The tires are well and truly kicked.
You and your team hardly have to do anything; a bit of block and tackle to comply with the HTTP standards, and from that point on you can concentrate on delivering business value while you fall into the pit of success.

HATEOAS and server state

Trying to understand REST HATEOAS:
Suppose I have a service that has state; they are: initial, ready, running. I have a client that connects to the service, obtains a page with links that allow it to mutate the service state.
It uses one of the links to change the service's state and obtains another page with new links.
As long as there is 1 client, the state the client holds is identical with the service. But if there is a second client and it changes the service's state, the first client's representation is stale.
How is this resolved in HATEOAS? From what I've read it seems that REST is not applicable and I should maybe look at something else. If so, what?
Thanks!
This is not resolved by HATEOS (entirely). As REST is stateless this is kind of a paradox use case to keep state in client and server aligned.
Assuming I understand your requirements, yes, you're state in client 1 is stale and not the same as the one on the server. But what if the client would make a periodic call to the server to see whether some other client changed it? If so, with HATEOAS you could provide a link to serve the current state and omit the link to change the state.
#Kay - Thanks for answering.
I'm going to try to answer my own question. I realized after reading your answer that the "application" in HATEOAS is really the virtual application the client experiences when it retrieves and processes the resources it gets from the server. Its states are the pages (resources) it transitions between. The server (service?) may have its own state but it's not the same as the client's.
As long as this distinction is kept in mind, it is not unRESTful to have stale links in client 1. The server simply responds with new links reflecting its own state. And the client makes new transitions based on the updated links.
Still trying to understand. If I have it wrong, I'd appreciate some help.
Thanks!
The stateless requirement of REST refers to the ability of the server to understand and process the client request independent from any previous interactions it has made with said client. In other words, the client should be able to send a request "out of the blue" to the server (I.e., without a session saved on said server) and have the request processed. Hence there isn't a concept of login and logout in a purely RESTful architecture.
That's a different constraint than HATEOAS. Basically, "hypermedia as the engine of application state" means that all state is conveyed through the media type being used and not the connection itself. The client can (and often does) keep its own state, and can request snapshots of the state of resources from the server through resource representations (a.k.a media types).
If you want to be notified when a resource changes state, REST is (probably) the wrong choice. You'd likely want to use a different application protocol than, say, HTTP.
As Fielding says: it's not REST without HATEOAS. Don't call your service REST if it's ignore HATEAOS and stateful service can not be REST. You understood HATEOA. The server provides hyperlinks for the client which should be use to change the state located at the CLIENT SIDE.
To solve your problem: omit tend server from any state information. It will easier your life. Then implement REST as using the Richardson Maturity Model while consider information's from here.

Jain SIP in multi-thread environment

It's not clear how to use jain SIP stack in mutli-thread environment. I need to create multiple SIP sessions from different threads, e.g each client should be proceeded in its own transaction. Below is few options:
Use single SipProvider for receiving and sending SIP requests and do multiplexing on application side. SipProvider is not thread-safe, hence sending requests requires proper locking.
Create new SipProvider and new ListeningPoint for each client. This is how it works for me now. However, I don't really like it. And it's not clear, whther SipStack threadsafe or not
Create new instance of SipStack for every client
Its been a long time since I thought about JAIN-SIP (or even SIP for that matter or even Java) but here goes:
Set the re-entrant listener flag when you create the stack. (look up the javadocs). Specify a thread pool size. When a sip request or response comes along, the stack may potentially create a new thread for you and invoke your listener.
Your critical section is the SipListener implementation. You should not block for ever in it - otherwise new inbound requests and responses will not be routed to the sip listener for the transaction that is being processed at the time you blocked.
Hope that answers your question. Happy hacking.
Thats it.
why don't you ue SIP Servlets, it lets you focus on your application logic and handles those details for you ?
See http://code.google.com/p/sipservlets/