how to know if code coming from local cache vs server - fiddler

In fiddler, is there any way of knowing if some piece of code ( jscript, jquery, css) are been loaded from local cache vs downloaded from the server. I think this may be represented by different color in web sessions, but wasn't able to find legend for these colors.

If you see 304 Not Modified responses, those mean that the client made a conditional request, and server is signalling "no need to download, you have the newest version cached". That's one "class" of cached responses.
However, for some entities, not even conditional requests are sent (Expires header is in the future, etc. - see RFC2616 ). Those would not show up in Fiddler at all, as there is no request at all - the client may assume that the cached version is fresh.
What you can certainly see are the non-cached resources - anything coming back with a response code from the 2xx range should be non-cached (unless there's a seriously misconfigured caching proxy upstream, but those are rare nowadays).
You could clear your caches, and open the page. Save those results. Then open the page again - see what's missing when compared to the first load; those are cached.

Fiddler is an HTTP proxy, so it does not show cached content at all.

Related

Why my website keeps requesting resources from server even after the website is fully loaded

I am working on the web vitals for a website and I was checking the Chrome Developer Tools the Network tab. The website loads fully, but I see that in the network tab, the server requests keep on increasing and the resources requested go up to 7.8MB and the website has a slider that keeps repeating in the network. How can I check why so many requests are made?
Here is the picture of the network tab of the website.
I see that the resource names are slide-X.jpg. Without seeing the website or its code, I can only guess that there's a carousel on the page that cycles through images. If the images aren't cacheable, they'd continue to be loaded over the network. Otherwise if they are cacheable, I'd expect to see no network requests at all or at worst a 304 HTTP "Not Modified" response code.
So I'd recommend confirming what kinds of widgets are on the page like a carousel with repetitive behavior and checking the cache control headers of static content like images to avoid the need to load the images each time. Personally, I think carousels are bad UX so I'd even suggest you consider removing it all together! Regardless, you should still cache your content more efficiently.

How to manage HATEOAS links when the server is the client?

I'm learning about HATEOAS. The backend server I'm working on will use a third party REST API that uses HATEOAS. That API has an end point to return the url for each resource and also returns the related resource links with regular requests.
But I'm wondering what's a good way to manage these links on the server to avoid hardcoding them. For example if the third party changes the url of the resource, how will the server detect that change? Are there any standard practices for managing HATEOAS resource links?
Possible ways I can think of
When the server starts, get all the resources urls and cache them. Whenever the third party API needs to be called, reuse these cached urls. Whenever there is a 404 or related error, update the resource url. Or update the url periodically in intervals.
Get the resource url each time before calling the end point. Simplest but essentially doubles the number of requests.
But neither sound like robust ways.
While discovery is generally a good thing and should allow a HATEOAS system to introduce changes in ways that 'hardcoded urls' don't, if urls start breaking arbitrarily I would still consider this a major issue.
You should be able to store urls / links on your side and have some expectation that those keep working.
There are some mechanisms that deal with changes though:
The server should return 301 / 308 redirects if a resource moved. If this were the case, you should also update your references.
The server can emit Sunset or Deprecated headers. See: https://www.rfc-editor.org/rfc/rfc8594
Those are more general answers, but ultimately the existence of best practices does not mean that vendors will abide by them. With that in mind I think your best bet is to try and find out what the deprecation policy is of your vendor and see what they recommend.
Use a cached resource if it is valid, request a refresh when you don't have a local valid copy.
RFC 7234 defines the caching semantics of HTTP.
Ideally, you don't implement the caching rules yourself, but instead you use a general purpose cache.
In its ideal form, your bespoke implementation is talking to a headless browser, and the headless browser worries about the caching rules for you.
In theory, you need the initial URL to start the process, and everything else comes from that.
Each resource you get from the server should include links to other edges on the graph of service for that resource.
So, once you get the initial resource, all of the rest come automatically.
That said, it's not untoward to have "well known" entry points that are, ideally, unchanging URLs. But in the end, those are just "bookmarks", and not necessarily guaranteed end points.
Consider a shopping site such as Amazon. Outside of amazon.com, you don't know any of their URLs. They're all provided on the various forms and pages, and the human simply navigates the site. Those URLs can be changing all the time, and no one would know. With HATEOAS, it's up to the machine to follow the links, rather than a human. But the process of navigation is the same.
As others have mentioned, idea of caching a root resource has merit. Then you rely on the caching headers to direct you to how often you have to refresh the links.
But that said, operationally, there's no difference between following a normal link, and following a cached link. Underneath, the cached resource loads faster, but you still need to "follow the link". Because that's where the caching behavior kicks in. This is different from assuming the link is good, assuming you know the result of a resource lookup. Your application follows the link. Always. The underlying infrastructure is responsible for making it efficient.
So, your code should not, say, load up a root resource, and then stuff a map filled with links, and then assume they're good. Rather, the code should request the root resource, perhaps as a Map of links (datatypes for the win), and let the next layer handle the details. Because it all depends on the type of caching involved. Some have coded durations where no followup is necessary. Others, you make the request anyway, and the server tier responds back "nothing changed", so you can use your local copy, but you're still require to ask in the first place.
Those are implementation details that the SERVER mandates (not the client). It's a server contract. If they want you pinging them each and every time, so be it. That's the contract they're presenting to you and if you want to be a Good Citizen, then you should honor that contact.
Ideally, the server makes good decisions on these kinds of issues for the sake of efficiency, but in the end it's really up to them.
The client has to go along. The client in a HATEOAS system cedes a lot to the server. They're simply not decisions for the client to make.

HTTP/2 Server Push and Browser Cache

I read some documents about HTTP/2 Server Push.
A blog owner said that:
However, there is a very headache problem in server push. If the browser has already cached the resource files which are to be pushed, pushing is just a waste of bandwidth.
Another one said:
Since server push will send the assets to the client as distinct HTTP objects (each with its own Cache-Control headers), they can be cached by the browser just like anything else.
My question is that, HTTP/2 Push and browser cache both are working well? Or If I activate HTTP/2 Push feature for some assets, browser cache won't work for these assets?
If you push a resource and the page needs to use it, it will be saved to the browser cache for next time.
The problem lies if you change the resource, and push it again, but the old version is already in the browser cache and the cache control headers say it’s still valid, then the browser will use the old cached version despite the fact you have pushed a newer version. So it’s a wasted push.
Good blog post on it here and here and also Chapter 5 of my book due out soon covers this too.

How can a HTTP client request the server for latest data/to refresh cache?

We're designing a REST service with server-side caching. We'd like to provide an option to the client to specifically ask for latest data even if the cached data has not expired. I'm looking into the HTTP 1.1 spec to see if there exists a standard way to do this, and the Cache Revalidation and Reload Controls appears to fit my need.
Questions:
Should we just use Cache Revalidation and Reload Controls?
If not, is it acceptable to include an If-Modified-Since header with epoch time, causing the server to always consider the resource as have changed? The spec doesn't preclude this, but I'm wondering if I'm abusing :) the intent of the header?
What'd be a good way to identify the resource to refresh? In our case, the URL path alone is not enough, and I'm not sure if query or matrix parameters are considered as part of a unique URL. What about using an ETag?
If your client wants a completely fresh representation of a resource, it may specify max-age=0 to do that. That is actually the intent to receive a response no older than 0 seconds.
All other mechanisms you mentioned (If-Modified-Since, ETag, If-Match, etc.) are all working with caches to make sure the resource is in some state. They work only, if you definitely know you have a valid state of the resource. You can think of it as optimistic locking. You can make conditional requests for when the resource did, or did not change. However you have to know whether you are expecting a change or not.
You could potentially misuse the If-Modified-Since as you say, but max-age communicates your intent better.
Also note, by design there may be multiple caches along the way, not just your server side cache. Most often the client caches also, and there may be other transparent caches on the way.
According to section-5.2.1.4, it appears that no-cache request directive best fits my need.
The "no-cache" request directive indicates that a cache MUST NOT use a
stored response to satisfy the request without successful validation
on the origin server.
Nothing is said about subsequent requests, which is exactly what I want. There is also a no-cache response directive in section-5.2.2.2, but that also applies to subsequent requests.

Apache2 mod_perl Last-Modified header ignored

I have a perl generated page. The contents of this page change every 30 minutes, so I'm setting $r->set_last_modified() to the time the contents last changed.
That all works well and I can see the correct header arriving at my browser.
When I refresh the page, I see my browser uses the correct "If-Modified-Since" header in the request to the server, but Apache2 ignores this and re-sends the entire page.
How can I get Apache2 to behave correctly and respond with a "HTTP/1.x 304 Not Modified" ?
(The "last-modified" / "if-modified-since" headers are handled correctly when requesting static content from the same Apache2 process.)
Thanks for any help.
EDIT: Are my expectations wrong? Do I have to explicitly handle inbound If-Modified-Since headers in my perl script?
Sadly, yes, your expectations are wrong.
At the point where you basically say to Apache "OK, I'm dealing with this request...", Apache is going to hand over responsibility for everything to you. If you want the request to honour If-Modified-Since, it's down to your code.
Face it, this is the right behaviour, since there's no way Apache can know what you /really/ mean by 'modified' in a Perl handler: it might be that the best check is to go query your backend DB for a timestamp on a record, for example....
Apache won't store your last-modified value when it processes a request. So in order to decide whether something was modified it will have to run your application.