Service worker JavaScript update frequency (every 24 hours?) - progressive-web-apps

As per this doc on MDN:
After that it is downloaded every 24 hours or so. It may be
downloaded more frequently, but it must be downloaded every 24h to
prevent bad scripts from being annoying for too long.
Is the same true for Firefox and Chrome? OR does update to service worker javascript only happens when user navigates to site?

Note: As of Firefox 57, and Chrome 68, as well as the versions of Safari and Edge that support service workers, the default behavior has changed to account for the updated service worker specification. In those browsers, HTTP cache directives will, by default, be ignored when checking the service worker script for updates. The description below still applies to earlier versions of Chrome and Firefox.
Every time you navigate to a new page that's under a service worker's scope, Chrome will make a standard HTTP request for the JavaScript resource that was passed in to the navigator.serviceWorker.register() call. Let's assume it's named service-worker.js. This request is only made in conjunction with a navigation or when a service worker is woken up via, e.g., a push event. There is not a background process that refetches each service worker script every 24 hours, or anything automated like that.
This HTTP request will obey standard HTTP cache directives, with one exception (which is covered in the next paragraph). For instance, if your server set appropriate HTTP response headers that indicated the cached response should be used for 1 hour, then within the next hour, the browser's request for service-worker.js will be fulfilled by the browser's cache. Note that we're not talking about the Cache Storage API, which isn't relevant in this situation, but rather standard browser HTTP caching.
The one exception to standard HTTP caching rules, and this is where the 24 hours thing comes in, is that browsers will always go to the network if the age of the service-worker.js entry in the HTTP cache is greater than 24 hours. So, functionally, there's no difference in using a max-age of 1 day or 1 week or 1 year—they'll all be treated as if the max-age was 1 day.
Browser vendors want to ensure that developers don't accidentally roll out a "broken" or buggy service-worker.js that gets served with a max-age of 1 year, leaving users with what might be a persistent, broken web experience for a long period of time. (You can't rely on your users knowing to clear out their site data or to shift-reload the site.)
Some developers prefer to explicitly serve their service-worker.js with response headers causing all HTTP caching to be disabled, meaning that a network request for service-worker.js is made for each and every navigation. Another approach might be to use a very, very short max-age—say a minute—to provide some degree of throttling in case there is a very large number of rapid navigations from a single user. If you really want to minimize requests and are confident you won't be updating your service-worker.js anytime soon, you're free to set a max-age of 24 hours, but I'd recommend going with something shorter on the off chance you unexpectedly need to redeploy.

Some developers prefer to explicitly serve their service-worker.js with response headers causing all HTTP caching to be disabled, meaning that a network request for service-worker.js is made for each and every navigation.
This no-cache strategy may result useful in a fast-paced «agile» environment.
Here is how
Simply place the following hidden .htaccess file in the server directory containing the service-worker.js:
# DISABLE CACHING
<IfModule mod_headers.c>
Header set Cache-Control "no-cache, no-store, must-revalidate"
Header set Pragma "no-cache"
Header set Expires 0
</IfModule>
<FilesMatch "\.(html|js)$">
<IfModule mod_expires.c>
ExpiresActive Off
</IfModule>
<IfModule mod_headers.c>
FileETag None
Header unset ETag
Header unset Pragma
Header unset Cache-Control
Header unset Last-Modified
Header set Pragma "no-cache"
Header set Cache-Control "max-age=0, no-cache, no-store, must-revalidate"
Header set Expires "Thu, 1 Jan 1970 00:00:00 GMT"
</IfModule>
</FilesMatch>
This will disable caching for all .js and .html files in this server directory and those below; which is more than service-worker.js alone.
Only these two file types were selected because these are the non-static files of my PWA that may affect users who are running the app in a browser window without installing it (yet) as a full-fledged automatically updating PWA.
More details about service worker behaviour are available from Google Web Fundamentals.

Related

Cache-Control header to bust device cache but allow CDN

I am implementing an HTTP polling mechanism to detect device network status. I am planning to make a periodic GET request to a static file /static/byte.txt to validate the device's internet access.
I am using the Cache-Control: no-cache request header to make sure I am not served with a cached copy of the file on the device (which defeats the purpose). But I would like to still use any cached copy of the file on the CDN, as there is no need to download the file from the origin (my servers) every time. Does anyone know of a way to set the cache control headers to achieve that? Thanks!
The Cache-Control request header is a poor fit for this use case as both the client HTTP library and the CDN will assign the same meaning to whatever cache control directive you choose.
Instead, I recommend using a Cache-Control response header. In the response, you can use something like Cache-Control: max-age=0, s-maxage=604800, which indicates that the client should not cache the response but the CDN can cache it for up to a week (604,800 seconds).

Webextension: Set response headers for web_accessible_resources

To provide some custom caching via a webextension, I use web_accessible_resources and redirect accesses towards them in a background script – see my previous question for details.
While that works content wise, I cannot find a way to change the response headers of the cached content, for example the Last-Modified header. So when I cache content that the original website does some consistence checks on, this will fail.
I tried to intercept the redirected response with an onHeadersReceived handler, but this never triggers as “Only requests made using HTTP or HTTPS will trigger events” and my redirect uses the moz-extension:// protocol.
How does one set response headers when serving web_accessible_resources?
Is it possible at all?

Isn't advantageous for a browser to cache static content?

I have been doing some OWASP tests and one of the low level threats is :
Low (Medium) Incomplete or No Cache-control and Pragma HTTP Header Set
Description
The cache-control and pragma HTTP header have not been set properly or are missing allowing the browser and proxies to cache content.
URL
<redacted url>
Evidence
public, must-revalidate, proxy-revalidate
The suggestion from OWASP is to prevent the content from being cached ... but this doesn't make any sense. I thought it helped your page loading speed if certain contents are cached by the browser? In addition how is caching static content a security threat?
It is advantageous for performance.
It is not advantageous for security if those pages contain sensitive information. If these headers are not set then even if you log out of a website then someone with access to your computer could access these pages from your history by just using the 'back' button. Intermediate proxies can also cache the pages. Security is all about context - if theres no sensitive information there then this isnt a problem.

Prevent an HTTP client from hitting a server with cache (iphone)

Ok, I'm confused. I'm trying to send back the magic headers from my server that will prevent a client from hitting the server again until a resource is stale.
I understand how ETag or Last-Modified works (Validation) - the client will ALWAYS still hit the server, and the server needs to validate the date or etag against the current value to know whether to bother serving up a new one.
Cache-Control and Expires, however, I don't think I understand. I've set the following:
Cache-Control: max-age=86400, must-revalidate
No matter what I do, my client (my browser, curl, NSURLConnection) always hits the server again on the second request. Is this a client thing? What headers should I send back to get the client to use it's private cache for a certain length of time?
As Nathan hints at in his answer, clients can issue a subsequent request with an If-Modified-Since header to determine whether or not their cache is stale. If the client receives a 304 Not Modified response, it will serve the content out of the local cache.
According to RFC 2616 (the HTTP 1.1 specification), the presence of must-revalidate within the Cache-control header forces clients to re-check their cache's status with the originating server prior to serving out of the cache.
For future reference - Mark Nottingham has written a great guide to HTTP caching:
http://www.mnot.net/cache_docs/#CACHE-CONTROL
The server needs to check the If-Modified-Since header and return a 304 not modified header if it wants the browser to keep caching.

in what situation does the HTTP_REFERER not work?

I have used referrer before in foo.php to decide whether the page iframing foo.php is of a particular URL. (using $_SERVER['HTTP_REFERER'])
It turned out that most of the time, it worked (about 98% of the time), but it also seemed like some users arrived the page and $_SERVER['HTTP_REFERER'] was not set in foo.php and therefore broke the code. [update: These user claimed that they followed the usual page flow and didn't use the URL of foo.php all by itself on the browser (that they let it be an iframe) and the users never altered their browser settings.]
I wonder what the reasons are that it could happen?
The HTTP/1.1 RFC does not make it mandatory to send an HTTP referer header. You can't make any assumptions about its presence when writing robust code; perfectly conforment browsers may not include it.
Moreoever, the RFC advises that "The Referer field MUST NOT be sent if the Request-URI was obtained from a source that does not have its own URI, such as input from the user keyboard", and "We suggest, though do not require, that a convenient toggle interface be provided for the user to enable or disable the sending of From and Referer information".
The later is not very common (though some browsers have a "Private" mode that fulfils the requirements). More likely for your 2% is that people Bookmarked the URL, which fulfils the first criteria (URI obtained from a source without a URI), and so the browser sends no referer.
Not by default AFAIK, but it's easy to turn it off (for privacy) e.g. in Firefox via about:config, and surely some users could be using browsers distributed to them (e.g. by their IT department) with such kinds of setting. So you should try to avoid relying on REFERER for any important functionality (also because it's mis-spelled, of course;-).