How to fix service worker red dot error without unregistering it? - progressive-web-apps

I've observed that when I make changes to a service worker, it occasionally displays as a red dot.
I tried refreshing the browser, but the red dot remained; the only way to remove it is to unregister it.
I'm quite sure my service worker is alright because reinstalling the service worker removes the red dot.
What is causing this?

That means the Service Worker is installed and waiting, but not yet active. All new service workers pass through phases: installing => waiting => active. Explicit activation is required for modified/updated service workers to avoid unexpected behaviors while the browser is running the app. Try skipWaiting():
// service-worker.js
self.addEventListener('install', e => {
self.skipWaiting() // always activate updated SW immediately
})

Related

How add multiple service worker on Flutter Web

I want to add a service worker (in addition to the existing "flutter_service_worker").
The problem I have is that I can't get both to be active at the same time. I don't quite understand the scope issue.
When I register my service worker, I do the following in the index.html: navigator.serviceWorker.register("./sw.js", { Scope: "/". });
Then I install it and activate it in cache, but it overlaps the original.
Which scope should be used to keep both service workers active?

Service Fabric upgrades keep active connections alive

I am trying to upgrade an application deployed to service fabric.
How can I only upgrade nodes that have no active connections and wait for the busy nodes to finish before upgrading them?
Most of the time, you don't really have to worry about the upgrades on a node level as the SF runtime handles it internally if configured in Monitored mode. This is what we've been using with a high level of success and never really had to do much. This also fit our requirement that all upgrade domains (nodes) have to match our health state policies before considered healthy.
If you want to have more advanced control over your upgrades like using request draining etc, have a look at the info as mentioned here. But to be honest, we've been quite happy with just using monitored mode and investigating why stuff fails if it does. We had some apps that had a long background task running as a stateful actor that sometimes failed upgrade and most always it was due to an issue that was caused in the background task itself instead of anything to do with Service Fabric.
Service Fabric knew when no active connections and background tasks were running to then upgrade nodes and we could actually see the nodes that were temporarily 'stuck' due to waiting for an active background task to finish.

Service Fabric "Waiting for upgrade..." using VSTS

I've configured upgrades on my VSTS release of a Service Fabric app containing 5 services to a single node test environment on Azure. Unfortunately when it gets to the release part it just hangs saying "Waiting for upgrade..." over and over again. I left it for 15 hours and it still says the same thing. The initial deployment went ahead without issue.
I've looked at various posts about turning off health check times, but this has not been successful. I've also tried setting the mode to UnmonitoredAuto, but no success.
I've RDPd onto the environment and checked the processor/memory usage in task manager, and everything is pretty much 0%, and very low memory usage.
Is there anything else I can do to stop the upgrade hanging?
OK, I've managed to fix this. This was happening because there is a PreUpgradeSafetyCheck that happens before rolling out an upgrade. This is not relevant for a single node cluster as downtime is inevitable for single node clusters.
The status of an upgrade can be found using: Get-ServiceFabricApplicationUpgrade. Which shows the status above.
To fix this there is a flag: UpgradeReplicaSetCheckTimeoutSec in the release task. Setting the value to 0 sorts things out.

CoordinatedShutdown timeout on Akka cluster application

We've an akka cluster application (sharding some actors). Sometimes, when we deploy and our application should be turned off we see some logs like that:
Coordinated shutdown phase [cluster-sharding-shutdown-region] timed
out after 10000 milliseconds
This happens on the first deploy after more than 2 days since last deploy (on mondays for example). We ask the akka node to quit the cluster with the JMX helper and we have the following code too:
actorSystem.registerOnTermination {
logger.error("Gracefully shutdown of node")
System.exit(0)
}
So when this error happens, eventually node leaves the cluster (or at least it closes the JMX entry point to manage akka cluster) but process don't finish and the log "Gracefully shutdown of node" doesn't appear. So when this happen we need to shutdown the java process manually (we handle this with supervisor) and redeploy.
I know the timeout can be tunned through config but what are the implications of increasing this timeout? Why sometimes coordinated shutdown throws a timeout? What happens when coordinated shutdown timeout?
Any clue would be appreciated :D
Thank you
What happens after timeout? Quoting from Akka documentation:
If tasks are not completed within a configured timeout (see reference.conf) the next phase will be started anyway. It is possible to configure recover=off for a phase to abort the rest of the shutdown process if a task fails or is not completed within the timeout.
Why the shutdown may time out? Quite possible you have a deadlock somewhere. In that case, increasing the timeout wouldn't help. It may also very well be that you need more time for shutdown. Then, you must increase the timeout.
But more related to your problem, could be the following:
By default, the JVM is not forcefully stopped (it will be stopped if all non-daemon threads have been terminated). To enable a hard System.exit as a final action you can configure:
akka.coordinated-shutdown.exit-jvm = on
So you can turn this on, which should solve the "shutdown the java process manually" step.
Nevertheless, the hard question is to find out why the shutdown times out in the first place. I guess with the above trick you can survive for some time, but you'd better spend some time to find the actual cause.
We used to face this problem (One of the Co-ordinated shutdown phase timeout) for short lived application.
Use case where we faced this:
Application joins existing akka cluster
Does some work
Leaves the cluster
But at step 3, the status of member was still (Joining or WeaklyUp) and if you see task added for PhaseClusterLeave, it allows to remove member from cluster only if it's status is UP.
Snippet from ClusterDaemon.scala which is invoked on Running ClusterLeave phase :
def leaving(address: Address): Unit = {
// only try to update if the node is available (in the member ring)
if (latestGossip.members.exists(m ⇒ m.address == address && m.status == Up)) {
val newMembers = latestGossip.members map { m ⇒ if (m.address == address) m.copy(status = Leaving) else m } // mark node as LEAVING
val newGossip = latestGossip copy (members = newMembers)
updateLatestGossip(newGossip)
logInfo("Marked address [{}] as [{}]", address, Leaving)
publishMembershipState()
// immediate gossip to speed up the leaving process
gossip()
}
}
To solve this problem, we ended up writing our own CoordinatedShutdown which you can refer here CswCoordinatedShutdown.scala

Removing service fabric application fails

I have deployed an application to a 5 node standalone cluster. Deployment succeeded successful. But the application did not start because of some bug in the application.
I tried removing the application from the cluster using the Service Fabric Explorer but this fails.
The health State of the application is “Error” and the status is “Deleting”
The application has 9 services. 6 services show a Health State “Unknown” with a question mark and a Status “Unknown”. 3 services show a health state “Ok” but with a Status “Deleting”.
I have also tried to remove it using powershell:
Remove-ServiceFabricApplication -ApplicationName fabric:/appname -Force -ForceRemove
The result was an Operation timed out.
I also tried the script below that I found in some other post.
Connect-ServiceFabricCluster -ConnectionEndpoint localhost:19000
$nodes = Get-ServiceFabricNode
foreach($node in $nodes)
{
$replicas = Get-ServiceFabricDeployedReplica -NodeName $node.NodeName - ApplicationName "fabric:/MyApp"
foreach ($replica in $replicas)
{
Remove-ServiceFabricReplica -ForceRemove -NodeName $node.NodeName -PartitionId $replica.Partitionid -ReplicaOrInstanceId $replica.ReplicaOrInstanceId
}
}
Also no result, the script did not find any replica to remove.
At the same time we started removing the application one of the system services also changed state.
The fabric:/System/NamingService service shows a “Warning” health state.
This is on partition 00000000-0000-0000-0000-000000001002.
The primary replica shows:
Unhealthy event: SourceId='System.NamingService', Property='Duration_PrimaryRecovery', HealthState='Warning', ConsiderWarningAsError=false.
The PrimaryRecovery started at 2016-10-06 07:55:21.252 is taking longer than 30:00.000.
I also restarted every node (1 at the time) with no result.
How to force to remove the application without recreating the cluster because that is not a option for a production environment.
Yeah this can happen if you don't allow your code to exit RunAsync or Open/Close of your ICommunicationListener.
Some background:
Your service has a lifecycle that is driven by Service Fabric. A small component in your service - you know it as FabricRuntime - drives this. For stateless service instances, it's a simple open/close lifecycle. For stateful services, it's a bit more complex. A stateful service replica opens and closes, but also changes role, between primary, secondary, and none. Lifecycle changes are initiated by Service Fabric and show up as a method call or cancellation token trigger in your code. For example, when a replica is switch to primary, we call your RunAsync method. When it switches from primary to something else, or needs to shut down, the cancellation token is triggered. Either way, the system waits for you to finish your work.
When you go delete a service, we tell your service to change role and close. If your code doesn't respond, then it will get stuck in that state.
To get out of that state, you can run Remove-ServiceFabricReplica -ForceRemove. This essentially drops the replica from the system - as far Service Fabric is concerned, the replica is gone. But your process is still running. So you have to go in and kill the process too.
The error in the script is with the '- ApplicationName' and should be '-ApplicationName'.
And after correcting the parameter, this DID remove the hosed up pieces and get me back in order to be able to correct and redeploy the application to the cluster.