Hangfire Recurring jobs are stuck on processing state in AWS ECS container - postgresql

Hangfire Recurring jobs are stuck on processing state in AWS ECS container even after truncated all hangfire tables, restarted container.
We are hosting Console Application and following is the code to create server:
var options = new BackgroundJobServerOptions
{
WorkerCount = 1
};
Log.Information("********************** Hangfire Server started **********************");
var server = new BackgroundJobServer(options);

Related

connection to external server (mongodb server) fails from fargate container deployed using cdk

I created a simple node.js/express app and created docker image and successfully pushed it to aws ecr.
Next, I created a cdk project to deploy this container to fargate with public application load balancer. ecs_patterns.ApplicationLoadBalancedFargateService
Although the deployment cmd (cdk deploy) was successful, cluster page in aws console shows "No tasks running" and Services tab within the cluster shows red bar with "0/1 Tasks running" and Tasks tab within cluster shows tasks are getting created and stopped (every 1 or 2 min, a task is created and eventually stopped and a new one is created and this keeps on going forever)
Going inside a stopped task and its Log tab shows
ERROR: Connecting to MongoDB failed. Please check if MongoDB server is running at the correct host/port.. This is the error message I have in my app when connection to mongodb fails when the server is initialized.
The DB credentials and connection url are valid (see below) and it runs in a separate EC2 instance with EIP and domain name. In fact, I can connect to the DB from my dev machine which is outside aws.
Also, just for trial, I created a stack manually through console by creating security groups (for load balancer and service), target group, application load balancer, listener (port 80 HTTP), cluster, task definition (with correct db credentials set in env var), service, etc., it's working without any issue.
All I want is to create similar stack using cdk (I don't want to manually create/maintain it)
Any clue on why connection to external server/db is failing from a fargate container would be very useful. I'm unable to compare the "cdk created cloudformation template" (that's not working) with the "manually created stack" (that's working) as there are too many items in the autogenerated template.
Here is the cdk code based on aws sample code:
const vpc = new ec2.Vpc(this, "MyVpc", { maxAzs: 2 });
const cluster = new ecs.Cluster(this, "MyCluster", { vpc });
const logDriver = ecs.LogDriver.awsLogs({ streamPrefix: "api-log" });
const ecrRepo = ecr.Repository.fromRepositoryName(this, "app-ecr", "abcdef");
new ecs_patterns.ApplicationLoadBalancedFargateService(
this, "FargateService", {
assignPublicIp: true,
cluster,
desiredCount: 1,
memoryLimitMiB: 1024,
cpu: 512,
taskImageOptions: {
containerName: "api-container",
image: ecs.ContainerImage.fromEcrRepository(ecrRepo),
enableLogging: true,
logDriver,
environment: { MONGO_DB_URL: process.env.DB_URL as string }
},
publicLoadBalancer: true,
loadBalancerName: "api-app-lb",
serviceName: "api-service"
}
);
It turned out to be a silly mistake! Instead of MONGO_DB_URL it should be DB_URL because that's what my node.js/express server in the container is using.

JBPM JobExecutor not working on PCF but working in local

We implemented jobExecutors in jbpm, running jbpm engine as springboot kie server,
Jobexecutors are running in local and updating RequestInfo table accordingly, but when we deploy in PCF scheduled jobs are not running
We enabled properties jbpm.executor.enabled true
And other props
Please advise what could be the issue in PCF

ECS get image from QUAY.io and spin ec2Spot: Infinitely waiting for task to start - desiredCount = 1, pendingCount = 0

I've set up pipeline which talks to ECS and spins EC2Spot instance.
Getting stuck on following message
PRIMARY task ******:5 - runningCount = 0 , desiredCount = 1, pendingCount = 0
Which basically means that I'm waiting for task to start, but something is off in a set up and it never gets started. Any suggestions on where to look?
Note:
This is a testing app which spins up a browser so no ports required
No load balancer
Possibly quay.io integration miss, but cant figure out with no logs
CloudTrail log is empty with only success messaged upon taskDefinition create and update
Thanks
About 8 hours of hammering head of the wall and this issue was solved.
Long time ago, by this fella - https://stackoverflow.com/a/36533601/5332494
Steps that It took me to figure it out.
Look in the CloudTrail => Event history => Even name column(UpdateService) => click on View event => Find error message(was unable to place a task because no container instance met all of its requirements. Reason: No Container Instances were found in your cluster. For more information, see the Troubleshooting section of the Amazon ECS Developer Guide) there which will take you to https://docs.aws.amazon.com/AmazonECS/latest/developerguide/service-event-messages.html#service-event-messages-1
Page in a link above specifies possible issues you are having if you got same message as I(see step 1). First option on that page:
No container instances were found in your cluster
took me to https://docs.aws.amazon.com/AmazonECS/latest/developerguide/launch_container_instance.html
That's where I've added docker instance to my ecs cluster and finally was able to add ec2 Spot instance through codefresh pipeline talking to ecs.
Notes:
Ecs had to talk to QUAY.io to pull docker image from their private registry. And all I had to do is create secret in AWS secret manager with default following format
{ "username": "your-Quay-Username",
"password": "your-Quay-password"
}
That's it :)

Microsoft Service Fabric - fabric:/System/ImageStoreService not running

I am trying to copy an app to the service fabric image store.
I am not able to copy the application via VS or Powershell (probably because of the fabric:/System/ImageStoreService being in Error state). The operation times out when done using Visual Studio and when done using Powershell - it just stays stuck indefinitely.
I don't know how to approach services that are not running on the Service Fabric Cluster. I have other services failing on the cluster as well - this is a new test cluster created using the Azure portal yesterday (see attached screenshot).
Error event: SourceId='System.FM', Property='State'. Partition is below target replica or instance count.
ImageStoreService 3 3 00000000-0000-0000-0000-000000003000
N/P InBuild _nodr_0 131636712421204228
(Showing 1 out of 1 replicas. Total available replicas: 0)

How to run a quartz trigger in my local server in clustered environment

multiple servers are running on same database with same quartz.properties file.
but there is a requirement (for debugging purpose) where i want to trigger my jobs (created from my server) specifically on my server.
i tried changing org.quartz.scheduler.instanceName but it didn't work, jobs are still triggering on random machines.
Is there any way by which i can trigger job on my server which were configured by my server.
**my quartz properties**
org.quartz.scheduler.instanceName = myLocalInstance
org.quartz.scheduler.instanceId = AUTO
** all other server instance Quartz properties **
org.quartz.scheduler.instanceName = commonInstance
org.quartz.scheduler.instanceId = AUTO