I have tried to delete a cloud data fusion instance. The console has said the instance is deleting for over a few days now. Even though I don't have an actual pipeline running it is accumulating charges ~$40/day. When I try to delete the instance stuck on delete I get an error saying deletion of failed.
I had a similar problem when i mistakenly removed some of the Service Accounts. I'm unable to recall which SA. I used this to undelete the SA:
$ gcloud beta iam service-accounts undelete __ACCOUNT_ID__
I had the same problem. Data Fusion(DF) was in a state of deletion for days.
There is an option for update of DF from the console. When I tried to 'update' the instance went back to running state. After that the deletion was successful.
[I hope this can help someone.]
Related
I am deploying a stack with CDK. It gets stuck in CREATE_IN_PROGRESS. CloudTrail logs show repeating events in logs:
DeleteNetworkInterface
CreateLogStream
What should I look at next to continue debugging? Is there a known reason for this to happen?
I also saw the exact same issue with the deployment of a CDK-based ECS/Fargate Deployment
In my instance, I was able to diagnose the issue by following the content from the AWS support article https://aws.amazon.com/premiumsupport/knowledge-center/cloudformation-stack-stuck-progress/
What specifically diagnosed and then resolved it for me:-
I updated my ECS service to set the desired task count of the ECS Service to 0. At that point the Cloud Formation stack did complete successfully.
From that, it became obvious that the actual issue was related to the creation of the initial task for my ECS Service. I was able to diagnose that by reviewing the output in Deployment and Events Tab of the ECS Service in the AWS Management Console. In my case, the task creation was failing because of an issue with accessing the associated ECR repository. Obviously there could be other reasons but they should show-up there.
When I try to delete the Cloud Object Storage service in my Cloud Pak for Data as a Service account, I get this message:
An unexpected error occurred while attempting to delete service instance: Unexpected response code: 500 (Instance is in pending_reclamation state. Use API to reclaim the instance.)
What does this error mean? What should I do to really delete this service?
According to https://cloud.ibm.com/docs/hyper-protect-dbaas-for-mongodb?topic=hyper-protect-dbaas-for-mongodb-what-new#dec-2020:
When you delete a service instance, it's disabled (pending reclamation) rather than deleted completely. You can restore a deleted service instance with no data loss within the retention period of seven days. You can also choose to permanently delete it. See Deleting service instances.
This seems to be true for all IBM Cloud services, not just MongoDB. Using the command line interface, I was able to actually delete the Cloud Object Storage service I had.
I am using ubuntu 18.04 on AWS EC2 instace free tier, running websites on apache server, NodeJS with PostgreSQL database. All deployments are done perfectly and webapps works fine without any exception or error details.
However I am facing an annoying issue: this instance is stopping frequently without any exception or error logs. After rebooting instance everything starts working fine but after some time it automatically stops either in few hrs. on same day when rebooted instance or in 1-2 days after that.
I created another free tier instance with seperate account and that is also giving same issue. I am not finding any logs or troubleshoot option to get rid of this problem.
I would like to know how it can be troubleshooted or where can i find logs of any errors or exception for this isntance?
Suggestion given by AWS in "Instance Status Checl" as attached below are not practicle solution to apply evertime.
Something with your VM itself is causing its health checks to fail.
Have a look at syslogs, and your application logs. Also take a look at CloudWatch metrics to see if any metrics have dramatic change close to time.
You can also add a CloudWatch alarm with a recovery action to automatically reboot if there’s an issue with your VM.
We are experimenting with Kubernetes and Confluence in the cloud and have deployed Confluence connected to a pgsql database. When applying an update, something happened that caused the pgsql pod to tank and lose the persistent volume connections.
Thankfully the volume was set to retain, so we have the volume and I have since been able to point a new pgsql instance to this volume, but I can't find a way to get Confluence to see this existing database. Confluence just proceeds to the initial fresh install screens. I've tried installing it on a temporary database and then modifying the confluence.cfg.xml file to point to the old data once completed but Confluence will not restart when I try this.
Any help is appreciated.
Using the web installer you should have a step to select "My own database". From there you can configure the database credentials and host. Go ahead and let the installer run, it will overwrite the default settings but will retain your existing data.
Also, you may want to get on the psql shell via console and check to make sure that your data actually exists and you haven't ended up with an empty database.
If you're still stuck, reach out here and we can check out the next steps.
In my case the original solution posted here is accurate:
However I had to do this in a non containerized environment. I installed Confluence on a VM using a blank database, then modified the confluence.cfg.xml file to point to the pgsql database in the kubernetes cluster and restarted confluence. I was able to see my data, so I then used confluence's XML export feature to grab the dataset. I then blew away the kubernetes environment and re-created it from scratch and imported the backed up XML into the new instance. Not a super clean way of doing it, but got where I needed to.
I can't start creating service fabric cluster.
When starting creation portal always shows "Rainy Cloud" and nothing can be inserted?
Thanks for reporting this, we found a problem in the portal that may be causing this. We'll be rolling out a fix in the next few days.
BTW, we have a repo on GitHub that you can use to report issues like this for a faster response: https://github.com/Azure/service-fabric-issues