Google Cloud SQL Migration Job stuck on Running - google-cloud-sql

I've got a database on Google SQL that is used by our application running on kubernetes in GKE.
The mysql instance is running on 5.6, and I need to update it to 5.7, so I tried using the new migration jobs.
I've set up the connection profile and all the required permissions for the source DB, then followed the instructions to make a continuous migration.
The Job says it's running, migrating the ~450GB database. After about a day, it's still running, the storage used seems to have stopped growing, and the replication delay is at 0. The source database is not currently in use (That's why I'm unsing it to try this out before doing the same with a more important db).
According to this, if the dump phase is done, I should be able to promote the instance, but the promote button remains greyed out, and there's no way to check the running state (it only says "running", and I don't see any way to check if it's dumping, on CDC, or anything else).
The documentation seems a bit lacking, and I couldn't find anything by googling around. Has anyone been using this?
In short, my questions are:
Why can't I promote the instance?
and how can I check in what phase is the migration?
Here's a screencap of my job:
link because SO doesn't let me embed images yet
Thanks.
p.d.: the tag that the documentation says should be used in stackoverflow is: google-cloud-database-migration-service, which is too long and stackoverflow doesn't allow, so I used google-cloud-sql instead :/

I am seeing an issue like this, but possibly more frustrating. After a week for a 2TB database, storage resets to near-zero and the full dump restarts, without any errors or indication of what happened.

Related

Cloud SQL Mysql - Stuck in failover operation in progress

My Cloud SQL Mysql 5.7.37 Highly available instance is stuck in a "Failover operation in progress. This may take a few minutes. While this operation is running, you may continue to view information about the instance" process. It is a fairly small database and it has been stuck like this for 5 hours and the failover is not available so no DB queries can be executed, hence our system is currently down.
No commands on the DB can be executed since it is in an updating process, the error log is empty and the operations log only contain this update and successfull backups.
Does anyone have any suggestions? I am not paying for Google Support so I cant get support directly from them (which I think is terrible since this a fully managed service).
Best,
Carl-Fredrik

When an RDS Postgres DB is dropped, does it clear the connected Redis Elasticache?

This question as simple as the title suggests...
We are not fully confident that reloading data into our RDS Postgres instance, clears our Redis cache that is caching calls made to that DB. We are therefore not confident our new data displayed in our UI is made up of new or stale cache data. Does anyone have an idea.
We've mined AWS to the best of our ability to see if we can see the data/size of what is in the cache, but to little avail.
It seems fairly difficult to research what appears to be a simple question as most google results are related to clearing a Cache, full stop (I'm guessing a lot of people have issues with doing that).
A bit more digging into this and I've found a reasonable metric with gives the information I needed. For any person who runs into the same issue in the future follow these steps:
Drop your DB.
Navigate to the ElasticCache Dashboard in AWS.
Select the cache you're interested in.
Scroll down through the metrics graphs until you see 'Current Items (Count)'
From this graph you can see the number of items in your cache and it should be pretty clear if the DB Drop emptied the cache.
N.B. At the current time of writing, a DB drop appears to be clearing our cache, although we are not entirely sure if this is an optional setting we've introduced and can be turned off.

Google Cloud SQL Postgres - randomly slow queries from Google Compute / Kubernetes

I've been testing Google Cloud SQL with Postgresql, but I have random queries taking ~3s instead of a few ms.
Troubleshooting I did:
The queries themselves aren't problems, rerunning the same query will work.
Indexes are properly set. The database is also very very small, it shouldn't do this, even if there weren't any index.
The Kubernetes container is connecting to the database through SQL Proxy (I followed this https://cloud.google.com/sql/docs/postgres/connect-kubernetes-engine). It is not the problem though as I tried to connect directly to the database, with the same issue.
I configured net.ipv4.tcp_keepalive_time to 60 to make sure the connection weren't dropping.
I also have a pool of connection that are never disconnected to make sure it wasn't from that.
When I run queries directly through my local Postgresql client, I never have the problem.
I don't have this issue when developing locally either and connecting to my local database.
What I'm getting at is: I feel there's some weird connection/link issue between my Google Compute instances and my Google SQL instance that I can't seem to figure out.
Any idea?
Edit:
I also noticed these logs in my SQL Cloud instance every 30s:
ERROR: recovery is not in progress
HINT: Recovery control functions can only be executed during recovery.
STATEMENT: SELECT pg_is_xlog_replay_paused(), current_timestamp
That's an interesting problem you are facing. So my knowledge on Kubernetes isn't that great, but I do have a general understanding so let's see if I can provide some suggestions.
To start with, the API that you linked to in your question does mention that it is still in beta. So I do believe there would still be issues to patch in maximizing speed performance.
Secondly, from what I understand, Kubernetes is a great tool for handling stateless workloads. Thus, handling data where state is required for queries would be a slow operation. This article (although not entirely related) does explain some of the pitfalls of Kubernetes (not all the questions are relevant)
Thirdly, could you explain your use case a little bit? Do you really need to use Kubernetes or will another tool like a powerful Compute Engine Instance or or a Dataflow job resolve the the issue? Are you making your database queries through a programming language or an application call?
Thanks, and do let me know!

Why is my mongodb collection deleted automatically?

I have a MongoDB client in three EC2 instances and I have created a replica set. Last time I had a problem, of space constraint which stopped my mongod process, thereby halting the application and now in an instance couple of days back, some of my tables were gone from database, so I set logging and all to my database just to catch if anything like that happens again. In a fresh incident this morning I was unable to login to my system and that's when I found out that whole database was empty. I checked other SO question like this which suggest setting up a TTL.Which I haven't done at all.
Now how do I debug this situation and do a proper root cause analysis? I can't even find anything in my debug logs as well. The tables just vanished. How do I set up proper logging mechanism and how do I ensure that all my tables are never ever deleted again?
Today I got a mail from Amazon that I was probably running an unsecured version of MongoDB and that may have caused this issue. So who ever is facing this issue please go through the Security Checklist Provided by MongoDB. There are some points that are absolutely necessary in there.
1. Enable Access Control and Enforce Authentication
2. Encrypt Communication
3. Limit Network Exposure
These three are the core and depending upon how many people access your database you can Configure Role-Based Access Control.
These are all the things I have done. Before this incident I had not taken security that seriously but after I was hit by it. I made sure I have all the necessary precautions in place.
Hope this helps someone.

Why is Heroku telling me "Your app has no databases" when it clearly does?

I know the subject of migrating Heroku databases has lots of documentation but I have yet to find my answer, and nobody seems to be mentioning the error I'm getting.
I developed my app using the basic/free version of Heroku, where I get my two random dictionary words and a number. I've got a Rails app running in this instance, populated with data. It's what I've used to demo to management.
My company now has paid space on Heroku, including Postgres. I've gotten my application deployed to this new space, including an empty Postgres database (I've run migrations), and now I would like to move my data over from the free/shared space, to my paid space.
I believe this is the page of directions I'm supposed to be following:
https://devcenter.heroku.com/articles/migrating-from-shared-database-to-heroku-postgres
But when I get to this step:
heroku pgbackups:capture --expire -a [my_app]
I get the error in my question, "Your app has no databases." I've done the necessary steps, added the pgbackups add-on and so forth. If I execute this command against my newly created paid app (with the empty database), it works fine. But running it for my old/free/shared-db version gets the error.
I get that it does not have a paid database, no. If I go to http://postgres.heroku.com it doesn't even show up. But I've got data living in a database somewhere in Heroku world, and I'd like to get at it. The documentation does lead me to believe that these are the instructions for getting off the 5mb shared space, which is what I'm on.
I didn't take into account some corner cases on an update I wrote for the client. A later version fixed it as I think you figured out. Sorry.