Is it possible to issue a durable delete in Aerospike with asinfo using 'truncate'? - truncate

I wanted to avoid using Aerospike clients (e.g. for Python) and delete records from a set using native asinfo command 'truncate' as it allows to do it quickly. But after I restarted Aerospike all deleted records were back. I saw this aerospike: delete all record in a set but it doesn't answer my question. Neither does this page from AS docs. It says, that a tombstone should be written after a durable delete, do I have to create it manually or are there some other ways?
UPD:
Thanks to #kporter who provided the accepted answer below I was able to look into differences between Community and Enterprise edition of Aerospike and found more information on the problem, some may find it helpful as well:
Persisted Delete [Community Edition]
This answer and the whole discussion from AS forum
And this thread
If I understood all of it correctly the best way to get your records deleted completely in CE is to ensure that they have right TTL and can expire naturally. And if for some reason you have a lot of old records without TTL as in my case, you can issue truncate command via asinfo and do not restart AS server until data on SSD is eventually overwritten. Or just truncate sets with old records on every restart.
Also I wonder if it is possible to wipe AS storage completely and then restore it from a backup of already truncated data as an emergency measure?
UPD1:
So, I was able to wipe SSD with Aerospike storage and restore only needed records from a backup. Here is how I did it:
Firstly, you need to remove old records from sets via asinfo and truncate command, links to docs are above
Then backup namespaces you want to save with asbackup
Stop your AS server, mine was in Docker container, so I just stopped said container
Zero out the disk that is used as AS storage, mine was /dev/sdb
Create necessary partitions on this disk
Start AS server
Restore data from the backup using arestore
Useful links: how to remove and clean up an aerospike server installation, AS docs on SSD setup
I am not sure if it is a good solution for large production setups but it worked as intended in my case with only one AS node and an opportunity to stop it for a while.
This way I was able to reduce the size of data in my AS from 160Gb to 11Gb and because of that my server now fully restarts only in half an hour instead of approximately eight hours as before.

You can find more information about truncating a set here:
https://www.aerospike.com/docs/operations/manage/sets/
As mentioned there, truncation is not durable in Aerospike Community.
In the Enterprise Edition, truncation is durable and preserves record deletions through a cold-restart. In the Community Edition, similar to record deletes, records in previously truncated sets are not durable and deletes can return through a cold-start.

Related

Cloud SQL disk size is much larger than actual database

Cloud SQL reports that I've used ~4TB of SSD storage, but my database is only ~225 GB. What explains this discrepancy? Is there something I can delete to free up space? If I moved it to a different instance, would the required storage go down?
There are a couple of options about why your Cloud SQL storage has increase:
-Did you enable Point-in-time recovery? PITR uses write-ahead logs and if you enabled this feature, that could be the reason why of your increases.
-Have you used temporary tables and you have not deleted them?
If none of the above applies to you, I highly recommend you to open a case with GCP support team so that they take a look at your Cloud SQL instance.
On the other hand, you should open a case to decrease the disk size to a smaller one so it won’t be necessary to create a new instance and copy all the data to that new instance in addition that shrinking the disk is done at Google's end making the effort from you the lowest possible.
A maintenance window can be scheduled where Google can proceed with this task and you may want to schedule a maintenance window to minimize the impact of the downtime. For this case it is necessary to know the new disk size and when you would like to perform this operation.
Finally, if you prefer to use the migration method, you should export the DB, then create the new instance, import the DB and synchronize the old one with the new one to have all the data in both instances to which can take several hours to complete those four steps.
You do not specify what kind of database. In my case, for a MySQL database, there were several hundred GB as binary logs (mysql flag).
You could check with:
SHOW BINARY LOGS;

Are there any negative performance or functionality downsides to using pg_upgrade with --link option afterwards?

I'm about upgrade a quite large PostgreSQL cluster from 9.3 to 11.
The upgrade
The cluster is approximately 1,2Tb in size. The database has a disk system consisting of a fast HW RAID 10 array of 8 DC-edition SSDs with 192GB ram and 64 cores. I am performing the upgrade by replicating the data to a new server with streaming replication first, then upgrading that one to 11.
I tested the upgrade using pg_upgrade with the --link option, this takes less than a minute. I also tested the upgrade regularly (without --link) with many jobs, that takes several hours (+4).
Questions
Now the obvious choice is of cause for me to use the --link option, however all this makes me wonder - is there any downsides (performance or functionality wise) to using that over the regular slower method? I do not know the internal workings of postgresql data structures, but I have a feeling there could be a performance difference after the upgrade between rewriting the data entirely and to just using hard links - whatever that means?
Considerations
The only thing I can find in the documentation about the drawbacks of --link is the downside of not being able to access the old data directory after the upgrade is performed https://www.postgresql.org/docs/11/pgupgrade.htm However that is only a safety concern and not a performance drawback and doesn't really apply in my case of replicating the data first.
The only other thing I can think of is reclaiming space, with whatever performance upsides that might have. However as I understand it, that can also be achieved by running a VACUUM FULL DATABASE (or CLUSTER?) command after the --link-upgraded database has been upgraded? Also the reclaiming of space is not very impactful performance wise on an SSD as I understand.
I appreciate if anyone can help cast some light into this.
There is absolutely no downside to using hard links (with the exception you noted, that the old cluster is dead and has to be removed).
A hard link is in no way different from a normal file.
A “file” in UNIX is in reality an “inode”, a structure containing file metadata. An entry in a directory is a (hard) link to that inode.
If you create another hard link to the inode, the same file will be in two different directories, but that has no impact whatsoever on the behavior of the file.
Of course you must make sure that you don't start both the only and the new server. Instant data corruption would ensue. That's why you should remove the old cluster as soon as possible.

Sitecore 8.1 update 2 MongoDB backup

I am using replica set (2 mongo, 1 arbitor) for my Sitecore CD servers.
Assuming all mongo DB data get flushed to Reporting SQL DB; do we need to take backup of MongoDB database on production CD ?
If yes what is best approach and frequency to do it; considering My application is moderately using anaytics feature (Personalization , Campaign etc).
Unfortunately, your assumption is bad - the MongoDB is the definitive source of analytic data, not the reporting db. The reporting db contains only the aggregate info needed for generating the report (mostly). In fact, if (when) something goes wrong with the SQL DB, the idea is that it is rebuilt from the source MongoDB. Remember: You can't un-add two numbers after you've added them!
Backup vs Replication
A backup is a point-in-time view of the database, where replication is multiple active copies of a current database. I would advocate for replication over backup for this type of data. Why? Glad you asked!
Currency - under what circumstance would you want to restore a 50GB MongoDB? What if it was a week old? What if it was a month? Really the only useful data is current data, and websites are volatile places - log data backups are out of date within an hour. If you personalise on stale data is that providing a good user experience?
Cost - backing up large datasets is costly in terms of time, storage capacity and compute requirements; they are also a pain to restore and the bigger they are the more likely there's a corruption somewhere
Run of business
In a production MongoDB environment you really should have 2-3 replicas. That's going to save your arse if one of the boxes dies, which they sometimes do - MongoDB works the disks very hard.
These replicas are self-healing, and always current (pretty-much) so they are much better than taking backups. The chances that you lose all your replicas at once is really low except for one particular edge case... upgrades. So a backup is really only protection against hardware failure or data corruption which, in a multi-instance replica set, is already very effectively handled. Unless you're paranoid, you're never going to use that backup and it'll cost you plenty to have it.
Sitecore Upgrades
This is the killer edge-case - always make backups (see Back Up and Restore with MongoDB Tools) before running an upgrade because you can corrupt all of your replicas in one motion and you'll want to be able to roll back.
Data Trimming (side-note)
You didn't ask this, but at some point you'll be thinking "how the heck can I back up this 170GB monster db every day? this is ridiculous" - and you'll be right.
There are various schools of thought around how long this data should be persisted for - that's a question only you or your client can answer. I suggest keeping it until there's too much, then make a decision on how much you have to get rid of. Keep as much as you can tolerate.

Safe way to backup PostgreSQL when using Persistent Disk

I’m trying to set up daily backups (using Persistent Disk snapshots) for a PostgreSQL instance I’m running on Google Compute Engine and whose data directory lives on a Persistent Disk.
Now, according to the Persistent Disk Backups blog post, I should:
stop my application (PostgreSQL)
fsfreeze my file system to prevent further modifications and flush pending blocks to disk
take a Persistent Disk snapshot
unfreeze my filesystem
start my application (PostgreSQL)
This obviously brings with it some downtime (each of the steps took from seconds to minutes in my tests) that I’d like to avoid or at least minimize.
The steps of the blog post are labeled as necessary to ensure the snapshot is consistent (I’m assuming on the filesystem level), but I’m not interested in a clean filesystem, I’m interested in being able to restore all the data that’s in my PostgreSQL instance from such a snapshot.
PostgreSQL uses fsync when committing, so all data which PostgreSQL acknowledges as committed has made its way to the disk already (fsync goes to the disk).
For the purpose of this discussion, I think it makes sense to compare a Persistent Disk snapshot without stopping PostgreSQL and without using fsfreeze with a filesystem on a disk that has just experienced an unexpected power outage.
After reading https://wiki.postgresql.org/wiki/Corruption and http://www.postgresql.org/docs/current/static/wal-reliability.html, my understanding is that all committed data should survive an unexpected power outage.
My questions are:
Is my comparison with an unexpected power outage accurate or am I missing anything?
Can I take snapshots without stopping PostgreSQL and without using fsfreeze or am I missing some side-effect?
If the answer to the above is that I shouldn’t just take a snapshot, would it be idiomatic to create another Persistent Disk, periodically use pg_dumpall(1) to dump the entire database and then snapshot that other Persistent Disk?
1) Yes, though it should be even safer to take a snapshot. The fsfreeze stuff is really to be 100% safe (anecdotally: I never use fsfreeze on my PDs and have not run into issues)
2) Yes, but there is no 100% guarantee that it will always work (paranoid solution: take a snapshot, spin up a temp VM with that snapshot, check the disk is ok, and delete the VM. This can be automated)
3) No, I would not recommend this over snapshots. It will take a lot more time, might degrade your DB performance, and what happens if something happens in the middle of a dump? Also, PDs are very expensive for incremental backups. Snapshots are diffed, so you don't have to pay for the whole disk every copy (just the first one), only the changes.
Possible recommendation:
Do #3, but then create a snapshot of the new PD and then delete the PD.
https://cloud.google.com/compute/docs/disks/persistent-disks#creating_snapshots has recently been updated and now includes this new paragraph:
If you skip this step, only data which was successfully flushed to disk by the application will be included in the snapshot. The application experiences this scenario as if it was a sudden power outage.
So the answers to my original questions are:
Yes
Yes
N/A, since the answer to ② is Yes.

postgresql: Accidentally deleted pg_filenode.map

Is there any way to recover or re-create pg_filenode.map file that was accidentally deleted? Or is there any solution on how to fix this issue without affecting the database? Any suggestions to fix this issue is highly appreciated! The postgres version that we have is 9.0 running in Redhat Linux 5. Thanks!
STOP TRYING TO FIX ANYTHING RIGHT NOW. Everything you do risks making it worse.
Treat this as critical database corruption. Read and act on this wiki article.
Only once you have followed its advice should you even consider attempting repair or recovery.
Since you may have some hope of recovering the deleted file if it hasn't been overwritten yet, you should also STOP THE ENTIRE SERVER MACHINE or unmount the file system PostgreSQL is on and disk image it.
If this data is important to you I advise you to contact professional support. This will cost you, but is probably your best chance of getting your data back after a severe administrator mistake like this. See PostgreSQL professional support. (Disclaimer: I work for one of the listed companies as shown in my SO profile).
It's possible you could reconstruct pg_filenode.map by hand using information about the table structure and contents extracted from the on-disk tables. Probably a big job, though.
First, if this is urgent and valuable, I strongly recommend contacting professional support initially. However, if you can work on a disk image, if it is not time critical, etc. here are important points to note and how to proceed (we recently had to recover a bad pg_filenode.map. Moreover you are better off working on a disk image of a disk image.
What follows is what I learned from having to recover a damaged file due to an incomplete write on the containing directory. It is current to PostgreSQL 10, but that could change at any time
Before you begin
Data recovery is risky business. Always note what recovery means to your organization, what data loss is tolerable, what downtime is tolerable etc before you begin. Work on a copy of a copy if you can. If anything doesn't seem right, circle back, evaluate what went wrong and make sure you understand why before proceeding.
What this file is and what it does
The standard file node map for PostgreSQL is stored in the pg_class relation which is referenced by object id inside the Pg catalogs. Unfortunately you need a way to bootstrap the mappings of the system tables so you can look up this sort of informatuion.
In most deployments this file will never be written. It can be copied from a new initdb on the same version of Postgres with the same options passed to initdb aside from data directory. However this is not guaranteed.
Several things can change this mapping. If you do a vacuum full or similar on system catalogs, this can change the mapping from the default and then copying in a fresh file from an initdb will not help.
Some Things to Try
The first thing to try (on a copy of a copy!) is to replace the file with one from a fresh initdb onto another filesystem from the same server (this could be a thumb drive or whatever). This may work. It may not work.
If that fails, then it would be possible perhaps to use pg_filedump and custom scripting/C programming to create a new file based on efforts to look through the data of each relation file in the data directory. This would be significant work as Craig notes above.
If you get it to work
Take a fresh pg_dump of your database and restore it into a fresh initdb. This way you know everything is consistent and complete.