Debezium Connector tries to open old log file - apache-kafka

I have a debezium connector that works fine, for a limited time. These errors occur in log file:
Caused by: java.sql.SQLException: ORA-00308: cannot open archived log '+RECO/XXXXXXXX/ARCHIVELOG/2022_01_04/thread_1_seq_53874.3204.1093111215'
ORA-17503: ksfdopn:2 Failed to open file +RECO/XXXXXXXX/ARCHIVELOG/2022_01_04/thread_1_seq_53874.3204.1093111215
ORA-15012: ASM file '+RECO/XXXXXX/ARCHIVELOG/2022_01_04/thread_1_seq_53874.3204.1093111215' does not exist
I've learnt in this database log files are deleted daily. Is my connector trying to read an old log file, which does not exist anymore? How can I tell my connector to check only last 12 hours, for example. Or should I do something in database side?

I've learnt in this database log files are deleted daily. Is my connector trying to read an old log file, which does not exist anymore?
It is fine to delete archive logs that are no longer needed, but it's critical that you make sure that you are not deleting logs that the Oracle Connector still requires in order to perform mining. In your particular case, the connector still required thread_1_seq_53874.3204.1093111215 but the log is no longer on the file system and therefore the connector will stop with an error. This error happens with any other connector such as MySQL if you remove the binlogs before the connector is done reading them.
How can I tell my connector to check only last 12 hours, for example.
You cannot.
The way the Debezium connectors are designed is that they're meant to read all changes from the logs in chronological order to guarantee that there is no change data event loss. If a log were to be deleted that was needed and we did not throw an error, then you would have gaps where changes from the source database would not be represented as change events and so your consumers wouldn't be kept in sync.
Or should I do something in database side
Archive logs need to be retained for as long as they're needed by the connector. The latency of the Oracle connector is dependent both on the volatility of your database but also on a number of factors such as the performance of the database server hardware (disk and cpu), the size of your redo logs, etc.
Some environments may not be able to keep archive logs available in the default destination location for extended periods of time due to space constraints. This is why we introduced a way that you can set up Oracle to write archive logs to a secondary destination location that is capable of retaining the logs for a longer period of time, often via a network mount, and then you can explicitly tell the connector use that archive destination name rather than the first valid/default location of the system.

Related

About delete binary logs file on Cloud Sql

I have a question about binary log on Google Cloud Sql.
Now that storage on Cloud SQL is constantly increasing, I want to delete the binary logs files. I have read the documentation about it, but it is not clear that when I disable the binary logs function, will the files be deleted immediately or have to wait for the next 7 days for the files to be deleted. Thank you.
https://cloud.google.com/sql/docs/mysql/backup-recovery/pitr#disk-usage
According to the official documentation :
Disk usage and point-in-time recovery
The binary logs are automatically deleted with their associated
automatic backup, which generally happens after about 7 days.
Diagnosing issues with Cloud SQL instances
Binary logs cannot be manually deleted.
Binary logs are automatically deleted with their associated automatic
backup, which generally happens after about seven days.
Therefore you have to wait for about 7 days for the Binary logs and their associated automatic backup to be deleted.

what is in zookeeper datadir and how to cleanup?

I found my zookeeper dataDir is huge. I would like to understand
What is in the dataDir?
How to cleanup? Does it automatically cleanup after certain period?
Thanks
According to Zookeeper's administrator guide:
The ZooKeeper Data Directory contains files which are a persistent copy of the znodes stored by a particular serving ensemble. These are the snapshot and transactional log files. As changes are made to the znodes these changes are appended to a transaction log, occasionally, when a log grows large, a snapshot of the current state of all znodes will be written to the filesystem. This snapshot supercedes all previous logs.
So in short, for your first question, you can assume that dataDir is used to store Zookeeper's state.
As for your second question, there is no automatic cleanup. From the doc:
A ZooKeeper server will not remove old snapshots and log files, this is the responsibility of the operator. Every serving environment is different and therefore the requirements of managing these files may differ from install to install (backup for example).
The PurgeTxnLog utility implements a simple retention policy that administrators can use. The API docs contains details on calling conventions (arguments, etc...).
In the following example the last count snapshots and their corresponding logs are retained and the others are deleted. The value of should typically be greater than 3 (although not required, this provides 3 backups in the unlikely event a recent log has become corrupted). This can be run as a cron job on the ZooKeeper server machines to clean up the logs daily.
java -cp zookeeper.jar:log4j.jar:conf org.apache.zookeeper.server.PurgeTxnLog <dataDir> <snapDir> -n <count>
If this is a dev instance, I guess you could just almost completely purge the folder (except some files like myid if its there). But for a production instance you should follow the cleanup procedure shown above.

How to configure WAL archiving for a cluster that *only* hosts dev or test databases?

I've got a dev and test database for a project, i.e. databases that I use to either run my project or run tests, locally. They're both in the same cluster ('instance' – I come from Redmond).
Note that my local cluster is different than the cluster that hosts the production database.
How should I configure those databases with respect to archiving the WAL files?
I'd like to be able to 'build' or 'rebuild' either of those databases by restoring from a base backup and running seed data scripts.
But how should I configure the databases or the cluster for archiving WAL files? I understand that I need them if I want to recover the database. I think that's unlikely (as I didn't even know about 'WAL' or their files, or that, presumably they're shared by all of the databases in the same cluster, which seems weird and scary coming from Microsoft SQL Server.)
In the event that I rebuild one of the databases, I should delete the WAL files since the base backup – how can I do that?
But I also don't want to have to worry about the size of the WAL files growing indefinitely. I don't want to be forced to rebuild just to save space. What can I do to prevent this?
My local cluster only contains a single dev and test database for my project, i.e. losing data from one of these databases is (or should be) no big deal. Even having to recreate the cluster itself, and the two databases, is fine and not an issue if it's even just easier than otherwise to restore the two databases to a 'working' condition for local development and testing.
In other words, I don't care about the data in either database. I will ensure – separate from WAL archiving – that I can restore either database to a state sufficient for my needs.
Also, I'd like to document (e.g. in code) how to configure my local cluster and the two databases so that other developers for the same project can use the same setup for their local clusters. These clusters are all distinct from the cluster that hosts the production database.
Rather than trying to manage your WAL files manually, it's generally recommended that you let a third-party app take care of that for you. There are several options, but pg_backrest is the most popular of the open-source offerings out there.
Each database instance writes its WAL stream, chopped in segments of 16MB.
Every other relational database does the same thing, even Microsoft SQL Server (the differences are in the name and organization of these files).
The WAL contains the physical information required to replay transactions. Imagine it as information like: "in file x, block 2734, change 24 bytes at offset 543 as follows: ..."
With a base backup and this information you can restore any given point in time in the life of the database since the end of the base backup.
Each PostgreSQL cluster writes its own "WAL stream". The files are named with long weird hexadecimal numbers that never repeat, so there is no danger that a later WAL segment of a cluster can conflict with an earlier WAL segment of the same cluster.
You have to make sure that WAL is archived to a different machine, otherwise the exercise is pretty useless. If you have several clusters on the same machine, make sure that you archive them to different directories (or locations in general), because the names of the WAL segments of different clusters will collide.
About retention: You want to keep around your backups for some time. Once you get rid of a base backup, you can also get rid of all WAL segments from before that base backup. There is the pg_archivecleanup executable that can help you get rid of all archived WAL segments older than a given base backup.
I'd like to be able to 'build' or 'rebuild' either of those databases by restoring from a base backup and running seed data scripts.
Where is the basebackup coming from? If you are restoring the PROD base backup and running the seed scripts over it, then you don't need WAL archiving at all on test/dev. But then what you get will be a clone of PROD, which means it will not have different databases for test and for dev in the same instance, since (presumably) PROD doesn't have that.
If the base backup is coming from someplace else, you will have to describe what it is. That will dictate your WAL needs.
Trying to run one instance with both test and dev on it seems like a false economy to me. Just run two instances.
Setting archive_mode=off will entirely disable a wal archive. There will still be "live" WAL files in the pg_wal or pg_xlog directory, but these get removed/recycled automatically after each checkpoint--you should not need to manage these, other than by controlling how often checkpoints take place (and making sure you don't have any replication slots hanging around). The WAL archive and the live WAL files are different things. The live WAL files are mandatory and are needed to automatically recover from something like a power failure. The WAL archive may be needed to manually recover from a hard-drive crash or the total destruction of your server, and probably isn't needed at all on dev/test.

initial sync not proceed on ops manager v2.0

I try to get backup on ops manager.
but, initial sync is not proceed.
The following content and check the backup-agent.log has is displayed .
[backup-agent/components/mothership.go:321] Total Slice #1031 - server syncStore is full. 0th attempt. Will resend this slice again soon.
I checked ulimit,head database space,back store space.but that is not full.
what is full?
Since the question is a little ambiguous, Iam making couple of assumption.
Assumptions:
- You are setting up the Ops Manager for an existing replica set.
- The Agents are configured correctly and you have at least one Backup agent running on the replica set you are trying to backup.
Coming back to your question, syncStore is the OpLog Store which syncs your Initial Sync data using Backup Daemon. The issue here seems to be, backup job has not been bound to a headDB. you can do so by selecting Backup - Jobs - Job - under "Machine" label select set binding and select appropriate headDB.
Hope the above solution solves the issue.
[backup-agent/components/mothership.go:321] Total Slice #1031 - server syncStore is full. 0th attempt. Will resend this slice again soon.
Please note that server syncStore is full message indicates the backup daemon is not processing ( applying to headDB) the incoming oplog/sync slices fast enough.
Hence, In order to know the rate at which backup daemon is applying the slices to the headDB and track the status of the ongoing backup initial-sync process.
So you need to analyze OpsManagerlog files, there are-
Diagnostic archive log (Admin-->General-->Overview-->System Overview-->Diagnostic Archive)
Backup daemon logs and mms0 logs [available in /opt/mongodb/mms/logs]
Backup agent log
Thanks,
Karthick
I once encountered the same problem because I did not place Mongo binaries in the mongodb releases directory during configuration. Because of this, the Deamon program cannot start the head database, resulting in the failure to take care of the transferred oplog, thus printing an error server syncStore is full.But that's not the reason.

Backup Zookeeper data on active server

I am trying to configure production-ready Zookeeper data backup.
As I learned from different sources, Zookeeper snapshot file is not enough to guarantee a return to a previous state. In fact, the snapshot file may not even represent the state of the tree at any point in time (see corresponding stackoverflow ticket answer).
So to make the consistent zk data storage backup (to store it on a cloud or elsewhere), I need to copy snapshots with transaction logs.
The question is: how can I copy transaction log files while zookeeper is active and makes hundreds of transactions a second? Won't the files be corrupted?
What other practices can be used in this case?