Orientdb delete empty cluster automatically - orientdb

Is there any way to delete in Orientdb (distributed) empty cluster automatically to keep a small amount of files and db more clean?

According to the documentation, there seems to be nothing automatic that responds to your needs. Maybe a way to try to do it, you might write a Java function that checks which clusters are empty, and deletes them. Then this function could be performed or hands every few time or scheduled with some scheduler.

Related

How to write integration tests depending on Druid?

I am coding an application that generates reports from a Druid database. My integration tests need to read data from that database.
My current approach involves creating synthetic data for each of the tests. However, I am unable to remove data created from the database (be it by removing entries or completely dropping the schema). Tried this but still getting data back after disabling the segment and firing the kill task.
I think that either I am completely wrong with my approach or there is a way to delete information from the database that I haven't been able to find.
You can do this by below 2 approaches
Approach 1 :
Disable the segment(used=0)
Fire a kill task for that segment
Have the load and drop rules
Refer : http://druid.io/docs/latest/ingestion/tasks.html (look for destroying segments)
Approach 2 : (prefer this for doing integration tests before setting up production):
stop coordinator node and delete all entires in the druid_segments
table in the metadata store
stop historical node and delete everything inside the directory pointed by druid.segmentCache.locations at historical node
start coordinator and historical nodes
Remember this will delete everything from druid cluster.
In the end I worked around the issue by inserting data in Druid with ids specific to each unit test and querying for that.
Not very elegant since now one malicious test can (potentially) mess with the results of another test.

AWS RDS instance created from snapshot very slow

I have restored a snapshot of a PostgreSQL instance as a new instance with exactly the same configuration as the original instance. However, running queries takes much longer on the new instance. A query that takes less than 0.5 ms to execute on the original instance, takes over 1.2 ms on the new one. A nightly Python script that runs in 20 minutes on the old instance is now taking over an hour with the new one. This has been going on for several days now.
I run VACUUM(ANALYZE, DISABLE_PAGE_SKIPPING); after our nightly snapshot restores for our Staging DB to get everything running smoothly again
Unfortunately this is normal but it should go away after a while.
Snapshots are stored on S3, and when you create a new EBS volume with one, the volume only pulls in blocks of data as they are requested, causing degraded performance until the whole volume is initialized. See these AWS docs for confirmation.
Those docs suggest using dd to force all the data to load, but on RDS you have no way to do that. You might want to try SELECTing everything you can instead, although that will still miss some things (like indexes).

Thread limited number of SQL database restores

Trying find an example or a starting point for a project I have to restore databases into a test environment. I have a list of 40+ sql instances, databases, and backup location and like to use the cmdlet Restore-SQLDatabases but only allow 3 restores to occur at a time. To minimize the impact on our network/storage I don't want to initiate all 40+ restores at one time. The list of what needs to be restored are contained in a csv and when testing can get the restores to go but not sure what options I'd have to only thread only 3 at a time.
I used the RunspaceFactory example and modified it to use a script-block to execute Restore-SqlDatabase. I'm sure there may be cleaner or simpler ways of doing this but so far it seems to work.

Postgresql replication without DELETE statement

We have a requirement that says we should have a copy of all the items that were in our system at one point. The most simple way to explain it would be replication but ignoring the delete statement (INSERT and UPDATE are ok)
Is this possible ? or maybe the better question would be what is the best approach to tackle this kind of problem?
Make a copy/replica of current database and use triggers via dblink from current database to the replica. Use after insert and after update trigger to insert and update data in replica.
So whenever a row insertion/updation take place in current database it will directly reflect to replica.
I'm not sure that I understand the question completely, but I'll try to help:
First (opposite to #Sunit) - I suggest avoiding triggers. Triggers are introducing additional overhead and impacting performance.
The solution I would use (and I'm actually using in few of my projects with similar demands) - don't use DELETE at all. Instead you can add bit (boolean) column called "Deleted", set its default value to 0 (false), and instead of deleting the row you update this field to 1 (true). You'll also need to change your other queries (SELECT) to include something like "WHERE Deleted = 0".
Another option is to continue using DELETE as usual, and to allow deleting records from both primary and replica, but configure WAL archiving, and store WAL archives in some shared directory. This will allow you to moment-in-time recovery, meaning that you'll be able to restore another PostgreSQL instance to state of your cluster in any moment in time (i.e. before the deletion). This way you'll have a trace of deleted records, but pretty complicated procedure to reach the records. Depending on how often deleted records will be checked in the future (maybe they are not checked at all, but simply kept for just-in-case tracking) this approach my also help.

Quartz JDBC Job Store - Maintenance/Cleanup

I am currently in the processes of setting up Quartz in a load balanced environment using the JDBC job store and I am wondering how everyone manages the quartz job store DB.
For me Quartz (2.2.0) will be deployed as a part of a versioned application with multiple versions potentially existing on the one server at the one time. I am using the notation XXScheduler_v1 to ensure multiple schedulers play nice together. My code is working fine, with the quartz tables being populated with the triggers/jobs/etc as appropriate.
One thing I have noticed though is that there seems to be no database cleanup that occurs when the application is undeployed. What I mean is that the Job/Scheduler data seems to stay in the quartz database even though there is no longer a scheduler active.
This is less than ideal and I can imagine with my model the database would get larger than it needed to be with time. Am I missing how to hook-up some clean-up processes? Or does quartz expect us to do the db cleanup manually?
Cheers!
I got this issue once, and here is what I did to rectify the issue. This will work for sure but in case it does not then we will have backup of table so you don't have anything to loose while trying this.
Take sql dump of following tables using method mentioned at : Taking backup of single table
a) QRTZ_CRON_TRIGGERS
b) QRTZ_SIMPLE_TRIGGERS
c) QRTZ_TRIGGERS
d) QRTZ_JOB_DETAILS
Delete data from above tables in sequence as
delete from QRTZ_CRON_TRIGGERS;
delete from QRTZ_SIMPLE_TRIGGERS;
delete from QRTZ_TRIGGERS;
delete from QRTZ_JOB_DETAILS;
Restart your app which will then freshly insert all deleted tasks and related entries in above tables (Provided your app has its logic right).
This is more like starting your app with all the tasks being scheduled for the first time. So you must keep in mind that tasks will behave as if these are freshly inserted.
NOTE: If this does not work then apply the backup you took for tables and try to debug more closely. As of now, I have not seen this method fail.
It's definitely not doing any DB cleanup when undeploying the application or shutting down the scheduler. You would have to build some cleanup code during application shutdown (i.e. building some sort of StartupServlet or context listener that would do the cleanup on the destroy() event lifecycle)
You're not missing anything.
However, these quartz tables aren't different from any applicative DB objects you use in you data model. You add Employees table and in a later version you don't need it anymore. Who's responsible for deleting the old table? Only you. If you habe a DBA you might roll it on the DBA ;).
This kind of maintenance would typically be done using an uninstall script / wizard, upgrade script / wizard, or during the first startup of the application in its new version.
On a side note, typically different applications use different databases, or different schemas for the least, thus reducing inter-dependencies.
To clean Quartz Scheduler internal data one needs more SQL:
delete from QRTZ_CRON_TRIGGERS;
delete from QRTZ_SIMPLE_TRIGGERS;
delete from QRTZ_TRIGGERS;
delete from QRTZ_JOB_DETAILS;
delete from QRTZ_FIRED_TRIGGERS;
delete from QRTZ_LOCKS;
delete from QRTZ_SCHEDULER_STATE;