Basically, triggers are not possible in cassandra. Using a custom trigger jar, triggers can be formed. However, like in sql is it possible to have trigger only for deletes in cassandra?
Related
Could you please explain how DDL replication works in AWS DMS (in the case of the two Postgres databases)? I didn't find an explanation of this process in the official documentation.
As I can see the replication task installs awsdms_ddl_audit trigger (here is information about of this trigger). This trigger intercepts DDL operations and writes them to the awsdms_ddl_audit table. I don't understand what happens with these intercepted DDL operations after it.
P.S.
I am asking this because I've noticed that DMS is applying these DDL operations in the middle of the CDC process i.e. it doesn't arrange them with a timeline of CDC changes.
In my case, I update the source database for an hour and at the end of it I remove several columns. The DMS removes these columns when the CDC synchronization process isn't finished yet.
It's very strange behavior.
I have flow files with data records in it. I'm able to place it on S3 bucket. From there on I want to run COPY command and update command with joins to achieve MERGE / UPSERT operation. Can anyone suggest ways to solve this as firehose only executes copy command and I can't make UPSERT / MERGE operation as prescribed by AWS docs directly, so has to copy into staging table and update or insert using some conditions.
There are a number of ways to do this but I usually go with a lambda function run every 5 minutes or so that takes the data put in Redshift from firehose and merges it with existing data. Redshift likes to run on larger "chunks" of data and it is most efficient if you build up some size before performing these operations. The best practice is to move the data from the firehose target in an atomic operation like ALTER TABLE APPEND and use this new table as the source for merging. This is so firehose can keep adding data while the merge is in process.
I have a table named "infrastructure" in my postgresql databse. So when a record inserted or updated to this table, 4 different trigger is working. But each trigger takes 1-2 seconds. This is a performance issue for me. So can I send the resords in a queue in postgresql? Then consumers can do trigger operations. Is this possible? Does listen/notify works for this purpose?
Usually queues are better left outside of the DB using dedicated solutions - but if you insist on keeping it in the database then you can try the mBus extension.
I have not used it so can not comment on it.
I have a Spark job that reads from an Oracle table into a dataframe. The way it seems the jdbc.read method works is to pull an entire table in at once, so I constructed a spark-submit job to work in batch. Whenever I have data I need manipulated I put it in a table and run the spark-submit.
However, I would like this to be more event driven...essentially I want it so anytime data is moved into this table it is run through Spark, so I can have events in a UI drive these insertions and spark is just running. I was thinking about using a spark streaming context just to have it watching and operating on the table all the time, but with a long wait between streaming contexts. This way I can use the results (also written to Oracle in part) to trigger a deletion of the read table and not run data more than once.
Is this a bad idea? Will this work? It seems more elegant than using a cron-job.
We have DB that has massive amount of business logic stored in triggers inside DB. Is there a way to log firing of triggers along with arguments that they have been fired, and what they have changed?
I saw a lot of tutorials on how to do table audit with triggers, but I would like to audit triggers not tables :)
Take one of the examples that do table auditing with triggers. Use their approach to extract the changed data, but do not write the data into an audit table, but use it for a RAISE NOTICE.
That notice will then be written to the PostgreSQL log file if you set the logging configuration correctly (log_min_messages = notice)
See the manual for details on RAISE: http://www.postgresql.org/docs/current/static/plpgsql-errors-and-messages.html