No deleted entries in journal receivers? - db2

We have all Future 3 files journaled, but when I do a DSPJRN command OUTFILFMT(*TYPE3) I have no delete (DL) entries. I have only one record in the physical file, but many add (PT) and update (UP) entries for that one record. How is that possible? What am I missing?

Two possibilities. Either CLRPFM command was used to clear the table or all records in table were deleted using single delete command. If you delete all records from table using single delete command, DB2 for i will use CLRPFM. There will be entry in journal receiver for CLRPFM instead of entry for delete in both cases.

Thank you for your thoughts. We determined that the journal was set up to save *AFTER images only. Since the *AFTER images were just blanks, there was no way to know which record was deleted. We changed the journaling to *BOTH images and now can see which record(s) were deleted.

Related

is it possible to truncate sys_file_processedfile?

I am stuck with a sys_file_processedfile table with more than 200.000 entries. Is it possible to truncate the table and empty the folder /fileadmin/_processed_ without destroying something?
Thanks!
It is possible.
In Admin Tools (Installtool) under Maintenance there is a card named Remove Temporary Assets which you should use to do so.
TYPO3 stores processed files and cached images in a dedicated directory. This directory is likely to grow quickly.
With this action you can delete the files in this folder. Afterwards, you should also clear the cache database tables.
The File Abstraction Layer additionally stores a database record for every file it needs to process. (e.g. image thumbnails). In case you modified some graphics settings (All Configuration [GFX]) and you need all processed files to get regenerated, you can use this tool to remove the "processed" ones.

PostgreSQL logical replication - ignore pre-existing data

Imagine dropping a subscription and recreating it from scratch. Is it possible to ignore existing data during the first synchronization?
Creating a subscription with (copy_data=false) is not an option because I do want to copy data, I just don't want to copy already existing data.
Example: There is a users table and a corresponding publication on the master. This table has 1 million rows and every minute a new row is added. Then we drop the subscription for a day.
If we recreate the subscription with (copy_data=true), replication will not start due to a conflict with already existing data. If we specify (copy_data=false), 1440 new rows will be missing. How can we synchronize the publisher and the subscriber properly?
You cannot do that, because PostgreSQL has no way of telling when the data were added.
You'd have to reconcile the tables by hand (or INSERT ... ON CONFLICT DO NOTHING).
Unfortunately PostgreSQL does not support nice skip options for conflicts yet, but I believe it will be enhanced in the feature.
Based on #Laurenz Albe answer which recommends the use of the statement:
INSERT ... ON CONFLICT DO NOTHING.
I believe that it would be better to use the following command which also will take care any possible updates on your data before you start the subscription again:
INSERT ... ON CONFLICT UPDATE SET...
Finally I have to say that both are dirty solutions as during the execution of the above statement and the creation of the subscription, new lines may have been arrived which will result in losing them until you perform again the custom sync.
I have seen some other suggested solutions using the LSN number from the Postgresql log file...
For me maybe is elegant and safe to delete all the data from the destination table and create the replication again!

Is it possible to restore the db with incomplete datas?

I have a database(Postgres) backup which contains 100+ tables and most of them have many rows(100K) in it. But when I restored the db with the backup file(production data:- contains large volume of data) one table restored with less data nearly 300K rows are missing. Is there any possibility to happen like this or I'm missing anything?
Thanks in advance
one option could be the following. You should store your data directory from the old db in a zip file and try again. More description here
Michael

Move items between collections

I need to move a heavy quantity of items between two collections. I tried to change direct at database the tables "item" and "collection2item", columns "owning_collection" and "item_id" respectively. Then I restarted tomcat, cleaned the cocoon cache, rebuilt the index and it's still not working.
Is the process metadata-export/metadata-import safer or easier than the above for mass move of items?
What else can I do?
Your process should be ok if you run the reindex with the -bf flags (just -f may be enough too).
Without the -f flag, the reindex (link goes to code as of DSpace 5.x) will check the last_modified value (in the item table) and only reindex items whose value in that column has changed since the last reindex. This also means that a reindex without -f should work if you also updated the last_modified timestamp.
Still not working?
If the reindex still doesn't happen, something else must be going wrong. Check your dspace.log -- are there any entries that look like "wrote item xyz to index"? If not then the items aren't being reindexed. Are there any error messages in the dspace.log around the time you do the reindex? Any error messages in the solr log file?
Also, make sure you always run the reindex (and all other dspace commands) as the same user that tomcat is running under, to avoid permissions problems. If you've ever run the commands as a different user, change the permissions of the solr data directory (probably [dspace]/solr/search/data) so that the tomcat user can create/write/delete files in it.
Overall recommendation
In most cases I'd go with batch metadata editing myself for moving items between collections, it avoids all these problems and will trigger a re-index of the affected items automatically.
The metadata import process is very reliable. It also provides a preview option that will allow you to see the changes before they are applied. After the items are updated, the proper re-indexing processes will run.
You only need to provide the item ids and the data fields you wish to edit.
If you prefer to build your CSV file by hand or from a SQL query, that will work as well. The name of the column at the top of your CSV will determine the fields to be updated.
https://wiki.duraspace.org/display/DSDOC5x/Batch+Metadata+Editing#BatchMetadataEditing-CSVFormat

Postgresql replication without DELETE statement

We have a requirement that says we should have a copy of all the items that were in our system at one point. The most simple way to explain it would be replication but ignoring the delete statement (INSERT and UPDATE are ok)
Is this possible ? or maybe the better question would be what is the best approach to tackle this kind of problem?
Make a copy/replica of current database and use triggers via dblink from current database to the replica. Use after insert and after update trigger to insert and update data in replica.
So whenever a row insertion/updation take place in current database it will directly reflect to replica.
I'm not sure that I understand the question completely, but I'll try to help:
First (opposite to #Sunit) - I suggest avoiding triggers. Triggers are introducing additional overhead and impacting performance.
The solution I would use (and I'm actually using in few of my projects with similar demands) - don't use DELETE at all. Instead you can add bit (boolean) column called "Deleted", set its default value to 0 (false), and instead of deleting the row you update this field to 1 (true). You'll also need to change your other queries (SELECT) to include something like "WHERE Deleted = 0".
Another option is to continue using DELETE as usual, and to allow deleting records from both primary and replica, but configure WAL archiving, and store WAL archives in some shared directory. This will allow you to moment-in-time recovery, meaning that you'll be able to restore another PostgreSQL instance to state of your cluster in any moment in time (i.e. before the deletion). This way you'll have a trace of deleted records, but pretty complicated procedure to reach the records. Depending on how often deleted records will be checked in the future (maybe they are not checked at all, but simply kept for just-in-case tracking) this approach my also help.