Talend, Postgres and sequences - postgresql

I have a JPA application and associated Postgres schema that was designed with a single sequence table (not a sequence). I am trying to populate several tables using Talend tPostgresqlOutput. These tables have keys that are sequenced by the JPA application. I am at a loss to work out how to read a sequence number from the table, update it and then use the sequence number to key a record on an insert with Talend. I can work it through with a sequence, but this is a table.

Related

cannot truncate the tables in landing area after transfering data

I have 2 schemas with exactly the same 15 tables in Postgres. Data will be inserted in to first schema every 3 hours. then data needs to be transfer into second schema.
afterthat tables of the first schema needs to be truncated. (landing area needs to be empty).I wrote a trigger to transfer data after inserting to first schema into second schema.
but how tables of the first schema should be truncated?
I searched already and I tried two ways but non of them works.
1.I put the truncate command after all (insert into... conflict on... )in the same trigger's function that transfer data from first schema into second schema. which doesn't work.
2.I made another trigger which will implemented after insert into (or update) tables of second schema. which also doesn't work. why this one doesn't work? if data already inserted into second_schema. it should be possible to trucate first_schema. It isnot in use by active query.
Error->cannot TRUNCATE "table1" because it is being used by active queries in this session
what should I do? I need to tranfer data after every insert into first schema, to second schema. and then truncate all 15 tables of the first schema.
Tables of first schema should be empty for new insert.

Configure Postgres auto-increment sequence starting value for all sequences in one place

In a special use case, I want to configure postgres auto-increment sequences to start from 100001 instead of 1 across whole DB for all the tables. Could it be configured in one place for the tables instead of altering all the sequences after table setup?

How to upsert/Delete the DB2 source table data using Pyspark/SQL/DataFrames SPARK RDD's?

I'm trying to run the upsert/delete some of the values in DB2 database source table, which is a existing table on DB2. Is it possible using Pyspark/Spark SQL/Dataframes.
There is no direct way for update/delete in relational database using Pyspark job, but there are workarounds.
(1) You can create a identical empty table (secondary table) in relational database and insert data into secondary table using pyspark job, and write a DML trigger that would perform desired DML operation on your primary table.
(2) You can create a dataframe (eg. a) in spark that would be copy of your existing relational table and merge existing table dataframe with current dataframe(eg. b) and create a new dataframe(eg. c) that would be having latest changes. Now truncate the relational database table and reload with spark latest changes dataframe(c).
These is just a workaround and not a optimal solution for huge amount of data.

How to write another query in IN function when partitioning

I have 2 local docker postgresql-10.7 servers set up. On my hot instance, I have a huge table that I wanted to partition by date (I achieved that). The data from the partitioned table (Let's call it PART_TABLE) is stored on the other server, only PART_TABLE_2019 is stored on HOT instance. And here comes the problem. I don't know how to partition 2 other tables that have foreign keys from PART_TABLE, based on FK. PART_TABLE and TABLE2_PART are both stored on HOT instance.
I was thinking something like this:
create table TABLE2_PART_2019 partition of TABLE2_PART for values in (select uuid from PART_TABLE_2019);
But the query doesn't work and I don't know if this is a good idea (performance wise and logically).
Let me just mention that I can solve this with either function or script etc. but I would like to do this without scripting.
From doc at https://www.postgresql.org/docs/current/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE
"While primary keys are supported on partitioned tables, foreign keys
referencing partitioned tables are not supported. (Foreign key
references from a partitioned table to some other table are
supported.)"
With PostgreSQL v10, you can only define foreign keys on the individual partitions. But you could create foreign keys on each partition.
You could upgrade to PostgreSQL v11 which allows foreign keys to be defined on partitioned tables.
Can you explain what a HOT instance is and why it would makes this difficult?

PostgreSQL sequence connects to columns

So im working on a database at the moment, and i can see there are loads of sequences. I was wondering how sequences link up to their corresponding column in order to increment the value.
for example if i create a new table with a column names ID how would i apply a sequence to that column.
Typically, sequences are created implicitly. With a serial column or (alternatively) with an IDENTITY column in Postgres 10 or later. Details:
Auto increment table column
Sequences are separate objects internally and can be "owned" by a column, which happens automatically for the above examples. (But you can also have free-standing sequences.) They are incremented with the dedicated function nextval() that is used for the column default of above columns automatically. More sequence manipulation functions in the manual.
Details:
Safely and cleanly rename tables that use serial primary key columns in Postgres?
Or you can use ALTER SEQUENCE to manipulate various properties.
Privileges on sequences have to be changed explicitly for serial columns, while that happens implicitly for the newer IDENTITY columns.